前言
本篇章主要介绍哈夫曼树及哈夫曼编码,包括哈夫曼树的一些基本概念、构造、代码实现以及哈夫曼编码,并用Python实现。
1. 基本概念
哈夫曼树
其中,
给定
代码实现:
class HuffmanTreeNode(object): def __init__(self): self.data = '#' self.weight = -1 self.parent = None self.lchild = None self.rchild = None class HuffmanTree(object): def __init__(self, data_list): self.nodes = [] # 按权重从大到小进行排列 for val in data_list: newnode = HuffmanTreeNode() newnode.data = val[0] newnode.weight = val[1] self.nodes.append(newnode) self.nodes = sorted(self.nodes, key=lambda node: node.weight, reverse=True) print([(node.data, node.weight) for node in self.nodes]) def CreateHuffmanTree(self): # 这里注意区分 # TreeNode = self.nodes[:] 变量TreeNode, 这个相当于深拷贝, TreeNode变化不影响nodes # TreeNode = self.nodes 指针TreeNode与nodes共享一个地址, 相当于浅拷贝, TreeNode变化会影响nodes TreeNode = self.nodes[:] if len(TreeNode) > 0: while len(TreeNode) > 1: letfTreeNode = TreeNode.pop() rightTreeNode = TreeNode.pop() newNode = HuffmanTreeNode() newNode.lchild = letfTreeNode newNode.rchild = rightTreeNode newNode.weight = letfTreeNode.weight + rightTreeNode.weight letfTreeNode.parent = newNode rightTreeNode.parent = newNode self.InsertTreeNode(TreeNode, newNode) return TreeNode[0] def InsertTreeNode(self, TreeNode, newNode): length = len(TreeNode) if length > 0: temp = length - 1 while temp >= 0: if newNode.weight < TreeNode[temp].weight: TreeNode.insert(temp+1, newNode) return True temp -= 1 TreeNode.insert(0, newNode)
3. 哈夫曼编码
在数据通信时,假如我们要发送
报文最短可以引申到二叉树路径最短,即构造前缀编码的实质就是构造一棵哈夫曼树,通过这种形式获得的二进制编码称为哈夫曼编码。这里的权值就是报文中字符出现的概率,出现概率越高的字符我们用越短的字符表示。
以下表中的字符及其出现的概率为例来实现哈夫曼编码:
"text-align: left">代码实现就是在哈夫曼树的基础上加一个编码的函数:
def HuffmanEncode(self, Root): TreeNode = self.nodes[:] code_result = [] for index in range(len(TreeNode)): temp = TreeNode[index] code_leaf = [temp.data] code = '' while temp is not Root: if temp.parent.lchild is temp: # 左分支 code = '0' + code else: # 右分支 code = '1' + code temp = temp.parent code_leaf.append(code) code_result.append(code_leaf) return code_result
测试结果如下:
if __name__ == '__main__': tree_obj = HuffmanTree([('A', 0.01), ('B', 0.43), ('C', 0.15), ('D', 0.02), ('E', 0.03), ('F', 0.21), ('G', 0.07), ('H', 0.08)]) huf_tree = tree_obj.CreateHuffmanTree() huf_code = tree_obj.HuffmanEncode(huf_tree) for index in range(len(huf_code)): print('{0}: {1}'.format(huf_code[index][0], huf_code[index][1]))
总结