Huffman Coding
 Huffman Coding is a famous Greedy Algorithm.
 It is used for the lossless compression of data.
 It uses variable length encoding.
 It assigns variable length code to all the characters.
 The code length of a character depends on how frequently it occurs in the given text.
 The character which occurs most frequently gets the smallest code.
 The character which occurs least frequently gets the largest code.
 It is also known as Huffman Encoding.
Prefix Rule
 Huffman Coding implements a rule known as a prefix rule.
 This is to prevent the ambiguities while decoding.
 It ensures that the code assigned to any character is not a prefix of the code assigned to any other character.
Major Steps in Huffman Coding
There are two major steps in Huffman Coding
 Building a Huffman Tree from the input characters.
 Assigning code to the characters by traversing the Huffman Tree.
Huffman Tree
The steps involved in the construction of Huffman Tree are as follows
Step01:
 Create a leaf node for each character of the text.
 Leaf node of a character contains the occurring frequency of that character.
Step02:
 Arrange all the nodes in increasing order of their frequency value.
Step03:
Considering the first two nodes having minimum frequency,
 Create a new internal node.
 The frequency of this new node is the sum of frequency of those two nodes.
 Make the first node as a left child and the other node as a right child of the newly created node.
Step04:
 Keep repeating Step02 and Step03 until all the nodes form a single tree.
 The tree finally obtained is the desired Huffman Tree.
Time Complexity
The time complexity analysis of Huffman Coding is as follows
 extractMin( ) is called 2 x (n1) times if there are n nodes.
 As extractMin( ) calls minHeapify( ), it takes O(logn) time.
Thus, Overall time complexity of Huffman Coding becomes O(nlogn).
Here, n is the number of unique characters in the given text.
Important Formulas
The following 2 formulas are important to solve the problems based on Huffman Coding
Formula01:
Formula02:
Total number of bits in Huffman encoded message
= Total number of characters in the message x Average code length per character
= ∑ ( frequency_{i} x Code length_{i })
PRACTICE PROBLEM BASED ON HUFFMAN CODING
Problem
A file contains the following characters with the frequencies as shown. If Huffman Coding is used for data compression, determine
 Huffman Code for each character
 Average code length
 Length of Huffman encoded message (in bits)
Characters  Frequencies 
a  10 
e  15 
i  12 
o  3 
u  4 
s  13 
t  1 
Solution
First let us construct the Huffman Tree.
Huffman Tree is constructed in the following steps
Step01:
Step02:
Step03:
Step04:
Step05:
Step06:
Step07:
Now,
 We assign weight to all the edges of the constructed Huffman Tree.
 Let us assign weight ‘0’ to the left edges and weight ‘1’ to the right edges.
Rule

After assigning weight to all the edges, the modified Huffman Tree is
Now, let us answer each part of the given problem one by one
1. Huffman Code For Characters
To write Huffman Code for any character, traverse the Huffman Tree from root node to the leaf node of that character.
Following this rule, the Huffman Code for each character is
 a = 111
 e = 10
 i = 00
 o = 11001
 u = 1101
 s = 01
 t = 11000
From here, we can observe
 Characters occurring less frequently in the text are assigned the larger code.
 Characters occurring more frequently in the text are assigned the smaller code.
2. Average Code Length
Using formula01, we have
Average code length
= ∑ ( frequency_{i} x code length_{i} ) / ∑ ( frequency_{i} )
= { (10 x 3) + (15 x 2) + (12 x 2) + (3 x 5) + (4 x 4) + (13 x 2) + (1 x 5) } / (10 + 15 + 12 + 3 + 4 + 13 + 1)
= 2.52
3. Length of Huffman Encoded Message
Using formula02, we have
Total number of bits in Huffman encoded message
= Total number of characters in the message x Average code length per character
= 58 x 2.52
= 146.16
≅ 147 bits
To gain better understanding about Huffman Coding,
To practice previous years GATE problems on Huffman Coding,
Next Article 0/1 Knapsack Problem
Get more notes and other study material of Design and Analysis of Algorithms.
Watch video lectures by visiting our YouTube channel LearnVidFun.