Huffman Code
Common Core Program Outcomes (Applicable to CS/CLCO/DS Majors):
SLO #1 Programming proficiency:
Demonstrate proficiency in at least two programming languages, using design principles to solve problems.
SLO #2 Teamwork and Ethics:
Work effectively within a team, communicate clearly, and navigate ethical and societal issues.
SLO #3 Self-study Capability:
Utilize self-directed learning resources for problem-solving.
SLO #4 Professional Preparedness: Preparedness for employment in the field, demonstrated through capstone project quality and employment outcomes.
Computer Science (BA/BS) Specific Outcome:
SLO #5 Discipline Outcome:
Solid understanding of core concepts (data structures, algorithms, databases, networking, communication, security, parallel/distributed computing).
This assignment specifically addresses Outcomes 2 and 5.
Huffman is a variable-length binary code to represent characters. Huffman's Algorithm is
a greedy algorithm that produces an optimal binary code: the most frequently appearing character
is represented by the least number of bits. In Part 1 of this assignment, you will be
implementing the Huffman Algorithm presented in Section 4.4.2 of your textbook.
Objective
Given a file with ASCII characters, produce the Huffman Code for the file.
Write the code to compress and decompress a file of ASCII characters.
There will be some restrictions to simplify implementation.
Part 1: The Huffman Tree
Given an input file with ASCII characters A-Z (restriction), using Huffman's greedy algorithm, produce a binary
tree that produces the binary Huffman code for each character in the file.
The priority queue to be used must be implemented as a heap.
Create two classes:
The heap will be an implementation of your priority queue. The most appropriate thing to do would be to
create two classes: Priority Queue and Heap. The Priority Queue class would be called from the Huffman class.
The Priority Queue Class would implement the priority queue as a heap, hiding the heap implementation from Huffman.
The Priority Queue member functions would simply call the Heap class member functions. However, to simplify
this implementation, the Huffman class will directly call the Heap class member functions.
The Huffman Class will have the following public member functions:
- Constructor
- Destructor
- Display Code
The private section, will declare a pointer to a huffmannode. You will also have at least one
private member function declared, the function to produce an inorder display of the huffman tree.
Constructor
You will need the following declaration:
struct huffmannode{
char symbol;
int frequency;
int nodetype;
huffmannode * left;
huffmannode *right;
};
The Constructor is passed a string, which is a file name. The constructor should open the file
and use cassert to make sure the file opened properly. The constructor can assume the file will only contain
valid data. Valid data consists of the upper-case alphabet: A-Z. By assuming only the letters A-Z, a 26-entry
array can be used to count the frequency of each character.
After opening the file, the constructor should:
- Loop reading each character in the file and counting the frequency of each character.
- Using the frequency of each character, build the priority
queue (See NeaPolitan). The nodetype
for each node will indicate it is an internal node or a
character node (leaf). You should only make an entry in the heap
if there is at least one character occurring in the file. In other words, if the character F does not
appear in the file, do not enter a node with zero occurrences in the priority queue.
- Use the Heap Display member function to display the heap.
- Go into a for loop for i=1 to i=n-1 removing two elements from the priority queue, creating a new
huffmannode such that the left child pointer points to the first element removed, the right child
pointer points to the second element removed, frequency is the sum of the two frequencies, and the node
type indicates it is an internal, frequency node.
Insert the new node into the priority queue. See
Neapolitan for pseudocode.
At the completion of this loop, you should have a heap with one item consisting of a binary tree that represents the
Huffman code for each character in the file. A left branch is a binary 0 in the code and a right branch is a 1 in the
binary code.
- Remove the one node from the heap. It becomes the finalhuffmantree declared in the private portion of
the class definition
- Print a message that the Huffman Code has been built and exit
Display Code member function
- Call a private in-order traversal routine that prints the character and
binary huffman code as well as the frequency of the character for each leaf node and
the frequency count of each internal node.
Destructor
This function must deallocate all nodes on the huffman tree. It is recommended you reverse the process
you used to create the huffman tree.
- Insert your huffman tree into a Heap. You now have a heap with one item
- Enter a do while loop (test at the bottom), while the size of the heap is greater than one.
- At each iteration, delete an element from the heap. Create a new huffmannode with the left and
right pointers. Delete the item removed. Insert the two new huffmannodes onto the heap.
Files: huffman.cpp and huffman.h
Heap Class
You are going to implement a priority queue as a Minimum heap class.
Private:
The private section of the heap class will have:
- An array of huffmannodes. Since there is a restriction of 26 different characters, 26 should be the
maximum size you need. However, declare it to be size 100 (use a constant).
- An integer representing the size of the heap
- You will need a private rebuildheap member function.
Member Functions:
- Constructor: sets size to 0.
- HeapIsEmpty returns true or false.
- HeapInsert, inserts a huffmannode into the heap.
- HeapDelete, removes highest priority item (minimum frequency) and returns it.
- Displays the items in the heap
Files: PQHeap.cpp and PQHeap.h
Main Function
#include "huffman.h"
using namespace std;
int main()
{
string fn;
cout<<"enter the name of a file containing the characters"< cin>>fn;
huffman myhuffman(fn);
myhuffman.displayCode();
}//end main
Files: huffmancode.cpp
Get Started
Use the following files to get started:
makefile,
huffman.h
huffman.cpp
PQHeap.h
PQHeap.cpp
Turnin: Select huffman
Link to my turnin.out file.
Part 2: Compression/Decompression
Use the Huffman Tree to Compress and Decompress the file
In Part 2, you will develop a deep understanding of binary code manipulation. Use the binary compression tree built in Part 1 of this assignment
to compress and decompress the given file.
Changes to huffmancode.cpp
After prompting for the name of the input input file,
- Prompt the user for the name of the file to be used for compression.
- Prompt the user for the name of the file to be used for decompression.
- Declare the Huffman object as before.
- Call displayCode to display the in-order traversal of the Huffman tree.
- Call a new public member function, compressdecompress, passing the name of
the file to contain the compressed data and the name of the file to contain the decompressed data.
- Read and print the contents of the original file along side the contents of the
Decompressed file, one character per line.
Changes to huffman.h
- Add the declaration of the new public member function compressdecompress. It should take
two string parameters for the compress and decompress file names.
- You will need additional private data members:
- A string to remember the name of the original input file
- An array to store the compressed string and its length for each of the
26 characters of the alphabet.
- The number of characters in the original input file.
Changes to huffman.cpp
Add the code for the compressdecompress member function:
- Reopen the input file for reading. Open the file for compression file for writing.
- Use the compression tree to compress the file writing the data into the compression
file.
- Close the compression file. Reopen the compression file for reading. Open the
decompressed file for writing.
- Decompress the compressed file writing the data into the decompressed file.
Turnin
Use turnin to submit your code. Select huffmancomp. Your makefile will also be copied so that I can easily run your code.
Link to my turnin.out file.
The Group
This is a group assignment.
- Each group member should individually analyze the assignment for understanding
- The group should then meet to discuss how the work will be divided among the group
members
- The group will submit a formal document:
- Defines what each group member will be responsible for
- Defines a detailed schedule of when each individual part of the project will
be completed.
- Defines how the work will be integrated.
- An individual status report will be submitted that identifies
- Each group member's individual status (what has been completed, what remains
to be completed).
- The group's overall status (what has been completed, what remains).
- On schedule? If not, how will it be rectified.
Repeat for Part 2
The formal plan must include fixes to anything that was wrong in Part 1.
This page is maintained by
Barbara Bracken
Last Modified
10/14/2025