Page 1 of 1
Don't hassle the Huff'
Posted: Sun Sep 21, 2008 2:30 pm
by Fluid Byte
I am currently working on a implementation of the Huffman algorithm in PB. I really thought I have done this in a couple of days but it's taking much longer than expected now. At the moment I am able to count the character frequency, construct the Huffman-Tree and create the Huffman-Code for each character.
For example I have the input string "MISSISSIPPI". That would result in the following Huffman-Codes: S = 0, I = 11, P = 101, M = 100. When you use these codes to encode the string you get: 100110011001110110111. Now you split it up in put into three bytes: 10011001 / 10011101 / 10111.
I hope that this is correct so far. Because I found a Huffmancode on this forum wich results in 5 instead of 3 Bytes for the encoded string.
My question is now how do you decode the string again? I think in order to be able to decode the string even after the applications ends you have to save some addtional information in the encoded string. At least I read that on some sites I found via Google.
Posted: Sun Sep 21, 2008 3:15 pm
by Road Runner
The only other information you should need to decode your bit stream is the symbol table you just generated:S = 0, I = 11, P = 101, M = 100
You scan your coded data from the start and compare with each symbol in your symbol table. When you get a match, you have found the next symbol, you remove those bits and start again to look for the next symbol.
e.g.
Code = 100110011001110110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 10, no match in symbol table.
Try 3 bits, value = 100, matches M in symbol table
Remove those 3 bits and start again to find next symbol
Code = 110011001110110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 11, matches I in symbol table
Remove those 2 bits and start again to find next symbol
Code = 0011001110110111
Try 1 bit, value = 0, matches S in symbol table
Remove that 1 bit and start again to find next symbol
Code = 011001110110111
Try 1 bit, value = 0, matches S in symbol table
Remove that 1 bit and start again to find next symbol
Code = 11001110110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 11, matches I in symbol table
Remove those 2 bits and start again to find next symbol
Code = 001110110111
Try 1 bit, value = 0, matches S in symbol table
Remove that 1 bit and start again to find next symbol
Code = 01110110111
Try 1 bit, value = 0, matches S in symbol table
Remove that 1 bit and start again to find next symbol
Code = 1110110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 11, matches I in symbol table
Remove those 2 bits and start again to find next symbol
Code = 10110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 10, no match in symbol table.
Try 3 bits, value = 101, matches P in symbol table
Remove those 3 bits and start again to find next symbol
Code = 10111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 10, no match in symbol table.
Try 3 bits, value = 101, matches P in symbol table
Remove those 3 bits and start again to find next symbol
Code = 11
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 11, matches I in symbol table
Remove those 2 bits. There are no remaining bits in the code so the decode is finished and the result is MISSISSIPPI
..editted to correct cut and paste error in P.
Posted: Sun Sep 21, 2008 4:03 pm
by blueb
Fluid Byte...
You might take a look at:
http://www.pbcrypto.com/view.php?algorithm=huffman
These are PowerBASIC files that focus on encryption techniques.
--blueb
Posted: Mon Sep 22, 2008 2:58 pm
by Fluid Byte
Hehe, now that's a really
detailed way of doing it. Actually it would have been enuff to show the decoding of the first two characters but you never know.
Thank you both, got it working now.

Posted: Mon Sep 22, 2008 3:51 pm
by Kaeru Gaman
thanx for the example, Fluid & RoadRunner...
never cared about Huff before, know I know what it is...
Posted: Mon Sep 22, 2008 4:54 pm
by Fluid Byte
@Kaeru:
Here is the page that got me started with Huffman (really easy to understand):
http://www.ziegenbalg.ph-karlsruhe.de/m ... ierung.htm
It features an excellent explanation of the algorithm and uses the same demoword as here.
Here's another site with a good info as well plus a nice javascript to demonstrate the algorithm in realtime:
http://www.inf.fh-flensburg.de/lang/alg ... uffman.htm
Both in German only, sorry.

Posted: Mon Sep 22, 2008 6:46 pm
by Rook Zimbabwe