I am currently working on a implementation of the Huffman algorithm in PB. I really thought I have done this in a couple of days but it's taking much longer than expected now. At the moment I am able to count the character frequency, construct the Huffman-Tree and create the Huffman-Code for each character.
For example I have the input string "MISSISSIPPI". That would result in the following Huffman-Codes: S = 0, I = 11, P = 101, M = 100. When you use these codes to encode the string you get: 100110011001110110111. Now you split it up in put into three bytes: 10011001 / 10011101 / 10111.
I hope that this is correct so far. Because I found a Huffmancode on this forum wich results in 5 instead of 3 Bytes for the encoded string.
My question is now how do you decode the string again? I think in order to be able to decode the string even after the applications ends you have to save some addtional information in the encoded string. At least I read that on some sites I found via Google.
Don't hassle the Huff'
- Fluid Byte
- Addict
- Posts: 2336
- Joined: Fri Jul 21, 2006 4:41 am
- Location: Berlin, Germany
Don't hassle the Huff'
Windows 10 Pro, 64-Bit / Whose Hoff is it anyway?
-
- User
- Posts: 48
- Joined: Tue Oct 07, 2003 3:10 pm
The only other information you should need to decode your bit stream is the symbol table you just generated:S = 0, I = 11, P = 101, M = 100
You scan your coded data from the start and compare with each symbol in your symbol table. When you get a match, you have found the next symbol, you remove those bits and start again to look for the next symbol.
e.g.
Code = 100110011001110110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 10, no match in symbol table.
Try 3 bits, value = 100, matches M in symbol table
Remove those 3 bits and start again to find next symbol
Code = 110011001110110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 11, matches I in symbol table
Remove those 2 bits and start again to find next symbol
Code = 0011001110110111
Try 1 bit, value = 0, matches S in symbol table
Remove that 1 bit and start again to find next symbol
Code = 011001110110111
Try 1 bit, value = 0, matches S in symbol table
Remove that 1 bit and start again to find next symbol
Code = 11001110110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 11, matches I in symbol table
Remove those 2 bits and start again to find next symbol
Code = 001110110111
Try 1 bit, value = 0, matches S in symbol table
Remove that 1 bit and start again to find next symbol
Code = 01110110111
Try 1 bit, value = 0, matches S in symbol table
Remove that 1 bit and start again to find next symbol
Code = 1110110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 11, matches I in symbol table
Remove those 2 bits and start again to find next symbol
Code = 10110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 10, no match in symbol table.
Try 3 bits, value = 101, matches P in symbol table
Remove those 3 bits and start again to find next symbol
Code = 10111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 10, no match in symbol table.
Try 3 bits, value = 101, matches P in symbol table
Remove those 3 bits and start again to find next symbol
Code = 11
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 11, matches I in symbol table
Remove those 2 bits. There are no remaining bits in the code so the decode is finished and the result is MISSISSIPPI
..editted to correct cut and paste error in P.
You scan your coded data from the start and compare with each symbol in your symbol table. When you get a match, you have found the next symbol, you remove those bits and start again to look for the next symbol.
e.g.
Code = 100110011001110110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 10, no match in symbol table.
Try 3 bits, value = 100, matches M in symbol table
Remove those 3 bits and start again to find next symbol
Code = 110011001110110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 11, matches I in symbol table
Remove those 2 bits and start again to find next symbol
Code = 0011001110110111
Try 1 bit, value = 0, matches S in symbol table
Remove that 1 bit and start again to find next symbol
Code = 011001110110111
Try 1 bit, value = 0, matches S in symbol table
Remove that 1 bit and start again to find next symbol
Code = 11001110110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 11, matches I in symbol table
Remove those 2 bits and start again to find next symbol
Code = 001110110111
Try 1 bit, value = 0, matches S in symbol table
Remove that 1 bit and start again to find next symbol
Code = 01110110111
Try 1 bit, value = 0, matches S in symbol table
Remove that 1 bit and start again to find next symbol
Code = 1110110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 11, matches I in symbol table
Remove those 2 bits and start again to find next symbol
Code = 10110111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 10, no match in symbol table.
Try 3 bits, value = 101, matches P in symbol table
Remove those 3 bits and start again to find next symbol
Code = 10111
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 10, no match in symbol table.
Try 3 bits, value = 101, matches P in symbol table
Remove those 3 bits and start again to find next symbol
Code = 11
Try 1 bit, value = 1, no match in symbol table.
Try 2 bits, value = 11, matches I in symbol table
Remove those 2 bits. There are no remaining bits in the code so the decode is finished and the result is MISSISSIPPI
..editted to correct cut and paste error in P.
Last edited by Road Runner on Sun Sep 21, 2008 6:53 pm, edited 1 time in total.
Fluid Byte...
You might take a look at: http://www.pbcrypto.com/view.php?algorithm=huffman
These are PowerBASIC files that focus on encryption techniques.
--blueb
You might take a look at: http://www.pbcrypto.com/view.php?algorithm=huffman
These are PowerBASIC files that focus on encryption techniques.
--blueb
- It was too lonely at the top.
System : PB 6.21(x64) and Win 11 Pro (x64)
Hardware: AMD Ryzen 9 5900X w/64 gigs Ram, AMD RX 6950 XT Graphics w/16gigs Mem
System : PB 6.21(x64) and Win 11 Pro (x64)
Hardware: AMD Ryzen 9 5900X w/64 gigs Ram, AMD RX 6950 XT Graphics w/16gigs Mem
- Fluid Byte
- Addict
- Posts: 2336
- Joined: Fri Jul 21, 2006 4:41 am
- Location: Berlin, Germany
Hehe, now that's a really detailed way of doing it. Actually it would have been enuff to show the decoding of the first two characters but you never know.
Thank you both, got it working now.

Thank you both, got it working now.

Windows 10 Pro, 64-Bit / Whose Hoff is it anyway?
- Kaeru Gaman
- Addict
- Posts: 4826
- Joined: Sun Mar 19, 2006 1:57 pm
- Location: Germany
- Fluid Byte
- Addict
- Posts: 2336
- Joined: Fri Jul 21, 2006 4:41 am
- Location: Berlin, Germany
@Kaeru:
Here is the page that got me started with Huffman (really easy to understand):
http://www.ziegenbalg.ph-karlsruhe.de/m ... ierung.htm
It features an excellent explanation of the algorithm and uses the same demoword as here.
Here's another site with a good info as well plus a nice javascript to demonstrate the algorithm in realtime:
http://www.inf.fh-flensburg.de/lang/alg ... uffman.htm
Both in German only, sorry.
Here is the page that got me started with Huffman (really easy to understand):
http://www.ziegenbalg.ph-karlsruhe.de/m ... ierung.htm
It features an excellent explanation of the algorithm and uses the same demoword as here.
Here's another site with a good info as well plus a nice javascript to demonstrate the algorithm in realtime:
http://www.inf.fh-flensburg.de/lang/alg ... uffman.htm
Both in German only, sorry.

Windows 10 Pro, 64-Bit / Whose Hoff is it anyway?
- Rook Zimbabwe
- Addict
- Posts: 4322
- Joined: Tue Jan 02, 2007 8:16 pm
- Location: Cypress TX
- Contact:
Interesting link to how the code came about:
http://www.huffmancoding.com/david/algorithm.html

Long winded English language explanation:
http://www.maths.abdn.ac.uk/~igc/tch/mx ... ode59.html
http://www.huffmancoding.com/david/algorithm.html

Long winded English language explanation:
http://www.maths.abdn.ac.uk/~igc/tch/mx ... ode59.html