Hash tables.
- netmaestro
- PureBasic Bullfrog
- Posts: 8451
- Joined: Wed Jul 06, 2005 5:42 am
- Location: Fort Nelson, BC, Canada
Large data sizes are where hash tables come into their own. One common use for them is in chess algorithms where the evaluations for many millions of possibilities must be searched over and over again. You won't find a mainstream chess program made today that doesn't employ hash tables, as there just isn't anything that comes close for huge jobs.
srod, thanks for posting that! I'll be putting it to work pretty soon coming up with a chess program that'll beat the world champion's cousin's retarded brother.
srod, thanks for posting that! I'll be putting it to work pretty soon coming up with a chess program that'll beat the world champion's cousin's retarded brother.
BERESHEIT
netmaestro wrote:srod, thanks for posting that! I'll be putting it to work pretty soon coming up with a chess program that'll beat the world champion's cousin's retarded brother.

Too late, I wrote one of those a few years back except that it had a tendency to throw the game after two moves!
I may look like a mule, but I'm not a complete ass.
-
- Enthusiast
- Posts: 746
- Joined: Fri Jul 14, 2006 8:53 pm
- Location: Malta
- Contact:
you people are too far ahead of me, but, euh, could i look at it like a long array like this:
then sort the array on the hash and use the hash to find the string back using binairy search?
Code: Select all
Structure hash_table
hash.s
text.s
EndStructure
Dim test.hash_table(100)
For n = 1 To 100
test(n)\text = Str(Random(999999))+Str(Random(999999))+Space(100)
test(n)\hash = MD5Fingerprint(test(n)\text,100)
Next n
( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB)
( The path to enlightenment and the PureBasic Survival Guide right here... )
( The path to enlightenment and the PureBasic Survival Guide right here... )
blueznl wrote:you people are too far ahead of me, but, euh, could i look at it like a long array like this:
then sort the array on the hash and use the hash to find the string back using binairy search?Code: Select all
Structure hash_table hash.s text.s EndStructure Dim test.hash_table(100) For n = 1 To 100 test(n)\text = Str(Random(999999))+Str(Random(999999))+Space(100) test(n)\hash = MD5Fingerprint(test(n)\text,100) Next n
If you're going to do this, why not just sort on the original text and then use a binary search etc? The hash is pretty much redundant in the above code as you're using a text string to index another text string!
A proper hash table has a number of advantages over the method you've outlined, including:
- faster access - could be as quick as single lookup (order 1) as opposed to order log2 (n) etc.
- the hash is numeric as opposed to textual and thus links directly into the underlying table/array
- no need to continually ensure that the data is ordered (to facilitate the binary chop)
There's probably more!

Hash tables are used extensively.
I may look like a mule, but I'm not a complete ass.
'cause searching through a list with elements of variable length means you'll need either another list to index the exact starting positions of the strings, or you have to keep track of all the string lengths
also on very long yet similar texts a search on the actual text is perhaps slower, i dunno'
simplified: i haven't got a clue what we're doing here
also on very long yet similar texts a search on the actual text is perhaps slower, i dunno'

simplified: i haven't got a clue what we're doing here

( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB)
( The path to enlightenment and the PureBasic Survival Guide right here... )
( The path to enlightenment and the PureBasic Survival Guide right here... )
Sorry, I'm not sure what you mean?Inf0Byt3 wrote:Hi srod, any chance for a binary version?
Great work!
Do you mean one which creates hashes/indexes based upon binary data?
If so, then it could easily be achieved by first creating a MD5Fingerprint hash code and then feeding this to the above library etc. It's probably not worth adjusting the library to work with binary data directly, just throw a little function into your own program which creates MD5Fingerprints etc.
I may look like a mule, but I'm not a complete ass.
-
- Enthusiast
- Posts: 746
- Joined: Fri Jul 14, 2006 8:53 pm
- Location: Malta
- Contact:
I have a question
about these 2 lines in HT_AddItem(HTid, key$, value) procedure
AddElement(_HTentries())
PokeL(index, _HTentries())
isnt the poke writing to the address of the linked list?
and the linked list is private memory (memory being handled by PB)
I am following the logic, and can see the results...but I am lost with this.
about these 2 lines in HT_AddItem(HTid, key$, value) procedure
AddElement(_HTentries())
PokeL(index, _HTentries())
isnt the poke writing to the address of the linked list?
and the linked list is private memory (memory being handled by PB)
I am following the logic, and can see the results...but I am lost with this.
I may not help with your coding
Just ask about mental issues!
http://www.lulu.com/spotlight/kingwolf
http://www.sen3.net
Just ask about mental issues!
http://www.lulu.com/spotlight/kingwolf
http://www.sen3.net
- DoubleDutch
- Addict
- Posts: 3220
- Joined: Thu Aug 07, 2003 7:01 pm
- Location: United Kingdom
- Contact:
I already use a hash table (with collision possibility) in my spellchecker routine. After table creation I post sort it so that it can be searched with a binary search. Because of collision possibility I then do a sequential search from the start of the collision "pack".
It seems pretty fast and it searches up to 2x160,000 words on each cursor position change (as you type spell checking). Even on a slow machine you don't know it's doing it (apart from the fact that it underlines the spelling errors).
The reason I check up to 2x the number of words is because if they type a space I have to check left and right words.
The post sort/binary search method can only be used if data is sorted each time it's added to the database.
It seems pretty fast and it searches up to 2x160,000 words on each cursor position change (as you type spell checking). Even on a slow machine you don't know it's doing it (apart from the fact that it underlines the spelling errors).
The reason I check up to 2x the number of words is because if they type a space I have to check left and right words.

The post sort/binary search method can only be used if data is sorted each time it's added to the database.
https://deluxepixel.com <- My Business website
https://reportcomplete.com <- School end of term reports system
https://reportcomplete.com <- School end of term reports system
Hi,kinglestat wrote:I have a question
about these 2 lines in HT_AddItem(HTid, key$, value) procedure
AddElement(_HTentries())
PokeL(index, _HTentries())
isnt the poke writing to the address of the linked list?
and the linked list is private memory (memory being handled by PB)
I am following the logic, and can see the results...but I am lost with this.
the Poke is writing to the address pointed to by the value of 'index' and it is writing the address of the newly created element in the linked list.
Let me try and explain.
A hash table is simply a 1 dimensional array of pointers. For example :
Code: Select all
HT_CreateHashTable(20)
When you add a key/value to the hash table, the key gets 'hashed' in order to produce an index into the array of pointers.
For example, with the hash table created above, the line :
Code: Select all
HT_AddItem(HTid, "entry1",100)
This value (13) means that the value we wish to store (100 in this case) will, in principal, be placed within the 13th entry in the hash table.
However, because the 13th entry may already contain values from other keys generating the same hash code, we need a way of storing multiple items in this same element. Thus, instead of placing the value of 100 directly in this 13th element we instead place it into a linked list and place the address of this linked list element into the 13th element of the hash table (with the PokeL() statement).
This is done in such a way that all the entries generating the same hash code from the same table form consecutive elements in the linked list.
Now, when you then issue the following command :
Code: Select all
HT_GetItem(HTid, "entry1")
I hope this helps.

Last edited by srod on Wed Nov 07, 2007 12:13 pm, edited 1 time in total.
I may look like a mule, but I'm not a complete ass.