Binary search vs. Hash table

Share your advanced PureBasic knowledge/code with the community.
Little John
Addict
Addict
Posts: 4777
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Post by Little John »

Another data structure for fast searching that I have used in the past is a ternary search tree.
Unfortunately, I can't claim that I really understand how it works. ;-)

Regards, Little John
User avatar
idle
Always Here
Always Here
Posts: 5836
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Post by idle »

Hash tables are prickly beasts which are often hard to find optimal solution. Srods table is a actually a good generic implementation and as far as it's utility goes it sure beats the alternatives of having to suffer the indignation of the Wire Brush and Dettol because you've been sitting on your arse to long!" (Martha.. fetch the wire brush and Dettol my piles have flared up again! ((a Billy Connoly joke for those that don't know))

The choice to use a hash table is really governed by the problem, you need to consider it's structure, it's size and hash function to suit the data and there's no perfect solution, there's also the issue of system performance to consider, page faults and it just goes on and on.

If you have all the time in the world to load your data a binary search is probably the way to go, though it really does depend upon what your looking up and or storing.
kinglestat
Enthusiast
Enthusiast
Posts: 746
Joined: Fri Jul 14, 2006 8:53 pm
Location: Malta
Contact:

Post by kinglestat »

and to add my own view:

>regarding large ram requests; keep in mind that on modern OSes, the OS will only grant a percentage of real RAM, and a percentage of Virtual RAM; I think the ratio is 40/60 in windows as is in redhat linux. I rarely create a hash with less than 9 million elements, yet app rarely uses more than 32mb of ram.

>Secondly, with some pre-processing, you can convert a hash table to a lookup table - like a foreignkey of a database. To this effect, for example I use sqlite memory db, since then I can mix my sql statements. I'll post some pseudo code later on this

>third, for efficiency you can use just 1 hash table for all your hashing need providing you use a generic hash function (like crc32) which gives few collissions; crc32 is quite good in that. I am using mixed C/PB code to further increase the hash efficiency but the gain is minor...about 5%

http://www.purebasic.fr/english/viewtop ... macro+hash

But there is a final twist for "small" hash tables
There was a post once for creating small hashes using macros...was a question I asked in fact, but I can't find it. Removing the overhead of calling functions (by using macros) and using as simple a hash as you can get away with, well, you have even faster hashing.
I may not help with your coding
Just ask about mental issues!

http://www.lulu.com/spotlight/kingwolf
http://www.sen3.net
Fred
Administrator
Administrator
Posts: 18162
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Post by Fred »

I just added an optional param to NewMap, and with a 2.000.000 map size it's indeed much faster:

Binary lookup: 11 secs
Hashtable: 1,6 sec

I didn't used the srod hashtable, as i didn't have it.
srod
PureBasic Expert
PureBasic Expert
Posts: 10589
Joined: Wed Oct 29, 2003 4:35 pm
Location: Beyond the pale...

Post by srod »

Sounds great Fred. 8)
I may look like a mule, but I'm not a complete ass.
Post Reply