Binary search vs. Hash table

Little John · Post by **Little John** » Mon Aug 31, 2009 8:38 pm

Another data structure for fast searching that I have used in the past is a ternary search tree.
Unfortunately, I can't claim that I really understand how it works.

Regards, Little John

Post by **idle** » Tue Sep 01, 2009 10:42 am

Hash tables are prickly beasts which are often hard to find optimal solution. Srods table is a actually a good generic implementation and as far as it's utility goes it sure beats the alternatives of having to suffer the indignation of the Wire Brush and Dettol because you've been sitting on your arse to long!" (Martha.. fetch the wire brush and Dettol my piles have flared up again! ((a Billy Connoly joke for those that don't know))

The choice to use a hash table is really governed by the problem, you need to consider it's structure, it's size and hash function to suit the data and there's no perfect solution, there's also the issue of system performance to consider, page faults and it just goes on and on.

If you have all the time in the world to load your data a binary search is probably the way to go, though it really does depend upon what your looking up and or storing.

kinglestat · Post by **kinglestat** » Wed Sep 02, 2009 11:03 am

and to add my own view:

>regarding large ram requests; keep in mind that on modern OSes, the OS will only grant a percentage of real RAM, and a percentage of Virtual RAM; I think the ratio is 40/60 in windows as is in redhat linux. I rarely create a hash with less than 9 million elements, yet app rarely uses more than 32mb of ram.

>Secondly, with some pre-processing, you can convert a hash table to a lookup table - like a foreignkey of a database. To this effect, for example I use sqlite memory db, since then I can mix my sql statements. I'll post some pseudo code later on this

>third, for efficiency you can use just 1 hash table for all your hashing need providing you use a generic hash function (like crc32) which gives few collissions; crc32 is quite good in that. I am using mixed C/PB code to further increase the hash efficiency but the gain is minor...about 5%

http://www.purebasic.fr/english/viewtop ... macro+hash

But there is a final twist for "small" hash tables
There was a post once for creating small hashes using macros...was a question I asked in fact, but I can't find it. Removing the overhead of calling functions (by using macros) and using as simple a hash as you can get away with, well, you have even faster hashing.

Post by **Fred** » Thu Sep 03, 2009 6:05 pm

I just added an optional param to NewMap, and with a 2.000.000 map size it's indeed much faster:

Binary lookup: 11 secs
Hashtable: 1,6 sec

I didn't used the srod hashtable, as i didn't have it.

srod · Post by **srod** » Thu Sep 03, 2009 6:10 pm

Sounds great Fred.