Page 1 of 3

A OOP class for dynamically implementing hash tables!

Posted: Tue Dec 11, 2007 10:59 pm
by srod
Update : 15/08/08.
Have added a new method; ClearItems(), which removes all items from the underlying hash table but does not destroy the table. This is much faster than simply enumerating all keys and removing the items individually.

The download link is at the bottom of this post.


==============================================
Update : 27/12/07.
Have added a new method; EnumKeys(), which allows the host application to iterate through all the keys (and all of the data) of the underlying hash table. This is useful if some of the fields being stored contain pointers to memory which needs to be freed by the host application etc.

Have added a second demo program to show this method being used and also there is a detailed explanation of this method in the user guide.

The download link is at the bottom of this post.


==============================================
Hi,

right this little offering is not merely a conversion to OOP of my previous hash table utility (http://www.purebasic.fr/english/viewtopic.php?t=24683)
but represents a far more refined and way way more powerful cross-platform 'engine'.

It has grown out of a need of mine for a hash table capable of handling huge amounts of data - and this is it! :wink:

I've included quite a comprehensive user guide with this class and so will only sketch out the main differences between this library and my previous one for hash tables etc. Indeed, if you are unsure what a hash table is and how it can be used, then there is a short and unintelligible ( :wink: ) section in the user guide devoted to this.
  1. This new engine allows you to store any kind of data within a hash table; strings, structures, pointers to other hash tables etc. All strings are freed automatically etc. (The old hash table could only store a single 32-bit value in each item.)
  2. Far less memory overhead per item stored.
    Byte for byte, there is a saving of roughly 12 bytes per item (if storing just single 32-bit values etc), which is a lot when you scale this up!
  3. Faster access to individual items.
    My previous engine used a global linked list to store all the data from ALL the hash tables in an application. That is, if you had 10 hash tables, then all the data for all of these tables was stored in the same linked list! This was convenient because of the need to handle collisions etc.
    My new engine does not use any linked lists at all. Instead, all items sharing the same hash ('collisions') are bundled into an ordered array and are subsequently retrieved through a combination of the gerenated hash code + a binary search! Very fast!
  4. Optional protection for multithreaded access to individual hash table objects.
I will leave to the user guide + the demo program the task of instructing developers on how to use this library, but I will make very clear right now the need for anyone intending to use this class to read the guide very carefully! I mean, fine toothpick time! :)

The download includes all the source, a demo + the user guide. It also includes the source to my 'OOP array class' (http://www.purebasic.fr/english/viewtopic.php?t=29506) as this is required by the hash table class.

(Extract the zip whilst 'using folder names' and it should run straight out of the box. That, or your money back! :) )

Download

Regards.

Posted: Tue Dec 11, 2007 11:04 pm
by rsts
Another lesson and tool from one of the masters.

Many thanks.

Posted: Tue Dec 11, 2007 11:11 pm
by milan1612
Thanks srod :P , especially for the first one!

Posted: Tue Dec 11, 2007 11:19 pm
by srod
You're both welcome.

Milan, bugs withstanding, this new hash table should be far better than the first one. My advice would be to consider using this one instead. :)

Posted: Tue Dec 11, 2007 11:25 pm
by milan1612
srod wrote:You're both welcome.

Milan, bugs withstanding, this new hash table should be far better than the first one. My advice would be to consider using this one instead. :)
I'm currently diving into your code, looks complicated (haven't opened the manual yet :lol: )
EDIT: Wow, the manual looks very professional :shock:

Posted: Tue Dec 11, 2007 11:27 pm
by Dare
Thanks.

Posted: Tue Dec 11, 2007 11:32 pm
by srod
milan1612 wrote:I'm currently diving into your code, looks complicated (haven't opened the manual yet :lol: )
Haven't you read the thread in the off-topic section about monkeys being cleverer than humans?

Well, here's the proof!

No wait, did I mean ...... :wink:

EDIT: Wow, the manual looks very professional :shock:
Stick a monkey in front of a keyboard...

:)

Posted: Wed Dec 12, 2007 2:34 pm
by milan1612
Well, I'm definitely not the monkey :lol:

Posted: Wed Dec 12, 2007 2:39 pm
by srod
I'm the monkey!

Posted: Thu Dec 13, 2007 11:27 am
by byo
Thank you. Once again, a big contribution.
You deserve a Christmas with a whole lotta beer.

Posted: Thu Dec 13, 2007 12:18 pm
by srod
You're welcome byo.
You deserve a Christmas with a whole lotta beer.
Deserve has nothing to do with it byo; I'll be falling through a few hedges regardless! :wink:

Posted: Mon Dec 17, 2007 12:08 am
by kinglestat
srod, impressive work and nicely explained
so you finally got around in believing in large hashes eh?
What I like best is that unlike my implementation...which..er...serves me...in your case memory needed is for the unique values of the hash...then memory is used up as needed.
Many thanks, and well done

Posted: Mon Dec 17, 2007 11:39 am
by srod
kinglestat wrote:srod, impressive work and nicely explained
Thank you. I can't release something like this without some kind of guide.
so you finally got around in believing in large hashes eh?
srod looks around sheepishly before skulking off into the shadows!

:)
What I like best is that unlike my implementation...which..er...serves me...in your case memory needed is for the unique values of the hash...then memory is used up as needed.
Yes, aside from the main table of pointers (1 pointer per unique hash) which is allocated at the outset, all other memory allocations are performed only when required. The best way when we are catering for possibly huge hash tables etc.

I'm glad you like it.

Posted: Mon Dec 17, 2007 10:54 pm
by kinglestat
you inspired me to release mine....with some trepidation....but at the moment its buggy...when I fix bug I'll post it

Posted: Thu Dec 27, 2007 1:15 pm
by srod
Update : 27/12/07.
Have added a new method; EnumKeys(), which allows the host application to iterate through all the keys (and all of the data) of the underlying hash table. This is useful if some of the fields being stored contain pointers to memory which needs to be freed by the host application etc.

Have added a second demo program to show this method being used and also there is a detailed explanation of this method in the user guide.

Please see the first post for the download link.