Murmur3 hash

Share your advanced PureBasic knowledge/code with the community.
User avatar
Keya
Addict
Addict
Posts: 1890
Joined: Thu Jun 04, 2015 7:10 am

Murmur3 hash

Post by Keya »

ive been trying to write the Murmur3 hash in Purebasic, google suggests it hasnt been done before but it seems like a simple enough algorithm and from my reading today it has a lot of excellent qualities such as collision resistance, distribution, and exceptional speed mainly as it works on 4 bytes at a time - this should be the fastest hash Purebasic has ever seen if benchmarks are anything to go by, and we probably wont bother with CRC32 again!

This is unfinished so perhaps i shouldnt be posting it to Tips&Tricks just yet lol (i did set In Progress in the topic!), but i'm sure we'll get there soon!!!

It is currently working fine for one of the test vectors, so the code is nearrrrly there, but im at the stage now where im struggling to work out the bits i've gotten wrong lol

my version is based on code from:
https://en.wikipedia.org/wiki/MurmurHash
http://code.google.com/p/smhasher/wiki/MurmurHash3
http://smhasher.googlecode.com/svn/trun ... rHash3.cpp
http://programmers.stackexchange.com/qu ... -and-speed

and I found this useful page with lots of test vectors...
http://stackoverflow.com/questions/1474 ... st-vectors

currently mine just works against one of them, "aaaa", returning the expected hash. There are some issues still to deal with, because it works with Unsigned 32bit, so Purebasic's 32bit signed Long is throwing me off a bit there. The "tail" section in particular needs some work (that section isn't used in this demo as "aaaa" is a perfect 4-byte chunk, so there's no tail)

As you can see she's not TOOOO hard... just finicky in dotting the i's and crossing the t's heehee :) :)

Anyway I post it here to hopefully spur some interest in it!

[EDIT] see wilbert's excellent asm implementation here!

Code: Select all

CompilerIf #PB_Compiler_Unicode = 1
 CompilerError("Unicode not supported for this demo, use Ascii compile")
CompilerEndIf

Structure longx
  l.Long[0]
EndStructure

Structure asciix
  a.Ascii[0]
EndStructure


Procedure.l murmur3_32(key, llen.i, seed.l)
  c1.l = $cc9e2d51
  c2.l = $1b873593
  r1.l = 15
  r2.l = 13
  m.l = 5
  n.l = $e6546b64
  k.l = 0
  i.l = 0
  
  hash.l = seed
  
  nblocks.l = llen / 4
  *blocks.longx = key
  For i = 0 To nblocks-1
    k = PeekL(*blocks\l[i])    <-- it seems i might be doing a memory read twice here instead of the intended once, as im having to use PeekL to read the value, otherwise we're using the address not the value
    k * c1 
    k = (k << r1) | (k >> (32 - r1)) 
    k * c2 
    
    hash = hash ! k 
    hash = ((hash << r2) | (hash >> (32 - r2))) * m + n     
  Next i  
  
  *tail.asciix = key + (nblocks * 4)
  k1.l = 0
  
  ;i dont understand this part because only Case 1 modifies the hash variable
  Select (llen & 3)
    Case 3:
      k1 = k1 ! (PeekA(*tail\a[2]) << 16)
    Case 2:
      k1 = k1 ! (PeekA(*tail\a[1]) << 8)
    Case 1: 
      k1 = k1 ! (PeekA(*tail\a[0]))
      k1 * c1
      k1 << 15
      k1 * c2
      hash ! k1
  EndSelect 
  
  hash ! llen 
  hash ! (hash >> 16) 
  hash * $85ebca6b 
  hash ! (hash >> 13) 
  hash * $c2b2ae35 
  hash ! (hash >> 16)   
  ProcedureReturn hash 
EndProcedure


;// murmur3_32("aaaa", len=4, seed=$9747b28c) = $5A97808A according to http://stackoverflow.com/questions/14747343/murmurhash3-test-vectors

test.s = "aaaa"
llen.l = Len(test)
seed.l =  $9747b28c  ;0
hash.l = murmur3_32(@test, llen, seed)
Debug(Hex(hash,#PB_Long))
Last edited by Keya on Sat Nov 07, 2015 12:53 pm, edited 2 times in total.
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: [In progress] Murmur3 hash!

Post by wilbert »

Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
Keya
Addict
Addict
Posts: 1890
Joined: Thu Jun 04, 2015 7:10 am

Re: [In progress] Murmur3 hash!

Post by Keya »

wilbert wrote:Did you see this post
http://www.purebasic.fr/english/viewtop ... 66#p375166
lol obviously no i didn't!!! :oops:

Strange reason why though... i searched for: "Murmur" site:purebasic.fr
And it only had 3 results, none of which included that one of yours!!! (wth!?!??!)
http://www.google.com/search?q="Murmur"+site:purebasic.fr&ie=utf-8&oe=utf-8 !?!????
I dont understand... the only thing I can think of is because it doesn't actually contain the word "Murmur" on it's own, just "Murmurhash"! i would've thought "murmur" would encompass "murmur", "murmur3", "murmurhash" et al... apparently not!

..... anyway, see, isn't Murmur interesting!?!? lol :) :) :)
Thankyou AGAIN wilbert!!!
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: [In progress] Murmur3 hash!

Post by wilbert »

Keya wrote:..... anyway, see, isn't Murmur interesting!?!? lol :) :) :)
It's indeed a nice and interesting algorithm. Fast and not so many collisions. :)
Windows (x64)
Raspberry Pi OS (Arm64)
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Murmur3 hash

Post by wilbert »

It might be worth to look at FarmHash or MetroHash also but I don't know how complicated they are to implement.
Windows (x64)
Raspberry Pi OS (Arm64)
Post Reply