For multiplication 257 is chosen. This seems to work pretty well and can also be implemented with a shift and add instead of a multiplication (see RKHash asm code).
For modulo 2^32 is chosen so the modulo calculation can be omitted when using 32 bit variables.
So you end up with
Power.l = 257 ^ WindowSize
RKHash.l = (RKHash * 257) + UnsignedByteValueToAdd - (UnsignedByteValueToRemove * Power)
or with shift and add
RKHash.l = (RKHash << 8 ) + RKHash + UnsignedByteValueToAdd - (UnsignedByteValueToRemove * Power)
In its current form it's not very useful since a simple compare would be faster but it can be useful when you want to search for multiple patterns.
I that case you can compare the rolling hash against multiple pattern hashes.
Code: Select all
Procedure.l MPow(Base, Exponent)
; Fast computation of
; Base^Exponent Mod 2^32
! mov edx, [p.v_Base]
! mov ecx, [p.v_Exponent]
! mov eax, 1
!.l0: test ecx, 1
! jz .l1
! imul eax, edx
!.l1: imul edx, edx
! shr ecx, 1
! jnz .l0
ProcedureReturn
EndProcedure
Procedure.l RKHash(*Memory, Size)
; Rabin-Karp hash
; Mul 257, Mod 2^32
CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
! mov rdx, [p.p_Memory]
! mov rcx, [p.v_Size]
! push rbx
! xor eax, eax
! test rcx, rcx
! jz .l1
!.l0: mov ebx, eax
! shl ebx, 8
! add eax, ebx
! movzx ebx, byte [rdx]
! add rdx, 1
! add eax, ebx
! sub rcx, 1
! jnz .l0
!.l1: pop rbx
CompilerElse
! mov edx, [p.p_Memory]
! mov ecx, [p.v_Size]
! push ebx
! xor eax, eax
! test ecx, ecx
! jz .l1
!.l0: mov ebx, eax
! shl ebx, 8
! add eax, ebx
! movzx ebx, byte [edx]
! add edx, 1
! add eax, ebx
! sub ecx, 1
! jnz .l0
!.l1: pop ebx
CompilerEndIf
ProcedureReturn
EndProcedure
; Rabin-Karp rolling hash example
Text.s = "A small demonstration of a rolling hash"
Pattern.s = "rolling"
TxtByteLen = StringByteLength(Text)
PatByteLen = StringByteLength(Pattern)
Power.l = MPow(257, PatByteLen)
PatHash.l = RKHash(@Pattern, PatByteLen)
TxtHash.l = RKHash(@Text, PatByteLen)
*Old.Ascii = @Text
*New.Ascii = @Text + PatByteLen
*End = @Text + TxtByteLen
Repeat
; Compare hash
If PatHash = TxtHash And CompareMemory(@Pattern, *Old, PatByteLen)
BytePosFound = *Old - @Text
Debug "Pattern found at byte position: " + Str(BytePosFound)
EndIf
; Break if end is reached
If *New = *End
Break
EndIf
; Update rolling hash
TxtHash = TxtHash * 257 + *New\a - (*Old\a * Power)
*Old + 1
*New + 1
ForEver