Never thought a piece of code would make me so happy

idle wrote:ps don't listen to srod putting down his ***** hashtable (five stars, not abuse), he's talking four stars (****)
Code: Select all
Structure ByteWord
StructureUnion
B.c[2]
W.l ;words are signed :(
EndStructureUnion
EndStructure
Structure Patterns
Count.l
Pattern.s[10]
EndStructure
Structure MemoryArray
Byte.c[0]
EndStructure
#PatternCount = 10000
Dim RawPatterns.s(#PatternCount)
Dim StructPtn.Patterns(66000)
;create some 32byte randome patterns
For i = 1 To #PatternCount
For j = 1 To 32
RawPatterns(i) = RawPatterns(i) + Chr(Random(254) + 1) ;don't want nulls for the test
Next
Next
;Set test pattern
rawpatterns(5000) = "alex-1234-12bx56"
;Load Into search structure. (speed not so important here so I used the asc(mid()) rather than pointers as its quicker to write
For i = 1 To #PatternCount
Idx.ByteWord
Idx\b[0] = Asc(Mid(RawPatterns(i),1,1))
Idx\b[1] = Asc(Mid(RawPatterns(i),2,1))
;check bucket overflow
If StructPtn(Idx\w)\count > 8
Debug "Bucket Full"
Else
StructPtn(Idx\w)\count = StructPtn(Idx\w)\count + 1
StructPtn(Idx\w)\Pattern[StructPtn(Idx\w)\count] = RawPatterns(i)
EndIf
Next
;Load a File of a few meg (shell32.dll = >8mb)
OpenFile(0,"C:\WINDOWS\ServicePackFiles\i386\shell32.dll")
FileLen.l = Lof(0)
*FileData = AllocateMemory(FileLen)
ReadData(0,*Filedata,FileLen)
CloseFile(0)
;Add search string a fair way through the file
PokeS(*Filedata + 5000 ,"alex-1234-12bx56")
;Search
*FileByteArray.MemoryArray = *Filedata
Time1 = ElapsedMilliseconds()
For i = 1 To FileLen -32
;check for index match
Idx\b[0] = *FileByteArray\byte[i]
Idx\b[1] = *FileByteArray\byte[i+1]
For j = 1 To StructPtn(Idx\w)\count
If CompareMemory(@StructPtn(Idx\w)\Pattern[j],*Filedata + i,16)
Debug i
EndIf
Next
Next
Debug "Done"
;Debug Str(ElapsedMilliseconds() - time1)
MessageRequester("", Str(ElapsedMilliseconds() - time1))
I can see that working, though in the cases of a pattern having the same leading bytes you'd run into complications. I suppose thats where Associative array implimentaions take over from look up tablespdwyer wrote:
The simple point is that the first twp bytes of the pattern are the array index so there's no looping or searching there, the bytes you are checking become the lookup for the pattern.