FindData module
Re: FindData module
Would someone provide example use cases for this? Thank you.
Re: FindData module
It's like findstring but works with binary data and patterns and is ~10x faster than findstring. Each function has it's pros and cons
depending upon the length of what your looking for (the needle) and you also have to check if SSE is available
at runtime though that shouldn't really be an issue, unless its some rusty old box in the basement.
I just bolted it in to a tag preprocessor to look for "<?PB" and want to use SSE2_FIND which is blazingly fast for short needles
You need to test if SSE2 is available, Then set a function prototype or fall back to BM (boyer moore) which is also fast
but prefers bigger needles, as it steps through the input by the needle size.
Wilbert will have the demo code lurking somewhere. no doubt.
depending upon the length of what your looking for (the needle) and you also have to check if SSE is available
at runtime though that shouldn't really be an issue, unless its some rusty old box in the basement.
I just bolted it in to a tag preprocessor to look for "<?PB" and want to use SSE2_FIND which is blazingly fast for short needles
You need to test if SSE2 is available, Then set a function prototype or fall back to BM (boyer moore) which is also fast
but prefers bigger needles, as it steps through the input by the needle size.
Code: Select all
Procedure PreProcessInit(*pre.PreProcessor)
UseModule CPUInfo
If IsCPU(#PB_CPU_SSE2)
*pre\find = finddata::@SSE2_Find()
Else
*pre\find = FindData::@BM()
EndIf
UnuseModule CPUInfo
EndProcedure
....
While (pos > -1 And pos < len)
pos = *pre\find(*input,len,*pre\tag,*pre\taglen,pos) ;<--- hunt through *input for UTF8("<?PB")
If pos <> -1
osz = outp+(pos-p_end)
If osz > outlen
*pre\output = ReAllocateMemory(*pre\output,osz,#PB_Memory_NoClear)
outlen = osz
EndIf
...
Windows 11, Manjaro, Raspberry Pi OS
Re: FindData module
Thank you.
I know that these are hand-optimized assembly. But I would love to see a normal code version for comparison for those of us who don't know assembly.
I know that these are hand-optimized assembly. But I would love to see a normal code version for comparison for those of us who don't know assembly.
Re: FindData module
I found the link to the collection of c algorithms which largely inspired the module
they should be a bit more comprehensible.
http://www.dmi.unict.it/~faro/smart/algorithms.php
they should be a bit more comprehensible.
http://www.dmi.unict.it/~faro/smart/algorithms.php
Windows 11, Manjaro, Raspberry Pi OS
-
- Enthusiast
- Posts: 619
- Joined: Fri Feb 20, 2009 9:24 am
- Location: Almaty (Kazakhstan. not Borat, but Triple G)
- Contact:
Re: FindData module
maybe someone will have some ideas:
i have rom file - any sega mega drive game. and i have 4 files - banks for music. i read first 16bytes from that banks, then search where it lays in a rom file and remember thems adresses. for example it was $AABB01 $AACC02 $AADD03 $AAEE04 - now i need to find some place, where that four addresses can be lays near each other. not exactly near... but maybe between of them 2-20 bytes... problem is this values too short and can be a lot of matching with search. but i need only that place, where they all near - that place will be sure place, what i need this four values can be with any order... not from bigger to lower, or lower to bigger... any. just some one place, where they lays near. and it can be 4 values, but can be 3 and 2 values (when some banks is empty)
i can see something in my head... make some 4 arrays with all matching places with that addresses... and then need some + and - function probably...cant catch it still flying in my head...
i have rom file - any sega mega drive game. and i have 4 files - banks for music. i read first 16bytes from that banks, then search where it lays in a rom file and remember thems adresses. for example it was $AABB01 $AACC02 $AADD03 $AAEE04 - now i need to find some place, where that four addresses can be lays near each other. not exactly near... but maybe between of them 2-20 bytes... problem is this values too short and can be a lot of matching with search. but i need only that place, where they all near - that place will be sure place, what i need this four values can be with any order... not from bigger to lower, or lower to bigger... any. just some one place, where they lays near. and it can be 4 values, but can be 3 and 2 values (when some banks is empty)
i can see something in my head... make some 4 arrays with all matching places with that addresses... and then need some + and - function probably...cant catch it still flying in my head...
Re: FindData module
I am not sure but this might be a special case of the knapsack problem
-
- Enthusiast
- Posts: 619
- Joined: Fri Feb 20, 2009 9:24 am
- Location: Almaty (Kazakhstan. not Borat, but Triple G)
- Contact:
Re: FindData module
thanks, not clear for me this Knapsack...
i make something like that:
it should find some 2222 - second items in arrays is correct. but i am glad - it is work need a lot future tests but i think it is fine
i make something like that:
Code: Select all
Structure forsortarray
type$
value.l
EndStructure
Global Dim SortArrayValues.forsortarray(4)
; arrays of addresses
Dim inst.l(3)
Dim envel.l(3)
Dim seq.l(3)
Dim samples.l(3)
;{ fill addresses
inst(1) = $00000688
inst(2) = $0000D688
inst(3) = $001FF836
envel(1) = $0000A682
envel(2) = $0000D682
envel(3) = $0011D682
seq(1) = $0000D00C
seq(2) = $0000D67C
seq(3) = $0001D67C
samples(1) = $0000D600
samples(2) = $0000D676
samples(3) = $0010D676
;}
sizeins = ArraySize(inst())
sizeenv = ArraySize(envel())
sizeseq = ArraySize(seq())
sizesmp = ArraySize(samples())
; detect which arrays is empty
If sizeins And sizeenv And sizeseq And sizesmp
; all banks is here
For ins = 1 To sizeins
For env = 1 To sizeenv
For seq = 1 To sizeseq
For smp = 1 To sizesmp
; inst env seq smp
; fill all data in temp array
SortArrayValues(1)\type$ = "ins"
SortArrayValues(1)\value = inst(ins)
SortArrayValues(2)\type$ = "env"
SortArrayValues(2)\value = envel(env)
SortArrayValues(3)\type$ = "seq"
SortArrayValues(3)\value = seq(seq)
SortArrayValues(4)\type$ = "smp"
SortArrayValues(4)\value = samples(smp)
; sort array for #PB_Sort_Ascending
SortStructuredArray(SortArrayValues(), #PB_Sort_Ascending, OffsetOf(forsortarray\value), TypeOf(forsortarray\value))
tmp1 = SortArrayValues(4)\value - SortArrayValues(3)\value
Select tmp1
Case 4 To 20
tmp2 = SortArrayValues(3)\value - SortArrayValues(2)\value
Select tmp2
Case 4 To 20
tmp3 = SortArrayValues(2)\value - SortArrayValues(1)\value
Select tmp3
Case 4 To 20
Debug "was found at " + Str(ins) + Str(env) + Str(seq) + Str(smp)
Break 4
EndSelect
EndSelect
EndSelect
Next
Next
Next
Next
ElseIf sizeins And sizeseq And sizesmp
; no envelopes
ElseIf sizeins And sizeenv And sizeseq
; no samples
ElseIf sizeins And sizeseq
; no samples and envelopes
EndIf