Page 1 of 1
CompareMemory 4*as fast as PB´s ;)
Posted: Sun Oct 16, 2005 8:03 pm
by Franky
Run it without debugger, else it takes very long
Code: Select all
Procedure O_CompareMemory(wert1.l,wert2.l,l.l)
;Erstmal die Werte in Eax und Ebx setzen
MOV eax,wert2
MOV ebx,wert1
;dann die länge in ECX und EDX laden
MOV ecx,l
;EDX erhält den wert von EDX%4
MOV edx,l
Or edx,-4
XOR edx,-4
;ECX wird durch 4 geteilt
SAR ecx,2
;Um ne endlosschleife zu verhindern, gucken ob ecx=0, dann direkt zum EDX-Part springen
CMP ecx,0
JE l_o_cm_middle
;Wenn ecx>0, wird immer Longweise, also 4 Byte auf einmal verglichen
O_CM_Begin:
MOV edi,DWord[eax]
CMP edi,DWord[ebx]
JNE l_o_cm_unlike
ADD eax,4
ADD ebx,4
DEC ecx
JNZ l_o_cm_begin
o_cm_middle:
;Direkt weg, wenn net Modulo 4
CMP edx,0
JE l_o_cm_is
MOV ecx,0
;Hier werden die übrigen bis zu 3 Byte einzeln verglichen
O_CM_Begin_B:
MOV cl,byte[eax]
CMP cl,byte[ebx]
JNE l_o_cm_unlike
INC eax
INC ebx
DEC edx
JNZ l_o_cm_begin_b
O_cm_is:
ProcedureReturn 1
O_CM_Unlike:
ProcedureReturn 0
EndProcedure
And a little example
Code: Select all
;Test 1 228 Byte
text1.s="Das ist ein seehr langer text mit etwa 200 Zeichen, ok, noch nicht ganz, kann aber bald was werden, wenn ich nur lange mit dem Tippen von Buchstaben in dieses Fenster fortfahre und mir nicht einreden lasse, das mache keinen Sinn"
text2.s="Das ist ein seehr langer text mit etwa 200 Zeichen, ok, noch nicht ganz, kann aber bald was werden, wenn ich nur lange mit dem Tippen von Buchstaben in dieses Fenster fortfahre und mir nicht einreden lasse, das mache keinen Sinn"
x=GetTickCount_()
For a=1 To 100000
l1=CompareMemory(text1,text2,228)
Next
x2=GetTickCount_()
For a=1 To 100000
l2=o_CompareMemory(@text1,@text2,228)
Next
MessageRequester("Ergebnis:","1.) "+Str(x2-x)+" , "+Str(l1)+Chr(13)+"2.) "+Str(GetTickCount_()-x2)+" , "+Str(l2))
;Test 2, 400 kb
adra=AllocateMemory(400003)
adrb=AllocateMemory(400003)
For a=0 To 400002
PokeB(adra+a,Random(255))
Next
CopyMemory(adra,adrb,400003)
PokeB(adra+400002,19)
x=GetTickCount_()
For a=1 To 10000
l1=CompareMemory(adra,adrb,400003)
Next
x2=GetTickCount_()
For a=1 To 10000
l2=o_CompareMemory(adra,adrb,400003)
Next
MessageRequester("Ergebnis:","1.) "+Str(x2-x)+" , "+Str(l1)+Chr(13)+"2.) "+Str(GetTickCount_()-x2)+" , "+Str(l2))
Sorry for the german comments, see my signature

Posted: Sun Oct 16, 2005 10:24 pm
by va!n
Nice work Franky! Here are my results
i added and used RandomSeed to get at every run the same strings to compare, used RandomSeed(23):
Test1: 180, 1 : 0, 1
Test2: 31906, 0 : 3469, 0
Coool!
Posted: Wed Oct 19, 2005 6:58 pm
by va!n
@franky:
just found this source (ASM)... maybe it may help you and possible faster!? just check it out...
http://www.codeproject.com/string/optimize_strings.asp
PB4 fix?
Posted: Tue Sep 11, 2007 9:32 pm
by Hi-Toro
Any asm coders know how to fix this for PB4? I get a syntax error on "XOR edx,-4" here, and a 4* speedup would be very useful indeed for my current project! Hopefully it's just that line, as it compiles without error if I comment that out...
Posted: Tue Sep 11, 2007 9:44 pm
by Demivec
Hi-Toro wrote:Any asm coders know how to fix this for PB4? I get a syntax error on "XOR edx,-4" here, and a 4* speedup would be very useful indeed for my current project! Hopefully it's just that line, as it compiles without error if I comment that out...
I had a problem with the line before it too. Just put an "!" before each line that isn't recognized as inline ASM.
Posted: Tue Sep 11, 2007 9:45 pm
by Pupil
Just put a '!' as first character on that line and you should be good to go.
Posted: Tue Sep 11, 2007 10:32 pm
by Hi-Toro
Hi Demivec/Pupil,
Many thanks for the quick response! Is this because 'XOr' is a PB keyword? If so, any idea why 'Or' doesn't cause the same problem?
It doesn't *really* matter to me, as the fixed !Xor works fine (saves half a second on my duplicate file checker with certain test folders!), but I'm just curious.
Anyway, thanks again for the fix! Much appreciated.
Posted: Tue Sep 11, 2007 10:46 pm
by Derek
Can't seem to get it to not work on mine, anyway, you should read the manual concerning inline assembler as the use of ! affects the way that variables are addressed in assembler commands.
Posted: Tue Sep 11, 2007 11:04 pm
by Hi-Toro
Thanks, Derek. I really don't have the mental capacity for asm coding, so the relevant manual entry doesn't really mean much to me! Not a problem, anyway, as the 'fix' is working fine for me. I think that file/folder parsing may be more of a target for optimisation now, rather than the memory check.
Posted: Tue Sep 11, 2007 11:34 pm
by Demivec
Here's a translation of Franky's code and comments:
Code: Select all
Procedure O_CompareMemory(value1.l,value2.l,l.l)
; Copy the values into Eax and Ebx
MOV Eax,value2
MOV Ebx,value1
; then load the length into ECX and EDX
MOV Ecx,l
; EDX holds value of EDX%4 (remainder 0-3)
MOV Edx,l
OR Edx,-4
XOR Edx,-4
; ECX is divided by 4
SAR Ecx,2
; In order to prevent an infinite loop, if ecx=0, then jump directly to the EDX part
CMP Ecx,0
JE l_o_cm_middle
; If ecx>0, 4 bytes at one time will be compared
O_CM_Begin:
MOV Edi,DWord[Eax]
CMP Edi,DWord[Ebx]
JNE l_o_cm_unlike
ADD Eax,4
ADD Ebx,4
DEC Ecx
JNZ l_o_cm_begin
o_cm_middle:
; Already finished if length is an even modulo 4
CMP Edx,0
JE l_o_cm_is
MOV Ecx,0
; Here the remaining bytes, up to 3, are compared separately
O_CM_Begin_B:
MOV cl,byte[Eax]
CMP cl,byte[Ebx]
JNE l_o_cm_unlike
INC Eax
INC Ebx
DEC Edx
JNZ l_o_cm_begin_b
O_cm_is:
ProcedureReturn 1
O_CM_Unlike:
ProcedureReturn 0
EndProcedure
Posted: Wed Sep 12, 2007 2:39 pm
by breeze4me
shorter code.
Code: Select all
Procedure bz_CompareMemory(*src, *dst, len)
;CLD
PUSH esi
PUSH edi
MOV esi, [p.p_src + 8]
MOV edi, [p.p_dst + 8]
MOV eax, [p.v_len + 8]
MOV ecx, eax
AND eax, 3 ; len % 4
SHR ecx, 2 ; len / 4
XOR edx, edx ;set ZF
REPE CMPSD
!JNE .not_eq
MOV ecx, eax
XOR edx, edx ;set ZF
REPE CMPSB
!JNE .not_eq
POP edi
POP esi
ProcedureReturn 1
!.not_eq:
POP edi
POP esi
ProcedureReturn 0
EndProcedure
example.
almost same speed or a bit slower
Code: Select all
DisableDebugger
SetPriorityClass_(GetCurrentProcess_(), #REALTIME_PRIORITY_CLASS)
SetThreadPriority_(GetCurrentThread_(), #THREAD_PRIORITY_TIME_CRITICAL)
Procedure O_CompareMemory(value1.l,value2.l,l.l)
; Copy the values into Eax and Ebx
MOV Eax,value2
MOV Ebx,value1
; then load the length into ECX and EDX
MOV Ecx,l
; EDX holds value of EDX%4 (remainder 0-3)
MOV Edx,l
OR Edx,-4
XOR Edx,-4
; ECX is divided by 4
SAR Ecx,2
; In order to prevent an infinite loop, if ecx=0, then jump directly to the EDX part
CMP Ecx,0
JE l_o_cm_middle
; If ecx>0, 4 bytes at one time will be compared
O_CM_Begin:
MOV Edi,DWord[Eax]
CMP Edi,DWord[Ebx]
JNE l_o_cm_unlike
ADD Eax,4
ADD Ebx,4
DEC Ecx
JNZ l_o_cm_begin
o_cm_middle:
; Already finished if length is an even modulo 4
CMP Edx,0
JE l_o_cm_is
MOV Ecx,0
; Here the remaining bytes, up to 3, are compared separately
O_CM_Begin_B:
MOV cl,byte[Eax]
CMP cl,byte[Ebx]
JNE l_o_cm_unlike
INC Eax
INC Ebx
DEC Edx
JNZ l_o_cm_begin_b
O_cm_is:
ProcedureReturn 1
O_CM_Unlike:
ProcedureReturn 0
EndProcedure
Procedure bz_CompareMemory(*src, *dst, len)
;CLD
PUSH esi
PUSH edi
MOV esi, [p.p_src + 8]
MOV edi, [p.p_dst + 8]
MOV eax, [p.v_len + 8]
MOV ecx, eax
AND eax, 3 ; len % 4
SHR ecx, 2 ; len / 4
XOR edx, edx ;set ZF
REPE CMPSD
!JNE .not_eq
MOV ecx, eax
XOR edx, edx ;set ZF
REPE CMPSB
!JNE .not_eq
POP edi
POP esi
ProcedureReturn 1
!.not_eq:
POP edi
POP esi
ProcedureReturn 0
EndProcedure
;Test 1 228 Byte
text1.s="Das ist ein seehr langer text mit etwa 200 Zeichen, ok, noch nicht ganz, kann aber bald was werden, wenn ich nur lange mit dem Tippen von Buchstaben in dieses Fenster fortfahre und mir nicht einreden lasse, das mache keinen Sinn"
text2.s="Das ist ein seehr langer text mit etwa 200 Zeichen, ok, noch nicht ganz, kann aber bald was werden, wenn ich nur lange mit dem Tippen von Buchstaben in dieses Fenster fortfahre und mir nicht einreden lasse, das mache keinen Sinn"
x=GetTickCount_()
For a=1 To 10000000
l1=O_CompareMemory(@text1,@text2,228)
Next
x2=GetTickCount_()
For a=1 To 10000000
l2=bz_CompareMemory(@text1,@text2,228)
Next
x3=GetTickCount_()
MessageRequester("Ergebnis:", "O. "+Str(x2-x)+" , "+Str(l1)+Chr(13)+"bz. "+Str(x3-x2)+" , "+Str(l2))
x=GetTickCount_()
For a=1 To 10000000
l1=bz_CompareMemory(@text1,@text2,228)
Next
x2=GetTickCount_()
For a=1 To 10000000
l2=O_CompareMemory(@text1,@text2,228)
Next
x3=GetTickCount_()
MessageRequester("Ergebnis:", "bz. "+Str(x2-x)+" , "+Str(l1)+Chr(13)+"O. "+Str(x3-x2)+" , "+Str(l2))
;Test 2, 400 kb
adra=AllocateMemory(400003)
adrb=AllocateMemory(400003)
For a=0 To 400002
PokeB(adra+a,Random(255))
Next
CopyMemory(adra,adrb,400003)
PokeB(adra+400002,19)
x=GetTickCount_()
For a=1 To 10000
l1=O_CompareMemory(adra,adrb,400003)
Next
x2=GetTickCount_()
For a=1 To 10000
l2=bz_CompareMemory(adra,adrb,400003)
Next
x3=GetTickCount_()
MessageRequester("Ergebnis:","O. "+Str(x2-x)+" , "+Str(l1)+Chr(13)+"bz. "+Str(x3-x2)+" , "+Str(l2))
x=GetTickCount_()
For a=1 To 10000
l1=bz_CompareMemory(adra,adrb,400003)
Next
x2=GetTickCount_()
For a=1 To 10000
l2=O_CompareMemory(adra,adrb,400003)
Next
x3=GetTickCount_()
MessageRequester("Ergebnis:", "bz. "+Str(x2-x)+" , "+Str(l1)+Chr(13)+"O. "+Str(x3-x2)+" , "+Str(l2))
Edit: some PUSH/POP added.
Posted: Wed Sep 12, 2007 2:55 pm
by Hi-Toro
Thanks guys! Your code seems to run at exactly the same speed, breeze4me, but I've switched to it just because it looks neater in my program...

Posted: Wed Sep 12, 2007 6:36 pm
by Demivec
breeze4me wrote:shorter code.
Thanks for updating Franky's code. I don't see why this can't be made as a change to the way PureBasic does this, since it seems so straight forward.
Posted: Wed Sep 12, 2007 8:27 pm
by Joakim Christiansen
Demivec wrote:I don't see why this can't be made as a change to the way PureBasic does this, since it seems so straight forward.
Agreed, if anyone submits code that works faster than the inbuilt commands then the PureBasic team should use it.