Page 1 of 1

Faster Len(String$) with asm

Posted: Sat Oct 14, 2006 6:06 am
by Helle
And here a example for Len(String$) with asm (need min. pentium 4):

Code: Select all

;- "Helle" Klaus Helbing,  14.10.2006,  PB4.00

Global X.l
Global Len.l
Global String$ = "pokjhgskioUIOHR%(!ยง)AGJOjiioe328959askfakgj9e6t306rgrjhg409604tkmgoe" 

;-------- Test with SSE
TestTime.l = ElapsedMilliseconds() 
X = @String$                 ;string-pointer

For i = 1 To 10000000 
 !mov ebx,[v_X]
 !pxor xmm1,xmm1             ;xmm1 null
W1: 
 !movdqu xmm0,[ebx]          ;read 16 byte of string
 !add ebx,16
 !pcmpeqb xmm0,xmm1          ;compare the 16 bytes of xmm0 with null (xmm1), set byte = 255 if equal (= null) and byte = 0 if not equal
 !pmovmskb eax,xmm0          ;copy signum-bits to eax (ax)
 !or eax,eax                 ;null ?  
 !jz l_w1                    ;yes, not found the null-byte of string (end of string)

 !sub ebx,17
 !test al,255
 !jnz l_w2
 !xchg al,ah
 !add ebx,8
W2: 
 !inc ebx
 !shr ax,1
 !jnc l_w2
Next 

 !sub ebx,[v_X]
 !mov [v_Len],ebx    

Time = ElapsedMilliseconds() - TestTime
MessageRequester("Len(String) with SSE","Stringlength = " + Str(Len) + Chr(13) + "TestTime = " + Str(Time) + " ms")

;-------- Test with PB
TestTime.l = ElapsedMilliseconds()
For i = 1 To 10000000 
Len = Len(String$) 
Next
Time = ElapsedMilliseconds() - TestTime
MessageRequester("Len(String) with PB","Stringlength = " + Str(Len) + Chr(13) + "TestTime = " + Str(Time) + " ms")

End
Greeting
Helle

Posted: Sat Oct 14, 2006 11:03 am
by wilbert
It looks nice but on my computer (Athlon processor) it doesn't work.
Your example works fine but if I try a string with a length of 10 characters, it returns 19. The pcmpeqb instruction doesn't seem to support 128 bit comparisson only 64.

Posted: Sat Oct 14, 2006 1:08 pm
by Helle
My example need SSE2.
PCMPEQB support 128 bit with processors with SSE2 (from P4, Celeron with Willamette-Core, Athlon 64, Opteron or Sempron with Paris-Core).

Greeting
Helle

Posted: Sat Oct 21, 2006 8:10 am
by wilbert
Those SSE instructions are very useful :)

I adapted your example to use only SSE instructions and no SSE2 so it also works on my Athlon CPU. I also implemented a second parameter. If the length of the string is bigger then that value, it returns that value.

Code: Select all

Procedure.l Len_SSE_(*String.l, MaxLen.l = -1)
 !mov edx,[p.p_String]
 !mov ecx,[p.v_MaxLen]
 !push ecx
 !add ecx,edx
 !cmp edx,ecx
 !jna len_sse_maxok
 !mov ecx,$ffffffff
 !len_sse_maxok:
 !pxor mm0,mm0
 !movq mm1,[edx]
 !pcmpeqb mm1,mm0
 !pmovmskb eax,mm1
 !and al,al
 !jnz len_sse_cont
 !and edx,$fffffff8
 !len_sse_loop:
 !add edx,8
 !cmp edx,ecx
 !jae len_sse_cont
 !movq mm1,[edx]
 !pcmpeqb mm1,mm0
 !pmovmskb eax,mm1
 !and al,al
 !jz len_sse_loop
 !len_sse_cont:
 !emms
 !pop ecx
 !bsf eax,eax
 !add eax,edx
 !sub eax,[p.p_String]
 !cmp eax,ecx
 !jna len_sse_cont2
 !mov eax,ecx
 !len_sse_cont2:
 ProcedureReturn
EndProcedure

Macro Len_SSE(String, MaxLen = -1)
 Len_SSE_(@String, MaxLen)
EndMacro
If for example you only want to know if the string is at least 20 characters long, you can check if Len_SSE(String$,20) equals 20.
It can speed things up a lot for large strings.