Page 2 of 2

Re: Make string comparing faster

Posted: Tue Mar 30, 2021 2:48 pm
by wilbert
Saki wrote:Due to the double-zero termination, strings which are only one character long are also recognized correctly.
Where did you find PB strings use a double-zero termination ?
As far as I know only a single Null character is used to terminate a string. :?

Re: Make string comparing faster

Posted: Tue Mar 30, 2021 3:03 pm
by Saki
It corresponds to the logic to search the Unicode string word by word for the termination,
so a word is also expected as termination.

Code: Select all

a$=" "
ShowMemoryViewer(@a$, 100)

Re: Make string comparing faster

Posted: Tue Mar 30, 2021 3:06 pm
by Keya
Unicode uses two null bytes (0x0000) for string termination - a single null byte is not sufficient because many Unicode characters contain null bytes as either the high or the low byte

Re: Make string comparing faster

Posted: Tue Mar 30, 2021 4:08 pm
by wilbert
I misunderstood.
I thought with double-zero termination you meant two null characters (four zero bytes).

An empty string only contains a null character. In this case comparing the first two characters in theory can result in a crash if the address of the second character you try to read is on a memory page that doesn't allow to access its memory.
So if you want to compare the first two characters at once, you need to be sure that both strings have a minimum length of one.

Re: Make string comparing faster

Posted: Tue Mar 30, 2021 4:52 pm
by Saki
Primarily, I don't think Left(2) offers any noticeable speed advantage.
I myself had always used Left(2) and for the rest the simple comparison.
The rest is not big, it's always lightning fast.

Re: Make string comparing faster

Posted: Wed Mar 31, 2021 11:40 am
by Saki
Hi, if you have tested these procedures, you can post the results here, then all have something from it :wink:

Re: Make string comparing faster

Posted: Wed Mar 31, 2021 1:01 pm
by Mijikai
I remember several threads on fast strings and im sure there was also talk about string comparisons.
Besides that i distinctively remember some search pattern code which does basically the same - inline asm, very fast.
I suggest to do some searching.

Re: Make string comparing faster

Posted: Wed Mar 31, 2021 1:33 pm
by infratec
My try:

Code: Select all

CompilerIf #PB_Compiler_IsMainFile
  EnableExplicit
CompilerEndIf


Procedure.i StringCompare(*String1.Character, *String2.Character)
  
  While *String1\c = *String2\c
    If *String1\c = 0 Or *String2\c = 0
      Break
    EndIf
    *String1 + SizeOf(Character)
    *String2 + SizeOf(Character)
  Wend
  
  ProcedureReturn Bool(*String1\c = *String2\c)
  
EndProcedure



CompilerIf #PB_Compiler_IsMainFile
  
  Define a$, b$
  
  a$ = "jjkgkdjghklsdghsdklfghkldghlgögjäfjeögheögtegdfkvbjsdgkljsdfhgkdhgösdfhgklsdghskdlghksldfghskldfghsdklf"
  b$ = "jjkgkdjghklsdghsdklfghkldghlgögjäfjeögheögtegdfkvbjsdgkljsdfhgkdhgösdfhgklsdghskdlghksldfghskldfghsdklf"
  
  MessageRequester("Info", Str(StringCompare(@a$, @b$)))
  
CompilerEndIf
It is for PB strings and not for UTF8 or ASCII strings.
Of course if I convert it to asm it's faster, but this is more educative for 'normal' PB users.

Re: Make string comparing faster

Posted: Wed Mar 31, 2021 1:39 pm
by Keya
for 1/2/4/8 bytes the Peek functions are always going to be faster because they're not string functions like Left(), but obviously you won't notice any speed difference if you're not calling it very often.

But you can see here why Peek is always going to be faster...

Code: Select all

mystr$ = "ab"
twobytes.w = PeekW(@mystr$)
=

Code: Select all

mov eax, dword ptr [403168]  ;ASCII "ab"
call 00402030                ;call PeekW()
	movsx eax, word ptr [eax]
	retn
mov word ptr [40316C], ax
You could easily inline that to remove the call/retn -- the upcoming C version of PB will probably automatically do that.

But now the same, using a string function...

Code: Select all

mystr$ = "ab"
twobytes$ = Left(mystr$,2)
=

Code: Select all

mov edx, dword ptr [40316C]
push edx
push edx
push 2                   ;2 bytes
push dword ptr [403168]  ;ASCII "ab"
call 00402040            ;call Left()
	push ebx
	push ebp
	push esi
	push edi
	mov edi, dword ptr [esp+14]
	push edi
	call 004020B0
		...
	xor ecx, ecx
	mov esi, eax
	mov eax, dword ptr [esp+18]
	test eax, eax
	sets cl
	dec ecx
	and ecx, eax
	cmp esi, ecx
	jle short 00402064
	mov esi, ecx
	push edi
	call 004022E0                                        
		...
	push dword ptr [esp+1C]                              
	mov ebp, eax                                         
	push esi                                             
	call 00402320                                        
		...
	mov ebx, eax
	test ebp, ebp
	jz short 00402084
	push ebp                                             
	call 004023A0                                        
		...
	mov edi, eax
	test edi, edi
	jz short 0040209B
	test esi, esi
	jle short 0040209B
	push esi
	push edi
	push ebx
	call 004023C
		...
	pop edi
	pop esi
	pop ebp
	pop ebx
	retn 0C
	pop edi
	pop esi
	pop ebp
	mov byte ptr [ebx], 0
	pop ebx
	retn 0C
push offset 00403164
call 00402160            ;store result
	push ebp                                            
	mov ebp, esp
	push ecx
	mov eax, dword ptr [40316C]
	sub eax, dword ptr [ebp+0C]
	mov dword ptr [ebp-4], eax
	mov ecx, dword ptr [ebp+8]
	cmp dword ptr [ecx], 0
	jne short 00402193
	mov edx, dword ptr [ebp-4]
	add edx, 5
	push edx                                             ; /Size
	push 0                                               ; |Flags = 0
	mov eax, dword ptr [403170]                          ; |
	push eax                                             ; |Heap
	call dword ptr [<&KERNEL32.HeapAlloc>]               ; \NTDLL.RtlAllocateHeap
		...
	mov ecx, dword ptr [ebp+8]
	mov dword ptr [ecx], eax
	jmp short 004021B4
	mov edx, dword ptr [ebp-4]
	add edx, 5
	push edx                                             ; /Size
	mov eax, dword ptr [ebp+8]                           ; |
	mov ecx, dword ptr [eax]                             ; |
	push ecx                                             ; |pMem
	push 0                                               ; |Flags = 0
	mov edx, dword ptr [403170]                          ; |
	push edx                                             ; |Heap
	call dword ptr [<&KERNEL32.HeapReAlloc>]             ; \NTDLL.RtlReAllocateHeap
		...
	mov ecx, dword ptr [ebp+8]
	mov dword ptr [ecx], eax
	mov edx, dword ptr [ebp-4]
	push edx           
	mov eax, dword ptr [40302C]                     
	add eax, dword ptr [ebp+0C]                     
	push eax                                        
	mov ecx, dword ptr [ebp+8]                      
	mov edx, dword ptr [ecx]                        
	push edx                                        
	call 004023C0                                   
		...
	mov eax, dword ptr [ebp+0C]
	mov dword ptr [40316C], eax
	mov esp, ebp
	pop ebp
	retn 8
(there's actually even more code to it than this as i'm only showing the one call layer)

Re: Make string comparing faster

Posted: Wed Mar 31, 2021 3:44 pm
by skywalk
Now do CompareMemoryString()?