Page 2 of 2
Re: Make string comparing faster
Posted: Tue Mar 30, 2021 2:48 pm
by wilbert
Saki wrote:Due to the double-zero termination, strings which are only one character long are also recognized correctly.
Where did you find PB strings use a double-zero termination ?
As far as I know only a single Null character is used to terminate a string.

Re: Make string comparing faster
Posted: Tue Mar 30, 2021 3:03 pm
by Saki
It corresponds to the logic to search the Unicode string word by word for the termination,
so a word is also expected as termination.
Re: Make string comparing faster
Posted: Tue Mar 30, 2021 3:06 pm
by Keya
Unicode uses two null bytes (0x0000) for string termination - a single null byte is not sufficient because many Unicode characters contain null bytes as either the high or the low byte
Re: Make string comparing faster
Posted: Tue Mar 30, 2021 4:08 pm
by wilbert
I misunderstood.
I thought with double-zero termination you meant two null characters (four zero bytes).
An empty string only contains a null character. In this case comparing the first two characters in theory can result in a crash if the address of the second character you try to read is on a memory page that doesn't allow to access its memory.
So if you want to compare the first two characters at once, you need to be sure that both strings have a minimum length of one.
Re: Make string comparing faster
Posted: Tue Mar 30, 2021 4:52 pm
by Saki
Primarily, I don't think Left(2) offers any noticeable speed advantage.
I myself had always used Left(2) and for the rest the simple comparison.
The rest is not big, it's always lightning fast.
Re: Make string comparing faster
Posted: Wed Mar 31, 2021 11:40 am
by Saki
Hi, if you have tested these procedures, you can post the results here, then all have something from it

Re: Make string comparing faster
Posted: Wed Mar 31, 2021 1:01 pm
by Mijikai
I remember several threads on fast strings and im sure there was also talk about string comparisons.
Besides that i distinctively remember some search pattern code which does basically the same - inline asm, very fast.
I suggest to do some searching.
Re: Make string comparing faster
Posted: Wed Mar 31, 2021 1:33 pm
by infratec
My try:
Code: Select all
CompilerIf #PB_Compiler_IsMainFile
EnableExplicit
CompilerEndIf
Procedure.i StringCompare(*String1.Character, *String2.Character)
While *String1\c = *String2\c
If *String1\c = 0 Or *String2\c = 0
Break
EndIf
*String1 + SizeOf(Character)
*String2 + SizeOf(Character)
Wend
ProcedureReturn Bool(*String1\c = *String2\c)
EndProcedure
CompilerIf #PB_Compiler_IsMainFile
Define a$, b$
a$ = "jjkgkdjghklsdghsdklfghkldghlgögjäfjeögheögtegdfkvbjsdgkljsdfhgkdhgösdfhgklsdghskdlghksldfghskldfghsdklf"
b$ = "jjkgkdjghklsdghsdklfghkldghlgögjäfjeögheögtegdfkvbjsdgkljsdfhgkdhgösdfhgklsdghskdlghksldfghskldfghsdklf"
MessageRequester("Info", Str(StringCompare(@a$, @b$)))
CompilerEndIf
It is for PB strings and not for UTF8 or ASCII strings.
Of course if I convert it to asm it's faster, but this is more educative for 'normal' PB users.
Re: Make string comparing faster
Posted: Wed Mar 31, 2021 1:39 pm
by Keya
for 1/2/4/8 bytes the Peek functions are always going to be faster because they're not string functions like Left(), but obviously you won't notice any speed difference if you're not calling it very often.
But you can see here why Peek is always going to be faster...
Code: Select all
mystr$ = "ab"
twobytes.w = PeekW(@mystr$)
=
Code: Select all
mov eax, dword ptr [403168] ;ASCII "ab"
call 00402030 ;call PeekW()
movsx eax, word ptr [eax]
retn
mov word ptr [40316C], ax
You could easily inline that to remove the call/retn -- the upcoming C version of PB will probably automatically do that.
But now the same, using a string function...
Code: Select all
mystr$ = "ab"
twobytes$ = Left(mystr$,2)
=
Code: Select all
mov edx, dword ptr [40316C]
push edx
push edx
push 2 ;2 bytes
push dword ptr [403168] ;ASCII "ab"
call 00402040 ;call Left()
push ebx
push ebp
push esi
push edi
mov edi, dword ptr [esp+14]
push edi
call 004020B0
...
xor ecx, ecx
mov esi, eax
mov eax, dword ptr [esp+18]
test eax, eax
sets cl
dec ecx
and ecx, eax
cmp esi, ecx
jle short 00402064
mov esi, ecx
push edi
call 004022E0
...
push dword ptr [esp+1C]
mov ebp, eax
push esi
call 00402320
...
mov ebx, eax
test ebp, ebp
jz short 00402084
push ebp
call 004023A0
...
mov edi, eax
test edi, edi
jz short 0040209B
test esi, esi
jle short 0040209B
push esi
push edi
push ebx
call 004023C
...
pop edi
pop esi
pop ebp
pop ebx
retn 0C
pop edi
pop esi
pop ebp
mov byte ptr [ebx], 0
pop ebx
retn 0C
push offset 00403164
call 00402160 ;store result
push ebp
mov ebp, esp
push ecx
mov eax, dword ptr [40316C]
sub eax, dword ptr [ebp+0C]
mov dword ptr [ebp-4], eax
mov ecx, dword ptr [ebp+8]
cmp dword ptr [ecx], 0
jne short 00402193
mov edx, dword ptr [ebp-4]
add edx, 5
push edx ; /Size
push 0 ; |Flags = 0
mov eax, dword ptr [403170] ; |
push eax ; |Heap
call dword ptr [<&KERNEL32.HeapAlloc>] ; \NTDLL.RtlAllocateHeap
...
mov ecx, dword ptr [ebp+8]
mov dword ptr [ecx], eax
jmp short 004021B4
mov edx, dword ptr [ebp-4]
add edx, 5
push edx ; /Size
mov eax, dword ptr [ebp+8] ; |
mov ecx, dword ptr [eax] ; |
push ecx ; |pMem
push 0 ; |Flags = 0
mov edx, dword ptr [403170] ; |
push edx ; |Heap
call dword ptr [<&KERNEL32.HeapReAlloc>] ; \NTDLL.RtlReAllocateHeap
...
mov ecx, dword ptr [ebp+8]
mov dword ptr [ecx], eax
mov edx, dword ptr [ebp-4]
push edx
mov eax, dword ptr [40302C]
add eax, dword ptr [ebp+0C]
push eax
mov ecx, dword ptr [ebp+8]
mov edx, dword ptr [ecx]
push edx
call 004023C0
...
mov eax, dword ptr [ebp+0C]
mov dword ptr [40316C], eax
mov esp, ebp
pop ebp
retn 8
(there's actually even more code to it than this as i'm only showing the one call layer)
Re: Make string comparing faster
Posted: Wed Mar 31, 2021 3:44 pm
by skywalk
Now do CompareMemoryString()?