Page 1 of 8
Why FindString() is so terribly slow?
Posted: Sun Nov 25, 2018 8:57 pm
by uwekel
Hi,
if i run this code, it takes around 2000ms to complete, or 3900ms with #PB_String_NoCase. With Python or FreeBASIC, the same code runs over 100x faster
Code: Select all
s.s = Space(1000000) + "!"
t = ElapsedMilliseconds()
For i = 1 To 1000
x = FindString(s, "!")
Next
Debug ElapsedMilliseconds() - t
t = ElapsedMilliseconds()
For i = 1 To 1000
x = FindString(s, "!", 0, #PB_String_NoCase)
Next
Debug ElapsedMilliseconds() - t
Regards
Uwe
Re: Why FindString() is so terribly slow?
Posted: Sun Nov 25, 2018 9:06 pm
by Mijikai
Iirc. thats because PureBasic does not store the string size so it needs to calculate it every time.
Helle a member in the german forum released a fast search function (sadly its only for Ascii):
Code: Select all
http://forums.purebasic.com/german/viewtopic.php?f=3&t=28159
__________________________________________________
SID deleted
25.11.2018
RSBasic
Re: Why FindString() is so terribly slow?
Posted: Sun Nov 25, 2018 9:06 pm
by RSBasic
You have to turn off the debugger!
Re: Why FindString() is so terribly slow?
Posted: Sun Nov 25, 2018 9:37 pm
by uwekel
It's not the debugger. Without it takes 1900ms and 3300ms. FreeBASIC take around 30ms, Python 34ms.
Re: Why FindString() is so terribly slow?
Posted: Sun Nov 25, 2018 9:41 pm
by RSBasic
Okay, I was just wondering because you wrote "Debug" in the code, because it only works with Debugger.

Re: Why FindString() is so terribly slow?
Posted: Sun Nov 25, 2018 9:49 pm
by #NULL
Why would FindString() calculate the length beforehand?
I think copying the string for the function argument takes quite a bit of the time.
Re: Why FindString() is so terribly slow?
Posted: Sun Nov 25, 2018 9:56 pm
by Mijikai
Findstring calls wcslen_() mutliple times.
Re: Why FindString() is so terribly slow?
Posted: Sun Nov 25, 2018 10:06 pm
by infratec
Can you show me your FreeBasic code?
Re: Why FindString() is so terribly slow?
Posted: Sun Nov 25, 2018 10:17 pm
by RSBasic
Is it faster if you compile this code?
Code: Select all
t = ElapsedMilliseconds()
For i = 1 To 1000
x = FindString(LCase(s), LCase("!"))
Next
Debug ElapsedMilliseconds() - t
(not tested)
Re: Why FindString() is so terribly slow?
Posted: Sun Nov 25, 2018 10:29 pm
by ccode
@RSBasic
x = FindString(LCase(s), LCase("!"))
is slower!!!
Mijikai wrote:thats because PureBasic does not store the string size so it needs to calculate it every time.
- Optimized: FindString() is up to twice as fast
And this is slower:
Code: Select all
Procedure.i FindStringMems(offset.l, *String, LenStr.i, *Srch, LenSrch.i, Enc.i=#PB_Ascii)
; REV: 150621, skywalk
; + changes by keya: 1) added 'offset', 2) changed from 0 To 1 base As per the other algos (0=not found)
; Return byte location(1-based) of *Srch in *String.
; Useful to search an Ascii string in Unicode app(default as of v5.4+).
Protected.i lenChar
If Enc = #PB_Unicode ; 2 bytes/char
lenChar = 2
Else ; #PB_Ascii = 1 bytes/char, #PB_UTF8 = variable bytes/char
lenChar = 1
EndIf
If *Srch
Protected.i *pos = *String + (offset - 1)
lenStr * lenchar + *String
While *pos <= lenStr
If CompareMemory(*pos, *Srch, lenSrch)
ProcedureReturn *pos - *String + 1
EndIf
*pos + lenChar
Wend
Else
ProcedureReturn 0
EndIf
EndProcedure
Or
Code: Select all
Procedure findMemory(*pString.Character, stringToFind.s)
Protected *pFind.Character = @stringToFind
Protected *start = *pString
If (*pFind\c = 0)
ProcedureReturn 1
EndIf
While *pString\c
If (*pFind\c = *pString\c)
*pFind + SizeOf(Character)
*pString + SizeOf(Character)
If (*pFind\c = 0)
Break
EndIf
Else
*pString + SizeOf(Character)
*pFind = @stringToFind
EndIf
Wend
If (*pFind\c = 0)
ProcedureReturn 1 + (*pString - (*pFind - @stringToFind) - *start) / SizeOf(Character)
Else
ProcedureReturn 0
EndIf
EndProcedure
We need a faster assembler routine

Re: Why FindString() is so terribly slow?
Posted: Sun Nov 25, 2018 10:55 pm
by STARGĂ…TE
uwekel wrote:It's not the debugger. Without it takes 1900ms and 3300ms. FreeBASIC take around 30ms, Python 34ms.
Can you show the FreeBASIC and Python code?
I can not believe that!
Even if I just run a
ASM loop with 1000000 steps (your string length) it needs 629ms (compared to 2787 with FindString)
Code: Select all
OpenConsole()
t = ElapsedMilliseconds()
For i = 1 To 1000
! MOV rcx, 1000000
Loop:
! DEC rcx
! JNZ l_loop
Next
PrintN(Str(ElapsedMilliseconds()-t))
s.s = Space(1000000) + "!"
t = ElapsedMilliseconds()
For i = 1 To 1000
x = FindString(s, "!")
Next
PrintN(Str(ElapsedMilliseconds()-t))
Input()
Even with 8th-steps (Integer steps) it needs 80ms.
Re: Why FindString() is so terribly slow?
Posted: Sun Nov 25, 2018 11:04 pm
by ccode
I have tested it.
FreeBasic is faster!
Code: Select all
Dim teststring As String
DIM i AS INTEGER
Dim x As Integer
Dim t As Double
t = TIMER
teststring = Space(1000000) + "!"
For i = 0 to 1000
x = InStr(teststring, "!")
next
Print TIMER - t
Sleep
Re: Why FindString() is so terribly slow?
Posted: Sun Nov 25, 2018 11:31 pm
by srod
Freebasic passes strings by reference so only passes the address to a function as opposed to PB in which the function itself makes a copy of the string which has been passed. This will inevitably slow things down.
Re: Why FindString() is so terribly slow?
Posted: Sun Nov 25, 2018 11:45 pm
by ccode
Code: Select all
;No optimized C strings are slow.
OpenConsole("FindString-Test")
s.s = Space(1000000) + "!"
t = ElapsedMilliseconds()
For i = 1 To 1000
x = strcspn_(Ascii(s), @"!")
Next
PrintN(Str(x+1)) ;Index is 0
PrintN(Str(ElapsedMilliseconds() - t))
Input()
CloseConsole()
The function must be designed for optimized C strings.
For example, an optimized string:
s.s{1000001}
FreeBasic-Strings:
Strings of variable length are handled internally via a descriptor (dt: identifier). This identifier contains a pointer to the actual string and the length of the string.
Command reference entry VARPTR returns a pointer to the identifier, while command reference entry STRPTR returns a pointer to the actual string.
At the end of a string, an instruction reference entry CHR (0) is automatically appended. This can be dispensed with the transfer to external functions on time-consuming copying.
The identifier of a variable-length string contains three things: the address of the string, its length, and the space reserved for the string. For this reason, Command Reference entry outputs SIZEOF (STRING) 12 (for a 32-bit compiler) or 24 (for a 64-bit compiler). This value results from a pointer address of 4 or 8 bytes and two instruction reference entries INTEGER, also 4 and 8 bytes, respectively. The reason that the length of the string and the amount of space reserved usually do not match is simply that it does not require costly reservation and copying of the memory each time the string content is changed
https://www.freebasic-portal.de/befehls ... p-412.html
Re: Why FindString() is so terribly slow?
Posted: Mon Nov 26, 2018 6:33 am
by uwekel
As requested, here are the other code examples.
FreeBASIC:
Code: Select all
var s = Space(1000000) + "!"
var t = Timer
for i as integer = 1 to 1000
var x = instr(s, "!")
next
print int((Timer - t) * 1000) & "ms"
Python:
Code: Select all
from time import time
s = ' ' * 1000000 + '!'
t = time()
for i in range(1000):
x = s.find('!')
print(int((time()-t)*1000), 'ms')
Regards
Uwe