Page 1 of 1
Slow string merge in a loop
Posted: Tue Feb 25, 2025 2:10 am
by AZJIO
There is a well-known problem of combining strings in a loop. I had to wait 11 minutes for 100,000 lines. When using a memory write with CopyMemoryString(), this is almost instantaneous. The problem is memory over-allocation. I've read that memory is allocated a bit more than necessary, so adding 2 characters doesn't over-allocate memory.
Is there any way to pre-allocate memory for a variable so that over-allocation does not occur?
current method
Code: Select all
ForEach StrList()
Len + Len(StrList())
Next
*Result\s = Space(Len)
*Point = @*Result\s
ForEach StrList()
CopyMemoryString(StrList(), @*Point)
Next
new way
Code: Select all
ForEach StrList()
Len + Len(StrList())
Next
Option(#String, Result$, Len) ; forcibly set the length of the variable and prevent the variable length from decreasing
ForEach StrList()
Result$ + StrList()
Next
Option(#String, Result$, 0) ; reset the forced variable length
Re: Slow string merge in a loop
Posted: Tue Feb 25, 2025 2:56 am
by idle
yes this case where the compiler could be a bit smarter and push the strings onto stack
when you append a string in line like this str$ = "a" + "b" + "c" it's actually fast
but when you do it in a loop if becomes
str = str + "a"
str= str + "b"
str = str + "c"
see here where I show how it could be fixed
https://www.purebasic.fr/english/viewto ... 16#p595816
cbackend
Code: Select all
Global s1.s
Global s2.s
s1 = "hello"
s2 = "world"
st = ElapsedMilliseconds()
For a =0 To 10000
s1 + s2
Next
et = ElapsedMilliseconds()
st1 = ElapsedMilliseconds()
!SYS_PushStringBasePosition();
For a = 0 To 10000
!SYS_CopyString(g_s2);
Next
!SYS_AllocateString4(&g_s1,SYS_PopStringBasePosition());
et1 = ElapsedMilliseconds()
out.s = Str(et-st) + " " + Str(et1-st1)
MessageRequester("test",out)
Re: Slow string merge in a loop
Posted: Tue Feb 25, 2025 6:08 am
by RASHAD
Maybe
Code: Select all
Dim mText.s(count)
ForEach StrList()
mText(i) = StrList()
i+1
Next
For t = 0 To count
text.s = text.s+mText(t)
Next
ReDim mtext.s(0)
Re: Slow string merge in a loop
Posted: Tue Feb 25, 2025 7:33 am
by AZJIO
Example for the test
Code: Select all
EnableExplicit
Define Result$
Define *m
Define i, StartTime
Define NewList ListStr.s()
#StrSize = 70 ; string length
#Count = 5000 ; number of lines
*m = AllocateMemory(#StrSize * 2 + 4)
If Not *m
End
EndIf
RandomSeed(123456789)
Procedure Filling(*c.Character, List ListStr.s())
Protected i, j, *c0
*c0 = *c
For i = 1 To #Count
*c = *c0
For j = 1 To #StrSize
*c\c = Random(122, 65)
*c + 2
Next
AddElement(ListStr())
ListStr() = PeekS(*c0)
Next
EndProcedure
Filling(*m, ListStr())
FreeMemory(*m)
; Output 5 strings showing that the strings exist
ResetList(ListStr())
For i = 1 To 5
NextElement(ListStr())
Debug ListStr()
Next
DisableDebugger
StartTime = ElapsedMilliseconds()
ForEach ListStr()
Result$ + ListStr()
Next
StartTime = (ElapsedMilliseconds() - StartTime)
EnableDebugger
Debug FormatNumber(StartTime / 1000, 3, ".", "") ; seconds
Re: Slow string merge in a loop
Posted: Tue Feb 25, 2025 10:23 am
by Piero
I remember you could use this trick
in other languages
I wonder if it can be done "directly" in PB in some way…
Edit:
Actually, it was with lists
Code: Select all
myList = (myList=[]) + myList + new_item;
but maybe it can be applied to PB strings in some way…
Re: Slow string merge in a loop
Posted: Tue Feb 25, 2025 10:43 am
by SMaag
Example for the test
I tested:
your example : time 0.367
JoinList : time 0.001 (use CopyMemoryString)
I changed to Console Output to be sure it is not a debugger problem! Same result!
Can you confirm this result?
Code: Select all
EnableExplicit
Define Result$
Define *m
Define i, StartTime
Define NewList ListStr.s()
#StrSize = 70 ; string length
#Count = 5000 ; number of lines
*m = AllocateMemory(#StrSize * 2 + 4)
If Not *m
End
EndIf
RandomSeed(123456789)
Procedure Filling(*c.Character, List ListStr.s())
Protected i, j, *c0
*c0 = *c
For i = 1 To #Count
*c = *c0
For j = 1 To #StrSize
*c\c = Random(122, 65)
*c + 2
Next
AddElement(ListStr())
ListStr() = PeekS(*c0)
Next
EndProcedure
Procedure.s JoinList(List lst.s(), Separator$, *IOutLen.Integer=0)
; ============================================================================
; NAME: JoinList
; DESC: Join all ListElements to a single String
; VAR(lst.s()) : The String List
; VAR(Separator$) : A separator String
; VAR(*IOutLen) : Pointer to a IntVar for optional return of Stringlenght
; RET.s: the String
; ============================================================================
Protected ret$
Protected I, L, N, lenSep
Protected *ptr
;lenSep = MemoryStringLength(@Separator$)
lenSep = Len(Separator$)
N = ListSize(lst())
Debug "ListLength = " + N
If N
; ----------------------------------------
; With Separator
; ----------------------------------------
ForEach lst()
L = L + Len(lst())
Next
L = L + (N-1) * lenSep
ret$ = Space(L)
*ptr = @ret$
If lenSep > 0
ForEach lst()
If lst()<>#Null$
CopyMemoryString(lst(), @*ptr)
EndIf
I + 1
If I < N
CopyMemoryString(Separator$, @*ptr)
EndIf
Next
Else
; ----------------------------------------
; Without Separator
; ----------------------------------------
ForEach lst()
If lst()<>#Null$
CopyMemoryString(lst(), @*ptr)
EndIf
Next
EndIf
EndIf
If *IOutLen
*IOutLen\i = L
EndIf
ProcedureReturn ret$
EndProcedure
Filling(*m, ListStr())
FreeMemory(*m)
OpenConsole()
; Output 5 strings showing that the strings exist
ResetList(ListStr())
For i = 1 To 5
NextElement(ListStr())
PrintN(ListStr())
Next
DisableDebugger
StartTime = ElapsedMilliseconds()
ForEach ListStr()
Result$ + ListStr()
;Result$ = JoinList(ListStr(),"")
Next
StartTime = (ElapsedMilliseconds() - StartTime)
EnableDebugger
; Debug FormatNumber(StartTime / 1000, 3, ".", "") ; seconds
PrintN(FormatNumber(StartTime / 1000, 3, ".", "")) ; seconds
PrintN("")
PrintN("press any key!")
Input()
Re: Slow string merge in a loop
Posted: Tue Feb 25, 2025 2:08 pm
by AZJIO
SMaag
Does the OpenConsole() function change anything? Disabling the debugger does the job. The debugger is only turned off where measurement is required.
It's not about the speed of any features, this topic of 100 has been brought up and many options have been suggested. See my first post where I suggested a simplified way without preparing pointers and structure. I'm using the quick way, but I wish the code looked simpler (set the size and reset the size)
1. I added one of the
modules (от mk-soft,
link)
2. I even have my own function -
ListTostring
Re: Slow string merge in a loop
Posted: Tue Feb 25, 2025 3:39 pm
by SMaag
Tanks for the information!
I read again! And yes it was a missunderstanding from my side!