Slow string merge in a loop

Just starting out? Need help? Post your questions and find answers here.
AZJIO
Addict
Addict
Posts: 2223
Joined: Sun May 14, 2017 1:48 am

Slow string merge in a loop

Post by AZJIO »

There is a well-known problem of combining strings in a loop. I had to wait 11 minutes for 100,000 lines. When using a memory write with CopyMemoryString(), this is almost instantaneous. The problem is memory over-allocation. I've read that memory is allocated a bit more than necessary, so adding 2 characters doesn't over-allocate memory.
Is there any way to pre-allocate memory for a variable so that over-allocation does not occur?

current method

Code: Select all

ForEach StrList()
	Len + Len(StrList())
Next

*Result\s = Space(Len)
*Point = @*Result\s
ForEach StrList()
	CopyMemoryString(StrList(), @*Point)
Next
new way

Code: Select all

ForEach StrList()
	Len + Len(StrList())
Next
Option(#String, Result$, Len) ; forcibly set the length of the variable and prevent the variable length from decreasing
ForEach StrList()
	Result$ + StrList()
Next
Option(#String, Result$, 0) ; reset the forced variable length
User avatar
idle
Always Here
Always Here
Posts: 6026
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: Slow string merge in a loop

Post by idle »

yes this case where the compiler could be a bit smarter and push the strings onto stack
when you append a string in line like this str$ = "a" + "b" + "c" it's actually fast
but when you do it in a loop if becomes
str = str + "a"
str= str + "b"
str = str + "c"

see here where I show how it could be fixed
https://www.purebasic.fr/english/viewto ... 16#p595816

cbackend

Code: Select all

Global s1.s  
Global s2.s   

s1 = "hello" 
s2 = "world" 

st = ElapsedMilliseconds() 
For a =0 To 10000 
   s1 + s2 
Next   
et = ElapsedMilliseconds() 

st1 = ElapsedMilliseconds()
!SYS_PushStringBasePosition();
For a = 0 To 10000 
  !SYS_CopyString(g_s2);
Next 
!SYS_AllocateString4(&g_s1,SYS_PopStringBasePosition());
et1 = ElapsedMilliseconds() 

out.s = Str(et-st) + " " + Str(et1-st1) 
MessageRequester("test",out) 


RASHAD
PureBasic Expert
PureBasic Expert
Posts: 4991
Joined: Sun Apr 12, 2009 6:27 am

Re: Slow string merge in a loop

Post by RASHAD »

Maybe

Code: Select all

Dim mText.s(count)

ForEach StrList()
	mText(i) = StrList()
	i+1
Next

For t = 0 To count
   text.s = text.s+mText(t)
Next

ReDim mtext.s(0)
Egypt my love
AZJIO
Addict
Addict
Posts: 2223
Joined: Sun May 14, 2017 1:48 am

Re: Slow string merge in a loop

Post by AZJIO »

Example for the test

Code: Select all

EnableExplicit

Define Result$
Define *m
Define i, StartTime
Define NewList ListStr.s()

#StrSize = 70 ; string length
#Count = 5000 ; number of lines
*m = AllocateMemory(#StrSize * 2 + 4)
If Not *m
	End
EndIf

RandomSeed(123456789)

Procedure Filling(*c.Character, List ListStr.s())
	Protected i, j, *c0
	*c0 = *c
	For i = 1 To #Count
		*c = *c0
		For j = 1 To #StrSize
			*c\c = Random(122, 65)
			*c + 2
		Next
		AddElement(ListStr())
		ListStr() = PeekS(*c0)
	Next
EndProcedure

Filling(*m, ListStr())

FreeMemory(*m)

; Output 5 strings showing that the strings exist
ResetList(ListStr())
For i = 1 To 5
	NextElement(ListStr())
	Debug ListStr()
Next

DisableDebugger
StartTime = ElapsedMilliseconds()
ForEach ListStr()
	Result$ + ListStr()
Next
StartTime = (ElapsedMilliseconds() - StartTime)
EnableDebugger
Debug FormatNumber(StartTime / 1000, 3, ".", "") ; seconds
User avatar
Piero
Addict
Addict
Posts: 1040
Joined: Sat Apr 29, 2023 6:04 pm
Location: Italy

Re: Slow string merge in a loop

Post by Piero »

I remember you could use this trick

Code: Select all

a$ = (a$ = "") + a$ + "b"
in other languages
I wonder if it can be done "directly" in PB in some way…
Edit:
Actually, it was with lists

Code: Select all

myList = (myList=[]) + myList + new_item;
but maybe it can be applied to PB strings in some way…
SMaag
Enthusiast
Enthusiast
Posts: 327
Joined: Sat Jan 14, 2023 6:55 pm
Location: Bavaria/Germany

Re: Slow string merge in a loop

Post by SMaag »

Example for the test
I tested:
your example : time 0.367
JoinList : time 0.001 (use CopyMemoryString)

I changed to Console Output to be sure it is not a debugger problem! Same result!
Can you confirm this result?

Code: Select all

EnableExplicit

Define Result$
Define *m
Define i, StartTime
Define NewList ListStr.s()

#StrSize = 70 ; string length
#Count = 5000 ; number of lines
*m = AllocateMemory(#StrSize * 2 + 4)
If Not *m
	End
EndIf

RandomSeed(123456789)

Procedure Filling(*c.Character, List ListStr.s())
	Protected i, j, *c0
	*c0 = *c
	For i = 1 To #Count
		*c = *c0
		For j = 1 To #StrSize
			*c\c = Random(122, 65)
			*c + 2
		Next
		AddElement(ListStr())
		ListStr() = PeekS(*c0)
	Next
EndProcedure

Procedure.s JoinList(List lst.s(), Separator$, *IOutLen.Integer=0)
  ; ============================================================================
  ; NAME: JoinList
  ; DESC: Join all ListElements to a single String
  ; VAR(lst.s()) : The String List
  ; VAR(Separator$) : A separator String
  ; VAR(*IOutLen)   : Pointer to a IntVar for optional return of Stringlenght 
  ; RET.s: the String
  ; ============================================================================
    Protected ret$
    Protected I, L, N, lenSep
    Protected *ptr
    
    ;lenSep = MemoryStringLength(@Separator$)
    lenSep = Len(Separator$)
    
    N = ListSize(lst())
    Debug "ListLength = " + N
    
    If N
      ; ----------------------------------------
      ;  With Separator
      ; ----------------------------------------
      ForEach lst()
        L = L + Len(lst()) 
      Next
      L = L + (N-1) * lenSep
      ret$ = Space(L)
      *ptr = @ret$
            
      If lenSep > 0 
        
        ForEach lst()
          If lst()<>#Null$
            CopyMemoryString(lst(), @*ptr)
          EndIf
          
          I + 1
          If I < N
            CopyMemoryString(Separator$, @*ptr)
          EndIf
        Next
        
      Else          
      ; ----------------------------------------
      ;  Without Separator
      ; ----------------------------------------
        
        ForEach lst()
           If lst()<>#Null$
            CopyMemoryString(lst(), @*ptr)
          EndIf
        Next
    
      EndIf
      
    EndIf
    
    If *IOutLen
      *IOutLen\i = L
    EndIf
    
    ProcedureReturn ret$
  EndProcedure


Filling(*m, ListStr())

FreeMemory(*m)

OpenConsole()

; Output 5 strings showing that the strings exist
ResetList(ListStr())
For i = 1 To 5
	NextElement(ListStr())
	PrintN(ListStr())
Next

DisableDebugger
StartTime = ElapsedMilliseconds()
ForEach ListStr()
  
  Result$ + ListStr()
  ;Result$ = JoinList(ListStr(),"")
Next
StartTime = (ElapsedMilliseconds() - StartTime)
EnableDebugger
; Debug FormatNumber(StartTime / 1000, 3, ".", "") ; seconds
PrintN(FormatNumber(StartTime / 1000, 3, ".", "")) ; seconds
PrintN("")
PrintN("press any key!")
Input()
AZJIO
Addict
Addict
Posts: 2223
Joined: Sun May 14, 2017 1:48 am

Re: Slow string merge in a loop

Post by AZJIO »

SMaag
Does the OpenConsole() function change anything? Disabling the debugger does the job. The debugger is only turned off where measurement is required.
It's not about the speed of any features, this topic of 100 has been brought up and many options have been suggested. See my first post where I suggested a simplified way without preparing pointers and structure. I'm using the quick way, but I wish the code looked simpler (set the size and reset the size)

1. I added one of the modules (от mk-soft, link)
2. I even have my own function - ListTostring
SMaag
Enthusiast
Enthusiast
Posts: 327
Joined: Sat Jan 14, 2023 6:55 pm
Location: Bavaria/Germany

Re: Slow string merge in a loop

Post by SMaag »

Tanks for the information!
I read again! And yes it was a missunderstanding from my side!
Post Reply