Page 1 of 1

PB 6.40 Alpha 1 - new String management speed vs ExString

Posted: Tue Jan 27, 2026 1:36 pm
by ChrisR
With the new string management with strings that are now prefixed by their length,
I'm trying to understand why this modified snippet from ExString (quite close to the new string management) remains faster for concatenation.
Do you have an explanation?

Code: Select all

;modified snippet from Exstring to try to be close to the new string management with strings that are now prefixed by their length

;- Use a big cache (#PacketSize) to not ReAllocate the memory at each concat, or #False to Always ReAllocate the memory at each concat with the required Size
#UseCache = #False  ;#True/#False 
#PacketSize = 4096  ;32

Structure StString
  Len.i
  *Address
EndStructure

Macro ProcedureReturnIf(Cond, ReturnVal = 0)
  If Cond : ProcedureReturn ReturnVal : EndIf
EndMacro

CompilerIf #UseCache
  
  Procedure CheckResizeMemory(*THIS.StString, Length)
    Protected *Add, MemSize, Result = #True
    
    MemSize = MemorySize(*THIS\Address)
    While Length + (*THIS\Len * SizeOf(Character)) + SizeOf(Character) > MemSize
      MemSize + #PacketSize
      *Add = ReAllocateMemory(*THIS\Address, MemSize, #PB_Memory_NoClear)
      If *Add
        If *Add <> *THIS\Address
          *THIS\Address = *Add
        EndIf
      Else
        Result = #False
        Break
      EndIf
    Wend
    
    ProcedureReturn Result
  EndProcedure
  
CompilerElse
  
  Procedure ResizeMemory(*THIS.StString, Length)
    Protected *Add, Result = #True
    
    *Add = ReAllocateMemory(*THIS\Address, Length + (*THIS\Len * SizeOf(Character)) + SizeOf(Character), #PB_Memory_NoClear)
    If *Add
      If *Add <> *THIS\Address
        *THIS\Address = *Add
      EndIf
    Else
      MessageRequester("ReAllocateMemory Error", "ReAllocateMemory Error" +#CRLF$+ "Help: it is usually a result of a memory corruption at an earlier time in the program by writing at an area outside of the allocated memory area", #PB_MessageRequester_Error)
      Result = #False
    EndIf
    
    ProcedureReturn Result
  EndProcedure
  
CompilerEndIf

Procedure NewString()
  Protected *Buffer, Result
  
  *Buffer = AllocateMemory(1, #PB_Memory_NoClear)
  If *Buffer
    Protected *THIS.StString
    *THIS = AllocateStructure(StString)
    *THIS\Address     = *Buffer
    Result           = *THIS
  EndIf
  
  ProcedureReturn Result
EndProcedure

Procedure AddString(*THIS.StString, String$)
  ProcedureReturnIf(*THIS\Address = 0)
  Protected Length = StringByteLength(String$)
  ProcedureReturnIf(Length < 1)
  Protected Pointer, Result = #True
  
  ;ShowMemoryViewer(*THIS\Address, Length + (*THIS\Len * SizeOf(Character)))
  CompilerIf #UseCache
    Result = CheckResizeMemory(*THIS, Length)
  CompilerElse
    Result = ResizeMemory(*THIS, Length)
  CompilerEndIf
  If Result
    Pointer = *THIS\Address + (*THIS\Len * SizeOf(Character))
    CopyMemoryString(@String$, @Pointer)
    *THIS\Len + (Length / SizeOf(Character))
  EndIf
  
  ProcedureReturn Result
EndProcedure

Procedure LenString(*THIS.StString)
  ProcedureReturnIf(*THIS\Address = 0)
  
  ProcedureReturn *THIS\Len
EndProcedure

Procedure.s GetString(*THIS.StString)
  If *THIS\Address = 0 : ProcedureReturn : EndIf
  
  ProcedureReturn PeekS(*THIS\Address, *THIS\Len)
EndProcedure

Procedure.s RightString(*THIS.StString, Len)
  If *THIS\Address = 0 Or Len < 1 : ProcedureReturn : EndIf
  
  If Len > *THIS\Len
    Len = *THIS\Len
  EndIf
  
  ProcedureReturn PeekS(*THIS\Address + ((*THIS\Len - Len) * SizeOf(Character)), Len)
EndProcedure

Procedure FreeString(*THIS.StString)
  ProcedureReturnIf(*THIS\Address = 0)
  
  FreeMemory(*THIS\Address)
  ClearStructure(*THIS, StString)
EndProcedure

;-----------  Main  ----------
CompilerIf #PB_Compiler_Debugger
  CompilerError "Compile me without Debugger!!!"
CompilerEndIf

a$ = ReplaceString(Space(20), " ", " Hello World!")   ; Len=13*20=260 - Final Len=260*2000=520000

OpenConsole("(Ex)String concatenation & right time:")
;- String concatenation time
PrintN("String concatenation & right time:")
Start = ElapsedMilliseconds()
For l = 1 To 2000
  b$ = b$ + a$ 
Next
PrintN("  Final length = " + Str(Len(b$)))
PrintN("  Large concat: "+Str(ElapsedMilliseconds() - Start) + " ms")

Start = ElapsedMilliseconds()
For l = 1 To 500000
  c$ = Right(b$, 6) 
Next
PrintN("  Right String (" + c$ + "): " +Str(ElapsedMilliseconds() - Start) + " ms")
PrintN("")

;- ExString concatenation time
PrintN("ExString concatenation & right time:")
Start = ElapsedMilliseconds()
ThisString = NewString()
For l = 1 To 2000
  AddString(ThisString, a$)
Next
PrintN("  Final length = " + Str(LenString(ThisString)))
PrintN("  Large concat: "+Str(ElapsedMilliseconds() - Start) + " ms")

Start = ElapsedMilliseconds()
For l = 1 To 500000
  c$ = RightString(ThisString, 6) 
Next
PrintN("  Right String (" + c$ + "): " +Str(ElapsedMilliseconds() - Start) + " ms")

FreeString(ThisString)
PrintN("")
PrintN("Presses the Return key to Close")
Input()
String concatenation & right time:
Final length = 520000
Large concat: 31 ms
Right String (World!): 4 ms

ExString concatenation & right time:
Final length = 520000
Large concat: 2 ms
Right String (World!): 4 ms

Re: PB 6.40 Alpha 1 - new String management speed vs ExString

Posted: Tue Jan 27, 2026 1:53 pm
by Fred
It's a stringbuilder, you have a big cache (#PacketSize), so it don't reallocate at each concat. PureBasic can't do that internally (reserving 8k for each string) as it will result in wasted memory. I will probably add a stringbuilder library somewhen which will just do the same.

You can compare with 6.30 to see the improvement for internal concat

Re: PB 6.40 Alpha 1 - new String management speed vs ExString

Posted: Tue Jan 27, 2026 2:45 pm
by ChrisR
Thanks for the quick reply, Fred.
It would indeed be nice to have a native stringbuilder library 8)
I understand for the big cache in StringBuilder not reallocated at each concat

But I just tried changing the procedure CheckResizeMemory() to allocate only the required size, so reallocated at each concat,
StringBuilder is slightly slower, but it is still faster than the new string management so I still need a little more to understand

Code: Select all

Procedure CheckResizeMemory(*THIS.StString, Length)
  Protected *Add, Result = #True

  *Add = ReAllocateMemory(*THIS\Address, Length + (*THIS\Len * SizeOf(Character)) + SizeOf(Character), #PB_Memory_NoClear)
  If *Add
    If *Add <> *THIS\Address
      *THIS\Address = *Add
    EndIf
  Else
    Result = #False
  EndIf
  
  ProcedureReturn Result
EndProcedure
Large concat: 4 ms vs 34 ms

Re: PB 6.40 Alpha 1 - new String management speed vs ExString

Posted: Tue Jan 27, 2026 3:21 pm
by Fred
PB currently always do an AllocateMemory(), it doesn't use ReAlloc(). I will give it a try to see if I can change that

Re: PB 6.40 Alpha 1 - new String management speed vs ExString

Posted: Tue Jan 27, 2026 3:36 pm
by ChrisR
Thanks :)

Re: PB 6.40 Alpha 1 - new String management speed vs ExString

Posted: Tue Jan 27, 2026 4:24 pm
by skywalk
Thanks for checking this!

Re: PB 6.40 Alpha 1 - new String management speed vs ExString

Posted: Tue Jan 27, 2026 5:06 pm
by ChrisR
Thank you, but all credit goes to Fred for this significant change and for responding and looking into this topic.
For my part, I use ExString a lot for concatenation when generating code in my app.
And when trying to understand what “All strings are now prefixed by their length” meant, I thought ExString was going to be good for the trash.

Otherwise, I can't wait to understand how it works and why we need to reset the string length when texts are modified directly via APIs.

I updated the code in the 1st post with the constant: #UseCache = #True ;#False, to use a cache or to always reallocate the memory to the required size.

Re: PB 6.40 Alpha 1 - new String management speed vs ExString

Posted: Thu Jan 29, 2026 10:27 pm
by idle
I tested Ucase and Lcase against the UTF16 module, strUcase StrLcase are 4 times faster in place and 3 times faster on copy vs 2 times faster on 6.30 definitely an improvement, though the UTF16 functions speed is really due to them working in place so if you don't want it to change the string you need to copy it first.

Code: Select all

 Procedure StrUCase_(*in.Unicode)  ;changes the case of the string inplace 
    Protected *char.Unicode 
    *char = *in 
    While *char\u  
      *char\u = casemappingUC(*char\u) 
      *char+2 
    Wend 
  EndProcedure 
6.40
Inplace strLCase / strUcase 53 ms for 1,000,000
LCase / UCase 216 ms for 1,000,000

Copy strLCase / strUcase 71 ms for 1,000,000
LCase / UCase 216 ms for 1,000,000

6.30
inplace strLCase / strUcase 49 ms for 1,000,000
LCase / UCase 348 ms for 1,000,000

copy strLCase / strUcase 142 ms for 1,000,000
LCase / UCase 343 ms for 1,000,000