Page 2 of 2

Re: String concatenation performance

Posted: Sat Jul 25, 2020 4:32 pm
by wilbert
cas wrote:Replace std::string with std::wstring
Is it possible for you to see what assembly code is generated ?
It would be nice to see how it is translated by the C compiler.

Re: String concatenation performance

Posted: Sat Jul 25, 2020 4:36 pm
by cas
Copy/paste it to compiler explorer: https://godbolt.org/
Edit: also you can test code on many online c++ compilers, for example this one is giving me good numbers (about 15ms vs 8ms on my local PC): http://cpp.sh/

Re: String concatenation performance

Posted: Sat Jul 25, 2020 5:04 pm
by wilbert
cas wrote:Copy/paste it to compiler explorer: https://godbolt.org/
Edit: also you can test code on many online c++ compilers, for example this one is giving me good numbers (about 15ms vs 8ms on my local PC): http://cpp.sh/
Thanks. I didn't know of that website. :)
I'm not familiar with C++ but it seems strings are objects which store the length. For a big part that explains the speed.
It also seems to reallocate memory which is also a great help since especially with smaller strings there's a reasonable chance the memory can be increased without the need to allocate a new block and copy the existing string data.

Re: String concatenation performance

Posted: Sat Jul 25, 2020 9:03 pm
by cas
Yes, string class from c++ when translated to PB and simplified, would look something like this:

Code: Select all

DisableDebugger
EnableExplicit

DeclareModule OptimizedString
  DisableDebugger
  EnableExplicit
  
  Structure str
    *start
    *end
    capacity.i
    ;sso.c[16] ;TODO: small string optimization
  EndStructure
  
  Declare Append(*s.str,*chars,nChars=-1)
  Declare Reserve(*s.str,additionalChars)
  Declare Length(*s.str)
  Declare Pointer(*s.str)
  Declare.s NativeString(*s.str)
EndDeclareModule

Module OptimizedString
  
  Procedure Append(*s.str,*chars,nChars=-1)
    If nChars=-1
      nChars=MemoryStringLength(*chars)
    EndIf
    If nChars>0
      Reserve(*s,nChars)
      CopyMemory(*chars,*s\end,nChars*SizeOf(Character))
      *s\end+(nChars*SizeOf(Character))
      PokeC(*s\end,0)
    EndIf
  EndProcedure
  
  Procedure Reserve(*s.str,additionalChars)
    Protected len=Length(*s)
    Protected free=*s\capacity-len
    If free<additionalChars
      *s\capacity=(*s\capacity+(additionalChars-free))*2
      *s\start=ReAllocateMemory(*s\start,(*s\capacity+1)*SizeOf(Character),#PB_Memory_NoClear)
      *s\end=*s\start+(len*SizeOf(Character))
    EndIf
  EndProcedure
  
  Procedure Length(*s.str)
    ProcedureReturn (*s\end-*s\start)/SizeOf(Character)
  EndProcedure
  
  Procedure Pointer(*s.str)
    ProcedureReturn *s\start
  EndProcedure
  
  Procedure.s NativeString(*s.str)
    ProcedureReturn PeekS(Pointer(*s),Length(*s))
  EndProcedure
  
EndModule


DisableExplicit

#N_REPEATS=10000

b.s = "..."

t1 = ElapsedMilliseconds()

a.s = "hello"

CompilerIf #N_REPEATS=<10000 ;do not test if loop is over 10k
For i = 1 To #N_REPEATS
  a + b
Next
CompilerEndIf
t2 = ElapsedMilliseconds()

ostr.OptimizedString::str
OptimizedString::Append(@ostr,@"hello")

For i = 1 To #N_REPEATS
  OptimizedString::Append(@ostr,@b)
Next

optimized_a.s=OptimizedString::NativeString(@ostr)

t3 = ElapsedMilliseconds()

t1s.s="<skipped>"
If Len(a)>5
  t1s.s=Str(t2-t1)+"ms"
  If optimized_a<>a
    MessageRequester("ERROR","results should be same")
  EndIf
EndIf

MessageRequester("Timings", t1s.s+" vs "+Str(t3-t2)+"ms")
10k: 0ms
100k: 4ms
1M: 38ms

Re: String concatenation performance

Posted: Sat Jul 25, 2020 9:16 pm
by cas
mk-soft wrote:With own structured memory strings works faster.
Need ca 4ms for 10000 loops and 33ms for 100000 loops. CPU Intel(R) Core(TM) i7-8700B CPU @ 3.20GHz
You are still measuring it with debugger enabled inside all procedures, you must disable debugger at top of your source file.

Re: String concatenation performance

Posted: Sat Jul 25, 2020 9:42 pm
by mk-soft
So I noticed.
100000 loops with FastString

macOS (Hostsystem)
Timings: 34481 vs 24511 vs 325 vs 11
Windows 7 (Virtual Machine)
---------------------------
Timings
---------------------------
31359 vs 23673 vs 294 vs 8
---------------------------
OK
---------------------------
Ubuntu 1804 (Virtual Machine)
Timings 9076 vs 12284 vs 107 vs 3

Re: String concatenation performance

Posted: Sun Jul 26, 2020 4:34 am
by Rinzwind
mk-soft wrote: That here a VB script is faster than Purebasic is very sad.
That's even ignoring the fact that VBScript's implementation doesn't do optimalization as JScript does (one can work around that somewhat with an array and final Join).

VBScript

Code: Select all

Option Explicit

Dim i, a, b, t1, t2

Function FormatMS(value)
	FormatMS = FormatNumber(value * 1000, 0, 0, 0, 0)
End Function


a = "Hello"
b = "..."
t1 = Timer
For i = 1 To 50000
	a = a & b
Next
t2 = Timer
WScript.Echo FormatMS(t2 - t1)
'WScript.Echo a
406 ms

JScript

Code: Select all

var i, a, b, t1, t2;
a = "Hello";
b = "...";
t1 = new Date();
for (i = 0; i < 50000; i++) {
 a = a + b;
}
t2 = new Date();
WScript.Echo(t2 - t1)
//WScript.Echo(a);
17 ms

JavaScript in browser is even faster... (around 3..9)

PB

Code: Select all

EnableExplicit

DisableDebugger

Define a.s, b.s, i, t1, t2
a = "Hello"
b = "..."

t1 = ElapsedMilliseconds()
For i = 1 To 50000
  a + b  
Next
t2 = ElapsedMilliseconds()
MessageRequester("", Str(t2 - t1))

9964 ms

FastString
28..34 ms

OptimizedString
2..4 ms

My own solution

Code: Select all

Procedure.s ListToString(List StringList.s(), Delimiter.s = " ")
  Protected String.s, l, c = ListSize(StringList()), *p, i
  
  If c = 0
    ProcedureReturn ""
  EndIf
  ForEach StringList()
    l + Len(StringList())
  Next
  String = Space(l + Len(Delimiter) * c)
  *p = @String
  ResetList(StringList())
  NextElement(StringList())
  CopyMemoryString(StringList(), @*p)
  While NextElement(StringList())
    CopyMemoryString(@Delimiter)
    CopyMemoryString(StringList())
  Wend
  ProcedureReturn String
EndProcedure
8..16 ms

Which just means PB's native implementation could really use optimalization (even if this artificial example is far from real world).

Re: String concatenation performance

Posted: Sun Jul 26, 2020 11:47 am
by mk-soft
@Rinzwind
FastString for 50000 loop ca 4ms

Interesting that JScript is so fast even under the module ActiveScript
---------------------------
Timings of loops 50000
---------------------------
Time: PB 7983ms / VBS 474ms / JScript 14ms

Len VBS = 150005
Len JScript = 150005
---------------------------
OK
---------------------------
Update

Code: Select all

;-TOP

; Comment   : Modul ActiveScript Example 14
; Version   : v2.09

; Link to ActiveScript  : https://www.purebasic.fr/english/viewtopic.php?f=12&t=71399
; Link to SmartTags     : https://www.purebasic.fr/english/viewtopic.php?f=12&t=71399#p527089
; Link to VariantHelper : https://www.purebasic.fr/english/viewtopic.php?f=12&t=71399#p527090

; ***************************************************************************************

XIncludeFile "Modul_ActiveScript.pb"
;XIncludeFile "Modul_SmartTags.pb"
;XIncludeFile "VariantHelper.pb"

UseModule ActiveScript
;UseModule ActiveSmartTags

; -------------------------------------------------------------------------------------

Procedure.s GetDataSectionText(*Addr.Character)
  Protected result.s, temp.s
  While *Addr\c <> #ETX
    temp = PeekS(*Addr)
    *Addr + StringByteLength(temp) + SizeOf(Character)
    result + temp + #LF$
  Wend
  ProcedureReturn result
EndProcedure

; -------------------------------------------------------------------------------------

Global script.s, sValue1.s, sValue2.s

Runtime sValue1, sValue2

; -------------------------------------------------------------------------------------
;-TOP

DisableDebugger

Define loops = 50000
Runtime loops

b.s = "..."
a.s = "hello"

t1 = ElapsedMilliseconds()
For i = 1 To loops
  a + b
Next
t2 = ElapsedMilliseconds()

*Control = NewActiveScript()
If *Control
  Debug "*** Parse ScriptText ***"
  
  script = GetDataSectionText(?vbs)
  
  t3 = ElapsedMilliseconds()
  r1 = ParseScriptText(*Control, script)
  If r1 = #S_OK
    Debug "Code Ready 1."
  EndIf
  t4 = ElapsedMilliseconds()
  
  
  Debug "*** Free ActiveScript ***"
  FreeActiveScript(*Control)
  
  Debug "************************************************************"
EndIf

*Control = NewActiveScript("JScript")
If *Control
  Debug "*** Parse ScriptText ***"
  
  script = GetDataSectionText(?JScript)
  
  t5 = ElapsedMilliseconds()
  r1 = ParseScriptText(*Control, script)
  If r1 = #S_OK
    Debug "Code Ready 1."
  EndIf
  t6 = ElapsedMilliseconds()
  
  Debug "*** Free ActiveScript ***"
  FreeActiveScript(*Control)
  
  Debug "************************************************************"
EndIf

info.s = "Time: PB " + Str(t2-t1) + "ms / VBS " + Str(t4-t3) + "ms / JScript " + Str(t6-t5) + "ms"
info.s + #LF$ + #LF$ + "Len VBS = " + Len(sValue1) + #LF$ + "Len JScript = " + Len(sValue2)
MessageRequester("Timings of loops " + loops, info)

; -------------------------------------------------------------------------------------

DataSection
  vbs:
  Data.s ~"On Error Resume Next"
  Data.s ~""
  Data.s ~"Dim loops, i, a, b"
  Data.s ~""
  Data.s ~"a = \"Hello\""
  Data.s ~"b = \"...\""
  Data.s ~"loops = Runtime.Integer(\"loops\")"
  Data.s ~""
  Data.s ~"For i = 1 to loops"
  Data.s ~" a = a + b"
  Data.s ~"Next"
  Data.s ~"Runtime.String(\"sValue1\") = a"
  Data.s ~""
  Data.s #ETX$
  Data.i 0
  jscript:
  Data.s ~"var i, a, b, loops;"
  Data.s ~"a = \"Hello\";"
  Data.s ~"b = \"...\";"
  Data.s ~"loops = Runtime.Integer(\"loops\");"
  Data.s ~"for (i = 0; i < loops; i++) {"
  Data.s ~" a = a + b;"
  Data.s ~"}"
  Data.s ~"Runtime.String(\"sValue2\") = a;"
  Data.s #ETX$
  Data.i 0
EndDataSection
[/size]

Re: String concatenation performance

Posted: Wed Oct 13, 2021 9:49 am
by Rinzwind
ps. c backend won't improve anything; same results (no surprise there)