Page 1 of 1
Performance of string concatenation
Posted: Sun Sep 18, 2022 7:45 pm
by Oso
I'm struggling with a performance problem when concatenating a single character onto a buffer. It takes 40 seconds to process the below. This is just a small file of 160 KB that I'm reading. To be clear, it isn't the file reading where the performance problem lies - it's with the concatenation at line 27. I need to be able to perform a test on each character. If I remove the line buffer.s = buffer.s + output.s the code finishes in only 38 ms. Am I doing something wrong? Thanks
Code: Select all
OpenConsole()
#FILENAME = "C:\TEST\TESTFILE.TXT"
If OpenFile(0, #FILENAME, #PB_Ascii)
PrintN ("Creating new file...")
WriteString(0, Space(163840) , #PB_Ascii)
TruncateFile(0)
CloseFile(0)
EndIf
PrintN ("Processing file for for reading...")
If FileSize(#FILENAME) < 0
PrintN("File error : The file does not exist")
Else
If OpenFile(0, #FILENAME, #PB_Ascii)
PrintN (Str(ElapsedMilliseconds()))
FileSeek(0, 0)
While Not(Eof(0))
inputchr.b = ReadByte(0)
output.s = Chr(inputchr.b & $FF)
buffer.s = buffer.s + output.s ; * <---- Remove line line, bottleneck is here
Wend
PrintN (Str(ElapsedMilliseconds()))
CloseFile(0)
EndIf
EndIf
Print("Press Enter : ")
Input()
Re: Performance of string concatenation
Posted: Sun Sep 18, 2022 9:03 pm
by mk-soft
Wrong approach ...
Do not read in character by character, but read the entire file into memory and then evaluate it in memory.
If you enlarge the string character by character, the system must always enlarge the memory block by block and copy the data over.
P.S.
I don't know why people still put wild folders or files in the main drive "C:\". This can also lead to a fatal error. Either use the user folder or an extra drive for example with the folder name "D:\Testing".
Re: Performance of string concatenation
Posted: Sun Sep 18, 2022 11:42 pm
by AZJIO
See an example of concatenating strings with CopyMemoryString()
viewtopic.php?p=566355#p566355
viewtopic.php?p=584676#p584676
Code: Select all
Len=0
ForEach Files()
Len + Len(Files())+2
Next
*Result\s = Space(Len)
*Point = @*Result\s
ForEach Files()
CopyMemoryString(Files()+#CRLF$, @*Point)
Next
ClearList(Files())
The strings are first added to the list and then copied to a pre-allocated memory buffer.
Search "
Fast concatenating" on the forum
viewtopic.php?t=69079
In your example, it's better to get the file size in order to allocate a buffer of the right size, and then use PokeB
Code: Select all
If inputchr.b
*MemoryBuffer + 1
PokeB(*MemoryBuffer , inputchr.b)
EndIf
; ...
Text$ = PeekS(*MemoryBufferStart, *MemoryBuffer - *MemoryBufferStart, #PB_Ascii)
Re: Performance of string concatenation
Posted: Mon Sep 19, 2022 8:45 pm
by Oso
AZJIO wrote: Sun Sep 18, 2022 11:42 pm
In your example, it's better to get the file size in order to allocate a buffer of the right size, and then use PokeB
Code: Select all
If inputchr.b
*MemoryBuffer + 1
PokeB(*MemoryBuffer , inputchr.b)
EndIf
Text$ = PeekS(*MemoryBufferStart, *MemoryBuffer - *MemoryBufferStart, #PB_Ascii)
Thanks to you both for the replies. It was my intention to process the file in sections anyway, but I was playing around with ideas initially, which led me to find that string concatenation can be slow.
For your example above @AZJIO do I have the initialisation of the pointers below correct? I wasn't absolutely sure how to do it, but it appears to work.
Code: Select all
mybuffer$ = Space(163840)
*MemoryBuffer = @mybuffer$
*MemoryBufferStart = *MemoryBuffer
I was confused by an example here
https://www.purebasic.com/documentation ... emory.html at first. Under the heading
"Pointers and character strings". It gives quite a long example which I don't fully understand. I'm not sure why it requires
.String and
*Pointer\s plus two separate
Text variables as I was able to define the pointers above more simply.
Code: Select all
Text$ = "Hello"
*Text = @Text$ ; *Text store the address of the string in memory
*Pointer.String = @*Text ; *Pointer points on *Text
Debug *Pointer\s ; Display the string living at the address stored in *Pointer (i.e. @Text$)
Re: Performance of string concatenation
Posted: Mon Sep 19, 2022 11:07 pm
by HeX0R
You can also try this:
viewtopic.php?p=558277#p558277
Maybe there are even better ways, but since no one knows, what exactly you want to achieve, it's difficult to be of any help.
You just said
I need to be able to perform a test on each character.
Testing is one thing, but what will happen with those characters in case your test signaled something?