Page 1 of 3

Lack of "fast strings"

Posted: Mon Feb 13, 2023 9:59 pm
by marcoagpinto
Hello!

For years that this has been my battle.

My app, Proofing Tool GUI, is a linguistic tool, and thus it uses tons of strings and is as slow as hell.

For years that people have been complaining that PureBasic looks for the null byte in strings instead of storing the size in bytes.

Isn't there an add-on or plugin for GCC that could be activated with a setting in PB and increase the speed of strings dozens of times?

That idea of creating a blah blah blah fixer myself is not practicable. It is similar to when Ubuntu 22.04 LTS came out and PB didn't support it and a user wrote dozens or hundreds of lines of code to fix the windows style back then… this is simply impracticable. It either comes built-in or no one will do it.

Thanks!

// No Bug. Moved from "Bugs - C backend" to "Feature Requests and Wishlists" (Kiffi)

Re: Lack of "fast strings"

Posted: Mon Feb 13, 2023 10:17 pm
by Tenaja
You are a little late to this complaint! (It's not new.)

This is a great c library, and it's not even 400 lines of code. I'm sure it would do it, and convert rather quickly.
https://github.com/spitstec/milkstrings.

The only slow aspect of PB's strings is continuously adding strings together. Restructure your code so there's spaces buffered at the end, and your code will be faster.

Oh, yeah... Use pointers instead of passing strings to procedures. Doing the later forces it to copy the string to a new budget every time.

BTW, this isn't a bug. Managed strings are for people who want simplicity, not speed.

Re: Lack of "fast strings"

Posted: Mon Feb 13, 2023 10:19 pm
by Tenaja
Tenaja wrote: Mon Feb 13, 2023 10:17 pm You are a little late to this complaint! (It's not new.)

This is a great c library, and it's not even 400 lines of code. I'm sure it would do it, and convert rather quickly.
https://github.com/spitstec/milkstrings.

The only slow aspect of PB's strings is continuously adding strings together. Restructure your code so there's spaces buffered at the end, and your code will be faster.

Oh, yeah... Use pointers instead of passing strings to procedures. Doing the later forces it to copy the string to a new memory location every time.

BTW, this isn't a bug. Managed strings are for people who want simplicity, not speed.

Re: Lack of "fast strings"

Posted: Mon Feb 13, 2023 10:20 pm
by BarryG
I agree with Marco that it should be built-in. Like him, I don't want to change all my strings to use pointers and do other hacks to make them fast.

Re: Lack of "fast strings"

Posted: Mon Feb 13, 2023 11:43 pm
by Bitblazer
Tenaja wrote: Mon Feb 13, 2023 10:17 pm This is a great c library, and it's not even 400 lines of code. I'm sure it would do it, and convert rather quickly.
https://github.com/spitstec/milkstrings.
// Version : 0.0.1
// Description : Easy strings in c limited length and lifetime
Are you sure you want that?

Re: Lack of "fast strings"

Posted: Mon Feb 13, 2023 11:45 pm
by marcoagpinto
Anyway, I have been supporting open-source from my own pocket for decades.

Is it a matter of $$$$$?

How much? 100 EUR? To implement it?

I can't afford more than that.

Re: Lack of "fast strings"

Posted: Tue Feb 14, 2023 12:02 am
by Tenaja
BarryG wrote: Mon Feb 13, 2023 10:20 pm I agree with Marco that it should be built-in. Like him, I don't want to change all my strings to use pointers and do other hacks to make them fast.
I haven't wanted to, either, but this request is decades old. Are you going to bang a dull drum, or come up with an alternative. GitHub has hundreds of string libraries that are far superior to PB's or c's stdlib. My example was just one that was a mere 400ish lines long.

Re: Lack of "fast strings"

Posted: Tue Feb 14, 2023 4:28 am
by Rinzwind
Did you use mentioned "great" library? Or a random github pick?

Anyway, here are some interesting c string libs:
https://github.com/oz123/awesome-c#string-manipulation

Re: Lack of "fast strings"

Posted: Tue Feb 14, 2023 7:30 am
by AZJIO

Re: Lack of "fast strings"

Posted: Tue Feb 14, 2023 9:36 am
by BarryG
Tenaja wrote: Tue Feb 14, 2023 12:02 amAre you going to bang a dull drum
We're hoping that the squeaky wheel will (eventually) get the grease.

In the meantime, I'm more than open to using a third-party drop-in lib that doesn't require any changes to my sources to speed up strings.

Re: Lack of "fast strings"

Posted: Tue Feb 14, 2023 2:08 pm
by Tenaja
BarryG wrote: Tue Feb 14, 2023 9:36 am
In the meantime, I'm more than open to using a third-party drop-in lib that doesn't require any changes to my sources to speed up strings.
Fred, is there a way to replace the built-in string libraries?

We might be able to easily replace the string function calls (replacing the precompiled library) but those that use pointers (coded at compile time) are unlikely to be replaceable.

Re: Lack of "fast strings"

Posted: Tue Feb 14, 2023 2:30 pm
by Kiffi
If I'm not completely mistaken, the really slow thing about string functions is concatenating strings.

Code: Select all

Define myString.s

For Counter = 0 To 99
 myString + " sooo slow "
Next

Debug myString
What if there was some kind of StringBuilder (like in C# for example). I think the basics are already there: The LinkedList

Code: Select all

NewList myList.s()

For Counter = 0 To 99
  AddElement(myList()) : myList() = "much faster"
Next
Now we just need a new command to 'merge' the LinkedList into a string. For example, let's call it ConcatList(LinkedList, [Separator.s]). The separator is an optional string used to concatenate the individual elements of the LinkedList.

Code: Select all

Debug ConcatList(myList())
much fastermuch fastermuch faster ...

Code: Select all

Debug ConcatList(myList(), ", ")
much faster, much faster, much faster, ...
just my 2 cents...

Re: Lack of "fast strings"

Posted: Tue Feb 14, 2023 7:36 pm
by HeX0R
You can already built-up a stringbuilder with PB commands => viewtopic.php?p=558277#p558277

Re: Lack of "fast strings"

Posted: Tue Feb 14, 2023 11:32 pm
by idle
the concatenation problem is more to do with how the code is generated with the + operator
as long as the "+" is on one line of code it will append a copy onto stack before unwinding it
if the strings are short its fast
s1 = s1 + s2 + s3 + s4 + s5 ;...
but as soon as you call it in a loop it blows up as it keeps copying the appended string onto the stack
and pops it off again.

for example what you get with s1+s2 in a for loop

Code: Select all

// For a =0 To 10000 
v_a=0;
while(1) {
if (!(((integer)10000LL>=v_a))) { break; }
// s1 + s2 
SYS_PushStringBasePosition();
SYS_CopyString(v_s1) ;                     
SYS_CopyString(v_s2);
SYS_AllocateString4(&v_s1,SYS_PopStringBasePosition());
// Next   
next1:
v_a+=1;
}
il_next2:;


vs what you want with incline c

Code: Select all

Global s1.s  
Global s2.s   

s1 = "hello" 
s2 = "world" 

st = ElapsedMilliseconds() 
For a =0 To 10000 
   s1 + s2 
Next   
et = ElapsedMilliseconds() 

st1 = ElapsedMilliseconds()
!SYS_PushStringBasePosition();
For a = 0 To 10000 
  !SYS_CopyString(v_s2);
Next 
!SYS_AllocateString4(&v_s1,SYS_PopStringBasePosition());
et1 = ElapsedMilliseconds() 

out.s = Str(et-st) + " " + Str(et1-st1) 
MessageRequester("test",out) 


Re: Lack of "fast strings"

Posted: Wed Feb 15, 2023 1:29 am
by ChrisR
I compared the inline C version with HeXOR's version:
with a.s = "abcdefghijklmnopqrstuvwxyz" + #CRLF$
It has more or less the same time with a loop of 10000 : C-Way = 1 ms - StringBuilder = 1 ms and PB-Way = 2820 ms
But then I don't know why but it slows down with a loop of 20000 : C-Way = 210 ms - StringBuilder = 3 ms and PB-Way = 15041 ms