Lack of "fast strings"
- marcoagpinto
- Addict
- Posts: 947
- Joined: Sun Mar 10, 2013 3:01 pm
- Location: Portugal
- Contact:
Lack of "fast strings"
Hello!
For years that this has been my battle.
My app, Proofing Tool GUI, is a linguistic tool, and thus it uses tons of strings and is as slow as hell.
For years that people have been complaining that PureBasic looks for the null byte in strings instead of storing the size in bytes.
Isn't there an add-on or plugin for GCC that could be activated with a setting in PB and increase the speed of strings dozens of times?
That idea of creating a blah blah blah fixer myself is not practicable. It is similar to when Ubuntu 22.04 LTS came out and PB didn't support it and a user wrote dozens or hundreds of lines of code to fix the windows style back then… this is simply impracticable. It either comes built-in or no one will do it.
Thanks!
// No Bug. Moved from "Bugs - C backend" to "Feature Requests and Wishlists" (Kiffi)
For years that this has been my battle.
My app, Proofing Tool GUI, is a linguistic tool, and thus it uses tons of strings and is as slow as hell.
For years that people have been complaining that PureBasic looks for the null byte in strings instead of storing the size in bytes.
Isn't there an add-on or plugin for GCC that could be activated with a setting in PB and increase the speed of strings dozens of times?
That idea of creating a blah blah blah fixer myself is not practicable. It is similar to when Ubuntu 22.04 LTS came out and PB didn't support it and a user wrote dozens or hundreds of lines of code to fix the windows style back then… this is simply impracticable. It either comes built-in or no one will do it.
Thanks!
// No Bug. Moved from "Bugs - C backend" to "Feature Requests and Wishlists" (Kiffi)
Re: Lack of "fast strings"
You are a little late to this complaint! (It's not new.)
This is a great c library, and it's not even 400 lines of code. I'm sure it would do it, and convert rather quickly.
https://github.com/spitstec/milkstrings.
The only slow aspect of PB's strings is continuously adding strings together. Restructure your code so there's spaces buffered at the end, and your code will be faster.
Oh, yeah... Use pointers instead of passing strings to procedures. Doing the later forces it to copy the string to a new budget every time.
BTW, this isn't a bug. Managed strings are for people who want simplicity, not speed.
This is a great c library, and it's not even 400 lines of code. I'm sure it would do it, and convert rather quickly.
https://github.com/spitstec/milkstrings.
The only slow aspect of PB's strings is continuously adding strings together. Restructure your code so there's spaces buffered at the end, and your code will be faster.
Oh, yeah... Use pointers instead of passing strings to procedures. Doing the later forces it to copy the string to a new budget every time.
BTW, this isn't a bug. Managed strings are for people who want simplicity, not speed.
Re: Lack of "fast strings"
Tenaja wrote: ↑Mon Feb 13, 2023 10:17 pm You are a little late to this complaint! (It's not new.)
This is a great c library, and it's not even 400 lines of code. I'm sure it would do it, and convert rather quickly.
https://github.com/spitstec/milkstrings.
The only slow aspect of PB's strings is continuously adding strings together. Restructure your code so there's spaces buffered at the end, and your code will be faster.
Oh, yeah... Use pointers instead of passing strings to procedures. Doing the later forces it to copy the string to a new memory location every time.
BTW, this isn't a bug. Managed strings are for people who want simplicity, not speed.
Re: Lack of "fast strings"
I agree with Marco that it should be built-in. Like him, I don't want to change all my strings to use pointers and do other hacks to make them fast.
Re: Lack of "fast strings"
Tenaja wrote: ↑Mon Feb 13, 2023 10:17 pm This is a great c library, and it's not even 400 lines of code. I'm sure it would do it, and convert rather quickly.
https://github.com/spitstec/milkstrings.
Are you sure you want that?// Version : 0.0.1
// Description : Easy strings in c limited length and lifetime
Last edited by Bitblazer on Mon Feb 13, 2023 11:45 pm, edited 1 time in total.
- marcoagpinto
- Addict
- Posts: 947
- Joined: Sun Mar 10, 2013 3:01 pm
- Location: Portugal
- Contact:
Re: Lack of "fast strings"
Anyway, I have been supporting open-source from my own pocket for decades.
Is it a matter of $$$$$?
How much? 100 EUR? To implement it?
I can't afford more than that.
Is it a matter of $$$$$?
How much? 100 EUR? To implement it?
I can't afford more than that.
Re: Lack of "fast strings"
I haven't wanted to, either, but this request is decades old. Are you going to bang a dull drum, or come up with an alternative. GitHub has hundreds of string libraries that are far superior to PB's or c's stdlib. My example was just one that was a mere 400ish lines long.
Re: Lack of "fast strings"
Did you use mentioned "great" library? Or a random github pick?
Anyway, here are some interesting c string libs:
https://github.com/oz123/awesome-c#string-manipulation
Anyway, here are some interesting c string libs:
https://github.com/oz123/awesome-c#string-manipulation
Re: Lack of "fast strings"
Fred, is there a way to replace the built-in string libraries?
We might be able to easily replace the string function calls (replacing the precompiled library) but those that use pointers (coded at compile time) are unlikely to be replaceable.
Re: Lack of "fast strings"
If I'm not completely mistaken, the really slow thing about string functions is concatenating strings.
What if there was some kind of StringBuilder (like in C# for example). I think the basics are already there: The LinkedList
Now we just need a new command to 'merge' the LinkedList into a string. For example, let's call it ConcatList(LinkedList, [Separator.s]). The separator is an optional string used to concatenate the individual elements of the LinkedList.
Code: Select all
Define myString.s
For Counter = 0 To 99
myString + " sooo slow "
Next
Debug myString
Code: Select all
NewList myList.s()
For Counter = 0 To 99
AddElement(myList()) : myList() = "much faster"
Next
Code: Select all
Debug ConcatList(myList())
much fastermuch fastermuch faster ...
Code: Select all
Debug ConcatList(myList(), ", ")
just my 2 cents...much faster, much faster, much faster, ...
Hygge
Re: Lack of "fast strings"
You can already built-up a stringbuilder with PB commands => viewtopic.php?p=558277#p558277
{Home}.:|:.{Dialog Design0R}.:|:.{Codes}.:|:.{Downloads}.:|:.{History Viewer Online}
Re: Lack of "fast strings"
the concatenation problem is more to do with how the code is generated with the + operator
as long as the "+" is on one line of code it will append a copy onto stack before unwinding it
if the strings are short its fast
s1 = s1 + s2 + s3 + s4 + s5 ;...
but as soon as you call it in a loop it blows up as it keeps copying the appended string onto the stack
and pops it off again.
for example what you get with s1+s2 in a for loop
vs what you want with incline c
as long as the "+" is on one line of code it will append a copy onto stack before unwinding it
if the strings are short its fast
s1 = s1 + s2 + s3 + s4 + s5 ;...
but as soon as you call it in a loop it blows up as it keeps copying the appended string onto the stack
and pops it off again.
for example what you get with s1+s2 in a for loop
Code: Select all
// For a =0 To 10000
v_a=0;
while(1) {
if (!(((integer)10000LL>=v_a))) { break; }
// s1 + s2
SYS_PushStringBasePosition();
SYS_CopyString(v_s1) ;
SYS_CopyString(v_s2);
SYS_AllocateString4(&v_s1,SYS_PopStringBasePosition());
// Next
next1:
v_a+=1;
}
il_next2:;
vs what you want with incline c
Code: Select all
Global s1.s
Global s2.s
s1 = "hello"
s2 = "world"
st = ElapsedMilliseconds()
For a =0 To 10000
s1 + s2
Next
et = ElapsedMilliseconds()
st1 = ElapsedMilliseconds()
!SYS_PushStringBasePosition();
For a = 0 To 10000
!SYS_CopyString(v_s2);
Next
!SYS_AllocateString4(&v_s1,SYS_PopStringBasePosition());
et1 = ElapsedMilliseconds()
out.s = Str(et-st) + " " + Str(et1-st1)
MessageRequester("test",out)
Re: Lack of "fast strings"
I compared the inline C version with HeXOR's version:
with a.s = "abcdefghijklmnopqrstuvwxyz" + #CRLF$
It has more or less the same time with a loop of 10000 : C-Way = 1 ms - StringBuilder = 1 ms and PB-Way = 2820 ms
But then I don't know why but it slows down with a loop of 20000 : C-Way = 210 ms - StringBuilder = 3 ms and PB-Way = 15041 ms
with a.s = "abcdefghijklmnopqrstuvwxyz" + #CRLF$
It has more or less the same time with a loop of 10000 : C-Way = 1 ms - StringBuilder = 1 ms and PB-Way = 2820 ms
But then I don't know why but it slows down with a loop of 20000 : C-Way = 210 ms - StringBuilder = 3 ms and PB-Way = 15041 ms