String length should be stored for string variables

Got an idea for enhancing PureBasic? New command(s) you'd like to see?
User avatar
Sicro
Enthusiast
Enthusiast
Posts: 538
Joined: Wed Jun 25, 2014 5:25 pm
Location: Germany
Contact:

Re: String length should be stored for string variables

Post by Sicro »

mk-soft wrote: Sun Nov 13, 2022 6:49 pm The normal string functions are terminated with zero bytes. Like now with PB.
Yes, in that feature request I wrote that strings should continue to terminate with a null character. They should only have an additional prefixed length field, so that the string functions don't have to calculate the string length again and again.
mk-soft wrote: Sun Nov 13, 2022 6:49 pm If change to Type B-STR in PB, this leads of course to problems.
Yes, if with WinAPI function or with pointers directly the PB strings are modified, the PB string management gets nothing of it and the length field is not adapted then. That's why I suggested in my last post to introduce a new function UpdateStringLength() for this purpose, with which the length field can be updated manually (specifying the string length as parameter) or automatically (searches for the null character, like PeekS() but without creating a new string).

Or a new, additional string variable type is implemented, see also my last post. I think this is the cleanest solution. So there are then absolutely no problems with backward compatibility.

@juergenkulow:
Are you saying that PB strings on Linux are already prefixed with a string length field? Please explain more what you want to tell us.
Image
Why OpenSource should have a license :: PB-CodeArchiv-Rebirth :: Pleasant-Dark (syntax color scheme) :: RegEx-Engine (compiles RegExes to NFA/DFA)
Manjaro Xfce x64 (Main system) :: Windows 10 Home (VirtualBox) :: Newest PureBasic version
User avatar
skywalk
Addict
Addict
Posts: 3972
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: String length should be stored for string variables

Post by skywalk »

Simple Dynamic Strings library for C prepends the actual pointer to the string with its length. This allows compatibility with null-term strings and almost all function calls with exception maybe of alias rules.
MySDS$ = SomeFn(MySDS$, someparam)

Code: Select all

+------------+-------------------------------+-----------+
| StrLenHere | Binary safe C alike string... | Null term |
+------------+-------------------------------+-----------+
             |
             `-> Pointer returned to user.
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
Fred
Administrator
Administrator
Posts: 16619
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: String length should be stored for string variables

Post by Fred »

Yes, it would be the best way to for PB, but it needs a major rework of all PB function which handle string and indeed the compile code generation which is not that straight forward. It's still there in my mind though so I will probably give it a try somewhen.
User avatar
Sicro
Enthusiast
Enthusiast
Posts: 538
Joined: Wed Jun 25, 2014 5:25 pm
Location: Germany
Contact:

Re: String length should be stored for string variables

Post by Sicro »

skywalk wrote: Sun Dec 04, 2022 9:15 pm

Code: Select all

+------------+-------------------------------+-----------+
| StrLenHere | Binary safe C alike string... | Null term |
+------------+-------------------------------+-----------+
             |
             `-> Pointer returned to user.
Yes, that is how I envision it and have described it here.
Fred wrote: Mon Dec 05, 2022 10:29 am Yes, it would be the best way to for PB
Fred wrote: Mon Dec 05, 2022 10:29 am Yes, it would be the best way to for PB, but it needs a major rework of all PB function which handle string and indeed the compile code generation which is not that straight forward. It's still there in my mind though so I will probably give it a try somewhen.
Thanks for keeping an eye on it.

Yes, the new additional string variable type (maybe ".z") should be natively built into the PB compiler. If we PB users would write a module, with our own string functions, and we would later pass the string back to PB string functions, the PB string function would have to calculate the string length again. Therefore, it would be better if the PB string functions support such string variables directly.

Edit:
Quote from skywalk fixed by inserting code tags.
Last edited by Sicro on Sat Dec 10, 2022 8:25 pm, edited 1 time in total.
Image
Why OpenSource should have a license :: PB-CodeArchiv-Rebirth :: Pleasant-Dark (syntax color scheme) :: RegEx-Engine (compiles RegExes to NFA/DFA)
Manjaro Xfce x64 (Main system) :: Windows 10 Home (VirtualBox) :: Newest PureBasic version
User avatar
mk-soft
Always Here
Always Here
Posts: 5335
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: String length should be stored for string variables

Post by mk-soft »

I don't think a new type is needed.
Like PB's AllocateStructure, the information comes before the data.

Code: Select all

*bStr = SysAllocString_("Hello World!")

sText.s = PeekS(*bStr)
lenText = PeekL(*bStr - SizeOf(LONG)) >> 1

Debug "Len = " + lenText 
Debug "Txt = " + sText

SysFreeString_(*bStr)
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
User avatar
skywalk
Addict
Addict
Posts: 3972
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: String length should be stored for string variables

Post by skywalk »

@mk-soft
I know you are showing this as an example, but bStr's are Windows only.
Whatever Fred comes up with will be a big performance leap regardless. 8)

My biggest concern is aliasing, since I may have that sprinkled throughout many areas of my code.
Converting to PB's next gen faststring lib would give me pause if aliasing was an issue.
Ex. myfaststring = mid(myfastring, 2, 5)

@Sicro
Maybe your reply was shifted?
For interoperability, the pointer returned must be at the start of Characters, not the header containing the string length.
+------------+-------------------------------+-----------+
| StrLenHere | Binary safe C alike string... | Null term |
+------------+-------------------------------+-----------+
|
`-> Pointer returned to user. <-- This is wrong.

I am unsure of the extent of commercial licensing for this lib, but it appears to solve every issue!
Better String Lib
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
User avatar
Sicro
Enthusiast
Enthusiast
Posts: 538
Joined: Wed Jun 25, 2014 5:25 pm
Location: Germany
Contact:

Re: String length should be stored for string variables

Post by Sicro »

skywalk wrote: Sat Dec 10, 2022 6:47 pm @Sicro
Maybe your reply was shifted?
For interoperability, the pointer returned must be at the start of Characters, not the header containing the string length.
I fixed the quote from you in my post (code tags were not copied).
Image
Why OpenSource should have a license :: PB-CodeArchiv-Rebirth :: Pleasant-Dark (syntax color scheme) :: RegEx-Engine (compiles RegExes to NFA/DFA)
Manjaro Xfce x64 (Main system) :: Windows 10 Home (VirtualBox) :: Newest PureBasic version
Post Reply