String concatenation performance

Just starting out? Need help? Post your questions and find answers here.
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: String concatenation performance

Post by wilbert »

cas wrote:Replace std::string with std::wstring
Is it possible for you to see what assembly code is generated ?
It would be nice to see how it is translated by the C compiler.
Windows (x64)
Raspberry Pi OS (Arm64)
cas
Enthusiast
Enthusiast
Posts: 597
Joined: Mon Nov 03, 2008 9:56 pm

Re: String concatenation performance

Post by cas »

Copy/paste it to compiler explorer: https://godbolt.org/
Edit: also you can test code on many online c++ compilers, for example this one is giving me good numbers (about 15ms vs 8ms on my local PC): http://cpp.sh/
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: String concatenation performance

Post by wilbert »

cas wrote:Copy/paste it to compiler explorer: https://godbolt.org/
Edit: also you can test code on many online c++ compilers, for example this one is giving me good numbers (about 15ms vs 8ms on my local PC): http://cpp.sh/
Thanks. I didn't know of that website. :)
I'm not familiar with C++ but it seems strings are objects which store the length. For a big part that explains the speed.
It also seems to reallocate memory which is also a great help since especially with smaller strings there's a reasonable chance the memory can be increased without the need to allocate a new block and copy the existing string data.
Windows (x64)
Raspberry Pi OS (Arm64)
cas
Enthusiast
Enthusiast
Posts: 597
Joined: Mon Nov 03, 2008 9:56 pm

Re: String concatenation performance

Post by cas »

Yes, string class from c++ when translated to PB and simplified, would look something like this:

Code: Select all

DisableDebugger
EnableExplicit

DeclareModule OptimizedString
  DisableDebugger
  EnableExplicit
  
  Structure str
    *start
    *end
    capacity.i
    ;sso.c[16] ;TODO: small string optimization
  EndStructure
  
  Declare Append(*s.str,*chars,nChars=-1)
  Declare Reserve(*s.str,additionalChars)
  Declare Length(*s.str)
  Declare Pointer(*s.str)
  Declare.s NativeString(*s.str)
EndDeclareModule

Module OptimizedString
  
  Procedure Append(*s.str,*chars,nChars=-1)
    If nChars=-1
      nChars=MemoryStringLength(*chars)
    EndIf
    If nChars>0
      Reserve(*s,nChars)
      CopyMemory(*chars,*s\end,nChars*SizeOf(Character))
      *s\end+(nChars*SizeOf(Character))
      PokeC(*s\end,0)
    EndIf
  EndProcedure
  
  Procedure Reserve(*s.str,additionalChars)
    Protected len=Length(*s)
    Protected free=*s\capacity-len
    If free<additionalChars
      *s\capacity=(*s\capacity+(additionalChars-free))*2
      *s\start=ReAllocateMemory(*s\start,(*s\capacity+1)*SizeOf(Character),#PB_Memory_NoClear)
      *s\end=*s\start+(len*SizeOf(Character))
    EndIf
  EndProcedure
  
  Procedure Length(*s.str)
    ProcedureReturn (*s\end-*s\start)/SizeOf(Character)
  EndProcedure
  
  Procedure Pointer(*s.str)
    ProcedureReturn *s\start
  EndProcedure
  
  Procedure.s NativeString(*s.str)
    ProcedureReturn PeekS(Pointer(*s),Length(*s))
  EndProcedure
  
EndModule


DisableExplicit

#N_REPEATS=10000

b.s = "..."

t1 = ElapsedMilliseconds()

a.s = "hello"

CompilerIf #N_REPEATS=<10000 ;do not test if loop is over 10k
For i = 1 To #N_REPEATS
  a + b
Next
CompilerEndIf
t2 = ElapsedMilliseconds()

ostr.OptimizedString::str
OptimizedString::Append(@ostr,@"hello")

For i = 1 To #N_REPEATS
  OptimizedString::Append(@ostr,@b)
Next

optimized_a.s=OptimizedString::NativeString(@ostr)

t3 = ElapsedMilliseconds()

t1s.s="<skipped>"
If Len(a)>5
  t1s.s=Str(t2-t1)+"ms"
  If optimized_a<>a
    MessageRequester("ERROR","results should be same")
  EndIf
EndIf

MessageRequester("Timings", t1s.s+" vs "+Str(t3-t2)+"ms")
10k: 0ms
100k: 4ms
1M: 38ms
cas
Enthusiast
Enthusiast
Posts: 597
Joined: Mon Nov 03, 2008 9:56 pm

Re: String concatenation performance

Post by cas »

mk-soft wrote:With own structured memory strings works faster.
Need ca 4ms for 10000 loops and 33ms for 100000 loops. CPU Intel(R) Core(TM) i7-8700B CPU @ 3.20GHz
You are still measuring it with debugger enabled inside all procedures, you must disable debugger at top of your source file.
User avatar
mk-soft
Always Here
Always Here
Posts: 5334
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: String concatenation performance

Post by mk-soft »

So I noticed.
100000 loops with FastString

macOS (Hostsystem)
Timings: 34481 vs 24511 vs 325 vs 11
Windows 7 (Virtual Machine)
---------------------------
Timings
---------------------------
31359 vs 23673 vs 294 vs 8
---------------------------
OK
---------------------------
Ubuntu 1804 (Virtual Machine)
Timings 9076 vs 12284 vs 107 vs 3
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
Rinzwind
Enthusiast
Enthusiast
Posts: 636
Joined: Wed Mar 11, 2009 4:06 pm
Location: NL

Re: String concatenation performance

Post by Rinzwind »

mk-soft wrote: That here a VB script is faster than Purebasic is very sad.
That's even ignoring the fact that VBScript's implementation doesn't do optimalization as JScript does (one can work around that somewhat with an array and final Join).

VBScript

Code: Select all

Option Explicit

Dim i, a, b, t1, t2

Function FormatMS(value)
	FormatMS = FormatNumber(value * 1000, 0, 0, 0, 0)
End Function


a = "Hello"
b = "..."
t1 = Timer
For i = 1 To 50000
	a = a & b
Next
t2 = Timer
WScript.Echo FormatMS(t2 - t1)
'WScript.Echo a
406 ms

JScript

Code: Select all

var i, a, b, t1, t2;
a = "Hello";
b = "...";
t1 = new Date();
for (i = 0; i < 50000; i++) {
 a = a + b;
}
t2 = new Date();
WScript.Echo(t2 - t1)
//WScript.Echo(a);
17 ms

JavaScript in browser is even faster... (around 3..9)

PB

Code: Select all

EnableExplicit

DisableDebugger

Define a.s, b.s, i, t1, t2
a = "Hello"
b = "..."

t1 = ElapsedMilliseconds()
For i = 1 To 50000
  a + b  
Next
t2 = ElapsedMilliseconds()
MessageRequester("", Str(t2 - t1))

9964 ms

FastString
28..34 ms

OptimizedString
2..4 ms

My own solution

Code: Select all

Procedure.s ListToString(List StringList.s(), Delimiter.s = " ")
  Protected String.s, l, c = ListSize(StringList()), *p, i
  
  If c = 0
    ProcedureReturn ""
  EndIf
  ForEach StringList()
    l + Len(StringList())
  Next
  String = Space(l + Len(Delimiter) * c)
  *p = @String
  ResetList(StringList())
  NextElement(StringList())
  CopyMemoryString(StringList(), @*p)
  While NextElement(StringList())
    CopyMemoryString(@Delimiter)
    CopyMemoryString(StringList())
  Wend
  ProcedureReturn String
EndProcedure
8..16 ms

Which just means PB's native implementation could really use optimalization (even if this artificial example is far from real world).
User avatar
mk-soft
Always Here
Always Here
Posts: 5334
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: String concatenation performance

Post by mk-soft »

@Rinzwind
FastString for 50000 loop ca 4ms

Interesting that JScript is so fast even under the module ActiveScript
---------------------------
Timings of loops 50000
---------------------------
Time: PB 7983ms / VBS 474ms / JScript 14ms

Len VBS = 150005
Len JScript = 150005
---------------------------
OK
---------------------------
Update

Code: Select all

;-TOP

; Comment   : Modul ActiveScript Example 14
; Version   : v2.09

; Link to ActiveScript  : https://www.purebasic.fr/english/viewtopic.php?f=12&t=71399
; Link to SmartTags     : https://www.purebasic.fr/english/viewtopic.php?f=12&t=71399#p527089
; Link to VariantHelper : https://www.purebasic.fr/english/viewtopic.php?f=12&t=71399#p527090

; ***************************************************************************************

XIncludeFile "Modul_ActiveScript.pb"
;XIncludeFile "Modul_SmartTags.pb"
;XIncludeFile "VariantHelper.pb"

UseModule ActiveScript
;UseModule ActiveSmartTags

; -------------------------------------------------------------------------------------

Procedure.s GetDataSectionText(*Addr.Character)
  Protected result.s, temp.s
  While *Addr\c <> #ETX
    temp = PeekS(*Addr)
    *Addr + StringByteLength(temp) + SizeOf(Character)
    result + temp + #LF$
  Wend
  ProcedureReturn result
EndProcedure

; -------------------------------------------------------------------------------------

Global script.s, sValue1.s, sValue2.s

Runtime sValue1, sValue2

; -------------------------------------------------------------------------------------
;-TOP

DisableDebugger

Define loops = 50000
Runtime loops

b.s = "..."
a.s = "hello"

t1 = ElapsedMilliseconds()
For i = 1 To loops
  a + b
Next
t2 = ElapsedMilliseconds()

*Control = NewActiveScript()
If *Control
  Debug "*** Parse ScriptText ***"
  
  script = GetDataSectionText(?vbs)
  
  t3 = ElapsedMilliseconds()
  r1 = ParseScriptText(*Control, script)
  If r1 = #S_OK
    Debug "Code Ready 1."
  EndIf
  t4 = ElapsedMilliseconds()
  
  
  Debug "*** Free ActiveScript ***"
  FreeActiveScript(*Control)
  
  Debug "************************************************************"
EndIf

*Control = NewActiveScript("JScript")
If *Control
  Debug "*** Parse ScriptText ***"
  
  script = GetDataSectionText(?JScript)
  
  t5 = ElapsedMilliseconds()
  r1 = ParseScriptText(*Control, script)
  If r1 = #S_OK
    Debug "Code Ready 1."
  EndIf
  t6 = ElapsedMilliseconds()
  
  Debug "*** Free ActiveScript ***"
  FreeActiveScript(*Control)
  
  Debug "************************************************************"
EndIf

info.s = "Time: PB " + Str(t2-t1) + "ms / VBS " + Str(t4-t3) + "ms / JScript " + Str(t6-t5) + "ms"
info.s + #LF$ + #LF$ + "Len VBS = " + Len(sValue1) + #LF$ + "Len JScript = " + Len(sValue2)
MessageRequester("Timings of loops " + loops, info)

; -------------------------------------------------------------------------------------

DataSection
  vbs:
  Data.s ~"On Error Resume Next"
  Data.s ~""
  Data.s ~"Dim loops, i, a, b"
  Data.s ~""
  Data.s ~"a = \"Hello\""
  Data.s ~"b = \"...\""
  Data.s ~"loops = Runtime.Integer(\"loops\")"
  Data.s ~""
  Data.s ~"For i = 1 to loops"
  Data.s ~" a = a + b"
  Data.s ~"Next"
  Data.s ~"Runtime.String(\"sValue1\") = a"
  Data.s ~""
  Data.s #ETX$
  Data.i 0
  jscript:
  Data.s ~"var i, a, b, loops;"
  Data.s ~"a = \"Hello\";"
  Data.s ~"b = \"...\";"
  Data.s ~"loops = Runtime.Integer(\"loops\");"
  Data.s ~"for (i = 0; i < loops; i++) {"
  Data.s ~" a = a + b;"
  Data.s ~"}"
  Data.s ~"Runtime.String(\"sValue2\") = a;"
  Data.s #ETX$
  Data.i 0
EndDataSection
[/size]
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
Rinzwind
Enthusiast
Enthusiast
Posts: 636
Joined: Wed Mar 11, 2009 4:06 pm
Location: NL

Re: String concatenation performance

Post by Rinzwind »

ps. c backend won't improve anything; same results (no surprise there)
Post Reply