Removing 'ASCII' switch from PureBasic

Developed or developing a new product in PureBasic? Tell the world about it.
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Removing 'ASCII' switch from PureBasic

Post by wilbert »

User_Russian wrote:Unicode strings in PB, much slower than the ASCII
It doesn't have to stay this way.
When there's only one version, there might be more room for the PB team to optimize the PB string system. I don't see why it would have to be much slower.
RichAlgeni wrote:If we could get SSE42 based string functions as Wilbert suggested
I suggested SSE2 since that is part of the x86-64 specification and guaranteed to work on all x86-64 processors :wink:
juror wrote:PB 5.22 LTS expires in mid 2015
When support expires that doesn't mean you can't use it anymore.
If 5.2x LTS is stable enough for someone to release a commercial application at this time, why wouldn't it be suitable anymore after support ends.
You could still keep the 5.2x LTS compiler for existing projects requiring ascii only and use a newer PB version for new projects.
Windows (x64)
Raspberry Pi OS (Arm64)
Little John
Addict
Addict
Posts: 4519
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: Removing 'ASCII' switch from PureBasic

Post by Little John »

wilbert wrote:When support expires that doesn't mean you can't use it anymore.
If 5.2x LTS is stable enough for someone to release a commercial application at this time, why wouldn't it be suitable anymore after support ends.
When we write new code, there is always the possibility that we encounter new bugs.
So I can understand, that especially for writing commercial programs, people want to use a PB version which is actively supported (read: gets bug fixes from time to time).
Lebostein
Addict
Addict
Posts: 807
Joined: Fri Jun 11, 2004 7:07 am

Re: Removing 'ASCII' switch from PureBasic

Post by Lebostein »

All string based things are slower with UNICODE. All Map() based things for example:

Code: Select all

NewMap test.l()

; Prepare keys
#count = 200000
Dim key$(#count)
For i = 0 To #count
  key$(i) = "Testing Key " + Str(i)
Next i

a1 = ElapsedMilliseconds()
For i = 0 To #count
  AddMapElement(test(), key$(i))
Next i
a2 = ElapsedMilliseconds()
For i = 0 To #count
  FindMapElement(test(),key$(i))
Next i
a3 = ElapsedMilliseconds()

text$ = "Add: " + Str(a2-a1) + #CRLF$ + "Find: " + Str(a3-a2)
SetClipboardText(text$)
MessageRequester("Results", text$)
UNICODE
Add: 2815
Find: 2655

ASCII
Add: 2250
Find: 2139
If you remove the word "highly" from "compiler which creates highly optimized executables" from the PB homepage, then I agree with this step (reluctantly)
User avatar
Samuel
Enthusiast
Enthusiast
Posts: 755
Joined: Sun Jul 29, 2012 10:33 pm
Location: United States

Re: Removing 'ASCII' switch from PureBasic

Post by Samuel »

wilbert wrote: When support expires that doesn't mean you can't use it anymore.
If 5.2x LTS is stable enough for someone to release a commercial application at this time, why wouldn't it be suitable anymore after support ends.
You could still keep the 5.2x LTS compiler for existing projects requiring ascii only and use a newer PB version for new projects.
You're exactly right.

If people are having issues with this. I wonder what will happen when Fred drops 32 bit?
Is everyone going to go grab their pitchforks and torches?
Deluxe0321
User
User
Posts: 69
Joined: Tue Sep 16, 2008 6:11 am
Location: ger

Re: Removing 'ASCII' switch from PureBasic

Post by Deluxe0321 »

Let's make a deal then;

If Fred fixes the speed related issues in the string library I would fully support the transition.

In Addition: Of course that would mean that he implements an easy way to output content in ascii too - by Memory (ToAscii()?) or by any other way.
What happens PB internally is not that problem, just make sure we won't loose speed.

Thank you!
User avatar
luis
Addict
Addict
Posts: 3876
Joined: Wed Aug 31, 2005 11:09 pm
Location: Italy

Re: Removing 'ASCII' switch from PureBasic

Post by luis »

Deluxe0321 wrote: What happens PB internally is not that problem, just make sure we won't loose speed.
UCS-2 data is twice as large and must be moved around in memory.

EDIT: btw this may be cheated through using a different instructions set instead of 386+fpu only since you can move more data at once then, but that would improve speed everywhere and not only for unicode strings operations.
Also I don't know if the PB string library is written in C or not (I suppose yes). If it is could be sufficient to get a boost to enable sse / sse2 in the C compiler (as someone already said I think). The optimum would be for PB to generate sse/sse2 instructions for our code too.
Last edited by luis on Fri Aug 08, 2014 12:05 pm, edited 1 time in total.
"Have you tried turning it off and on again ?"
A little PureBasic review
codeprof
User
User
Posts: 65
Joined: Sun Sep 16, 2012 12:13 pm

Re: Removing 'ASCII' switch from PureBasic

Post by codeprof »

freak wrote:1) Support for the 3 OS
2) Support for ascii/unicode
3) Support for quirks of specific OS versions within the same OS type (largely fixes for glitches in specific Windows versions)
4) Support for threaded programs
5) Support for 32bit/64bit
This sounds a bit strange to me. Supporting a whole processor architecture is less effort than supporting ascii strings?

Personally i need the UCase()/LCase() commands really often when I work with strings. However these are extremly much slower.

Code: Select all

DisableDebugger

Str.s
#Text = "1234567890"

Time = ElapsedMilliseconds()

For i=1 To 10000
  Str = UCase(Str+#Text)
Next i

MessageRequester("", StrF((ElapsedMilliseconds()-Time)/1000, 3))

;Results:  (Tested with Linux 64Bit)
;0.7s ASCII
;5.5s Unicode
User avatar
luis
Addict
Addict
Posts: 3876
Joined: Wed Aug 31, 2005 11:09 pm
Location: Italy

Re: Removing 'ASCII' switch from PureBasic

Post by luis »

codeprof wrote: This sounds a bit strange to me. Supporting a whole processor architecture is less effort than supporting ascii strings?
After you wrote it. Sound reasonable to me.
Theoretically the code generation stays always the same if not bugged, while support libraries changes and grows.
The changes and additions to the libraries must support ascii and unicode (case in point) and must be hand-tailored every time for that.
The code generation step does not care about all this an stays the same.
"Have you tried turning it off and on again ?"
A little PureBasic review
User avatar
heartbone
Addict
Addict
Posts: 1058
Joined: Fri Apr 12, 2013 1:55 pm
Location: just outside of Ferguson

Re: Removing 'ASCII' switch from PureBasic

Post by heartbone »

juror wrote:And if you don't understand Rescator you need to understand Dunning-Kruger.
:D
Thank you for that perspective juror.
I was feeling somewhat stupid and obsolete for considering text to be a string of characters. :oops:
Message body:
Enter your message here, it may contain no more than 60000 characters.
Now I won't need to ponder the true meaning of that instruction.
Keep it BASIC.
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Removing 'ASCII' switch from PureBasic

Post by wilbert »

codeprof wrote:Personally i need the UCase()/LCase() commands really often when I work with strings. However these are extremly much slower.
They are a bit slower but not that much. What is slow in your code is combining strings.
Try this to compare UCase only

Code: Select all

DisableDebugger

Str.s
#Text = "1234567890"

For i=1 To 1000
  Str + #Text
Next i

Time = ElapsedMilliseconds()

For i=1 To 10000
  UCase(Str)
Next i

MessageRequester("", StrF((ElapsedMilliseconds()-Time)/1000, 3))
The speed of string handling could probably be increased a lot if PB would cache the length of strings.
Windows (x64)
Raspberry Pi OS (Arm64)
juror
Enthusiast
Enthusiast
Posts: 228
Joined: Mon Jul 09, 2007 4:47 pm
Location: Courthouse

Re: Removing 'ASCII' switch from PureBasic

Post by juror »

wilbert wrote:
juror wrote:PB 5.22 LTS expires in mid 2015
When support expires that doesn't mean you can't use it anymore.
If 5.2x LTS is stable enough for someone to release a commercial application at this time, why wouldn't it be suitable anymore after support ends.
You could still keep the 5.2x LTS compiler for existing projects requiring ascii only and use a newer PB version for new projects.
That is certainly a possibility - unless you are a small vendor (as we are) and have contractual agreements with larger customers, especially govt agencies, who in the interests of removing all liability from themselves, have a contract provision which states to the effect "any/all software/utilities/products provided to (customer) by HTC (us) are warranted to have been developed and maintained using fully licensed and supported hardware and software. Furthermore, HTC warrants continuing support of all HTC provided products throughout the licensing period."

The exact terminology will vary and is not present in all contracts, but you get the idea. They want to protect themselves from any liability from 1) us using illegal software in our development environment and/or on hardware which isn't ours/theirs (either of which they feel could make them partially liable) and 2) assure our provided products are and will be maintained. This forces us to make the conversion to unicode while 5.22 LTS is still supported. We can do it, but it would be nicer to have more of a cushion, e.g. another ascii LTS.

Sure, we can push back, but that may well mean we do not get the contract and frankly, we can't afford to lose contracts. Selling to agencies has become the lifeblood of our company. We could not make it on individual sales to end users.
User avatar
DK_PETER
Addict
Addict
Posts: 898
Joined: Sat Feb 19, 2011 10:06 am
Location: Denmark
Contact:

Re: Removing 'ASCII' switch from PureBasic

Post by DK_PETER »

@Juror
That is certainly a possibility - unless you are a small vendor (as we are) and have contractual agreements with larger customers, especially govt agencies, who in the interests of removing all liability from themselves, have a contract provision which states to the effect "any/all software/utilities/products provided to (customer) by HTC (us) are warranted to have been developed and maintained using fully licensed and supported hardware and software. Furthermore, HTC warrants continuing support of all HTC provided products throughout the licensing period."
Those conditions are in my view completely intolerable.
If you can satisfy their needs using an obsolete version of PB to their satisfaction - they should have a problem..???
Do they require, that you show receipts for your PB purchase and examine your version of PB?
This is pure madness. Under no circumstance would I agree to such terms. If I can provide the services they need to the required standards
by using cobolt, pascal or Casper Fudd's minor league programming language, then the demands are met.
If this is truly the terms, then I would recommennd, that you switch to unicode as asap.
Current configurations:
Ubuntu 20.04/64 bit - Window 10 64 bit
Intel 6800K, GeForce Gtx 1060, 32 gb ram.
Amd Ryzen 9 5950X, GeForce 3070, 128 gb ram.
IdeasVacuum
Always Here
Always Here
Posts: 6425
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: Removing 'ASCII' switch from PureBasic

Post by IdeasVacuum »

we have have not had 1 request for Unicode
Same here - but isn't that because the customers do not program themselves? What I have got are customers that want their app to support a multitude of different languages (inc. Chinese Simplified/Traditional) and using Unicode has made that easy to develop and test.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
User avatar
graph100
Enthusiast
Enthusiast
Posts: 115
Joined: Tue Aug 10, 2010 3:17 pm

Re: Removing 'ASCII' switch from PureBasic

Post by graph100 »

I took the code of codeprof and made some tests :

Code: Select all

DisableDebugger

Str.s
#Text = "1234567890"

Time = ElapsedMilliseconds()

For i=1 To 10000
	Str = UCase(Str+#Text)
Next i

time = ElapsedMilliseconds()-Time                      

MessageRequester("", StrF(time/1000, 3))

;Results:  (Tested With Linux 64Bit)
;5.5s Unicode
;0.7s ASCII


;Windows 8 x64, PB 5.21 x64
; 3.38 - 3.47 unicode
; 3.26 - 3.21 ascii

;Windows 8 x64, PB 5.22 x86
; 3.56 - 3.50 unicode
; 3.13 - 3.10 ascii

;Windows 8 x64, PB 5.30 x86
; 3.55 - 3.45 unicode
; 3.37 - 3.39 ascii

;Mandriva 2010.2 x86, PB 5.22 x86
; 5.38 - 5.22 unicode
; 0.52 - 0.50 ascii

;MacOS X, PB 5.21 x86
; 3.72 - 3.62 unicode
; 3.25 - 3.13 ascii
-> We can see that ascii is always faster than unicode.
-> Then, on my windows and mac, ascii or unicode are really close.. like 5% slower for unicode.
-> on linux, (mine x86 and his x64), unicode is much much slower than ascii, around 10 times slower

the difference on linux must come from some really optimized routines for ascii.
Only the dev can know from where it come.
_________________________________________________
My Website : CeriseCode (Warning : perpetual changes & not completed ;))
juror
Enthusiast
Enthusiast
Posts: 228
Joined: Mon Jul 09, 2007 4:47 pm
Location: Courthouse

Re: Removing 'ASCII' switch from PureBasic

Post by juror »

DK_PETER wrote:@Juror
Those conditions are in my view completely intolerable.
Actually, they're not that unusual for government agencies in this country. I've worked for large companies where the contracts with small vendors were even worse.
DK_PETER wrote: If you can satisfy their needs using an obsolete version of PB to their satisfaction - they should have a problem..???
Most of these types of contracts have grown over the years to accommodate each/every eventuality. Some agency may have been burned sometime using unsupported software. As I said, I've seen worse than these. When I worked in the pharmaceutical industry, our contracts were much worse than these, largely because we had to cover every eventuality from an FDA regulatory view. From what I've read in the forums, frequent contributor DoubleDutch may work in an environment even more restrictive than ours.
DK_PETER wrote:Do they require, that you show receipts for your PB purchase and examine your version of PB?
They haven't yet, but our records are audit-able by them anytime they demand.
DK_PETER wrote:This is pure madness. Under no circumstance would I agree to such terms.
You're obviously much more successful than we are. We don't love the terms, but we can live with them in order to get the business. As I said, without their contract business, we don't remain a viable business. Individual end-user sales are not sufficient. We're actually phasing out our end-user sales in order to concentrate on securing additional contracts. It's possible that since many small suppliers feel like you do, and will not agree to their terms, we are at an advantage because we will.

Sorry. I'm beginning to stray far too "off-topic". We will accommodate the unicode change. It would just be nicer if we had a little more time. And it's not an option for us to "just use an unsupported version".
Post Reply