Does a string contain only alphabetical characters?

Share your advanced PureBasic knowledge/code with the community.
XCoder
User
User
Posts: 68
Joined: Tue Dec 31, 2013 9:18 pm

Does a string contain only alphabetical characters?

Post by XCoder »

I wrote the following procedure to determine whether a string contains only alphabetical characters as PB does not have a function for this purpose and I could not find a solution in the forum.

I used asm for speed.

Code: Select all

Procedure.l IsStrAlpha(*theString)
  ;=========================================================================================
  ; PURPOSE.... determines whether a string contains only alphabetical characters
  
  ; PARAMETER.. *theString - Address of the string to check
  
  ; RETURNS.... 1 if the string contains only alphabetical characters
  ;............ 0 if the string contains any non-alphabetical characters (including spaces)
  ;=========================================================================================
  
  !mov eax, [p.p_theString] ; Get address of string in eax
  !push esi               ; preserve esi
  !mov esi, eax           ; copy address of string to esi
  !cld                    ; makes esi count upwards when lodsw is used (hence fetches next character in string)
  !sub eax, eax           ; clear eax
  
!l_NextChar:  
  !lodsw                  ; get word pointed to by esi in ax - use word for Unicode strings [lodsb for ascii strings]
  !test   al,al           ; check if low byte is zero ie is this end of string?
  !jz  l_IsAlpha          ; if we have reached the end of the string then the string is alphanumeric
  
  !cmp ax, 65             ; A=65 If ax is less than 65 then ax holds a non-alphabetical character
  !jl l_NotAlpha
  !cmp ax, 91             ; Z=90 If ax is less than 91 then ax holds an alphabetical character
  !jl l_NextChar          ; get next character to examine
  
  !cmp ax, 97             ; a=97 If ax is less than 97 then ax holds a non-alphabetical character
  !jl l_NotAlpha           
  !cmp eax, 123           ; 122 = z  If ax is less than 123 then ax holds an alphabetical character
  !jl l_NextChar          ; get next character to examine
  
!l_NotAlpha:              ; string contains a non-alphabetical character
  !pop esi                ; restore esi
  !mov eax, 0
  !jmp l_exit

!l_IsAlpha:               ; string contains an alphabetical character
  !pop esi                ; restore esi
  !mov eax, 1

!l_exit:

  ProcedureReturn
EndProcedure

a$ = "12345"
Debug "String "+ a$ + " returns " + IsStrAlpha(@a$)

a$ = "thequickbrownfoxjumpsoverthelazydogs"
Debug "String "+ a$ + " returns " + IsStrAlpha(@a$)

a$="THEQUICKBROWNFOXJUMPSOVERTHELAZYDOG"
Debug "String "+ a$ + " returns " + IsStrAlpha(@a$)

a$ = "thequickbrownfoxjumpsoverthelazydog9"
Debug "String "+ a$ + " returns " + IsStrAlpha(@a$)

a$="THEQUICKBROWNFOXJUMPSOVERTHELAZYDOGS9"
Debug "String "+ a$ + " returns " + IsStrAlpha(@a$)
I hope others may find this useful. :lol:
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Does a string contain only alphabetical characters?

Post by wilbert »

Unfortunately your code is 32 bit only.
Here's a procedure that should give the same results but also works on 64 bit.

Code: Select all

Procedure.l IsStrAlpha(*theString)
  CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
    !mov rdx, [p.p_theString]
    !.l0:
    !movzx eax, word [rdx]
    !add rdx, 2
  CompilerElse
    !mov edx, [p.p_theString]
    !.l0:
    !movzx eax, word [edx]
    !add edx, 2
  CompilerEndIf
  !lea ecx, [eax - 65]
  !and ecx, -33
  !cmp ecx, 26
  !jb .l0
  !sub eax, 1
  !shr eax, 31
  ProcedureReturn
EndProcedure
Last edited by wilbert on Wed Sep 30, 2020 6:21 am, edited 1 time in total.
Windows (x64)
Raspberry Pi OS (Arm64)
kvitaliy
Enthusiast
Enthusiast
Posts: 162
Joined: Mon May 10, 2010 4:02 pm

Re: Does a string contain only alphabetical characters?

Post by kvitaliy »

Are there only English strings?

Code: Select all

a$ = "Anfängerfragen" ; German
Debug "String "+ a$ + " returns " + IsStrAlpha(@a$)

a$="expérience " ; French
Debug "String "+ a$ + " returns " + IsStrAlpha(@a$)
Tawbie
User
User
Posts: 26
Joined: Fri Jul 10, 2020 2:36 am

Re: Does a string contain only alphabetical characters?

Post by Tawbie »

To test if a unicode string (ie. a string supporting international languages) comprises only alpha characters is, of course, a much bigger job than just for ASCII; and for that, I think it best to use API functions.

Here's a quick example for WINDOWS ONLY. Note that this code is not built for speed, just simplicity:

Code: Select all


Procedure.i IsStrAlpha(*p)
  ;
  ; procedure tests if unicode string is comprised of alpha characters only. Returns 1 for alpha, 0 otherwise.
  ; Windows ONLY, using Windows API
  ; *p = pointer to string
  ; PB v.5.72; x.64, Unicode only, Windows 10 Pro
  ;  
  Protected Alpha.i

  Alpha = #True

  While PeekU(*p) <> 0                    ; loop until you reach Null at end of string
    If IsCharAlpha_(PeekU(*p)) = 0
      Alpha = #False
      Break                               ; exit at first non-alpha character
    EndIf
    *p+2                                  ; unicode uses 2 bytes per character
  Wend
  
  ProcedureReturn Alpha

EndProcedure


; TESTING:
; First, a string comprised of Unicode alpha characters:
X$ = "ABCabc" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF)

; Next, a string containing symbols as well:
Y$ = "ABCabc" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF) + Chr($2605) + Chr($266F)

Debug X$
Debug IsStrAlpha(@X$)
Debug ""
Debug Y$
Debug IsStrAlpha(@Y$)


End
User avatar
minimy
Enthusiast
Enthusiast
Posts: 344
Joined: Mon Jul 08, 2013 8:43 pm

Re: Does a string contain only alphabetical characters?

Post by minimy »

Hi everyone!

Nice job! Thanks for share!
but.. i try with 'especial' spanish character Ñ. and not work as espected.

Ñ= 209
ñ= 241

Code: Select all

X$ = "ABCabc Ñ" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF)

; Next, a string containing symbols as well:
Y$ = "ABCabc Ñ" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF) + Chr($2605) + Chr($266F)

Debug X$
Debug IsStrAlpha(@X$)
Debug ""
Debug Y$
Debug IsStrAlpha(@Y$)
Return this:

Code: Select all

ABCabc ÑÖߟƛṀᾯ
0

ABCabc ÑÖߟƛṀᾯ★♯
0
If translation=Error: reply="Sorry, Im Spanish": Endif
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Does a string contain only alphabetical characters?

Post by wilbert »

minimy wrote:but.. i try with 'especial' spanish character Ñ. and not work as espected
If you want to handle special characters as well, the fastest way is to use asm combined with a lookup table that indicates which characters are valid.
Windows (x64)
Raspberry Pi OS (Arm64)
Tawbie
User
User
Posts: 26
Joined: Fri Jul 10, 2020 2:36 am

Re: Does a string contain only alphabetical characters?

Post by Tawbie »

minimy wrote: but.. i try with 'especial' spanish character Ñ. and not work as espected.
...
X$ = "ABCabc Ñ" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF)
...
ABCabc ÑÖߟƛṀᾯ
0
It actually did work as expected - the space character is NOT an alpha character - you inserted a space character before the spanish character Ñ and that made the string not all alpha.
If you remove the space, it should work as expected:

X$ = "ABCabcÑ" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF)

Debug Output:
ABCabcÑÖߟƛṀᾯ
1
Marc56us
Addict
Addict
Posts: 1477
Joined: Sat Feb 08, 2014 3:26 pm

Re: Does a string contain only alphabetical characters?

Post by Marc56us »

The RegEx version

Code: Select all

; IsStrAlpha
; Marc56us - 2020/10/02 - PB 5.72 LTS

#RegEx = "[^\p{L}]+" 

If Not CreateRegularExpression(RegEx, #RegEx)
    Debug "RegEx error" : End
EndIf

Procedure.l IsStrAlpha(theString$)
    If MatchRegularExpression(RegEx, theString$)
        ProcedureReturn 0
    Else    
        ProcedureReturn 1   
    EndIf
EndProcedure

Repeat
    Read.s a$
    If a$ = "EOT" : Break : EndIf
    Debug "" + IsStrAlpha(a$) + " - " + a$
ForEver

End

DataSection
    Data.s "12345"
    Data.s "thequickbrownfoxjumpsoverthelazydogs"
    Data.s "THEQUICKBROWNFOXJUMPSOVERTHELAZYDOG"
    Data.s "thequickbrownfoxjumpsoverthelazydog9"
    Data.s "THEQUICKBROWNFOXJUMPSOVERTHELAZYDOGS9"
    Data.s "Anfängerfragen" ; German
    Data.s "expérience"     ; French
    Data.s "ABCabc" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF)
    Data.s "ABCabc" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF) + Chr($2605) + Chr($266F)
    Data.s "EOT"
EndDataSection

Code: Select all

0 - 12345
1 - thequickbrownfoxjumpsoverthelazydogs
1 - THEQUICKBROWNFOXJUMPSOVERTHELAZYDOG
0 - thequickbrownfoxjumpsoverthelazydog9
0 - THEQUICKBROWNFOXJUMPSOVERTHELAZYDOGS9
1 - Anfängerfragen
1 - expérience
1 - ABCabcÖߟƛṀᾯ
0 - ABCabcÖߟƛṀᾯ★♯
:wink:
kvitaliy
Enthusiast
Enthusiast
Posts: 162
Joined: Mon May 10, 2010 4:02 pm

Re: Does a string contain only alphabetical characters?

Post by kvitaliy »

Marc56us wrote:The RegEx version
Excellent! It also works in Russian.
1 - qwertyйцукенQWERTYЙЦУКЕН
Post Reply