Page 1 of 1

Does a string contain only alphabetical characters?

Posted: Tue Sep 29, 2020 5:53 pm
by XCoder
I wrote the following procedure to determine whether a string contains only alphabetical characters as PB does not have a function for this purpose and I could not find a solution in the forum.

I used asm for speed.

Code: Select all

Procedure.l IsStrAlpha(*theString)
  ;=========================================================================================
  ; PURPOSE.... determines whether a string contains only alphabetical characters
  
  ; PARAMETER.. *theString - Address of the string to check
  
  ; RETURNS.... 1 if the string contains only alphabetical characters
  ;............ 0 if the string contains any non-alphabetical characters (including spaces)
  ;=========================================================================================
  
  !mov eax, [p.p_theString] ; Get address of string in eax
  !push esi               ; preserve esi
  !mov esi, eax           ; copy address of string to esi
  !cld                    ; makes esi count upwards when lodsw is used (hence fetches next character in string)
  !sub eax, eax           ; clear eax
  
!l_NextChar:  
  !lodsw                  ; get word pointed to by esi in ax - use word for Unicode strings [lodsb for ascii strings]
  !test   al,al           ; check if low byte is zero ie is this end of string?
  !jz  l_IsAlpha          ; if we have reached the end of the string then the string is alphanumeric
  
  !cmp ax, 65             ; A=65 If ax is less than 65 then ax holds a non-alphabetical character
  !jl l_NotAlpha
  !cmp ax, 91             ; Z=90 If ax is less than 91 then ax holds an alphabetical character
  !jl l_NextChar          ; get next character to examine
  
  !cmp ax, 97             ; a=97 If ax is less than 97 then ax holds a non-alphabetical character
  !jl l_NotAlpha           
  !cmp eax, 123           ; 122 = z  If ax is less than 123 then ax holds an alphabetical character
  !jl l_NextChar          ; get next character to examine
  
!l_NotAlpha:              ; string contains a non-alphabetical character
  !pop esi                ; restore esi
  !mov eax, 0
  !jmp l_exit

!l_IsAlpha:               ; string contains an alphabetical character
  !pop esi                ; restore esi
  !mov eax, 1

!l_exit:

  ProcedureReturn
EndProcedure

a$ = "12345"
Debug "String "+ a$ + " returns " + IsStrAlpha(@a$)

a$ = "thequickbrownfoxjumpsoverthelazydogs"
Debug "String "+ a$ + " returns " + IsStrAlpha(@a$)

a$="THEQUICKBROWNFOXJUMPSOVERTHELAZYDOG"
Debug "String "+ a$ + " returns " + IsStrAlpha(@a$)

a$ = "thequickbrownfoxjumpsoverthelazydog9"
Debug "String "+ a$ + " returns " + IsStrAlpha(@a$)

a$="THEQUICKBROWNFOXJUMPSOVERTHELAZYDOGS9"
Debug "String "+ a$ + " returns " + IsStrAlpha(@a$)
I hope others may find this useful. :lol:

Re: Does a string contain only alphabetical characters?

Posted: Tue Sep 29, 2020 6:20 pm
by wilbert
Unfortunately your code is 32 bit only.
Here's a procedure that should give the same results but also works on 64 bit.

Code: Select all

Procedure.l IsStrAlpha(*theString)
  CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
    !mov rdx, [p.p_theString]
    !.l0:
    !movzx eax, word [rdx]
    !add rdx, 2
  CompilerElse
    !mov edx, [p.p_theString]
    !.l0:
    !movzx eax, word [edx]
    !add edx, 2
  CompilerEndIf
  !lea ecx, [eax - 65]
  !and ecx, -33
  !cmp ecx, 26
  !jb .l0
  !sub eax, 1
  !shr eax, 31
  ProcedureReturn
EndProcedure

Re: Does a string contain only alphabetical characters?

Posted: Tue Sep 29, 2020 7:34 pm
by kvitaliy
Are there only English strings?

Code: Select all

a$ = "Anfängerfragen" ; German
Debug "String "+ a$ + " returns " + IsStrAlpha(@a$)

a$="expérience " ; French
Debug "String "+ a$ + " returns " + IsStrAlpha(@a$)

Re: Does a string contain only alphabetical characters?

Posted: Wed Sep 30, 2020 4:35 am
by Tawbie
To test if a unicode string (ie. a string supporting international languages) comprises only alpha characters is, of course, a much bigger job than just for ASCII; and for that, I think it best to use API functions.

Here's a quick example for WINDOWS ONLY. Note that this code is not built for speed, just simplicity:

Code: Select all


Procedure.i IsStrAlpha(*p)
  ;
  ; procedure tests if unicode string is comprised of alpha characters only. Returns 1 for alpha, 0 otherwise.
  ; Windows ONLY, using Windows API
  ; *p = pointer to string
  ; PB v.5.72; x.64, Unicode only, Windows 10 Pro
  ;  
  Protected Alpha.i

  Alpha = #True

  While PeekU(*p) <> 0                    ; loop until you reach Null at end of string
    If IsCharAlpha_(PeekU(*p)) = 0
      Alpha = #False
      Break                               ; exit at first non-alpha character
    EndIf
    *p+2                                  ; unicode uses 2 bytes per character
  Wend
  
  ProcedureReturn Alpha

EndProcedure


; TESTING:
; First, a string comprised of Unicode alpha characters:
X$ = "ABCabc" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF)

; Next, a string containing symbols as well:
Y$ = "ABCabc" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF) + Chr($2605) + Chr($266F)

Debug X$
Debug IsStrAlpha(@X$)
Debug ""
Debug Y$
Debug IsStrAlpha(@Y$)


End

Re: Does a string contain only alphabetical characters?

Posted: Thu Oct 01, 2020 4:38 pm
by minimy
Hi everyone!

Nice job! Thanks for share!
but.. i try with 'especial' spanish character Ñ. and not work as espected.

Ñ= 209
ñ= 241

Code: Select all

X$ = "ABCabc Ñ" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF)

; Next, a string containing symbols as well:
Y$ = "ABCabc Ñ" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF) + Chr($2605) + Chr($266F)

Debug X$
Debug IsStrAlpha(@X$)
Debug ""
Debug Y$
Debug IsStrAlpha(@Y$)
Return this:

Code: Select all

ABCabc ÑÖߟƛṀᾯ
0

ABCabc ÑÖߟƛṀᾯ★♯
0

Re: Does a string contain only alphabetical characters?

Posted: Thu Oct 01, 2020 5:14 pm
by wilbert
minimy wrote:but.. i try with 'especial' spanish character Ñ. and not work as espected
If you want to handle special characters as well, the fastest way is to use asm combined with a lookup table that indicates which characters are valid.

Re: Does a string contain only alphabetical characters?

Posted: Thu Oct 01, 2020 11:55 pm
by Tawbie
minimy wrote: but.. i try with 'especial' spanish character Ñ. and not work as espected.
...
X$ = "ABCabc Ñ" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF)
...
ABCabc ÑÖߟƛṀᾯ
0
It actually did work as expected - the space character is NOT an alpha character - you inserted a space character before the spanish character Ñ and that made the string not all alpha.
If you remove the space, it should work as expected:

X$ = "ABCabcÑ" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF)

Debug Output:
ABCabcÑÖߟƛṀᾯ
1

Re: Does a string contain only alphabetical characters?

Posted: Fri Oct 02, 2020 10:30 am
by Marc56us
The RegEx version

Code: Select all

; IsStrAlpha
; Marc56us - 2020/10/02 - PB 5.72 LTS

#RegEx = "[^\p{L}]+" 

If Not CreateRegularExpression(RegEx, #RegEx)
    Debug "RegEx error" : End
EndIf

Procedure.l IsStrAlpha(theString$)
    If MatchRegularExpression(RegEx, theString$)
        ProcedureReturn 0
    Else    
        ProcedureReturn 1   
    EndIf
EndProcedure

Repeat
    Read.s a$
    If a$ = "EOT" : Break : EndIf
    Debug "" + IsStrAlpha(a$) + " - " + a$
ForEver

End

DataSection
    Data.s "12345"
    Data.s "thequickbrownfoxjumpsoverthelazydogs"
    Data.s "THEQUICKBROWNFOXJUMPSOVERTHELAZYDOG"
    Data.s "thequickbrownfoxjumpsoverthelazydog9"
    Data.s "THEQUICKBROWNFOXJUMPSOVERTHELAZYDOGS9"
    Data.s "Anfängerfragen" ; German
    Data.s "expérience"     ; French
    Data.s "ABCabc" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF)
    Data.s "ABCabc" + Chr($00D6) + Chr($00DF) + Chr($0178) + Chr($019B) + Chr($1E40) + Chr($1FAF) + Chr($2605) + Chr($266F)
    Data.s "EOT"
EndDataSection

Code: Select all

0 - 12345
1 - thequickbrownfoxjumpsoverthelazydogs
1 - THEQUICKBROWNFOXJUMPSOVERTHELAZYDOG
0 - thequickbrownfoxjumpsoverthelazydog9
0 - THEQUICKBROWNFOXJUMPSOVERTHELAZYDOGS9
1 - Anfängerfragen
1 - expérience
1 - ABCabcÖߟƛṀᾯ
0 - ABCabcÖߟƛṀᾯ★♯
:wink:

Re: Does a string contain only alphabetical characters?

Posted: Fri Oct 02, 2020 12:49 pm
by kvitaliy
Marc56us wrote:The RegEx version
Excellent! It also works in Russian.
1 - qwertyйцукенQWERTYЙЦУКЕН