IsAlpha IsNumeric

Share your advanced PureBasic knowledge/code with the community.
User avatar
Demivec
Addict
Addict
Posts: 4270
Joined: Mon Jul 25, 2005 3:51 pm
Location: Utah, USA

Re: IsAlpha IsNumeric

Post by Demivec »

SFSxOI wrote:To add other specific characters in case you want to refine the alphanumeric expression just add them after the [:alnum:] and before the following bracket - for example I added a '.' to support floats with this [:alnum:].] (note the '.' after [:alnum:] but before the next bracket)
@SFSxOI: I thought the topic of the thread was about functions for determining 'IsAlpha' or 'IsNumeric', not 'IsAlphaNumeric' :) .

Even though floats are numerals + symbols they don't include letters. Wouldn't that make alpha-numeric detection a bit of an overkill unless you are detecting number bases higher than ten.
SFSxOI
Addict
Addict
Posts: 2970
Joined: Sat Dec 31, 2005 5:24 pm
Location: Where ya would never look.....

Re: IsAlpha IsNumeric

Post by SFSxOI »

Demivec wrote:
SFSxOI wrote:To add other specific characters in case you want to refine the alphanumeric expression just add them after the [:alnum:] and before the following bracket - for example I added a '.' to support floats with this [:alnum:].] (note the '.' after [:alnum:] but before the next bracket)
@SFSxOI: I thought the topic of the thread was about functions for determining 'IsAlpha' or 'IsNumeric', not 'IsAlphaNumeric' :) .

Even though floats are numerals + symbols they don't include letters. Wouldn't that make alpha-numeric detection a bit of an overkill unless you are detecting number bases higher than ten.

I was just offering examples specific to the other more recent posts indicated using character classes and how to modify the character classe usage for special uses that can simplify regular expressions for common uses along the lines of alpha and numeric uses. For example, luis expressed for something that would do "1,0" so i offered the IsAlphaNumeric due to the thread topic, but he could have done a 'digit' class regular expression and added a ',' if all he was interested in was strings of digits with a ',' in them. However, since you bought it up, and back to straight Alpha or Numeric (and one for floats):


[:digit:] = Only the digits 0 to 9
[:alnum:] = Any alphanumeric character 0 to 9 OR A to Z or a to z.
[:alpha:] = Any alpha character A to Z or a to z.
[:blank:] = Space and TAB characters only.
[:xdigit:] = Hexadecimal notation 0-9, A-F, a-f.
[:punct:] = Punctuation symbols . , " ' ? ! ; : # $ % & ( ) * + - / < > = @ [ ] \ ^ _ { } | ~
[:print:] = Any printable character.
[:space:] = Any whitespace characters (space, tab, NL, FF, VT, CR). Many system abbreviate as \s.
[:graph:] = Exclude whitespace (SPACE, TAB). Many system abbreviate as \W.
[:upper:] = Any alpha character A to Z.
[:lower:] = Any alpha character a to z.
[:cntrl:] = Control Characters NL CR LF TAB VT FF NUL SOH STX EXT EOT ENQ ACK SO SI DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC IS1 IS2 IS3 IS4 DEL.

Code: Select all

Procedure IsNumeric(in_str.s)
  rex_IsNumeric = CreateRegularExpression(#PB_Any,"^[[:digit:]]+$") ; 
  Is_Numeric.b = MatchRegularExpression(rex_IsNumeric, in_str)
  ProcedureReturn Is_Numeric
EndProcedure

Procedure IsNumericFloat(in_str.s)
  rex_IsNumericFloat = CreateRegularExpression(#PB_Any,"^[[:digit:].]+$") ; Any digit 0-9 and float
  Is_NumericFloat.b = MatchRegularExpression(rex_IsNumericFloat, in_str)
  ProcedureReturn Is_NumericFloat
EndProcedure

Procedure IsAlpha(in_str.s)
  rex_isAlpha = CreateRegularExpression(#PB_Any,"^[[:alpha:]]+$") ; A-Z and a-z
  is_Alpha.b = MatchRegularExpression(rex_isAlpha, in_str)
  ProcedureReturn is_Alpha
EndProcedure

Debug IsNumeric("1234567890")
Debug IsNumericFloat("1.0")
Debug IsNumericFloat("123456901.1235")
Debug IsAlpha("ABCDabcd")

There are also abbreviations of the above as well:

\d = Match any character in the range 0 - 9 = [:digit:]
\D = Match any character NOT in the range 0 - 9 = [^[:digit:]]
\s = Match any whitespace characters (space, tab etc.) = [:space:] EXCEPT VT is not recognized
\S = Match any character NOT whitespace (space, tab) = [^[:space:]]
\w = Match any character in the range 0 - 9, A - Z and a - z = [:alnum:]
\W = Match any character NOT the range 0 - 9, A - Z and a - z = [^[:alnum:]]

Example using abbreviation:

Code: Select all

Procedure IsNumeric(in_str.s)
  rex_IsNumeric = CreateRegularExpression(#PB_Any,"^\d+$") ; Any digit 0-9
  Is_Numeric.b = MatchRegularExpression(rex_IsNumeric, in_str)
  ProcedureReturn Is_Numeric
EndProcedure

Debug IsNumeric("123")
Last edited by SFSxOI on Thu Aug 16, 2012 6:14 pm, edited 1 time in total.
The advantage of a 64 bit operating system over a 32 bit operating system comes down to only being twice the headache.
User avatar
Guimauve
Enthusiast
Enthusiast
Posts: 742
Joined: Wed Oct 22, 2003 2:51 am
Location: Canada

Re: IsAlpha IsNumeric

Post by Guimauve »

@SFSxOI

I'm not sure but I think your code can lead to a Memory leak problem because at each function calls a Regular expression is created but never released. To avoid this you have to do at least :

Code: Select all

Procedure IsNumeric(in_str.s)
	rex_IsNumeric = CreateRegularExpression(#PB_Any,"^[[:digit:]]+$") ; 
	Is_Numeric.b = MatchRegularExpression(rex_IsNumeric, in_str)
	FreeRegularExpression(rex_IsNumeric)
	ProcedureReturn Is_Numeric
EndProcedure

Procedure IsNumericFloat(in_str.s)
	rex_IsNumericFloat = CreateRegularExpression(#PB_Any,"^[[:digit:].]+$") ; Any digit 0-9 and float
	Is_NumericFloat.b = MatchRegularExpression(rex_IsNumericFloat, in_str)
	FreeRegularExpression(rex_IsNumericFloat)
	ProcedureReturn Is_NumericFloat
EndProcedure

Procedure IsAlpha(in_str.s)
	rex_IsAlpha = CreateRegularExpression(#PB_Any,"^[[:alpha:]]+$") ; A-Z and a-z
	is_Alpha.b = MatchRegularExpression(rex_isAlpha, in_str)
	FreeRegularExpression(rex_IsAlpha)
	ProcedureReturn is_Alpha
EndProcedure

Debug IsNumeric("1234567890")
Debug IsNumericFloat("1.0")
Debug IsNumericFloat("123456901.1235")
Debug IsAlpha("ABCDabcd")
Or much better and faster (only if you accept to have few regular expression always existing when your program is running)

Code: Select all

Procedure.b IsNumeric(in_str.s)
  
  Static rex_IsNumeric
  
  If rex_IsNumeric = #Null
    rex_IsNumeric = CreateRegularExpression(#PB_Any,"^[[:digit:]]+$") ; 
  EndIf
  
  ProcedureReturn MatchRegularExpression(rex_IsNumeric, in_str)
EndProcedure

Procedure.b IsNumericFloat(in_str.s)
  
  Static rex_IsNumericFloat
  
  If rex_IsNumericFloat = #Null
    rex_IsNumericFloat = CreateRegularExpression(#PB_Any,"^[[:digit:].]+$") ; Any digit 0-9 and float
  EndIf
  
  ProcedureReturn MatchRegularExpression(rex_IsNumericFloat, in_str)
EndProcedure

Procedure.b IsAlpha(in_str.s)
  
  Static rex_isAlpha
  
  If rex_isAlpha = #Null
    rex_isAlpha = CreateRegularExpression(#PB_Any,"^[[:alpha:]]+$") ; A-Z and a-z
  EndIf
  
  ProcedureReturn MatchRegularExpression(rex_isAlpha, in_str)
EndProcedure

Debug IsNumeric("1234567890")
Debug IsNumericFloat("1.0")
Debug IsNumericFloat("123456901.1235")
Debug IsAlpha("ABCDabcd")
Best regards
Guimauve
User avatar
ts-soft
Always Here
Always Here
Posts: 5756
Joined: Thu Jun 24, 2004 2:44 pm
Location: Berlin - Germany

Re: IsAlpha IsNumeric

Post by ts-soft »

@sfx

Code: Select all

Procedure IsNumeric(numstr.s)
  Protected Result, Pattern.s = "^[-+]?[0-9]*\.?[0-9]+$"
  Protected RegEx = CreateRegularExpression(#PB_Any, Pattern)
  If RegEx
    Result = MatchRegularExpression(RegEx, numstr)
    FreeRegularExpression(RegEx)
    ProcedureReturn Result
  EndIf
EndProcedure


Procedure IsNumericFloat(in_str.s)
  rex_IsNumericFloat = CreateRegularExpression(#PB_Any,"^[[:digit:].]+$") ; Any digit 0-9 and float
  Is_NumericFloat.b = MatchRegularExpression(rex_IsNumericFloat, in_str)
  ProcedureReturn Is_NumericFloat
EndProcedure

Debug IsNumericFloat("1.012.3")
Debug IsNumeric("1.012.3")
My code is correct, a floatingpoint number has only one dot!
PureBasic 5.73 | SpiderBasic 2.30 | Windows 10 Pro (x64) | Linux Mint 20.1 (x64)
Old bugs good, new bugs bad! Updates are evil: might fix old bugs and introduce no new ones.
Image
SFSxOI
Addict
Addict
Posts: 2970
Joined: Sat Dec 31, 2005 5:24 pm
Location: Where ya would never look.....

Re: IsAlpha IsNumeric

Post by SFSxOI »

Guimauve wrote:@SFSxOI

I'm not sure but I think your code can lead to a Memory leak problem because at each function calls a Regular expression is created but never released. To avoid this you have to do at least :

Code: Select all

Procedure IsNumeric(in_str.s)
	rex_IsNumeric = CreateRegularExpression(#PB_Any,"^[[:digit:]]+$") ; 
	Is_Numeric.b = MatchRegularExpression(rex_IsNumeric, in_str)
	FreeRegularExpression(rex_IsNumeric)
	ProcedureReturn Is_Numeric
EndProcedure

Procedure IsNumericFloat(in_str.s)
	rex_IsNumericFloat = CreateRegularExpression(#PB_Any,"^[[:digit:].]+$") ; Any digit 0-9 and float
	Is_NumericFloat.b = MatchRegularExpression(rex_IsNumericFloat, in_str)
	FreeRegularExpression(rex_IsNumericFloat)
	ProcedureReturn Is_NumericFloat
EndProcedure

Procedure IsAlpha(in_str.s)
	rex_IsAlpha = CreateRegularExpression(#PB_Any,"^[[:alpha:]]+$") ; A-Z and a-z
	is_Alpha.b = MatchRegularExpression(rex_isAlpha, in_str)
	FreeRegularExpression(rex_IsAlpha)
	ProcedureReturn is_Alpha
EndProcedure

Debug IsNumeric("1234567890")
Debug IsNumericFloat("1.0")
Debug IsNumericFloat("123456901.1235")
Debug IsAlpha("ABCDabcd")
Or much better and faster (only if you accept to have few regular expression always existing when your program is running)

Code: Select all

Procedure.b IsNumeric(in_str.s)
  
  Static rex_IsNumeric
  
  If rex_IsNumeric = #Null
    rex_IsNumeric = CreateRegularExpression(#PB_Any,"^[[:digit:]]+$") ; 
  EndIf
  
  ProcedureReturn MatchRegularExpression(rex_IsNumeric, in_str)
EndProcedure

Procedure.b IsNumericFloat(in_str.s)
  
  Static rex_IsNumericFloat
  
  If rex_IsNumericFloat = #Null
    rex_IsNumericFloat = CreateRegularExpression(#PB_Any,"^[[:digit:].]+$") ; Any digit 0-9 and float
  EndIf
  
  ProcedureReturn MatchRegularExpression(rex_IsNumericFloat, in_str)
EndProcedure

Procedure.b IsAlpha(in_str.s)
  
  Static rex_isAlpha
  
  If rex_isAlpha = #Null
    rex_isAlpha = CreateRegularExpression(#PB_Any,"^[[:alpha:]]+$") ; A-Z and a-z
  EndIf
  
  ProcedureReturn MatchRegularExpression(rex_isAlpha, in_str)
EndProcedure

Debug IsNumeric("1234567890")
Debug IsNumericFloat("1.0")
Debug IsNumericFloat("123456901.1235")
Debug IsAlpha("ABCDabcd")
Best regards
Guimauve

Yes, you can add the 'FreeRegularExpression' and arrange anyway you wish, mine were just quick examples, depends on the usage.
The advantage of a 64 bit operating system over a 32 bit operating system comes down to only being twice the headache.
SFSxOI
Addict
Addict
Posts: 2970
Joined: Sat Dec 31, 2005 5:24 pm
Location: Where ya would never look.....

Re: IsAlpha IsNumeric

Post by SFSxOI »

ts-soft wrote:@sfx

Code: Select all

Procedure IsNumeric(numstr.s)
  Protected Result, Pattern.s = "^[-+]?[0-9]*\.?[0-9]+$"
  Protected RegEx = CreateRegularExpression(#PB_Any, Pattern)
  If RegEx
    Result = MatchRegularExpression(RegEx, numstr)
    FreeRegularExpression(RegEx)
    ProcedureReturn Result
  EndIf
EndProcedure


Procedure IsNumericFloat(in_str.s)
  rex_IsNumericFloat = CreateRegularExpression(#PB_Any,"^[[:digit:].]+$") ; Any digit 0-9 and float
  Is_NumericFloat.b = MatchRegularExpression(rex_IsNumericFloat, in_str)
  ProcedureReturn Is_NumericFloat
EndProcedure

Debug IsNumericFloat("1.012.3")
Debug IsNumeric("1.012.3")
My code is correct, a floatingpoint number has only one dot!

yes, i'm aware that a float has only one dot. I was just offering quick generic types of examples that I wasn't going to get too involved with as everyone has their own way of doing things and people seemed to be talking about regular expression and to show it could be done differently, plus you can modify the ones i posted to accept only one dot if that is of concern. If your code is generating what is supposed to be a float and it contains more than one dot most likely your code is going to give other indications that something is wrong anyway.

Not saying your code is not correct, just pointing out that things can be simplified and stream lined a little by use of classes instead of lenghtening a code line with complicated syntax and lowering the line count in the procedure to three lines instead of seven.
The advantage of a 64 bit operating system over a 32 bit operating system comes down to only being twice the headache.
User avatar
ts-soft
Always Here
Always Here
Posts: 5756
Joined: Thu Jun 24, 2004 2:44 pm
Location: Berlin - Germany

Re: IsAlpha IsNumeric

Post by ts-soft »

My code is not longer, i only check if regularexpression is okay and free it :wink:
The using of classes doesn't short the code in any way, only the patternstring.
PureBasic 5.73 | SpiderBasic 2.30 | Windows 10 Pro (x64) | Linux Mint 20.1 (x64)
Old bugs good, new bugs bad! Updates are evil: might fix old bugs and introduce no new ones.
Image
SFSxOI
Addict
Addict
Posts: 2970
Joined: Sat Dec 31, 2005 5:24 pm
Location: Where ya would never look.....

Re: IsAlpha IsNumeric

Post by SFSxOI »

ts-soft wrote:My code is not longer, i only check if regularexpression is okay and free it :wink:
The using of classes doesn't short the code in any way, only the patternstring.
I'm not being critical of your code. I just think a shorter or more "gramatical read" pattern string is eaiser to envision when trouble shooting and coding than a longer one with more complicated syntax, its easy to make a mistake and "^[-+]?[0-9]*\.?[0-9]+$" is a little complicated for syntax when you can do "^[[:digit:].]+$". The pattern string is part of the code.

To each his own. :)

Quick example regular expression for whole number or floats, single decimal point or comma, using abbreviation for "[:digit:]" class which is "\d" :

Code: Select all

Procedure IsNumeric(in_str.s) ; whole number - or - floats - with - single decimal point or comma
  rex_IsNumeric = CreateRegularExpression(#PB_Any,"^\d*(\.|,)?\d*+$") ; Any digit 0-9, and float - with or without comma or decimal point
  Is_Numeric.b = MatchRegularExpression(rex_IsNumeric, in_str)
  ; FreeRegularExpression(rex_IsNumeric) ; if needed or desired - regular expression freed when program ends anyway
  ProcedureReturn Is_Numeric
EndProcedure

Debug IsNumeric("123456789102345678901.12345678910234567890") ; one decimal point
Debug IsNumeric("123456789102345678901,12345678910234567890") ; one comma
Debug IsNumeric("123456789102345678901") ; no decimal point or comma
Debug IsNumeric("0,123456789102345678901") ; preceding 0 then comma
Debug IsNumeric("0.123456789102345678901") ;  preceding 0 then decimal point
Of course a pattern can get longer or more complicated looking but still remain eaisier to follow with simplified syntax of a "gramatical read" form. With a little expansion of above example to the below example one can allow matching for math expressions by supporting; Preceding +/-, or 'e', 'scientific notation' with an 'e' or 'E' and +/- (...e.g.. 10 e+34 or 10 E+34), preceeding '$' for U.S. currency, division '/' and precentage '%' symbols, exponentiation ("raised to the power of") ^ symbol, greater than or less than symbols <>, equal = symbol, the sometimes used math "approximately equal" ~ symbol.

Code: Select all

Procedure Is_NumericMathExpression(in_NumMEx_str.s) 
  rex_IsNumericMathExpression = CreateRegularExpression(#PB_Any,"^[$+-e~]?\d*[.|,]?\d*[+-^eE<>=~%/]+$")
  Is_NumericExpressionMath.b = MatchRegularExpression(rex_IsNumericMathExpression, in_NumMEx_str)
  FreeRegularExpression(rex_IsNumericMathExpression) ; if needed or desired - regular expression freed when program ends anyway
  ProcedureReturn Is_NumericExpressionMath
EndProcedure

Debug Is_NumericMathExpression("123/2")
Debug Is_NumericMathExpression("123%")
Debug Is_NumericMathExpression("123456789102345678901.12345678910234567890") 
Debug Is_NumericMathExpression("+123456789102345678901,12345678910234567890")
Debug Is_NumericMathExpression("-e123456789102345678901.12345678910234567890")
Debug Is_NumericMathExpression("e123456789102345678901.12345678910234567890")
Debug Is_NumericMathExpression("123456789102345678901e-34")
Debug Is_NumericMathExpression("123456789102345678901.213456e+34")
Debug Is_NumericMathExpression("123456789102345678901,213456E+34")
Debug Is_NumericMathExpression("123E-34")
Debug Is_NumericMathExpression("123456789102345678901^34")
Debug Is_NumericMathExpression("$1.20")
Debug Is_NumericMathExpression("0.23456789102345678901")
Debug Is_NumericMathExpression("0.023456789102345678901")
Debug Is_NumericMathExpression("1.2")
Debug Is_NumericMathExpression("1,2")
Debug Is_NumericMathExpression("1")
Debug Is_NumericMathExpression("0")
Debug Is_NumericMathExpression("2>1")
Debug Is_NumericMathExpression("1<2")
Debug Is_NumericMathExpression("~1.234")
Debug Is_NumericMathExpression("1=1")
Debug Is_NumericMathExpression("1.0001~1.001")
Debug Is_NumericMathExpression("1.001+1=2.001~2")
The point of the bottom example is that determining if something is numeric or not is not simply just restricted to whole numbers or floats. Regular expressions can be expanded as needed to encompass special uses to evaluate other numerical uses such as math expressions. In Math expressions there can be more than one symbol of something, for example a decimal point.

; Note: For those not familiar with regular expressions; The * (asterisk or star) matches the preceding character 0 or more times. The ? (question mark) matches the preceding character 0 or 1 times only. The '\' is an escape character and indicates we want to use a metacharacter (which have special meaning in reg ex's) as a literal. In the above examples its needed to use metecharacters as literal characters for matching and the metacharacter needs to be escaped to remove its special significance so it can be used as a normal character such as, for example, a decimal point or currency $ symbol. If brackets ([...]) are used then generally you don't need to escape a metacharacter to use it as a literal but placement of some metacharacters becomes inportant within the brackets such as the ^ symbol. The ^ (circumflex or caret) inside square brackets (e.g. [^fF...]) means negate the expression if it comes first after the left bracket, for example, [^Ff] means anything except upper or lower case F and [^a-z] means everything except lower case a to z. Notice in the bottom example in the '[+-^eE<>=~%/]' part of the pattern that the ^ is not placed first after the bracket so it can be used as a literal for matching.
Last edited by SFSxOI on Sun Aug 19, 2012 9:16 pm, edited 26 times in total.
The advantage of a 64 bit operating system over a 32 bit operating system comes down to only being twice the headache.
User avatar
CONVERT
Enthusiast
Enthusiast
Posts: 130
Joined: Fri May 02, 2003 12:19 pm
Location: France

Re: IsAlpha IsNumeric

Post by CONVERT »

Thank you very much to all of you, you are great!

I'll study all your codes in detail, I'll learn so much.
PureBasic 6.20 beta 2 (x64) | Windows 10 Pro x64 | Intel(R) Core(TM) i7-8700 CPU @ 3.20Ghz 16 GB RAM, SSD 500 GB, PC locally assembled.
Come back to 6.11 LTS 64 bits because of an issue with #PB_ComboBox_UpperCase in ComboBoxGadget() (Oct. 10, 2024).
Post Reply