Screenshot, code and executable can be found here:
http://www.xs4all.nl/~bluez/purebasic/p ... tm#4_reval
Direct download:
http://www.xs4all.nl/~bluez/purebasic/reval.zip
Perhaps it might be a good idea to add a little 'quick reference' sheet to the purebasic docs? Something like this: (I've hidden it in the 'help' function of REval.)
Code: Select all
Quick Reference.
----------------
This text does not claim to be the complete (or correct) list of regular
expression components. It only provides a quick and dirty overview of
some of all the (sometimes esoteric :-)) options.
Characters:
a - single character 'a'
\t - tab, chr(9)
\r - return, cr, chr(13)
\n - line feed, lf, chr(10)
\e - escape, esc, chr(27)
\x09 - specific character in hexadecimal, \x09 is for example tab
\u20AC - unicade character, \u20AC is the euro currency sign
Special characters:
\d - digit ie. single character 0 to 9
\w - word character ie. single character a-z, A-Z, 0-9, underscore
\s - single 'whitespace' character ie. space, tab, line breaks
. - any single character
Classes:
a class is a group of characters enclosed by square brackets, each character
inside the brackets is a valid match, ranges can be specified by using '-'
[a] - single character 'a'
[ab] - single character, either 'a' or 'b'
[^a] - inside brackets! any single character except 'a'
[0-9] - single character 0 to 9
[0-9A-F] - single hexadecimal character
c[ao]t - matches 'cat' and 'cot' but not 'cit' or 'ct'
inside classes you do not have to escape any metacharacters except ^ \ ] -
Metacharacters:
[ \ ^ $ . | ? * + ( )
several characters have a special meaning, you need to 'escape' them (preceed
them with a backslash) if you want to use them literally
cats|dogs - matches 'cats' or 'dogs'
cats\|dogs - matches 'cats|dogs'
colo?r - matches 'color' and 'colour'
colo\?r - matches 'colo?r'
inside square brackets (classes) the dash '-' needs escaping as well
[1\-2] - matches '1' '2' and '-'
[1-2] - matches '1' '2' but not '-'
Repetition / wildcards:
. - any single character except line breaks (see multiline flag)
? - preceding condition zero or one time
* - zero up to many times
+ - one up to many times
{n} - n times
{n1,n2} - minimal n1 times, maximal n2 times
a? - nothing or 'a'
test{3} - matches 'testtt'
[0-9]{3} - matches '000' '001' etc. all the way up to '999'
(test){3} - matches 'testtesttest'
get(value)? - matches 'get' as well as 'getvalue'
remember that the wildcards * and + are 'greedy'
Grouping:
() - groups a part of the expression
(test){3} - matches 'testtesttest'
Position:
^ - outside brackets! start of the string
$ - end of string
bob - bob may exist anywhere in the string
^bob - string must start with 'bob'
bob$ - string must end with 'bob'
^bob$ - string must be 'bob'
\b - word boundary
\bis - would match 'this is' but would not match 'thisis'
\B - opposite of \b, ie. the match must be 'inside' a word
\Bis - would match 'thatis' but would not match 'that is'
\A \Z \z - start and end of strings, see the web for more details
Alternatives:
| - seperate multiple options by pipe characters
dog|cat - match any string that contains dogs or cats
(dog)|(cat) - for readability some may prefer to group alternatives
Exotics not covered:
backreferences, greedy vs. lazy vs. possessive, lookaround, modifiers, and more
(these can become a sure cause for a definite headache) for example:
([0-9])\1{2} - will match '11' '22' etc. but not '10'
(?i)appel - will match 'appel' as well as 'APPEL'