Regular Expression

Just starting out? Need help? Post your questions and find answers here.
loulou2522
Enthusiast
Enthusiast
Posts: 495
Joined: Tue Oct 14, 2014 12:09 pm

Regular Expression

Post by loulou2522 »

HI all
I want to negate this expression wichh test a string that must have only 8 or 11 characters

( ^[A-Z]{6,6}[A-Z2-9][A-NP-Z0-9]$ | ^[A-Z]{6,6}([A-Z2-9][A-NP-Z0-9][A-Z0-9]{3,3} ){1,1}$ )
Can someone help me to solve that's problem
Thanks in advance
Marc56us
Addict
Addict
Posts: 1477
Joined: Sat Feb 08, 2014 3:26 pm

Re: Regular Expression

Post by Marc56us »

loulou2522 wrote:I want to negate this expression wichh test a string that must have only 8 or 11 characters

( ^[A-Z]{6,6}[A-Z2-9][A-NP-Z0-9]$ | ^[A-Z]{6,6}([A-Z2-9][A-NP-Z0-9][A-Z0-9]{3,3} ){1,1}$ )
Not sure this works, but try this:

Code: Select all

^(?!.*^[A-Z]{6,6}[A-Z2-9][A-NP-Z0-9]$ | ^[A-Z]{6,6}([A-Z2-9][A-NP-Z0-9][A-Z0-9]{3,3} ){1,1}$).*$
(Need some data samples to really test)

:wink:
infratec
Always Here
Always Here
Posts: 6817
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: Regular Expression

Post by infratec »

Hmmm....

my favourite test site

https://regex101.com/

tells me that your expression in general does not work.

AAAAAAAA should fit or not?

Bernd
Marc56us
Addict
Addict
Posts: 1477
Joined: Sat Feb 08, 2014 3:26 pm

Re: Regular Expression

Post by Marc56us »

For the first exclude, this work

Code: Select all

^(?!.*^[A-Z]{6,6}[A-Z2-9][A-NP-Z0-9]$$).*$
Note {6,6} is strange but work. Shorter: {6}

Code: Select all

; Match (= exclude)
AAAAAA2B

; No match 
BBBBBBB2C
 AAAAAA2B
BBBBBB1A
To exclude your regex

Code: Select all

^(?!.* your_expression_here ).*$
:wink:
#NULL
Addict
Addict
Posts: 1440
Joined: Thu Aug 30, 2007 11:54 pm
Location: right here

Re: Regular Expression

Post by #NULL »

Marc56us wrote:Need some data samples to really test
for testing you could probalby use any BIC code
loulou2522
Enthusiast
Enthusiast
Posts: 495
Joined: Tue Oct 14, 2014 12:09 pm

Re: Regular Expression

Post by loulou2522 »

In fact yes i want to test a bic code
I want to test the bic code that will be 8 or 11 caracter and not presence of Chr(32) anywhere in the string nor than at the beginning or at the end of string
User avatar
Zebuddi123
Enthusiast
Enthusiast
Posts: 794
Joined: Wed Feb 01, 2012 3:30 pm
Location: Nottinghamshire UK
Contact:

Re: Regular Expression

Post by Zebuddi123 »

Hi loulou2522 8 or 11 codes.

Zebuddi :)

\b([a-zA-Z]{4}[a-zA-Z]{2}[a-zA-Z1-9]{2}([a-zA-Z0-9]{3})?)\b|\b([a-zA-Z){4}a-zA-Z]{2}[a-zA-Z1-9]{2})\b

https://bank-code.net

tests:
6 ARAB BANK AUSTRALIA LIMITED SYDNEY ARABAU2S
7 ARRIUM FINANCE PTY LIMITED SYDNEY ARUMAU2S
8 ASSETSECURE PTY LTD SYDNEY ASEYAU2S
9 ASX OPERATIONS PTY LIMITED SYDNEY (RTGS SETTLEMENT) XASXAU2SRTG
10 ASX OPERATIONS PTY LIMITED SYDNEY XASXAU2S
11 AUSTRACLEAR LIMITED SYDNEY (BKO 201) ACLRAU2S201
12 AUSTRACLEAR LIMITED SYDNEY (BKO 202) ACLRAU2S202
13 AUSTRACLEAR LIMITED SYDNEY (BKO 203) ACLRAU2S203
14 AUSTRACLEAR LIMITED SYDNEY (BKO 204) ACLRAU2S204
15 AUSTRACLEAR LIMITED SYDNEY (BKO 205) ACLRAU2S205
16 AUSTRACLEAR LIMITED SYDNEY (BKO 207) ACLRAU2S207
17 AUSTRACLEAR LIMITED SYDNEY (BKO 208) ACLRAU2S208
18 AUSTRACLEAR LIMITED SYDNEY (BKO 209) ACLRAU2S209

Bank code - 4 alphabetic characters Country code - 2 letters Location code - 2 alphanumeric characters, except zero Branch code - 3 alphanumeric characters

ACLRAU2S209

The SWIFT code / BIC code is made up of 8 or 11 characters, broken down as follows:

4 letters: Institution Code or bank code.
2 letters: ISO 3166-1 alpha-2 country code
2 letters or digits: location code
if the second character is "0", then it is typically a test BIC as opposed to a BIC used on the live network.
if the second character is "1", then it denotes a passive participant in the SWIFT network
if the second character is "2", then it typically indicates a reverse billing BIC, where the recipient pays for the message as opposed to the more usual mode whereby the sender pays for the message.
3 letters or digits: branch code, optional ('XXX' for primary office)
Where an 8-digit code is given, it may be assumed that it refers to the primary office.

SWIFT Standards, a division of The Society for Worldwide Interbank Financial Telecommunication (SWIFT), handles the registration of these codes. Because SWIFT originally introduced what was later standardized as Business Identifier Codes (BICs), they are still often called SWIFT addresses or codes.
malleo, caput, bang. Ego, comprehendunt in tempore
loulou2522
Enthusiast
Enthusiast
Posts: 495
Joined: Tue Oct 14, 2014 12:09 pm

Re: Regular Expression

Post by loulou2522 »

Thanks Zebudi but i have a problem to test the following code (it's an error BIC and that's what i want to test)
BIC = "CMC FRPPXXX"
<BIC>CMC FRPPXXX</BIC>
This is not a valid bic because of space, in this case i want to extract the bic
CMCI FRPPXXX and replace it in xml file by "NOTPROVIDED" in case of lenght < 7 or length between 9 and 10 or if it's contain a space or more in BIC Code
For having <BIC>NOTPROVIDED<BIC>
User avatar
Zebuddi123
Enthusiast
Enthusiast
Posts: 794
Joined: Wed Feb 01, 2012 3:30 pm
Location: Nottinghamshire UK
Contact:

Re: Regular Expression

Post by Zebuddi123 »

Hi loulou2522 if you mean matching CMC FRPPXXX or CMC FRPPXXX

Zebuddi. :)

\b(([a-zA-Z]{3}\s{1}[a-zA-z]{1})[a-zA-Z]{2}[a-zA-z0-9]{2})\b|\b(([a-zA-Z]{3}\s{1}[a-zA-z]{1})[a-zA-Z]{2}[a-zA-Z1-9]{2}[a-zA-z0-9]{3}?)\b
malleo, caput, bang. Ego, comprehendunt in tempore
loulou2522
Enthusiast
Enthusiast
Posts: 495
Joined: Tue Oct 14, 2014 12:09 pm

Re: Regular Expression

Post by loulou2522 »

In fact the space can be anywhere in the expression and not a fix position
I take this example CMCI FRPP but i can have for example BN AFRPPXXX
I want to transform with NOTPROVIDED every BIC which is not conform
User avatar
Zebuddi123
Enthusiast
Enthusiast
Posts: 794
Joined: Wed Feb 01, 2012 3:30 pm
Location: Nottinghamshire UK
Contact:

Re: Regular Expression

Post by Zebuddi123 »

Hi loulou2522 Best i can come up with is a proc to create regex`s on the fly checking for valid combination with 1 space at any position from 8(2nd to 7th) 11(2nd to 10th) place, remove white space and reparse for valid BIC code which can be returned or changed to what every you want.

I hope I understood what you wanted to do or along a similar vain. Any way i enjoyed playng with it :).

Zebuddi. :)

Code: Select all

Procedure CheckForBic(sString.s) ; sString  = file or large string or web page source etc
	Protected iIndex.i, iNbr.i, iRegex.i, iRegexMain.i
	; creates regex on the fly to check for a space from 2nd to 7th |2nd to 10th place resectivly
	;check for match and removes any white spaces
	; then rechecks for a vaild BIC code and prints out
	
	iRegexMain = CreateRegularExpression(#PB_Any, "\b([a-zA-Z]{4}[a-zA-Z]{2}[a-zA-Z1-9]{2}([a-zA-Z0-9]{3})?)\b|\b([a-zA-Z){4}a-zA-Z]{2}[a-zA-Z1-9]{2})\b")
	For oo = 9 To 13
		iRegex = CreateRegularExpression(#PB_Any,"\b([\d\w\ ]{" + Str( oo) + "})\b" , #PB_RegularExpression_DotAll|#PB_RegularExpression_AnyNewLine) 
		If MatchRegularExpression(iRegex, sString)
			Dim t$(0)
			iNbr = ExtractRegularExpression(iRegex, sString, t$())
			For iIndex = 0 To ( iNbr-1)
				aa$ = t$(iIndex)
				t$(iIndex) = RemoveString(t$(iIndex), Chr(32))
				aab$ = t$(iIndex)
				If MatchRegularExpression(iRegexMain, t$(iIndex))
					Dim tt$(0)
					ExtractRegularExpression(iRegexMain, t$(iIndex), tt$()) ; genuine BIC code
					Debug tt$(0)							 				     ; -----------> your code here  
				EndIf	
			Next
			FreeArray(t$())
			FreeArray(tt$())
			FreeRegularExpression(iRegex)
		EndIf
	Next
	FreeRegularExpression(iRegexMain)
EndProcedure


CheckForBic( Sting  Or file full of BIC codes )
malleo, caput, bang. Ego, comprehendunt in tempore
Marc56us
Addict
Addict
Posts: 1477
Joined: Sat Feb 08, 2014 3:26 pm

Re: Regular Expression

Post by Marc56us »

:idea: Perhaps a useful trick: to recognize whether the BIC is valid or not, even if it is badly formatted.

Match for good BIC even with wrong format (fields separator: space, or . or - or ,)
BIC alone in line or with boundary (" > < ")

Code: Select all

(?:\b|["> ])([A-Z]{4}[ .,-]?[A-Z]{2}[ .,-]?[A-Z0-9]{2}[ .,-]?([A-Z0-9]{3})?)(?:["< \r\n]|\b)

--- Good BIC found (even if bad formatted)
BIC = "ACLR AU2S201"
<BIC>ACLR AU2S201</BIC>
ACLRAU2S201
ACLR AU2S201
ACLR-AU2S201
ACLR-AU-2S-201
ACLR AU 2S 201
ACLR,AU,2S,201
ACLR.AU.2S.201

--- Bad BIC
BIC = "CMC FRPPXXX"
<BIC>CMC FRPPXXX</BIC>
:arrow: $1 BIC 8
:arrow: $2 branch code (optional)
:arrow: $1+$2 BIC 11

:!: To be sure, check the correspondence of the country code with the list ISO 3166-1 alpha-2

uses ReplaceString() to remove spaces whatever position

:wink:
User avatar
Zebuddi123
Enthusiast
Enthusiast
Posts: 794
Joined: Wed Feb 01, 2012 3:30 pm
Location: Nottinghamshire UK
Contact:

Re: Regular Expression

Post by Zebuddi123 »

@Marc56us Glad you posted that regex :) wasnt aware of the ((?:...) group construct - match everything enclosed) and have been trying to figure out how to do this with regex for quite a while and failing usually with a headache lol.

Zebuddi. :)

PS I`ll be playing with this all day lol.
malleo, caput, bang. Ego, comprehendunt in tempore
Marc56us
Addict
Addict
Posts: 1477
Joined: Sat Feb 08, 2014 3:26 pm

Re: Regular Expression

Post by Marc56us »

A sample code with some tricks.

Code: Select all

; BIC Extractor 
; Marc56 - 2017/04/04
; http://www.purebasic.fr/english/viewtopic.php?f=13&t=68185

; Data sample
; Need to escape \" in datasection (no need if read from file)
DataSection
     Data.s ~"BIC = \"ACLR AU2S201\""
     Data.s ~"<BIC>ACLR AU2S201</BIC>"
     Data.s ~"ACLRAU2S201"
     Data.s ~"ACLR AU2S203"
     Data.s ~"ACLR-AU2S201"
     Data.s ~"ACUE-AU-2S-201"
     Data.s ~"ETLR AU 2S 201"
     Data.s ~"ZALR,AU,2S,201"
     Data.s ~"FTLR,AU,2S"
     Data.s ~"ACLR.AU.2S.201"
     Data.s ~"BIC = \"CMC FRPPXXX\""
     Data.s ~"<BIC>CMC FRPPXXX</BIC> "
     Data.s Chr(3)
EndDataSection

; Real regex: (?:\b|["> ])([A-Z]{4}[ .,-]?[A-Z]{2}[ .,-]?[A-Z0-9]{2}[ .,-]?([A-Z0-9]{3})?)(?:["< \r\n]|\b)
; Can't escape string here so use horrible Chr(34) syntax :-(
; 4 capturing groups (first and last are boundary so use non-capturing (?...) 
If CreateRegularExpression(RegEx, "(?:\b|[> " + 
                                  Chr(34) + 
                                  "])([A-Z]{4})[ .,-]?([A-Z]{2})[ .,-]?([A-Z0-9]{2})[ .,-]?([A-Z0-9]{3})?(?:[< " + 
                                  Chr(34) + "\r\n]|\b)")
     
     Debug "RegEx OK, let's go :-)" + #CRLF$
Else
     Debug ~"RegEx KO !\nDrink another coffee and try again...\n:-/"
     End
EndIf

Repeat
     Read.s Data_Sample.s
     
     ; Is there a way to stop reading data when nothing to read, without IMA ?
     ; I don't know, so I put a flag at end of datasection and check it.
     If Data_Sample = Chr(3)
          Debug #CRLF$ + "End Of (sample) Text"
          Break
     EndIf
     
     BIC.s = ""
     ; Concatenate the 4 fields as one BIC code
     If MatchRegularExpression(RegEx, Data_Sample)
          ExamineRegularExpression(RegEx, Data_Sample)
          While NextRegularExpressionMatch(RegEx)
               BIC + RegularExpressionGroup(RegEx, 1) +
                     RegularExpressionGroup(RegEx, 2) +
                     RegularExpressionGroup(RegEx, 3) +
                     RegularExpressionGroup(RegEx, 4)
          Wend 
     Else
          BIC = "No BIC"
     EndIf
     
     Debug "Found: " + LSet(BIC, 11, " ") + " in: " + Data_Sample
ForEver
Hope this help.
:wink:
mestnyi
Addict
Addict
Posts: 995
Joined: Mon Nov 25, 2013 6:41 am

Re: Regular Expression

Post by mestnyi »

You can help create a regular expression for the following example?
https://regex101.com/r/PZntfz/1/
result
a(b(),c())
a1(b1,c1+d1)
a2("b2",c2(),(d2+"e2")-33)
Post Reply