Regex and parenthesis problem

Just starting out? Need help? Post your questions and find answers here.
danny88
User
User
Posts: 38
Joined: Sun Jan 21, 2024 8:13 am

Regex and parenthesis problem

Post by danny88 »

hi
I have this text :

Code: Select all

<input type="text" name="tiersForm.cin" value="C314769" id="tiersForm.cin" class="text_field" onclick="javascript:initActionTiers();" onchange="javascript:doPublish(\'ihmTiers\'); return false;">
The following regex should return C314769 but it returns <input type="text" name="tiersForm.cin" value="C314769"

Code: Select all

REGEX : <input[^>]*name="tiersForm\.cin"[^>]*value="([A-Z]\d+)"
What is wrong ? Thanks

Here is my code :

Code: Select all

regex.s = ~"<input[^>]*name=\"tiersForm\\.cin\"[^>]*value=\"([A-Z]\\d+)\""
ReadFile(0, "test.htm")
text.s = ReadString(0, #PB_File_IgnoreEOL)
CloseFile(0)

If CreateRegularExpression(0, regex)
  Dim Result$(0)  
  NbResults = ExtractRegularExpression(0, text, result$())
  Debug "Nb matchs found: " + NbResults
  For i = 0 To NbResults - 1
    Debug Result$(i)
  Next
Else
  MessageRequester("Error", RegularExpressionError())
EndIf
User avatar
spikey
Enthusiast
Enthusiast
Posts: 750
Joined: Wed Sep 22, 2010 1:17 pm
Location: United Kingdom

Re: Regex and parenthesis problem

Post by spikey »

You need a non-capturing group and a capturing group, by default you get a capturing expression. The non-capturing group to match the pre- and post- amble and the capturing group to get the text you want. You can define a non-capturing group by prefixing ?: (question mark, colon).
Try:

Code: Select all

(?:<input[^>]*name="tiersForm\.cin"[^>]*value=")([A-Z]\d+)(?:")
Last edited by spikey on Sun Nov 10, 2024 9:04 pm, edited 1 time in total.
danny88
User
User
Posts: 38
Joined: Sun Jan 21, 2024 8:13 am

Re: Regex and parenthesis problem

Post by danny88 »

same result
danny88
User
User
Posts: 38
Joined: Sun Jan 21, 2024 8:13 am

Re: Regex and parenthesis problem

Post by danny88 »

spikey wrote: Sun Nov 10, 2024 8:54 pm You need a non-capturing group and a capturing group, by default you get a capturing expression. The non-capturing group to match the pre- and post- amble and the capturing group to get the text you want. You can define a non-capturing group by prefixing ?: (question mark, colon).
Try:

Code: Select all

(?:<input[^>]*name="tiersForm\.cin"[^>]*value=")([A-Z]\d+)(?:")
same result
DarkDragon
Addict
Addict
Posts: 2344
Joined: Mon Jun 02, 2003 9:16 am
Location: Germany
Contact:

Re: Regex and parenthesis problem

Post by DarkDragon »

RegularExpressionGroup(#RegularExpression, 1)

0 is always everything
1 is first parenthesis
...
https://www.purebasic.com/documentation ... group.html
bye,
Daniel
User avatar
spikey
Enthusiast
Enthusiast
Posts: 750
Joined: Wed Sep 22, 2010 1:17 pm
Location: United Kingdom

Re: Regex and parenthesis problem

Post by spikey »

danny88 wrote: Sun Nov 10, 2024 9:00 pm same result
Weird, that should work (but I have to admit, I didn't test it - it was dinner time!) Try this:

Code: Select all

regex.s = "(<input[^>]*name=" + #DQUOTE$ + "tiersForm\.cin" + #DQUOTE$ + "[^>]*value=" + #DQUOTE$ + ")([A-Z]\d+)(" + #DQUOTE$ + ")"
Debug regex

text.s = "<input type=" + #DQUOTE$ + "text" + #DQUOTE$ + "name=" + #DQUOTE$ + "tiersForm.cin" + #DQUOTE$ + "value=" + #DQUOTE$ + "C314769" + #DQUOTE$ + 
         "id="+ #DQUOTE$ + "tiersForm.cin" + #DQUOTE$ + "class="+ #DQUOTE$ + "text_field" + #DQUOTE$ + "onclick="+ #DQUOTE$ + "javascript:initActionTiers();" + #DQUOTE$ + 
         "onchange="+ #DQUOTE$ + "javascript:doPublish(\'ihmTiers\'); return false;"+ #DQUOTE$ + ">"

If CreateRegularExpression(0, regex)
  If ExamineRegularExpression(0, text)
    Debug CountRegularExpressionGroups(0)
    NextRegularExpressionMatch(0)
    Debug RegularExpressionGroup(0, 2)
  EndIf
Else
  Debug RegularExpressionError()
EndIf
danny88
User
User
Posts: 38
Joined: Sun Jan 21, 2024 8:13 am

Re: Regex and parenthesis problem

Post by danny88 »

DarkDragon wrote: Sun Nov 10, 2024 9:19 pm RegularExpressionGroup(#RegularExpression, 1)

0 is always everything
1 is first parenthesis
...
https://www.purebasic.com/documentation ... group.html
works. thanks
danny88
User
User
Posts: 38
Joined: Sun Jan 21, 2024 8:13 am

Re: Regex and parenthesis problem

Post by danny88 »

spikey wrote: Sun Nov 10, 2024 10:38 pm
danny88 wrote: Sun Nov 10, 2024 9:00 pm same result
Weird, that should work (but I have to admit, I didn't test it - it was dinner time!) Try this:

Code: Select all

regex.s = "(<input[^>]*name=" + #DQUOTE$ + "tiersForm\.cin" + #DQUOTE$ + "[^>]*value=" + #DQUOTE$ + ")([A-Z]\d+)(" + #DQUOTE$ + ")"
Debug regex

text.s = "<input type=" + #DQUOTE$ + "text" + #DQUOTE$ + "name=" + #DQUOTE$ + "tiersForm.cin" + #DQUOTE$ + "value=" + #DQUOTE$ + "C314769" + #DQUOTE$ + 
         "id="+ #DQUOTE$ + "tiersForm.cin" + #DQUOTE$ + "class="+ #DQUOTE$ + "text_field" + #DQUOTE$ + "onclick="+ #DQUOTE$ + "javascript:initActionTiers();" + #DQUOTE$ + 
         "onchange="+ #DQUOTE$ + "javascript:doPublish(\'ihmTiers\'); return false;"+ #DQUOTE$ + ">"

If CreateRegularExpression(0, regex)
  If ExamineRegularExpression(0, text)
    Debug CountRegularExpressionGroups(0)
    NextRegularExpressionMatch(0)
    Debug RegularExpressionGroup(0, 2)
  EndIf
Else
  Debug RegularExpressionError()
EndIf
works. thanks
AZJIO
Addict
Addict
Posts: 2143
Joined: Sun May 14, 2017 1:48 am

Re: Regex and parenthesis problem

Post by AZJIO »

Code: Select all

regex.s = "<input[^>]*name=" + #DQUOTE$ + "tiersForm\.cin" + #DQUOTE$ + "[^>]*value=" + #DQUOTE$ + "([A-Z]\d+)" + #DQUOTE$ + ""
Debug regex

text.s = "<input type=" + #DQUOTE$ + "text" + #DQUOTE$ + "name=" + #DQUOTE$ + "tiersForm.cin" + #DQUOTE$ + "value=" + #DQUOTE$ + "C314769" + #DQUOTE$ + 
         "id="+ #DQUOTE$ + "tiersForm.cin" + #DQUOTE$ + "class="+ #DQUOTE$ + "text_field" + #DQUOTE$ + "onclick="+ #DQUOTE$ + "javascript:initActionTiers();" + #DQUOTE$ + 
         "onchange="+ #DQUOTE$ + "javascript:doPublish(\'ihmTiers\'); return false;"+ #DQUOTE$ + ">"

If CreateRegularExpression(0, regex)
  If ExamineRegularExpression(0, text)
    Debug CountRegularExpressionGroups(0)
    NextRegularExpressionMatch(0)
    Debug RegularExpressionGroup(0, 1)
  EndIf
Else
  Debug RegularExpressionError()
EndIf
.

Code: Select all

regex.s = ~"<input[^>]*name=\"tiersForm\\.cin\"[^>]*value=\"\\K([A-Z]\\d+)(?=\")"

text.s = "<input type=" + #DQUOTE$ + "text" + #DQUOTE$ + "name=" + #DQUOTE$ + "tiersForm.cin" + #DQUOTE$ + "value=" + #DQUOTE$ + "C314769" + #DQUOTE$ + 
         "id="+ #DQUOTE$ + "tiersForm.cin" + #DQUOTE$ + "class="+ #DQUOTE$ + "text_field" + #DQUOTE$ + "onclick="+ #DQUOTE$ + "javascript:initActionTiers();" + #DQUOTE$ + 
         "onchange="+ #DQUOTE$ + "javascript:doPublish(\'ihmTiers\'); return false;"+ #DQUOTE$ + ">"

If CreateRegularExpression(0, regex)
  Dim Result$(0)  
  NbResults = ExtractRegularExpression(0, text, result$())
  Debug "Nb matchs found: " + NbResults
  For i = 0 To NbResults - 1
    Debug Result$(i)
  Next
Else
  MessageRequester("Error", RegularExpressionError())
EndIf
Post Reply