Page 1 of 1
Regex and $1 parameter
Posted: Sun Nov 28, 2021 8:47 am
by BarryG
Hi, back again with another regular expression question. I was interested in converting camel case to title case, and found the following example at StackOverflow. It has 442 upvotes, so it must be correct, hehe. But I can't make it work with PureBasic (I've only tried putting spaces before each capital in my code below). Please help. Thanks.
https://stackoverflow.com/a/4149393/7908170
Code: Select all
text$="thisStringIsGood"
r=CreateRegularExpression(#PB_Any,"/([A-Z])/g")
If r
If MatchRegularExpression(r,text$)
text$=ReplaceRegularExpression(r,text$," $1")
EndIf
FreeRegularExpression(r)
Debug text$ ; Want "This String Is Good"
EndIf
Re: Regex and $1 parameter
Posted: Sun Nov 28, 2021 9:34 am
by Marc56us
ReplaceRegularExpression()
...
Remarks
Back references (usually described as \1, \2, etc.) are not supported. ExtractRegularExpression() combined with ReplaceString() should achieve the requested behaviour.

Re: Regex and $1 parameter
Posted: Sun Nov 28, 2021 9:41 am
by #NULL
I didn't see the doc talking about extractreg.. as Marc56us posted, but this seem to work:
Code: Select all
s.s = "thisStringIsGood"
If CreateRegularExpression(0, "([A-Z])")
If ExamineRegularExpression(0, s)
While NextRegularExpressionMatch(0)
s = ReplaceString(s,
RegularExpressionGroup(0, 1),
" " + RegularExpressionGroup(0, 1),
#PB_String_CaseSensitive,
RegularExpressionGroupPosition(0, 1),
1)
Wend
EndIf
Else
Debug RegularExpressionError()
EndIf
Debug s
If CreateRegularExpression(0, "(^.)")
If ExamineRegularExpression(0, s)
While NextRegularExpressionMatch(0)
s = ReplaceString(s,
RegularExpressionGroup(0, 1),
UCase(RegularExpressionGroup(0, 1)),
#PB_String_CaseSensitive,
RegularExpressionGroupPosition(0, 1),
1)
Wend
EndIf
Else
Debug RegularExpressionError()
EndIf
Debug s
Re: Regex and $1 parameter
Posted: Sun Nov 28, 2021 9:46 am
by BarryG
Oh crap, so "$1" is what PureBasic doesn't support? Darn. My app was to offer regex for its users to specify the regex text that they need, but obviously they can't now. So I'll have to not offer that feature, which is a real shame. I can't use entire replacements like #NULL's example for the reasons I just explained. This is very disappointing. Unless there's some other unofficial way to support regex fully and ignore PureBasic's version?
Re: Regex and $1 parameter
Posted: Sun Nov 28, 2021 9:46 am
by Marc56us
A quick and dirty solution without regex
Code: Select all
text$="thisStringIsGood"
For i = 1 To Len(text$)
Char$ = Mid(text$, i, 1)
If Char$ = UCase(Char$)
Char$ = " " + Char$
EndIf
Full$ + Char$
Next
Debug Full$
(need to add a line for first char)

Re: Regex and $1 parameter
Posted: Sun Nov 28, 2021 9:48 am
by BarryG
No good Marc56us - see my post above yours for why. PureBasic doesn't support drop-in regex statements obtained from the web, so it can't be used.
Re: [Ignore] Regex and $1 parameter
Posted: Sun Nov 28, 2021 9:56 am
by Marc56us
Unless there's some other unofficial way to support regex fully and ignore PureBasic's version?
Use RunProgram() and call external tool like
SED or
AWK (yes these unix tools exists for Windows too)

Re: Regex and $1 parameter
Posted: Sun Nov 28, 2021 1:12 pm
by BarryG
Hi #NULL, your code works great for your example text ("thisStringIsGood") but if I change it to something just slightly different ("thisIsCamelCase") then it fails (has extra spaces, and "CamelCase" doesn't separate like "IsGood" does). I can't work out why that would be. Any ideas?
Here's what I'm testing with:
Code: Select all
new$=" "
text$="thisStringIsGood" ; Works.
text$="thisIsCamelCase" ; Fails.
r=CreateRegularExpression(#PB_Any,"([A-Z])")
If r
If ExamineRegularExpression(r,text$)
While NextRegularExpressionMatch(r)
text$=ReplaceString(text$,RegularExpressionGroup(r,1),new$+RegularExpressionGroup(r,1),#PB_String_CaseSensitive,RegularExpressionGroupPosition(r,1),1)
Wend
EndIf
FreeRegularExpression(r)
EndIf
Debug text$
Re: Regex and $1 parameter
Posted: Sun Nov 28, 2021 1:26 pm
by #NULL
The
StartPosition Parameter for
ReplaceString needs to be changed from
RegularExpressionGroupPosition(r,1) to
RegularExpressionMatchPosition(r). Group position isn't correct there so the first C gets replaced twice.

Re: Regex and $1 parameter
Posted: Sun Nov 28, 2021 1:41 pm
by BarryG
You da man! Haha. That works, but it's bedtime now so I'll test more extensively tomorrow. Thanks!
Re: Regex and $1 parameter
Posted: Sun Nov 28, 2021 4:04 pm
by AZJIO
Re: Regex and $1 parameter
Posted: Mon Nov 29, 2021 10:41 am
by BarryG
AZJIO, I took a look but can't see how that helps with my question here? It outputs the original string. Granted, I'm not great with regex's so I'm probably doing something wrong. The aim is to make the regex work exactly the same way as the StackOverflow version at the start of this thread, since that's what my users will be providing.
Here's your code and what I tried:
Code: Select all
#RegExp = 0
Procedure.s RegexReplace2(RgEx, *Result.string, Replace0$)
Protected i, CountGr, Pos, Offset = 1
Protected Result$, Replace$
Protected NewList item.s()
Protected LenT, *Point
CountGr = CountRegularExpressionGroups(RgEx)
If CountGr > 9
CountGr = 9
EndIf
If ExamineRegularExpression(RgEx, *Result\s)
While NextRegularExpressionMatch(RgEx)
Pos = RegularExpressionMatchPosition(RgEx)
Replace$ = ReplaceString(Replace0$,"\0", RegularExpressionMatchString(RgEx))
For i = 1 To CountGr
Replace$ = ReplaceString(Replace$, "\"+Str(i), RegularExpressionGroup(RgEx, i))
Next
If AddElement(item())
item() = Mid(*Result\s, Offset, Pos - Offset) + Replace$
EndIf
Offset = Pos + RegularExpressionMatchLength(RgEx)
Wend
If AddElement(item())
item() = Mid(*Result\s, Offset)
EndIf
LenT = 0
ForEach item()
LenT + Len(item())
Next
*Result\s = Space(LenT)
*Point = @*Result\s
ForEach item()
CopyMemoryString(item(), @*Point)
Next
FreeList(item())
EndIf
EndProcedure
#RegExp = 0
Define Text.string
Text\s = "thisStringIsGood"
CreateRegularExpression(#RegExp , "/([A-Z])/g" )
RegexReplace2(#RegExp, @Text, " \1" )
FreeRegularExpression(#RegExp)
Debug Text\s ; thisStringIsGood
Re: Regex and $1 parameter
Posted: Mon Nov 29, 2021 2:17 pm
by AZJIO
I'm busy right now, but a hint that the problem of groups is being solved here.
2. You need to remove the character at the beginning "/" and the character at the end "/[gim]+", but not just remove, but use these flags to enable the appropriate mode.
Re: Regex and $1 parameter
Posted: Mon Nov 29, 2021 2:53 pm
by Marc56us
(just for the fun following my suggestion https://www.purebasic.fr/english/viewto ... 74#p577574)
Quick and dirty code taking the user input and transmitting it as is to SED. (so using SED Regex)
(SED use \1 instead of $1)
Code: Select all
; Regex And $1 parameter
; Post by BarryG ยป Sun Nov 28, 2021 8:47 am
; https://www.purebasic.fr/english/viewtopic.php?p=577567#p577567
; Marc56 - 2021-11-23
EnableExplicit
Enumeration
#RegExp
EndEnumeration
Procedure RegexReplaceNew(RegEx$, Text$, Replace$)
Debug "Regex source : " + Regex$
RegEx$ = ReplaceString(RegEx$, "(", "\(")
RegEx$ = ReplaceString(RegEx$, ")", "\)")
RegEx$ = RTrim(RegEx$, "g")
Debug "Regex with esc : " + Regex$
Protected Arg$ = "sed 's" + RegEx$ + Replace$ + "/g' Tmp_File.in > Tmp_File.out"
Debug "SED command line: " + Arg$
Protected Run = RunProgram("wsl", Arg$, GetTemporaryDirectory(), #PB_Program_Wait)
Protected Tmp_File$ = GetTemporaryDirectory() + "Tmp_File.out"
If FileSize(Tmp_File$) > 0
ReadFile(1, Tmp_File$)
Protected New_Line$ = ReadString(1)
CloseFile(1)
Debug "---"
Debug Text$
Debug New_Line$
Debug UCase(Left(New_Line$, 1)) + Right(New_Line$, Len(New_Line$) -1)
Else
Debug "No file"
EndIf
EndProcedure
Global Text$ = "thisStringIsGood"
If OpenFile(0, GetTemporaryDirectory() + "Tmp_File.in")
WriteString(0, "thisStringIsGood")
CloseFile(0)
Global RegEx$ = "/([A-Z])/g"
RegexReplaceNew(RegEx$ ,Text$, " \1")
Else
Debug "Can't create Temp file"
End
EndIf
DeleteFile(GetTemporaryDirectory() + "Tmp_File.in")
DeleteFile(GetTemporaryDirectory() + "Tmp_File.out")
End
(Using SED of WSL 1. If you don't have it installed, download SED from Unix Tools for Windows instead)
Code: Select all
Regex source : /([A-Z])/g
Regex with esc : /\([A-Z]\)/
SED command line: sed 's/\([A-Z]\)/ \1/g' Tmp_File.in > Tmp_File.out
---
thisStringIsGood
this String Is Good
This String Is Good
But, the simplest solution would obviously be to parse the user input (remove // and quantifiers) and use the regular expression functions of PB. But create your own regular expression filter with all the solutions, I hope you have lots of coffee and time

Re: Regex and $1 parameter
Posted: Tue Nov 30, 2021 5:29 am
by AZJIO
Code: Select all
EnableExplicit
Procedure.s RegexReplace2(RgEx, *Result.string, Replace0$, Once = 0)
Protected i, CountGr, Pos, Offset = 1
Protected Result$, Replace$
Protected NewList item.s()
Protected LenT, *Point
CountGr = CountRegularExpressionGroups(RgEx)
If CountGr > 9
CountGr = 9
EndIf
If ExamineRegularExpression(RgEx, *Result\s)
While NextRegularExpressionMatch(RgEx)
Pos = RegularExpressionMatchPosition(RgEx)
Replace$ = ReplaceString(Replace0$,"\0", RegularExpressionMatchString(RgEx))
For i = 1 To CountGr
Replace$ = ReplaceString(Replace$, "\"+Str(i), RegularExpressionGroup(RgEx, i))
Next
If AddElement(item())
item() = Mid(*Result\s, Offset, Pos - Offset) + Replace$
EndIf
Offset = Pos + RegularExpressionMatchLength(RgEx)
If Once
Break
EndIf
Wend
If AddElement(item())
item() = Mid(*Result\s, Offset)
EndIf
LenT = 0
ForEach item()
LenT + Len(item())
Next
*Result\s = Space(LenT)
*Point = @*Result\s
ForEach item()
CopyMemoryString(item(), @*Point)
Next
FreeList(item())
EndIf
EndProcedure
Define reSource$, reFlag$, User_entered$, re, re2, CreFlags = 0, Once = 0
User_entered$ = "/([A-Z])/g"
re=CreateRegularExpression(#PB_Any,"/(.+?)/([gim]*)")
If re
If ExamineRegularExpression(re, User_entered$)
If NextRegularExpressionMatch(re)
reSource$ = RegularExpressionGroup(re, 1)
reFlag$ = RegularExpressionGroup(re, 2)
EndIf
EndIf
FreeRegularExpression(re)
EndIf
If Not Asc(reSource$)
Debug "User, you're wrong. Empty regular expression"
Debug "The regular expression should be in the following format: /anything/gim"
End
EndIf
If FindString(reFlag$, "i")
CreFlags | #PB_RegularExpression_NoCase
EndIf
If FindString(reFlag$, "m")
CreFlags | #PB_RegularExpression_MultiLine
EndIf
If Not FindString(reFlag$, "g")
Once = 1
EndIf
Define Text.string
Text\s = "thisStringIsGood"
re2 = CreateRegularExpression(#PB_Any, reSource$, CreFlags)
If re2
RegexReplace2(re2, @Text, " \1", Once)
FreeRegularExpression(re2)
Debug Text\s ; thisStringIsGood
Else
Debug "User, you're wrong:"
Debug RegularExpressionError()
End
EndIf