Page 1 of 2
[Solved] RegEx question
Posted: Fri Jan 08, 2021 7:26 am
by BarryG
This is hard for me to work out, but should be easy for regular expression experts here. Hehe. See this string:
Code: Select all
{}{-first}1{-two}2{-three}3{-hello}{-last}{keep}
I just want to remove anything starting with "{-" and ending with "}", and including those curly braces, so the above would become:
I have no idea how to code this in regex. I've looked here but it's still too confusing because it wants to remove everything between, instead of only if starting with "{-":
https://stackoverflow.com/questions/239 ... o-brackets
Appreciate any help.
Re: RegEx question
Posted: Fri Jan 08, 2021 9:23 am
by STARGĂ…TE
Code: Select all
If CreateRegularExpression(0, "\{-.*?\}")
Define Result.s = ReplaceRegularExpression(0, "{}{-first}1{-two}2{-three}3{-hello}{-last}{keep}", "")
Debug Result
Else
Debug RegularExpressionError()
EndIf
Re: RegEx question
Posted: Fri Jan 08, 2021 9:41 am
by BarryG
Thank you, Stargate.
I put your expression into
https://regex101.com/ to learn how it works, and this came up:
Now I have a better understanding of it. I really appreciate your pointing the way. Thank you again!
BTW, the manual says CreateRegularExpression() returns 0 if the expression was not created successfully, but how likely is this in reality? It would only be if our expression syntax was wrong, yes? Since the expression here is 100% valid, I wouldn't need to do an If/EndIf for it?
Re: [Solved] RegEx question
Posted: Fri Jan 08, 2021 9:47 am
by infratec
If you also accept a solution without regex:
Code: Select all
Procedure.s Remove(*TextPtr.Character)
Protected Result$, Help$, State.i
While *TextPtr\c
Help$ + Chr(*TextPtr\c)
Select State
Case 0
If *TextPtr\c = '{'
State = 1
Else
Result$ + Help$
Help$ = ""
EndIf
Case 1
If *TextPtr\c = '-'
State = 2
ElseIf *TextPtr\c = '}'
Result$ + Help$
Help$ = ""
State = 0
EndIf
Case 2
If *TextPtr\c = '}'
Help$ = ""
State = 0
EndIf
EndSelect
*TextPtr + 2
Wend
ProcedureReturn Result$
EndProcedure
Text$ = "{}{-first}1{-two}2{-three}3{-hello}{-last}{keep}"
Debug Remove(@Text$)
To find what you not want is easy: {-\w*}
But to create the negative lookahead ...
Re: [Solved] RegEx question
Posted: Fri Jan 08, 2021 9:49 am
by loulou2522
HI Barryg,
Try rexman it's explain you how to make and test a regexp
viewtopic.php?f=27&t=37212&hilit=rexman
Re: [Solved] RegEx question
Posted: Fri Jan 08, 2021 9:51 am
by infratec
Ah ....
replace the matches with ""
But my solution saves a lot of kb for the exe

Re: [Solved] RegEx question
Posted: Fri Jan 08, 2021 9:56 am
by BarryG
Thanks for the alternative, infratec! Saves about 124 KB added to my exe. But at least I learned a bit about regex formats today. @loulou2522, I'll download Rexman and have a play.
Re: [Solved] RegEx question
Posted: Fri Jan 08, 2021 10:26 am
by Marc56us
Hi,
Can be simplified: no need to escape { and } when not use as quantifier
Code: Select all
If CreateRegularExpression(0, "{-.*?}")
Define Result.s = ReplaceRegularExpression(0, "{}{-first}1{-two}2{-three}3{-hello}{-last}{keep}", "")
Debug Result
Else
Debug RegularExpressionError()
EndIf
And another way for those (like me) who does not understand pointers. A FindString() version.
Code: Select all
; -------12345678901234567890123456789012345678901234567890
Text$ = "{}{-first}1{-two}2{-three}3{-hello}{-last}{keep}"
; Expected
; {}123{keep}
Debug "Text: " + Text$
Nb = CountString(Text$, "{-")
Debug "Found: {-*} " + Nb + " time(s)"
For i = 1 To Nb
Debug #CRLF$ + "#" + i
STX = FindString(Text$, "{-")
If STX > 0
Debug " STX: " + STX
ETX = FindString(Text$, "}", STX)
Debug " ETX: " + ETX
Text$ = RemoveString(Text$, Mid(Text$, STX, ETX-STX+1))
STX = 0
EndIf
Next
Debug #CRLF$ + "Result: " + Text$
It would probably be slow with large text, but PB functions are fast.

Re: [Solved] RegEx question
Posted: Fri Jan 08, 2021 5:20 pm
by ebs
Marc56us's FindString() code can be simplified and made somewhat faster by 1) not counting the number of "{-" occurrences and 2) starting the next FindString() at STX, not the beginning of the string:
Code: Select all
Procedure.s Remove4(Text$)
STX = 1
Repeat
STX = FindString(Text$, "{-", STX)
If STX
ETX = FindString(Text$, "}", STX)
Text$ = RemoveString(Text$, Mid(Text$, STX, ETX-STX+1))
Else
Break
EndIf
ForEver
ProcedureReturn Text$
EndProcedure
It also works if you omit the initialization of "STX = 1" and start at zero instead, but that would be wrong

Re: [Solved] RegEx question
Posted: Sat Dec 31, 2022 11:18 am
by BarryG
Hi all, back on this topic. This time, I want to convert
abc {?913} --- {?9745} xyz to
abc 913 --- 9745 xyz. As you can see, I just want to remove any leading "{?" and the closing "}" after it. These markers will be present multiple times in the string. My tired brain can't work it out. Any tips? Thanks.
I can do it this way, but's slow and inefficient. Posting it just to prove I tried before blindly asking for help. <Wink>.
Code: Select all
text$="abc {?913} --- {?9745} xyz"
While FindString(text$,"{?")
startPos = FindString(text$, "{?")
endPos = FindString(text$, "}")
text$ = Left(text$, startPos - 1) + Mid(text$, startPos + 2, endPos - startPos - 2) + Mid(text$, endPos + 1)
Wend
Debug text$ ; abc 913 --- 9745 xyz
I think a modification of infratec's code would be best, but I'm dumb tonight. Here's a template to start:
Code: Select all
Procedure.s RemoveMarkers(*TextPtr.Character)
Protected Result$, Help$, State.i
While *TextPtr\c
Help$ + Chr(*TextPtr\c)
Select State
Case 0
If *TextPtr\c = '{'
State = 1
Else
Result$ + Help$
Help$ = ""
EndIf
Case 1
If *TextPtr\c = '?'
State = 2
ElseIf *TextPtr\c = '}'
Result$ + Help$
Help$ = ""
State = 0
EndIf
Case 2
If *TextPtr\c = '}'
Help$ = ""
State = 0
EndIf
EndSelect
*TextPtr + 2
Wend
ProcedureReturn Result$
EndProcedure
Text$="abc {?913} --- {?9745} xyz"
Debug RemoveMarkers(@Text$)
Re: [Solved] RegEx question
Posted: Sat Dec 31, 2022 11:45 am
by Marc56us
I want to convert abc {?913} --- {?9745} xyz to abc 913 --- 9745 xyz. As you can see, I just want to remove any leading "{?" and the closing "}" after it. These markers will be present multiple times in the string.
Code: Select all
EnableExplicit
Define Txt$ = "abc {?913} --- {?9745} xyz"
Debug Txt$
Txt$ = ReplaceString(Txt$, "{?", "")
Txt$ = ReplaceString(Txt$, "}", "")
Debug Txt$
End
Code: Select all
abc {?913} --- {?9745} xyz
abc 913 --- 9745 xyz
Re: RegEx question
Posted: Sat Dec 31, 2022 11:57 am
by BarryG
Not quite, Marc56us. I'm not
that dumb. Hehe. Your code removes all "{?" and "}", which is not the goal. Only if "{?" is first followed by a "}" should those two sets be removed. Your code fails here, for example:
Code: Select all
Txt$ = "abc} {?913} --- {?9745} } } } {?xyz"
Txt$ = ReplaceString(Txt$, "{?", "")
Txt$ = ReplaceString(Txt$, "}", "")
; Next line debugs the wrong answer.
Debug Txt$ ; abc 913 --- 9745 xyz
Re: RegEx question
Posted: Sat Dec 31, 2022 1:12 pm
by infratec
Code: Select all
Txt$ = "abc} {?913} --- {?9745} } } } {?xyz"
Pos1 = FindString(Txt$, "{?")
While Pos1
Pos2 = FindString(Txt$, "}", Pos1)
If Pos2
Txt$ = Left(Txt$, Pos1 - 1) + Mid(Txt$, Pos2 + 1)
Else
Break
EndIf
Pos1 = FindString(Txt$, "{?", Pos1)
Wend
Debug Txt$
Re: RegEx question
Posted: Sat Dec 31, 2022 1:18 pm
by infratec
Extended version:
Code: Select all
Procedure.s CutOut(String$, CutStart$, CutEnd$)
Protected.i Pos1, Pos2, CutEndLen
CutEndLen = Len(CutEnd$)
Pos1 = FindString(String$, CutStart$)
While Pos1
Pos2 = FindString(String$, CutEnd$, Pos1)
If Pos2
String$ = Left(String$, Pos1 - 1) + Mid(String$, Pos2 + CutEndLen)
Else
Break
EndIf
Pos1 = FindString(String$, CutStart$, Pos1)
Wend
ProcedureReturn String$
EndProcedure
Txt$ = "abc} {?913} --- {?9745} } } } {?xyz"
Debug CutOut(Txt$, "{?", "}")
Txt$ = "{}{-first}1{-two}2{-three}3{-hello}{-last}{keep}"
Debug CutOut(Txt$, "{-", "}")
You can speed it up by using pointers and CopyMemory

Re: RegEx question
Posted: Sat Dec 31, 2022 1:46 pm
by BarryG
Sorry infratec, but both your examples delete the 913 and 9745, which are to be kept. See my original post with the green text of what's to be left. Only the leading marker of "{?" and closing marker of "}" is to be removed, and nothing else. It's a pain to code, as you can see.