Extracting regular expressions while keeping track

Just starting out? Need help? Post your questions and find answers here.
superadnim
Enthusiast
Enthusiast
Posts: 480
Joined: Thu Jul 27, 2006 4:06 am

Extracting regular expressions while keeping track

Post by superadnim »

I'd like to keep track of where exactly each match came from while using ExtractRegularExpression() is this possible?, in my case not only I need to match against my regular expression but also I must know where did the match come from.

Imagine if you were to match for numbers on a list of 1000 numbers, you'd typically want to know where each match came from so you can either highlight it in your application or do anything within those lines.

However I'm matching words and knowing the line where the match came from would be great.

I fear I'd have to give up on regular expressions to do this or I'd have to process the string twice (once with regular expressions, the second time just to find where each match came from).

Sounds feasible but very slow at the same time. I'm OK with the speed of regexp but not so much about having to do things twice...

:lol: should I bash the keyboard and give up?
:?
Trond
Always Here
Always Here
Posts: 7446
Joined: Mon Sep 22, 2003 6:45 pm
Location: Norway

Post by Trond »

I've tried to use regexes several times - all have failed miserably due to similar inherent restrictions. They only match, they don't actually do anything useful. It's so frustrating!
Frank Webb
New User
New User
Posts: 9
Joined: Wed Sep 19, 2007 12:57 am
Location: AUSTRALIA

Re: Extracting regular expressions while keeping track

Post by Frank Webb »

I know this reply is a bit late!

I had the same problem and tried a number of ways to obtain the starting character position of the first pattern match of a regular expression in a given string.

Using the mid(inputstring$,1,1) technique to build up a string character by charater and passing this to the MatchRegularExpression(#RegularExpression, String$)
function as the string builds up in length will work but it is painfully slow.

What works for me is to find the pattern match with MatchRegularExpression(#RegularExpression, String$).

Having detected the first pattern match the string that was found due to the regular expression pattern match is extracted using ExtractRegularExpression(#RegularExpression, String$, Array$())

This string is then passed to Position = FindString(String$, StringToFind$, StartPosition)

where the "StringToFind$" argument = Array$(0) obtained from the "ExtractRegularExpression" function.

This works really well and is fast.

If multiple pattern matches is a given searched string occur then it is simply a matter of substituting Array$(0), Array$(1), ....... Array$(N) to find all the starting positions.

I hope this late reply will help someone with the same problem.
applePi
Addict
Addict
Posts: 1404
Joined: Sun Jun 25, 2006 7:28 pm

Re: Extracting regular expressions while keeping track

Post by applePi »

for the first time i run COMatePLUS http://www.purebasic.fr/english/viewtop ... 14&t=37214
(download from http://www.nxsoftware.com/) and in the basic demos there is a file "Demo_RegExp.pb" show how to search for the positions of pattern "IS." in some string. using vbscript RegEXP engine, and this is great.
i will study how to do that with pcre.
for string "IS1 is2 IS3 is4" the positions of pattern "IS." case ignore:

results of running "Demo_RegExp.pb":
Running regular expression find sample...
------------------------------------------
Match found at characters 0 to 2 Match value is IS1
Match found at characters 4 to 6 Match value is is2
Match found at characters 8 to 10 Match value is IS3
Match found at characters 12 to 14 Match value is is4

Running regular expression replace sample...
------------------------------------------
The quick brown cat jumped over the lazy dog.


P.S to know more about how vbscript regExp used in vb6 look here:
http://www.regular-expressions.info/vbscript.html
Post Reply