Position number for ExamineRegularExpression()

Got an idea for enhancing PureBasic? New command(s) you'd like to see?
AZJIO
Addict
Addict
Posts: 2143
Joined: Sun May 14, 2017 1:48 am

Position number for ExamineRegularExpression()

Post by AZJIO »

Code: Select all

ExamineRegularExpression(#RegularExpression, String$[, Position])
; Or
SetRegularExpressionPosition(#RegularExpression, String$, Position) ; for ExamineRegularExpression(), ReplaceRegularExpression(), etc
We are always looking for a match from the beginning. The NextRegularExpressionMatch() function remembers the position from which to continue the search. What if I want to start the search not from the beginning, but from a previously found position using another function? What if I want to cache a position for a quick search in the future?
highend
Enthusiast
Enthusiast
Posts: 162
Joined: Tue Jun 17, 2014 4:49 pm

Re: Position number for ExamineRegularExpression()

Post by highend »

There is

Code: Select all

RegularExpressionMatchPosition()
Just store these positions during the loop if you want to access them (later)?
AZJIO
Addict
Addict
Posts: 2143
Joined: Sun May 14, 2017 1:48 am

Re: Position number for ExamineRegularExpression()

Post by AZJIO »

highend wrote: Sun Dec 31, 2023 1:13 am There is

Code: Select all

RegularExpressionMatchPosition()
Just store these positions during the loop if you want to access them (later)?
I need to use the position, not know it. I need to use it to speed up the search. Let's say I have 1000 files of 1 MB each and I use a complex regular expression with forward and backward viewing, while if I need to find some text at the end of the file, then I would not have to view 1000 MB of data, but only 100 MB or 1 MB if I do a one-step viewing, no more than one element in front. That is, waiting for 1 second or 1000 seconds=16 minutes, I hope it matters to you.
highend
Enthusiast
Enthusiast
Posts: 162
Joined: Tue Jun 17, 2014 4:49 pm

Re: Position number for ExamineRegularExpression()

Post by highend »

Afaik not possible. That's internal data of the regex engine, you can't start / resume a regex search from a different position.

What you can do is: If you got the position from the first match, split the string at that pos (-1) and start the search on the second part of the split
Marc56us
Addict
Addict
Posts: 1600
Joined: Sat Feb 08, 2014 3:26 pm

Re: Position number for ExamineRegularExpression()

Post by Marc56us »

AZJIO wrote: Sat Dec 30, 2023 11:43 pm

Code: Select all

ExamineRegularExpression(#RegularExpression, String$[, Position])
Good suggestion.
In the meantime, some ideas (not tested)

1. Isolate the remaining part of the string with a pointer. The search will then be performed on this new string.
- or -
2. If the starting point is known for all files (e.g. search only the end), then create the equivalent of the unix command 'tail' using i.e FileSeek() (In my opinion the best solution as you don't load the whole file in RAM)
- or -
3. If you want to start the search from a certain point, then ignore the beginning. F (remember use the correct encoding). i.e ignore the first 10000 chars

Code: Select all

^(?:.{10000}).+?(<regex>)
etc
AZJIO
Addict
Addict
Posts: 2143
Joined: Sun May 14, 2017 1:48 am

Re: Position number for ExamineRegularExpression()

Post by AZJIO »

highend wrote: Sun Dec 31, 2023 8:55 am Afaik not possible. That's internal data of the regex engine, you can't start / resume a regex search from a different position.

What you can do is: If you got the position from the first match, split the string at that pos (-1) and start the search on the second part of the split
Look at the "offset" parameter in the StringRegExp function.
In C, the concept of a string is a pointer to a sequential piece of data. You just add the number "offset" to the pointer and you get the pointer to the last part of the line. We need to pass the new pointer to the regular expression engine. That's it.
Marc56us wrote: Sun Dec 31, 2023 9:15 am In the meantime, some ideas (not tested)
I know these methods.
Post Reply