Regular Expressions Engine (deelx) with PureBasic
Posted: Tue Jul 31, 2012 7:45 am
here is how to use purebasic with an excellent perl compatible regular expressions engine called DEELX (freeware). at first download DLL version binary file from codeproject http://www.codeproject.com/Articles/159 ... gine-for-C or from here http://www.mediafire.com/?33basaihl2e5xez with the PB codes and the text file if you don't have a free account. the author site for the C++ version is here http://www.regexlab.com/en/
the dll contains 20 functions which we can know from using this code:
searching for a pattern:
suppose we have the string "ww12v345PY24536567" and we want to search the beginning and end positions of the pattern "2.*?3" ie digit 2 followed by any chars or digits until we find digit 3 with the minimum search span and over the whole string
the results should be this:
pattern begins at 3
pattern ends at 5
substring matched 2v3
===================================
pattern begins at 10
pattern ends at 13
substring matched 2453
===================================
replace text:
download this text file to use with the following code
http://www.textfiles.com/stories/abbey.txt
we want to replace "the" capital or small with "CATSZ"
the vb demo example from codeproject contains a funny example , he want to search for words wich have a digit at its end then to move that digit to the beginning of the word
the string is: "x1 yy2 zzz3"
the pattern wanted (\w+)(\d)
the replacement pattern $2$1 means reverse the found pattern
so the result will be "1x 2yy 3zzz"
the best introductory book is " Sams Teach Yourself Regular Expressions in 10 Minutes ":
http://www.forta.com/books/0672325667/
"10 minutes" he means a month or more
note that i have used a small number of functions from the engine dll, the other functions i don't know yet how to use. the above functions are sufficent for me .
more ref:
Regular Expressions for DarkBasic:
http://forum.thegamecreators.com/?m=for ... 196596&b=1
the dll contains 20 functions which we can know from using this code:
Code: Select all
Define dll.l = OpenLibrary(#PB_Any,"libdeelx.dll")
ExamineLibraryFunctions(dll)
For i = 1 To 20
NextLibraryFunction()
R.s = LibraryFunctionName()
Debug R
Next
FreeLibrary_(dll)
suppose we have the string "ww12v345PY24536567" and we want to search the beginning and end positions of the pattern "2.*?3" ie digit 2 followed by any chars or digits until we find digit 3 with the minimum search span and over the whole string
Code: Select all
handle.l = 0:result.l= 0:nextSearch.l
txt.s = "ww12v345PY24536567"
regex.s = "2.*?3"
OpenLibrary(0,"libdeelx.dll")
*func = GetFunction(0, "regexp_create")
handle = CallFunctionFast(*func )
*func = GetFunction(0, "result_create")
result = CallFunctionFast(*func )
*func = GetFunction(0, "regexp_compile")
CallFunctionFast(*func ,handle,@regex,0)
Debug "text = "+txt
Debug "pattern = "+regex
Debug ".........................................."
startpos.l = -1
For i.l = 1 To Len(txt)
*func = GetFunction(0, "regexp_match")
CallFunctionFast(*func ,handle,@txt,startpos,result)
*func = GetFunction(0, "result_ismatched")
rs.l = CallFunctionFast(*func,result)
If rs <> 0
*func = GetFunction(0, "result_start")
resultStart.l = CallFunctionFast(*func, result)
*func = GetFunction(0, "result_end")
resultEnd.l = CallFunctionFast(*func, result)
txtpart.s = Mid(txt, resultStart + 1, resultEnd - resultStart)
Debug "pattern begins at "+Str(resultStart)
Debug "pattern ends at "+Str(resultEnd-1)
Debug "substring matched "+txtpart
Debug "==================================="
startpos = resultEnd
Else
Break
EndIf
Next
*func = GetFunction(0, "regexp_free")
CallFunctionFast(*func, handle)
*func = GetFunction(0, "result_free")
CallFunctionFast(*func, result)
FreeLibrary_(0)
pattern begins at 3
pattern ends at 5
substring matched 2v3
===================================
pattern begins at 10
pattern ends at 13
substring matched 2453
===================================
replace text:
download this text file to use with the following code
http://www.textfiles.com/stories/abbey.txt
we want to replace "the" capital or small with "CATSZ"
Code: Select all
Enumeration
#DLL
#FILE
#OUT
EndEnumeration
handle.l = 0
regex.s = "[Tt][Hh][Ee]"
ReplacedTo.s = "CATSZ"
Global null.s
OpenLibrary(#DLL,"libdeelx.dll")
*func = GetFunction(#DLL, "regexp_create")
handle = CallFunctionFast(*func )
*func = GetFunction(#DLL, "regexp_compile")
CallFunctionFast(*func ,handle,@regex,0)
CreateFile(#OUT, "out.txt")
If ReadFile(#FILE, "abbey.txt")
While Eof(#FILE) = 0
txt.s = ReadString(#FILE)
If Len(txt)=0
Continue
EndIf
*func = GetFunction(#DLL, "regexp_replace")
result_length.l = CallFunctionFast(*func ,handle,@txt,@regex,-1,-1,@null,0)
result.s = Space(result_length*2)
CallFunctionFast(*func ,handle,@txt,@ReplacedTo,-1,-1,@result, result_length*2)
WriteString(#OUT, result)
WriteString(#OUT,Chr(13)+Chr(10))
Wend
EndIf
CloseFile(#FILE)
CloseFile(#OUT)
*func = GetFunction(#DLL, "regexp_free")
CallFunctionFast(*func, handle)
FreeLibrary_(#DLL)
the string is: "x1 yy2 zzz3"
the pattern wanted (\w+)(\d)
the replacement pattern $2$1 means reverse the found pattern
so the result will be "1x 2yy 3zzz"
the best introductory book is " Sams Teach Yourself Regular Expressions in 10 Minutes ":
http://www.forta.com/books/0672325667/
"10 minutes" he means a month or more
note that i have used a small number of functions from the engine dll, the other functions i don't know yet how to use. the above functions are sufficent for me .
more ref:
Regular Expressions for DarkBasic:
http://forum.thegamecreators.com/?m=for ... 196596&b=1