More Sophisticated Lower Case function
Posted: Wed May 08, 2013 3:35 pm
I have a long list of words which I want to search for a matching non-case-sensitive string. The very fast Pure basic Lcase() function can't be used because I want É to be found if I search for "e" etc. Furthermore I want to find ae (two letters) if I search for Æ etc and ss if search for ß. To this end I wrote the code below and it is reasonably satisfactory, but I wonder if anyone can improve on it. There is a very fast lower case routine here: http://www.purebasic.fr/german/viewtopi ... 32#p248832 which can do everything except substitute two letters for one. I have no idea if it could be adapted to do this.
Note: You should turn off the debugger when running this code or it will take a very long time!
Note: You should turn off the debugger when running this code or it will take a very long time!
Code: Select all
EnableExplicit
Global Inc.l, FirstTick, Word.s, EventId.l, TicksTaken.l
Global LcaseAsciiFixed.s{256}
Procedure CreateLowerCaseCharString()
Protected Inc.l, LCaseAsciiStr.s
For Inc = 1 To 44
LCaseAsciiStr = LCaseAsciiStr + Chr(Inc)
Next
LCaseAsciiStr = LCaseAsciiStr + " " ; replaces hyphen by space
For Inc = 46 To 64
LCaseAsciiStr = LCaseAsciiStr + Chr(Inc)
Next
LCaseAsciiStr = LCaseAsciiStr + "abcdefghijklmnopqrstuvwxyz"
For Inc = 91 To 137
LCaseAsciiStr = LCaseAsciiStr + Chr(Inc)
Next
LCaseAsciiStr = LCaseAsciiStr + "s‹œ" + Chr(141) + "z"
For Inc = 143 To 144
LCaseAsciiStr = LCaseAsciiStr + Chr(Inc)
Next
LCaseAsciiStr = LCaseAsciiStr + "''" + Chr(34) + Chr(34)
For Inc = 149 To 153
LCaseAsciiStr = LCaseAsciiStr + Chr(Inc)
Next
LCaseAsciiStr = LCaseAsciiStr + "s"
For Inc = 155 To 157
LCaseAsciiStr = LCaseAsciiStr + Chr(Inc)
Next
LCaseAsciiStr = LCaseAsciiStr + "zy ¡¢£¤y" ; sticky space to space
For Inc = 166 To 191
LCaseAsciiStr = LCaseAsciiStr + Chr(Inc)
Next
LCaseAsciiStr = LCaseAsciiStr + "aaaaaaæceeeeiiiidnooooo×ouuuuyÞßaaaaaaæceeeeiiiiðnooooo÷ouuuuyþy"
LcaseAsciiFixed = " " + LcaseAsciiStr
EndProcedure
Procedure.s MakeLowerCase(theStr.s)
Protected AscChar.a, StrEnd.l, Inc.l
StrEnd = Len(theStr) - 1
For Inc = 0 To StrEnd
AscChar = PeekA(@LcaseAsciiFixed + PeekA(@theStr + Inc))
Select AscChar
Case 1 To 155, 157 To 222, 224 To 229, 231 To 255
PokeA(@theStr + Inc, AscChar)
Case 156 ; œ
PokeA(@theStr + Inc, 111)
StrEnd + 1: Inc + 1
theStr = InsertString(theStr, "e", Inc + 1)
Case 223 ; ß
PokeA(@theStr + Inc, 115)
StrEnd + 1: Inc + 1
theStr = InsertString(theStr, "s", Inc + 1)
Case 230 ; æ
PokeA(@theStr + Inc, 97)
StrEnd + 1: Inc + 1
theStr = InsertString(theStr, "e", Inc + 1)
EndSelect
Next
ProcedureReturn theStr
EndProcedure
CreateLowerCaseCharString()
Word = "ÆXAMPLE ßTRING"
If OpenWindow(0, 20, 20, 400, 300, "Lower Case")
TextGadget(0, 20, 20, 250, 18, "Original String: " + Word)
TextGadget(1, 20, 40, 250, 18, "Lower case String: " + MakeLowerCase(Word))
TextGadget(2, 20, 60, 250, 18, "")
FirstTick = GetTickCount_()
For Inc = 1 To 1000000
MakeLowerCase(Word)
Next
TicksTaken = GetTickCount_() - FirstTick
SetGadgetText(2, "Time Taken: " + Str(TicksTaken))
Repeat
EventId = WaitWindowEvent()
Until EventId = #PB_Event_CloseWindow
CloseWindow(0)
EndIf