Page 1 of 1

[Implemented] #PB_String_NoCase for FindString()

Posted: Sat Oct 11, 2008 1:24 am
by zxtunes.com
Please add flag "#PB_String_NoCase" for FindString() function.

Posted: Sat Oct 11, 2008 11:58 am
by Trond

Code: Select all

FindString(LCase(S.s), LCase(S.s), 0)
FindString(S.s, S.s, 0, #PB_String_NoCase)
Now tell me which one is shorter.

Posted: Sat Oct 11, 2008 12:04 pm
by milan1612
...maybe shorter - but very inefficient. I use FindString quite often with big strings,
but I wrote my own procedure when I needed it case-insensitive as LCase is quite
slow with big strings. So +1 by me :P

Posted: Sat Oct 11, 2008 12:39 pm
by PB
> tell me which one is shorter

What do you mean by shorter? Shorter code to type, or shorter to execute?

Add an space to selected line

Posted: Sat Oct 11, 2008 1:43 pm
by Wolf
+100

FindString is very handy when work with strings, but always when i use it i feeling #PB_String_NoCase is missing....

I use Lcase() too but of constraint !


@Trond

If you like shorter, then instead of:

Code: Select all

FindString(LCase(S.s), LCase(S.s), 0)
FindString(S.s, S.s, 0, #PB_String_NoCase)
You can use:

Code: Select all

FindString(LCase(S.s), LCase(S.s), 0)
FindString(S.s, S.s, 0, 1)
Now which one is shorter and faster? :D

Posted: Sat Oct 11, 2008 2:13 pm
by Trond
How do you plan to implement this any faster than with lcase()? Whether the conversion happens inside or outside FindString() doesn't matter, it will still be just as slow.

Posted: Sat Oct 11, 2008 2:20 pm
by Kaeru Gaman
not really.

a #PB_String_NoCase would mean that one bit is ignored while comparing.
(capital letters have bit5 cleared, small have bit5 set, the rest is identical for both alphabets.)

using LCase means, having a Function running thru both full strings changing all chars to lower case.
additionally, since we dont change the origin string but creating a copy
we pass to the FindString function, we need additional memory to do so.

I think the first could be achieved with an even faster algorithm.

Posted: Sat Oct 11, 2008 2:55 pm
by Little John
Kaeru Gaman wrote:a #PB_String_NoCase would mean that one bit is ignored while comparing.
(capital letters have bit5 cleared, small have bit5 set, the rest is identical for both alphabets.)
This is true for characters A-Z and a-z, respectively. Maybe also for some special characters, but certainly not for all of them. Many languages have special characters, such as our funny German umlauts. And the situation is even different with Unicode.
Of course, we want a case-insensitive FindString() function to be reliable also for special characters.

Regards, Little John

Posted: Sat Oct 11, 2008 4:11 pm
by Trond
Kaeru Gaman wrote:a #PB_String_NoCase would mean that one bit is ignored while comparing.
Sorry, but that would be a complete disaster, even in ascii mode. In Unicode mode it will be worse...

Code: Select all

; IDE: Plain text source code
; Compiler: ascii mode

Procedure.s CompareIgnoreBit6(S1.s, S2.s)
  If Len(S1) <> Len(S2)
    ProcedureReturn "wrong length"
  EndIf
  For I = 1 To Len(S1)
    C1.c = Asc(Mid(S1, I, 1)) & %11011111
    C2.c = Asc(Mid(S2, I, 1)) & %11011111
    If C1 <> C2
      ProcedureReturn "NOT equal at " + Str(I)
    EndIf
  Next
  ProcedureReturn "equal"
EndProcedure

Debug CompareIgnoreBit6("ABCDEF", "abcdef")
Debug CompareIgnoreBit6("ABCDEF", "aBcDeF")
Debug CompareIgnoreBit6("ABCDEF", "MNBVDE")
Debug CompareIgnoreBit6("ABCDEF", "ABCVDE")
Debug "Everything was ok so far..."
Debug "----"

S1.s = "{8 × 2 ^ 10] Ÿ ß"
S2.s = "[8 ÷ 2 ~ 10} ¿ ÿ"
Debug S1
Debug S2
Debug CompareIgnoreBit6(S1, S2) + "!?!?!"