String Characters & PB_COMPILER_UNICODE (Ascii/Unicode)

Xombie · Post by **Xombie** » Wed Sep 27, 2006 9:07 pm

Code updated For 5.20+
Here's something that I just recently noticed and started using. Probably obvious to other people but... well... I'm slow.

Anyway, in the past if I was writing a routine to scan through a string and needed it to be compatible with ascii and unicode, I'd do some compiler-if statements for incrementing the string pointer, getting the character index, etc...

However, you can use bit-shifting to be more lazy

Code: Select all

Structure CHAR
   c.c
EndStructure
;-
Procedure.l FindDash(Text.s)
   ;
   Define.l HoldIndex
   ;
   Define.CHAR *HoldChar = @Text
   ;
   While *HoldChar\c <> '-' And *HoldChar\c
      ;
      *HoldChar + (1 << #PB_COMPILER_UNICODE)
      ;
   Wend
   ;
   HoldIndex = ((*HoldChar - @Text) >> #PB_COMPILER_UNICODE) + 1
   ;
   ProcedureReturn HoldIndex
   ;
EndProcedure
;-
Debug FindDash("Please - test this string.")
;-
End
;-

Like that. Try it under both unicode & non-unicode to see what I mean.

..: Edit :.. Fixed a misplaced addition

srod · Post by **srod** » Wed Sep 27, 2006 11:42 pm

There's nowt wrong with being lazy on occasion!

A neat trick; although SizeOf(CHARACTER) would probably function aswell - maybe not as fast though.

Rescator · Post by **Rescator** » Thu Sep 28, 2006 3:54 am

Here is a nice "helper" for unicode and ascii.

Code: Select all

CompilerIf #PB_Compiler_Unicode
 #Char=2
CompilerElse
 #Char=1
CompilerEndIf

Debug #Char

horst · Post by **horst** » Thu Sep 28, 2006 8:26 am

@Xombie:
No need to define a structure. With *pointer.character you can use *pointer\c
The character handling by pointer becomes interesting, when you use the pointer in your procedures where several character operations are necessary. Then you can use macros such as:

Code: Select all

Macro StepChar(pointer)
  pointer + SizeOf(pointer#\c)
EndMacro 

Macro SkipSpace(pointer)
  While pointer#\c = ' ' : StepChar(pointer) : Wend 
EndMacro 

Macro GotoChar(pointer,char)
  While pointer#\c 
    If pointer#\c = char : Break : Endif 
    StepChar(pointer)
 Wend 
EndMacro

Example:

Code: Select all

string$ = "1234567890"
*string.character = @string$ 
GoChar(*string,'7')
*string +1
Debug Chr(*string\c)

Post by **Fred** » Thu Sep 28, 2006 8:17 pm

SizeOf(Character) is definitely the way to go, instead of the 1 << #PB_Compiler_Unicode trick (nice one BTW

)

Xombie · Post by **Xombie** » Thu Sep 28, 2006 9:42 pm

Thanks, fred

I know about SizeOf() for the char type and I think it'd work fine for incrementing and in order searches but what about getting a character at a specific index without actually looping through the string? That's why I originally figured on the #PB_Compiler_Unicode. It's either 0 for ascii or 1 for unicode. With the '1' you can bit shift left or right to divide or multiply by 2 to get the proper position in a unicode string. And for ascii, bitshifting left or right by 0 does nothing so it has no effect and the byte is the index.

Also, the nicely different colored #PB_Compiler_Unicode (at least in Japbe) lets you know that you're messing with ascii/unicode and you can use it multiple times in a line as needed.

Just my two cents

At the very least it's much less cumbersome than what I was doing with the compiler-if-else-endif.

Post by **Fred** » Thu Sep 28, 2006 10:06 pm

Just use: pos*SizeOf(Character), as if SizeOf() is 1, nothing will be done (PB will remove this useless multiply) and if SizeOf() is 2 it will go to the right place

.

PureBasic Forums - English

String Characters & PB_COMPILER_UNICODE (Ascii/Unicode)

String Characters & PB_COMPILER_UNICODE (Ascii/Unicode)

Re: String Characters & PB_COMPILER_UNICODE (Ascii/Unico