String Characters & PB_COMPILER_UNICODE (Ascii/Unicode)

Share your advanced PureBasic knowledge/code with the community.
Xombie
Addict
Addict
Posts: 898
Joined: Thu Jul 01, 2004 2:51 am
Location: Tacoma, WA
Contact:

String Characters & PB_COMPILER_UNICODE (Ascii/Unicode)

Post by Xombie »

Code updated For 5.20+
Here's something that I just recently noticed and started using. Probably obvious to other people but... well... I'm slow.

Anyway, in the past if I was writing a routine to scan through a string and needed it to be compatible with ascii and unicode, I'd do some compiler-if statements for incrementing the string pointer, getting the character index, etc...

However, you can use bit-shifting to be more lazy :)

Code: Select all

Structure CHAR
   c.c
EndStructure
;-
Procedure.l FindDash(Text.s)
   ;
   Define.l HoldIndex
   ;
   Define.CHAR *HoldChar = @Text
   ;
   While *HoldChar\c <> '-' And *HoldChar\c
      ;
      *HoldChar + (1 << #PB_COMPILER_UNICODE)
      ;
   Wend
   ;
   HoldIndex = ((*HoldChar - @Text) >> #PB_COMPILER_UNICODE) + 1
   ;
   ProcedureReturn HoldIndex
   ;
EndProcedure
;-
Debug FindDash("Please - test this string.")
;-
End
;-
Like that. Try it under both unicode & non-unicode to see what I mean.

..: Edit :.. Fixed a misplaced addition :)
srod
PureBasic Expert
PureBasic Expert
Posts: 10589
Joined: Wed Oct 29, 2003 4:35 pm
Location: Beyond the pale...

Post by srod »

There's nowt wrong with being lazy on occasion! :D

A neat trick; although SizeOf(CHARACTER) would probably function aswell - maybe not as fast though.
I may look like a mule, but I'm not a complete ass.
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

Here is a nice "helper" for unicode and ascii.

Code: Select all

CompilerIf #PB_Compiler_Unicode
 #Char=2
CompilerElse
 #Char=1
CompilerEndIf

Debug #Char
horst
Enthusiast
Enthusiast
Posts: 197
Joined: Wed May 28, 2003 6:57 am
Location: Munich
Contact:

Re: String Characters & PB_COMPILER_UNICODE (Ascii/Unico

Post by horst »

@Xombie:
No need to define a structure. With *pointer.character you can use *pointer\c
The character handling by pointer becomes interesting, when you use the pointer in your procedures where several character operations are necessary. Then you can use macros such as:

Code: Select all

Macro StepChar(pointer)
  pointer + SizeOf(pointer#\c)
EndMacro 

Macro SkipSpace(pointer)
  While pointer#\c = ' ' : StepChar(pointer) : Wend 
EndMacro 

Macro GotoChar(pointer,char)
  While pointer#\c 
    If pointer#\c = char : Break : Endif 
    StepChar(pointer)
 Wend 
EndMacro 
Example:

Code: Select all

string$ = "1234567890"
*string.character = @string$ 
GoChar(*string,'7')
*string +1
Debug Chr(*string\c)
Horst.
Fred
Administrator
Administrator
Posts: 18252
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Post by Fred »

SizeOf(Character) is definitely the way to go, instead of the 1 << #PB_Compiler_Unicode trick (nice one BTW :P)
Xombie
Addict
Addict
Posts: 898
Joined: Thu Jul 01, 2004 2:51 am
Location: Tacoma, WA
Contact:

Post by Xombie »

Thanks, fred :)

I know about SizeOf() for the char type and I think it'd work fine for incrementing and in order searches but what about getting a character at a specific index without actually looping through the string? That's why I originally figured on the #PB_Compiler_Unicode. It's either 0 for ascii or 1 for unicode. With the '1' you can bit shift left or right to divide or multiply by 2 to get the proper position in a unicode string. And for ascii, bitshifting left or right by 0 does nothing so it has no effect and the byte is the index.

Also, the nicely different colored #PB_Compiler_Unicode (at least in Japbe) lets you know that you're messing with ascii/unicode and you can use it multiple times in a line as needed.

Just my two cents :) At the very least it's much less cumbersome than what I was doing with the compiler-if-else-endif.
Fred
Administrator
Administrator
Posts: 18252
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Post by Fred »

Just use: pos*SizeOf(Character), as if SizeOf() is 1, nothing will be done (PB will remove this useless multiply) and if SizeOf() is 2 it will go to the right place ;).
Post Reply