Character Count of Unicode String

Just starting out? Need help? Post your questions and find answers here.
tkaltschmidt
User
User
Posts: 13
Joined: Sun Aug 11, 2024 1:15 pm
Location: Germany, Hannover

Character Count of Unicode String

Post by tkaltschmidt »

Hi there,

i'm evaluating Purebasic and have to say: Very impressed so far!

Litte question: Is Purebasic capable of counting the perceived characters of a Unicode String? The following code returns 2 characters, but there is only one. Do i miss the correct function or is this a known limitation?

Code: Select all

EnableExplicit

Define MyString1$ = "😍"

Define MyLength.i=Len(MyString1$)
best, Thomas
DarkDragon
Addict
Addict
Posts: 2344
Joined: Mon Jun 02, 2003 9:16 am
Location: Germany
Contact:

Re: Character Count of Unicode String

Post by DarkDragon »

PureBasic uses Widechar/UCS-2 as internal string representation, so the unicode support is limited to the basic multilingual plane.

Your emoji is U+1F60D (outside the plane)
bye,
Daniel
miskox
Enthusiast
Enthusiast
Posts: 107
Joined: Sun Aug 27, 2017 7:37 pm
Location: Slovenia

Re: Character Count of Unicode String

Post by miskox »

tkaltschmidt
User
User
Posts: 13
Joined: Sun Aug 11, 2024 1:15 pm
Location: Germany, Hannover

Re: Character Count of Unicode String

Post by tkaltschmidt »

Thank you, Daniel and Saso!
Fred
Administrator
Administrator
Posts: 18162
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: Character Count of Unicode String

Post by Fred »

This module should do exactly what you want: https://www.purebasic.fr/english/viewto ... ilit=Utf16
tkaltschmidt
User
User
Posts: 13
Joined: Sun Aug 11, 2024 1:15 pm
Location: Germany, Hannover

Re: Character Count of Unicode String

Post by tkaltschmidt »

Looks good, thank you, Fred!
User avatar
idle
Always Here
Always Here
Posts: 5836
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: Character Count of Unicode String

Post by idle »

I just updated the UTF16.pb to expose strLen(string$) but if all you need is strlen

Code: Select all

Procedure StrLen_(str.s) 
  Protected *Char.Unicode
  Protected cnt
  *Char.Unicode = @str
  If *Char
    While *Char\u
      If *Char\u > $D7FF And *Char\u < $E000
        *Char + 4
      Else
        *Char + 2
      EndIf
      cnt + 1
    Wend
  EndIf
  ProcedureReturn cnt
EndProcedure
  
Define example$ = "😁A😁😁K😁" 
Debug StrLen_(example$) 
tkaltschmidt
User
User
Posts: 13
Joined: Sun Aug 11, 2024 1:15 pm
Location: Germany, Hannover

Re: Character Count of Unicode String

Post by tkaltschmidt »

That's awesome, thanks, idle!

May i ask: What is the difference between UTF16.pb and UTF16a.pb?
User avatar
idle
Always Here
Always Here
Posts: 5836
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: Character Count of Unicode String

Post by idle »

Utf16a includes a mapping to strip accents.
Post Reply