Page 1 of 1
Character Count of Unicode String
Posted: Mon Aug 19, 2024 3:31 pm
by tkaltschmidt
Hi there,
i'm evaluating Purebasic and have to say: Very impressed so far!
Litte question: Is Purebasic capable of counting the perceived characters of a Unicode String? The following code returns 2 characters, but there is only one. Do i miss the correct function or is this a known limitation?
Code: Select all
EnableExplicit
Define MyString1$ = "😍"
Define MyLength.i=Len(MyString1$)
best, Thomas
Re: Character Count of Unicode String
Posted: Mon Aug 19, 2024 3:40 pm
by DarkDragon
PureBasic uses Widechar/UCS-2 as internal string representation, so the unicode support is limited to the basic multilingual plane.
Your emoji is U+1F60D (outside the plane)
Re: Character Count of Unicode String
Posted: Mon Aug 19, 2024 3:51 pm
by miskox
Re: Character Count of Unicode String
Posted: Mon Aug 19, 2024 4:11 pm
by tkaltschmidt
Thank you, Daniel and Saso!
Re: Character Count of Unicode String
Posted: Mon Aug 19, 2024 4:59 pm
by Fred
Re: Character Count of Unicode String
Posted: Mon Aug 19, 2024 6:45 pm
by tkaltschmidt
Looks good, thank you, Fred!
Re: Character Count of Unicode String
Posted: Mon Aug 19, 2024 11:10 pm
by idle
I just updated the UTF16.pb to expose strLen(string$) but if all you need is strlen
Code: Select all
Procedure StrLen_(str.s)
Protected *Char.Unicode
Protected cnt
*Char.Unicode = @str
If *Char
While *Char\u
If *Char\u > $D7FF And *Char\u < $E000
*Char + 4
Else
*Char + 2
EndIf
cnt + 1
Wend
EndIf
ProcedureReturn cnt
EndProcedure
Define example$ = "😁A😁😁K😁"
Debug StrLen_(example$)
Re: Character Count of Unicode String
Posted: Tue Aug 20, 2024 2:56 pm
by tkaltschmidt
That's awesome, thanks, idle!
May i ask: What is the difference between UTF16.pb and UTF16a.pb?
Re: Character Count of Unicode String
Posted: Tue Aug 20, 2024 8:22 pm
by idle
Utf16a includes a mapping to strip accents.