Chr and Asc bug

Just starting out? Need help? Post your questions and find answers here.
User_Russian
Addict
Addict
Posts: 1591
Joined: Wed Nov 12, 2008 5:01 pm
Location: Russia

Chr and Asc bug

Post by User_Russian »

Code: Select all

Char.q = $81949FF0
s.s=PeekS(@Char, 1, #PB_UTF8)
ShowMemoryViewer(@s, 8)
Debug s
Debug Asc(s)
Debug Chr(Asc(s))
User avatar
useful
Enthusiast
Enthusiast
Posts: 403
Joined: Fri Jul 19, 2013 7:36 am

Re: Chr and Asc bug

Post by useful »

https://www.compart.com/en/unicode/U+1F501
UTF-8 Encoding: 0xF0 0x9F 0x94 0x81
UTF-16 Encoding: 0xD83D 0xDD01
UTF-32 Encoding: 0x0001F501

Code: Select all

EnableExplicit
Structure CHU
  ch_utf.a[4] 
EndStructure  
Define ccc.CHU
ccc\ch_utf[0] = $F0 
ccc\ch_utf[1] = $9F
ccc\ch_utf[2] = $94
ccc\ch_utf[3] = $81
ShowMemoryViewer(@ccc, 4)
Define sss.s = PeekS(@ccc, 1, #PB_UTF8)
Debug sss
Debug Len(sss) ;!!!!!!! 1 --> 2 ch
Debug StringByteLength(sss) ;!!!!!! 4 byte
Dawn will come inevitably.
User_Russian
Addict
Addict
Posts: 1591
Joined: Wed Nov 12, 2008 5:01 pm
Location: Russia

Re: Chr and Asc bug

Post by User_Russian »

useful wrote: Sun Jan 28, 2024 1:21 pmUTF-16 Encoding: 0xD83D 0xDD01
Chr() and Asc() don't support this.

Workaround.

Code: Select all

Procedure.l NewAsc(s.s)
  Protected r = Asc(s)
  If r>=$D800 And r<=$DFFF
    r | ((Asc(PeekS(@s+2, 1))&$FFFF)<<16)
  EndIf
  ProcedureReturn r
EndProcedure

Procedure.s NewChr(Char.l)
  Protected x, r.s = Chr(Char)
  x = Char & $FFFF
  If x>=$D800 And x<=$DFFF
    r + Chr(Char>>16)
  EndIf
  ProcedureReturn r
EndProcedure

Char = $81949FF0
s.s=PeekS(@Char, 1, #PB_UTF8)
ShowMemoryViewer(@s, 8)
Debug s
Debug NewChr(NewAsc(s))
User avatar
STARGÅTE
Addict
Addict
Posts: 2260
Joined: Thu Jan 10, 2008 1:30 pm
Location: Germany, Glienicke
Contact:

Re: Chr and Asc bug

Post by STARGÅTE »

Pure Basic can handle only the BMP of the Unicode Standard,
so characters from 0 to $FFFF.
For higher codes you need additional functions:
https://www.purebasic.fr/german/viewtop ... 14#p340514
PB 6.01 ― Win 10, 21H2 ― Ryzen 9 3900X, 32 GB ― NVIDIA GeForce RTX 3080 ― Vivaldi 6.0 ― www.unionbytes.de
Lizard - Script language for symbolic calculations and moreTypeface - Sprite-based font include/module
User avatar
idle
Always Here
Always Here
Posts: 6035
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: Chr and Asc bug

Post by idle »

if you need utf16 support take a look at my utf16 module
https://github.com/idle-PB/UTF16/blob/main/UTF16a.pb
Post Reply