Hello!
I am using emojis and they appear well if I paste them into the IDE, but while displaying them during execution I get corrupt results:
http://pastebin.com/WxBZ8WF5
Arial font.
Symbols appear corrupted
- marcoagpinto
- Addict

- Posts: 1076
- Joined: Sun Mar 10, 2013 3:01 pm
- Location: Portugal
- Contact:
Re: [PB5.44][PB5.60b6] Symbols appear corrupted
It's probably arial font which is not supporting these symbols ?
- marcoagpinto
- Addict

- Posts: 1076
- Joined: Sun Mar 10, 2013 3:01 pm
- Location: Portugal
- Contact:
Re: [PB5.44][PB5.60b6] Symbols appear corrupted
I could swear that I had Arial font on when I inserted the symbols in LibreOffice.Fred wrote:It's probably arial font which is not supporting these symbols ?
Re: [PB5.44][PB5.60b6] Symbols appear corrupted
It should be the same problem as you talked about in this thread: http://www.purebasic.fr/english/viewtopic.php?f=13&t=67687&p=501763#p501763marcoagpinto wrote:Hello!
I am using emojis and they appear well if I paste them into the IDE, but while displaying them during execution I get corrupt results:
http://pastebin.com/WxBZ8WF5
Arial font.
You need to realize that the codepoint for the emoji's require 4 bytes to store in PureBasic if using a Utf 16 encoded string (the kind PureBasic uses) but can also be stored as Utf 8 (like the source code). The problem usually occurs when either a literal string or a value composed using Chr () is used in the source code as these may not be properly encoded for Utf 16. These problems can be avoided by using a buffer to hold the properly encoded value.
My example code in your other thread demonstrates anot her way to solve this problem for these high codepoint vales in unicode by using a custom function for Chr ().
- marcoagpinto
- Addict

- Posts: 1076
- Joined: Sun Mar 10, 2013 3:01 pm
- Location: Portugal
- Contact:
Re: [PB5.44][PB5.60b6] Symbols appear corrupted
My friend,Demivec wrote:It should be the same problem as you talked about in this thread: http://www.purebasic.fr/english/viewtopic.php?f=13&t=67687&p=501763#p501763marcoagpinto wrote:Hello!
I am using emojis and they appear well if I paste them into the IDE, but while displaying them during execution I get corrupt results:
http://pastebin.com/WxBZ8WF5
Arial font.
You need to realize that the codepoint for the emoji's require 4 bytes to store in PureBasic if using a Utf 16 encoded string (the kind PureBasic uses) but can also be stored as Utf 8 (like the source code). The problem usually occurs when either a literal string or a value composed using Chr () is used in the source code as these may not be properly encoded for Utf 16. These problems can be avoided by using a buffer to hold the properly encoded value.
My example code in your other thread demonstrates anot her way to solve this problem for these high codepoint vales in unicode by using a custom function for Chr ().
Could you explain to me how do I convert the symbols to ASCII so that I can use your function?
My ASC(symbol) gave the same value for both emojis.
Thank you,
Re: [PB5.44][PB5.60b6] Symbols appear corrupted
1. I use "Segoe UI Symbol" for the Debugger, because it supports (most?) emoji and other symbols.
Tested "Arial" font - did not show symbols.
Back to "Segoe UI Symbol" - output changed to symbols, without even re-running the test program.
2. Asc() gave you the same value, because Asc() only returns a 16-bit value... so you're only getting part of any codepoint over $FFFF (the "high surrogate" half).
From http://www.purebasic.fr/english/viewtop ... 12&t=64947
Tested "Arial" font - did not show symbols.
Back to "Segoe UI Symbol" - output changed to symbols, without even re-running the test program.
2. Asc() gave you the same value, because Asc() only returns a 16-bit value... so you're only getting part of any codepoint over $FFFF (the "high surrogate" half).
From http://www.purebasic.fr/english/viewtop ... 12&t=64947
Code: Select all
Procedure.s _Chr(v.i) ;return a proper surrogate pair for unicode values outside the BMP (Basic Multilingual Plane)
Protected high, low
If v < $10000
ProcedureReturn Chr(v)
Else
;calculate surrogate pair of unicode codepoints to represent value in UTF-16
v - $10000
high = v / $400 + $D800 ;high/lead surrogate value
low = v % $400 + $DC00 ;low/tail surrogate value
ProcedureReturn Chr(high) + Chr(low)
EndIf
EndProcedure
Procedure _Asc(u$) ;return a proper codepoint value for a UTF-16 surrogate pair
Protected *u = @u$, high = PeekU(*u), low
Select high
Case 0 To $D7FF, $DC00 To $FFFF ;includes range for low surrogate value ($DC00 to $DFFF)
ProcedureReturn high ;return value as is (may be an unmatched low surrogate value)
Case $D800 To $DBFF
low = PeekU(*u + SizeOf(Unicode))
If low & $DC00 = $DC00 ;low >= $DC00 And low <= $DFFF
ProcedureReturn (high - $D800) * $400 + (low - $DC00) + $10000 ;return decoded surrogate pair
EndIf
ProcedureReturn high ;an unmatched high surrogate value, return value as is
EndSelect
EndProcedure
Text.s = _Chr(128299)
Debug Text
Debug Asc(Text)
Debug _Asc(Text)
Text.s = _Chr(128294)
Debug Text
Debug Asc(Text)
Debug _Asc(Text)- marcoagpinto
- Addict

- Posts: 1076
- Joined: Sun Mar 10, 2013 3:01 pm
- Location: Portugal
- Contact:
Re: [PB5.44][PB5.60b6] Symbols appear corrupted
Thank you, my friend!
Re: [PB5.44][PB5.60b6] Symbols appear corrupted
interesting and eyeopening thread, thanks to all posters. I've had no problems getting them to display in Win7-64 (tested both PB x86 and x64), but i've had no luck getting them to display in XP-32 (including using the exact same Segue UI .ttf file from Win7). Not sure if thats just a limitation of XP or 32bit OS (i havent tried any other 32bit Windows), or maybe im just doing something wrong or maybe my XP VM sucks. Actually i know my XP VM sucks, but i trust it more than my Win10 VM which sucks more.
Re: [PB5.44][PB5.60b6] Symbols appear corrupted
Maybe XP has an older version of the Segue UI font, before the emoji codepoints were added to unicode. They were added in version 6.0 of the Unicode Standard in 10/10/2010. You would have to install a later version of the font on XP to see the characters.Keya wrote:interesting and eyeopening thread, thanks to all posters. I've had no problems getting them to display in Win7-64 (tested both PB x86 and x64), but i've had no luck getting them to display in XP-32 (including using the exact same Segue UI .ttf file from Win7). Not sure if thats just a limitation of XP or 32bit OS (i havent tried any other 32bit Windows), or maybe im just doing something wrong or maybe my XP VM sucks. Actually i know my XP VM sucks, but i trust it more than my Win10 VM which sucks more.
@Edit: Added details on the codepoints in question.
