PB 6.20: Emojis above 65535?

Just starting out? Need help? Post your questions and find answers here.
User avatar
marcoagpinto
Addict
Addict
Posts: 1076
Joined: Sun Mar 10, 2013 3:01 pm
Location: Portugal
Contact:

PB 6.20: Emojis above 65535?

Post by marcoagpinto »

Heya,

Since PB 6.20 accepts loading text in ASCII, UTF-8, and UTF-16, does it mean that I can now display emojis above chr 65535 in UTF-8?

Thanks!
User avatar
jacdelad
Addict
Addict
Posts: 2030
Joined: Wed Feb 03, 2021 12:46 pm
Location: Riesa

Re: PB 6.20: Emojis above 65535?

Post by jacdelad »

marcoagpinto wrote: Sun Feb 16, 2025 8:55 am Heya,

Since PB 6.20 accepts loading text in ASCII, UTF-8, and UTF-16,[...]
Hi,
where did you get this information? I can't find it.
Good morning, that's a nice tnetennba!

PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
User avatar
marcoagpinto
Addict
Addict
Posts: 1076
Joined: Sun Mar 10, 2013 3:01 pm
Location: Portugal
Contact:

Re: PB 6.20: Emojis above 65535?

Post by marcoagpinto »

jacdelad wrote: Sun Feb 16, 2025 9:01 am
marcoagpinto wrote: Sun Feb 16, 2025 8:55 am Heya,

Since PB 6.20 accepts loading text in ASCII, UTF-8, and UTF-16,[...]
Hi,
where did you get this information? I can't find it.

Code: Select all

  ReadFile(1,File$)  
    string_format=ReadStringFormat(1)
Press F1 on:

Code: Select all

ReadStringFormat(1)
infratec
Always Here
Always Here
Posts: 7662
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: PB 6.20: Emojis above 65535?

Post by infratec »

:?: :?: :?:

SIne a long time there is written that it can detect many formats, but only
#PB_Ascii, #PB_UTF8 and #PB_Unicode can directly used.

And what has this to do with
does it mean that I can now display emojis above chr 65535 in UTF-8
?

UTF8 can result in up to 4 bytes.
If you can display all this characters depends on the font you are using.
I think you have to switch the font for your wanted emojis an then back to your normal font.
User avatar
HeX0R
Addict
Addict
Posts: 1218
Joined: Mon Sep 20, 2004 7:12 am
Location: Hell

Re: PB 6.20: Emojis above 65535?

Post by HeX0R »

Code: Select all

l = $81989FF0
a$ = PeekS(@l, 4, #PB_UTF8)
Debug a$
infratec
Always Here
Always Here
Posts: 7662
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: PB 6.20: Emojis above 65535?

Post by infratec »

Code: Select all

; https://github.com/idle-PB/UTF16/blob/main/UTF16.pb

IncludeFile "UTF16.pb"

UseModule UTF16

If LoadFont(0, "Segoe UI Emoji", 14)
  SetGadgetFont(#PB_Default, FontID(0))
  Debug "Ok"
EndIf

If OpenWindow(0, 0, 0, 322, 150, "EditorGadget", #PB_Window_SystemMenu | #PB_Window_ScreenCentered)
  EditorGadget(0, 8, 8, 306, 133)
  For a = 0 To 9
    AddGadgetItem(0, a, Hex(128512 + a) + ": " + StrChr(128512 + a))
  Next
  Repeat : Until WaitWindowEvent() = #PB_Event_CloseWindow
EndIf
infratec
Always Here
Always Here
Posts: 7662
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: PB 6.20: Emojis above 65535?

Post by infratec »

Extended HeXOR example:

Code: Select all

; https://www.compart.com/en/unicode/U+1F600


If LoadFont(0, "Segoe UI Emoji", 14)
  SetGadgetFont(#PB_Default, FontID(0))
  Debug "Ok"
EndIf

;0xF0 0x9F 0x98 0x80  emoji 1F600 in utf8 (see link above)
l = $80989FF0 ; in little endian

If OpenWindow(0, 0, 0, 322, 150, "EditorGadget", #PB_Window_SystemMenu | #PB_Window_ScreenCentered)
  EditorGadget(0, 8, 8, 306, 133)
  AddGadgetItem(0, a, "Emoji: " +  PeekS(@l, 4, #PB_UTF8|#PB_ByteLength))
  Repeat : Until WaitWindowEvent() = #PB_Event_CloseWindow
EndIf
If you use a font which includes the emojis, you can show the emojies :wink:
User avatar
marcoagpinto
Addict
Addict
Posts: 1076
Joined: Sun Mar 10, 2013 3:01 pm
Location: Portugal
Contact:

Re: PB 6.20: Emojis above 65535?

Post by marcoagpinto »

Heya,

Why aren't you using

Code: Select all

chr()
in the examples?

I tried to convert to emoji the following values in the Portuguese LibreOffice autocorrect file:

Code: Select all

<block-list:block block-list:abbreviated-name=":zebra:" block-list:name="&#x1F993;"/>
<block-list:block block-list:abbreviated-name=":zeta:" block-list:name="&#x3B6;"/>
<block-list:block block-list:abbreviated-name=":Zeta:" block-list:name="&#x396;"/>
<block-list:block block-list:abbreviated-name=":zombie:" block-list:name="&#x1F9DF;"/>
<block-list:block block-list:abbreviated-name=":zzz:" block-list:name="&#x1F4A4;"/>
And they appear all corrupted.

However, in the tons of emojis which the file has, some appear correct.

I am using the Arial font.

Thanks!
User avatar
idle
Always Here
Always Here
Posts: 6026
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: PB 6.20: Emojis above 65535?

Post by idle »

Chr will fail with the debugger enabled as it's intended to return a ucs2 chr
User avatar
kenmo
Addict
Addict
Posts: 2051
Joined: Tue Dec 23, 2003 3:54 am

Re: PB 6.20: Emojis above 65535?

Post by kenmo »

PureBasic has "unofficially" supported emoji and other characters >$FFFF for years. Although PB counts all characters as fixed 16-bit (UCS-2?) they seem to be treated as UTF-16 by the operating system when rendered.

But the provided Chr() still doesn't accept higher codepoints and won't return a UTF-16 surrogate pair string.

See my ChrU() procedure or Demivec's _Chr() procedure here:
https://www.purebasic.fr/english/viewtopic.php?t=66836
https://www.purebasic.fr/english/viewtopic.php?t=64947
Little John
Addict
Addict
Posts: 4802
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: PB 6.20: Emojis above 65535?

Post by Little John »

marcoagpinto wrote: Sun Feb 16, 2025 7:00 pm Why aren't you using

Code: Select all

chr()
in the examples?
For Unicode codepoints above $FFFF (= 65535), Chr() cannot be used.
Use this replacement instead.

//edit: kenmo was a few seconds quicker. :-)
Last edited by Little John on Sun Feb 16, 2025 8:12 pm, edited 1 time in total.
infratec
Always Here
Always Here
Posts: 7662
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: PB 6.20: Emojis above 65535?

Post by infratec »

Which font did you use ???

Code: Select all

; https://www.compart.com/en/unicode/U+1F600


; zebra:  1F993 -> UTF8 = 0xF0 0x9F 0xA6 0x93
; zeta:   3B6   -> UTF8 = 0xCE 0xB6
; Zeta:   396   -> UTF8 = 0xCE 0x96
; zombie: 1F9DF -> UTF8 = 0xF0 0x9F 0xA7 0x9F
; zzz:    1F4A4 -> UTF8 = 0xF0 0x9F 0x92 0xA4

#zebra = $93a69ff0
#lzeta = $b6ce
#uZeta = $96ce
#zombie = $9fa79ff0
#zzz = $a4929ff0

Define Emoji.l

If LoadFont(0, "Segoe UI Emoji", 14)
  SetGadgetFont(#PB_Default, FontID(0))
EndIf


If OpenWindow(0, 0, 0, 322, 150, "EditorGadget", #PB_Window_SystemMenu | #PB_Window_ScreenCentered)
  EditorGadget(0, 8, 8, 306, 133)
  Emoji = #zebra
  AddGadgetItem(0, 1, PeekS(@Emoji, 4, #PB_UTF8|#PB_ByteLength))
  Emoji = #lzeta
  AddGadgetItem(0, 2, PeekS(@Emoji, 2, #PB_UTF8|#PB_ByteLength))
  Emoji = #uZeta
  AddGadgetItem(0, 3, PeekS(@Emoji, 2, #PB_UTF8|#PB_ByteLength))
  Emoji = #zombie
  AddGadgetItem(0, 4, PeekS(@Emoji, 4, #PB_UTF8|#PB_ByteLength))
  Emoji = #zzz
  AddGadgetItem(0, 5, PeekS(@Emoji, 4, #PB_UTF8|#PB_ByteLength))
  Repeat : Until WaitWindowEvent() = #PB_Event_CloseWindow
EndIf

User avatar
marcoagpinto
Addict
Addict
Posts: 1076
Joined: Sun Mar 10, 2013 3:01 pm
Location: Portugal
Contact:

Re: PB 6.20: Emojis above 65535?

Post by marcoagpinto »

infratec wrote: Sun Feb 16, 2025 8:11 pm Which font did you use ???

Code: Select all

; https://www.compart.com/en/unicode/U+1F600


; zebra:  1F993 -> UTF8 = 0xF0 0x9F 0xA6 0x93
; zeta:   3B6   -> UTF8 = 0xCE 0xB6
; Zeta:   396   -> UTF8 = 0xCE 0x96
; zombie: 1F9DF -> UTF8 = 0xF0 0x9F 0xA7 0x9F
; zzz:    1F4A4 -> UTF8 = 0xF0 0x9F 0x92 0xA4

#zebra = $93a69ff0
#lzeta = $b6ce
#uZeta = $96ce
#zombie = $9fa79ff0
#zzz = $a4929ff0

Define Emoji.l

If LoadFont(0, "Segoe UI Emoji", 14)
  SetGadgetFont(#PB_Default, FontID(0))
EndIf


If OpenWindow(0, 0, 0, 322, 150, "EditorGadget", #PB_Window_SystemMenu | #PB_Window_ScreenCentered)
  EditorGadget(0, 8, 8, 306, 133)
  Emoji = #zebra
  AddGadgetItem(0, 1, PeekS(@Emoji, 4, #PB_UTF8|#PB_ByteLength))
  Emoji = #lzeta
  AddGadgetItem(0, 2, PeekS(@Emoji, 2, #PB_UTF8|#PB_ByteLength))
  Emoji = #uZeta
  AddGadgetItem(0, 3, PeekS(@Emoji, 2, #PB_UTF8|#PB_ByteLength))
  Emoji = #zombie
  AddGadgetItem(0, 4, PeekS(@Emoji, 4, #PB_UTF8|#PB_ByteLength))
  Emoji = #zzz
  AddGadgetItem(0, 5, PeekS(@Emoji, 4, #PB_UTF8|#PB_ByteLength))
  Repeat : Until WaitWindowEvent() = #PB_Event_CloseWindow
EndIf

I have been using Arial for over 10 years.

Maybe it is time to switch font?

Which one do you advice to be used by Windows, Linux and Mac?

Thanks!
User avatar
kenmo
Addict
Addict
Posts: 2051
Joined: Tue Dec 23, 2003 3:54 am

Re: PB 6.20: Emojis above 65535?

Post by kenmo »

Little John wrote: Sun Feb 16, 2025 8:09 pm //edit: kenmo was a few seconds quicker. :-)
8)
infratec
Always Here
Always Here
Posts: 7662
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: PB 6.20: Emojis above 65535?

Post by infratec »

As written in my first answer: you need a font with all the emojies inside.
And this is not Arial.

You can try

Code: Select all

LoadFont(0, "Noto Color Emoji", 14)
Maybe this font is available on all OSs if LibreOffice is installed.
But you need such an emoji font only for the emojies. The other text can be printed in Arial.
Or your program has to deliver this font.
The font is a free 'google' font.
https://fonts.google.com/noto/specimen/Noto+Color+Emoji
Post Reply