Page 1 of 1

PB 5.73 - PlainText vs UTF-8 difference

Posted: Tue May 25, 2021 10:50 am
by StarBootics
Hello everyone,

Apparently a program compiled with plain text source code have the French characters displayed correctly in the final executable. On the other ends a program compiled with UTF-8 source code have the French characters displayed incorrectly in the final executable. See the screen capture of this here :

https://www.dropbox.com/s/ml307vnybpeqm ... 8.png?dl=0

In both cases the source code look exactly the same. Save this one as Test UTF-8.pb making sure the file format is UTF-8

Code: Select all

If OpenWindow(0,400,300,400,300,"Test UTF8",#PB_Window_SystemMenu) 
  TextGadget(0, 10, 10,380,25,"Gadget texte standard (texte aligné à gauche)")
  TextGadget(1, 10, 40,380,25,"Gadget texte (texte aligné à droite)", #PB_Text_Right)
  TextGadget(2, 10, 70,380,25,"Gadget texte (texte centré)",#PB_Text_Center)
  TextGadget(3, 10,100,380,25,"Gadget texte avec bordure",#PB_Text_Border)
  TextGadget(4, 10,130,380,25,"Gadget texte (texte centré) + bordure", #PB_Text_Center | #PB_Text_Border)
  Repeat : Until WaitWindowEvent()=#PB_Event_CloseWindow
EndIf
Save this one as Test TextBrute.pb making sure the file format is plain text

Code: Select all

If OpenWindow(0,400,300,400,300,"Test Text brute",#PB_Window_SystemMenu) 
  TextGadget(0, 10, 10,380,25,"Gadget texte standard (texte aligné à gauche)")
  TextGadget(1, 10, 40,380,25,"Gadget texte (texte aligné à droite)", #PB_Text_Right)
  TextGadget(2, 10, 70,380,25,"Gadget texte (texte centré)",#PB_Text_Center)
  TextGadget(3, 10,100,380,25,"Gadget texte avec bordure",#PB_Text_Border)
  TextGadget(4, 10,130,380,25,"Gadget texte (texte centré) + bordure", #PB_Text_Center | #PB_Text_Border)
  Repeat : Until WaitWindowEvent()=#PB_Event_CloseWindow
EndIf
I don't know what but something is wrong somewhere with the compiler.

PB 5.73 LTS x64
Ubuntu 21.04 x64

Workaround always save your source code in Plain text format.

Best regards
StarBootics

Re: PB 5.73 - PlainText vs UTF-8 difference

Posted: Sun Jul 02, 2023 9:33 am
by Fred
Seems to work as expected here, can anybody else confirm ?

Re: PB 5.73 - PlainText vs UTF-8 difference

Posted: Mon Jul 03, 2023 12:28 am
by StarBootics
Just re-tested with PB V6.03 Beta 2 x64 and when I save the source code as UTF-8 french characters don't shows up correctly. See the screenshot uploaded to my Dropbox account.

Everything is OK when the source code is saved as Plain text.

Tested under Debian 11 x64

Best regards
StarBootics

Re: PB 5.73 - PlainText vs UTF-8 difference

Posted: Mon Jul 03, 2023 9:17 am
by Fred
May be converting from plain text to UTF-8 doesn't work well on your box due to some locale issues. If you type french characters directly in the UTF-8 file, it is working ?

Re: PB 5.73 - PlainText vs UTF-8 difference

Posted: Mon Jul 03, 2023 9:29 am
by StarBootics
Fred wrote: Mon Jul 03, 2023 9:17 am May be converting from plain text to UTF-8 doesn't work well on your box due to some locale issues. If you type french characters directly in the UTF-8 file, it is working ?
Yes it's working. Converting the source code back and forth between plain text and UTF-8 work absolutely fine. The problem became apparent only in the compiled executable from an UTF-8 encoded source code. If I use gedit to look at the source code saved as plain text or UTF-8, both source shows french accented characters correctly.

The local I'm using is fr_CA (French canadian)

Best regards
StarBootics

Re: PB 5.73 - PlainText vs UTF-8 difference

Posted: Mon Jul 03, 2023 10:07 am
by Fred
Could you send me your UTF-8 source ?

Re: PB 5.73 - PlainText vs UTF-8 difference

Posted: Mon Jul 03, 2023 10:15 am
by StarBootics

Re: PB 5.73 - PlainText vs UTF-8 difference

Posted: Mon Jul 03, 2023 10:22 am
by Fred
Here is my locale output, set to UTF8 (I didn't changed anything, so I guess it's ubuntu default):

Code: Select all

fred@ubuntu:~/svn/v5.80$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Re: PB 5.73 - PlainText vs UTF-8 difference

Posted: Mon Jul 03, 2023 10:44 am
by StarBootics
There is mine on Debian 11 x64

Code: Select all

LANG=fr_CA.UTF-8
LANGUAGE=fr_CA:fr
LC_CTYPE="fr_CA.UTF-8"
LC_NUMERIC="fr_CA.UTF-8"
LC_TIME="fr_CA.UTF-8"
LC_COLLATE="fr_CA.UTF-8"
LC_MONETARY="fr_CA.UTF-8"
LC_MESSAGES="fr_CA.UTF-8"
LC_PAPER="fr_CA.UTF-8"
LC_NAME="fr_CA.UTF-8"
LC_ADDRESS="fr_CA.UTF-8"
LC_TELEPHONE="fr_CA.UTF-8"
LC_MEASUREMENT="fr_CA.UTF-8"
LC_IDENTIFICATION="fr_CA.UTF-8"
LC_ALL=