Error when open file in UTF-8 w/o BOM

Working on new editor enhancements?
Allen
User
User
Posts: 92
Joined: Wed Nov 10, 2021 2:05 am

Error when open file in UTF-8 w/o BOM

Post by Allen »

Hi,

I found error when load file stored in UTF-8 w/o BOM. Please copy and paste below code in window note, save the file as test.pb in UTF-8 (not UTF-8 BOM). Load and run. Please confirm if this is a feature? can IDE accept file in UTF-8 format only?

Code: Select all

Global.s No$="零一二三四五六七八九十百千萬"
Debug No$
Thanks

Allen
User avatar
mk-soft
Always Here
Always Here
Posts: 5393
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: Error when open file in UTF-8 w/o BOM

Post by mk-soft »

In the PB settings you must set the source file text encoding under Compiler -> Defaults.
By default, the file is always interpreted as UTF8. (Which is also better)
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
Allen
User
User
Posts: 92
Joined: Wed Nov 10, 2021 2:05 am

Re: Error when open file in UTF-8 w/o BOM

Post by Allen »

Thanks for the advise.

Under File>Preferences>Complier>Defaults>SourceFile Text Encoding>

there is two choices, Plain Text and UTF-8. I already choose UTF-8. It did not work.

I make two identical files in UTF-8, one with BOM and one w/o, loaded in IDE, the files look exactly the same but the one w/o BOM did not run properly. Even use file compare function, the files look identical.

Any suggestions ?
User avatar
STARGÅTE
Addict
Addict
Posts: 2085
Joined: Thu Jan 10, 2008 1:30 pm
Location: Germany, Glienicke
Contact:

Re: Error when open file in UTF-8 w/o BOM

Post by STARGÅTE »

A file w/o BOM is always interpreted as ASCII.
If no BOM is in the file, how the string format should be stored?

UTF-8 files must have a BOM to load correctly as UTF-8 in IDE, independent from the default configuration settings.
PB 6.01 ― Win 10, 21H2 ― Ryzen 9 3900X, 32 GB ― NVIDIA GeForce RTX 3080 ― Vivaldi 6.0 ― www.unionbytes.de
Lizard - Script language for symbolic calculations and moreTypeface - Sprite-based font include/module
User avatar
mk-soft
Always Here
Always Here
Posts: 5393
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: Error when open file in UTF-8 w/o BOM

Post by mk-soft »

Any file without BOM is open alway as ascii, because the file can save as plain text

Small Helper ... Only for UTF8 file with missing BOM (invalid file format)

Code: Select all

;-TOP

Procedure.s AddUTF8BOM(FileName.s)
  Protected file, newfile, NewFileName.s, ft, context.s
  file = ReadFile(#PB_Any, FileName)
  If file
    ft = ReadStringFormat(file)
    If ft = #PB_Ascii
      NewFileName = FileName + ".utf8"
      newfile = CreateFile(#PB_Any, NewFileName)
      If newfile
        WriteStringFormat(newfile, #PB_UTF8)
        context = ReadString(file, #PB_File_IgnoreEOL)
        WriteStringN(newfile, context)
        CloseFile(newfile)
      EndIf
    EndIf
    CloseFile(file)
  EndIf
  ProcedureReturn NewFileName
EndProcedure

file.s = OpenFileRequester("Textfile", "", "", 0)
r1.s = AddUTF8BOM(file.s)
Debug r1

My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
Allen
User
User
Posts: 92
Joined: Wed Nov 10, 2021 2:05 am

Re: Error when open file in UTF-8 w/o BOM

Post by Allen »

Thanks STARGÅTE for the clarification.
Thanks mk-soft for the example.

Allen
Post Reply