What should ReadStringFormat(#File) return?

Just starting out? Need help? Post your questions and find answers here.
AZJIO
Addict
Addict
Posts: 2143
Joined: Sun May 14, 2017 1:48 am

Re: What should ReadStringFormat(#File) return?

Post by AZJIO »

simkot wrote: Wed Feb 26, 2025 7:11 am.
For example, in Autotit it is solved by one line

Code: Select all

FileGetEncoding ( "filehandle/filename" [, mode = 1] )
In this case, ANSI is also defined inUTF 8 without BOM.
Determining the encoding of a file using the BOM is classic. Determining the encoding from the content is a guessing algorithm. Developers could add a function like CheckDataEncoding(*p, length). But this function cannot be part of the ReadStringFormat() functionality in any way.
simkot wrote: Wed Feb 26, 2025 7:11 am.
PureBasic is acting somehow illogically. It encodes its .pb files in UTF8 with BOM, but reads files without BOM. It is necessary to act somehow the same way.
Everything is logical, the source code should not be in ANSI format, this is already the last century. If your source code is opened in another country (ANSI), the texts in your native language will look like gibberish. And the translator will not be able to translate them into the language of another country. First, you will need a code page recognizer, an algorithm that determines the frequency of letters, or checks the existence of words in a dictionary. I'm not an expert in these algorithms, but the Russian language code page recognition engine in Notepad++ is faulty, it always gives the wrong result and you need to disable it so as not to break your files.
I checked, UTF-8 without the BOM produces gibberish (opens as ANSI). ANSI opens as ANSI. So everything is working as it should be. There is no ANSI (cp1251) on Linux, so even in ANSI everything will be broken.
simkot
User
User
Posts: 31
Joined: Sat Oct 26, 2024 8:25 am

Re: What should ReadStringFormat(#File) return?

Post by simkot »

What does ANSI have to do with it? I meant either UTF8 or UTF BOM.
AZJIO
Addict
Addict
Posts: 2143
Joined: Sun May 14, 2017 1:48 am

Re: What should ReadStringFormat(#File) return?

Post by AZJIO »

how is ReadStringFormat(#File) related to the IDE?
simkot wrote: Wed Feb 26, 2025 7:11 am.
It encodes its .pb files in UTF8 with BOM, but reads files without BOM.
IDE does not read "UTF8 without BOM"
Post Reply