What should ReadStringFormat(#File) return?

AZJIO · Post by **AZJIO** » Fri Feb 28, 2025 6:57 am

simkot wrote: Wed Feb 26, 2025 7:11 am.
For example, in Autotit it is solved by one line
Code: Select all
FileGetEncoding ( "filehandle/filename" [, mode = 1] )
In this case, ANSI is also defined inUTF 8 without BOM.

Determining the encoding of a file using the BOM is classic. Determining the encoding from the content is a guessing algorithm. Developers could add a function like CheckDataEncoding(*p, length). But this function cannot be part of the ReadStringFormat() functionality in any way.

simkot wrote: Wed Feb 26, 2025 7:11 am.
PureBasic is acting somehow illogically. It encodes its .pb files in UTF8 with BOM, but reads files without BOM. It is necessary to act somehow the same way.

Everything is logical, the source code should not be in ANSI format, this is already the last century. If your source code is opened in another country (ANSI), the texts in your native language will look like gibberish. And the translator will not be able to translate them into the language of another country. First, you will need a code page recognizer, an algorithm that determines the frequency of letters, or checks the existence of words in a dictionary. I'm not an expert in these algorithms, but the Russian language code page recognition engine in Notepad++ is faulty, it always gives the wrong result and you need to disable it so as not to break your files.
I checked, UTF-8 without the BOM produces gibberish (opens as ANSI). ANSI opens as ANSI. So everything is working as it should be. There is no ANSI (cp1251) on Linux, so even in ANSI everything will be broken.

simkot · Post by **simkot** » Fri Feb 28, 2025 1:27 pm

What does ANSI have to do with it? I meant either UTF8 or UTF BOM.

AZJIO · Post by **AZJIO** » Fri Feb 28, 2025 5:31 pm

how is ReadStringFormat(#File) related to the IDE?

simkot wrote: Wed Feb 26, 2025 7:11 am.
It encodes its .pb files in UTF8 with BOM, but reads files without BOM.

IDE does not read "UTF8 without BOM"

PureBasic Forums - English

What should ReadStringFormat(#File) return?

Re: What should ReadStringFormat(#File) return?

Re: What should ReadStringFormat(#File) return?

Re: What should ReadStringFormat(#File) return?