Re: Difference between ASCII and Unicode
Posted: Tue Feb 10, 2015 5:13 pm
I'll try to give some answers.
).
All kinds of strings are affected: string constants, string variables, strings read from or written to a text file.
If all your strings contain only ASCII characters, then you don't need Unicode. But in many languages, there are so called "special characters". For correctly handling those characters in your program, it probably needs to be compiled in Unicode mode (in the end, it all depends on what your program exactly does).
Support for ASCII compilation with PureBasic will be dropped in the foreseeable future, so sooner or later all PB programs compiled with up to date PB versions will be Unicode programs. For these reasons, Unicode is the present and the future. ASCII is a technology from the last century.
In other words, this PB source file format is safe for ASCII and Unicode mode, and I'd recommend to use it for all new PB programs.
With existing PB programs that are saved as "plain text" files, it's a bit different: When you just switch the file format setting in the PB IDE from "plain text" to "UTF-8", it can happen that you'll get some unreadable characters. For conversion, better use a good text editor that has a command such as "Save as UTF-8".
Compiler > Compiler Options ... > [v] Create Unicode executable might or might not make your program work differently, it depends.
Unicode is only about strings, not about numbers, and not about music (as someone wrote here recentlyJoris wrote:* Are there rules (or tools) to check if a source needs unicode or not ?
All kinds of strings are affected: string constants, string variables, strings read from or written to a text file.
If all your strings contain only ASCII characters, then you don't need Unicode. But in many languages, there are so called "special characters". For correctly handling those characters in your program, it probably needs to be compiled in Unicode mode (in the end, it all depends on what your program exactly does).
Support for ASCII compilation with PureBasic will be dropped in the foreseeable future, so sooner or later all PB programs compiled with up to date PB versions will be Unicode programs. For these reasons, Unicode is the present and the future. ASCII is a technology from the last century.
Read the help for the PB functions you are interested in. It should be documented whether they work differently in ASCII or Unicode mode. Notice if a function has optional parameters such as #PB_Ascii, #PB_Unicode, #PB_UTF8, ...Joris wrote:* Will the use of FindString, StringField or ExtractRegularExpression etc. in my sources have any influence when they are used with unicode files (already working fine with only ASCII) ?
As far as I can see, File > File format > Encoding: UTF-8 will not make a difference if no unicode is in use.Joris wrote:* If setting these below, will it make a difference too if no unicode is in use ?In any case, for properly handling Unicode text, at least 2 settings should be made in the PureBasic IDE
Compiler > Compiler Options ... > [v] Create Unicode executable
File > File format > Encoding: UTF-8
In other words, this PB source file format is safe for ASCII and Unicode mode, and I'd recommend to use it for all new PB programs.
With existing PB programs that are saved as "plain text" files, it's a bit different: When you just switch the file format setting in the PB IDE from "plain text" to "UTF-8", it can happen that you'll get some unreadable characters. For conversion, better use a good text editor that has a command such as "Save as UTF-8".
Compiler > Compiler Options ... > [v] Create Unicode executable might or might not make your program work differently, it depends.