Page 1 of 2

5.30 advanced usage UTF-8(UNICODE)

Posted: Fri May 30, 2014 7:34 pm
by useful
Given the move towards UTF8 would like to be able to use national alphabets in identifiers, constants, variables, procedures, macros, ....

Re: 5.30 advanced usage UTF-8(UNICODE)

Posted: Mon Jun 02, 2014 4:29 am
by useful
It's interesting to me only?
I would like to know the opinion of the team/community in my opinion a fundamental issue.

Re: 5.30 advanced usage UTF-8(UNICODE)

Posted: Mon Jun 02, 2014 6:08 am
by Shield
I would also be in favor of that, even though I probably wouldn't use it.

From a logical / programming point of view, there is no real counterargument against a feature
like this other that it isn't really necessary and that it would probably require some modifications in the compiler.
Other modern languages support this.

However, regarding the implementation I'm not sure if other components like FASM support that.
If not, it means that the compiler had to convert all identifiers into some weird ASCII
string that wouldn't be readable anymore.

That's probably the main problem.

Re: 5.30 advanced usage UTF-8(UNICODE)

Posted: Mon Jun 02, 2014 6:34 am
by Little John
be able to use national alphabets in identifiers
-1
I hope that this is not going to happen.

When greek people would use greek characters for identifiers, this would not be understandable for the rest of the people in the world; when japanese people would use japanese characters for identifiers, this would not be understandable for the rest of the people in the world; etc.
The main purpose of any (natural or technical) language is to allow for versatile, reliable and as easy as possible communication among all individuals for whom this is of benefit. Thanks to the internet, we live in a "global village" today, and also everyone here on the forum benefits from this fact. In our "global village" it's already a considerable problem that there are so many different natural languages. We can be happy when a technical language such as PureBasic does not reproduce the problem of the diversity of the natural languages, but makes it as easy as posible for all PureBasic programmers in the world to read and understand each other's code.

Re: 5.30 advanced usage UTF-8(UNICODE)

Posted: Mon Jun 02, 2014 6:48 am
by useful
Theme can be transferred to offtopic.
I agree with the challenges of globalization, but sometimes at home I want to feel comfortable

Re: 5.30 advanced usage UTF-8(UNICODE)

Posted: Mon Jun 02, 2014 8:18 am
by Shield
Little John wrote:When greek people would use greek characters for identifiers,
this would not be understandable for the rest of the people in the world; when japanese people would use japanese characters for identifiers, this would not be understandable for the rest of the people in the world; etc.
And when Germans use German variable names and Chinese programmers use Pinyin to romanize their words,
the rest of the world also can't understand. Not having this feature is just grammatically limiting them.
Not everyone speaks English and allowing users to stick more to their native language may help beginners a lot.
If you take a look at other languages that support this feature, no serious project uses non-english variable names,
functions etc. So this isn't a problem for code sharing.

As I said, it's not a mandatory feature but it's nice to have for some people, for example for the user 'useful'.

Re: 5.30 advanced usage UTF-8(UNICODE)

Posted: Mon Jun 02, 2014 8:44 am
by useful
If you remember how he kept basic program 50 years ago, at a new stage of development to create localized frontend in the IDE and backend before asm-link, with the possibility of translation on the basis of the database.
i.e. the language of the writer <-> language system <-> language of the reader and to the turnover,
the project will blow up the world

Re: 5.30 advanced usage UTF-8(UNICODE)

Posted: Mon Jun 02, 2014 9:33 am
by IdeasVacuum
It is not worth the effort in my opinion. Most programming languages are based around American English and for you to understand the millions of code examples available around the world, you still need to use it.

Re: 5.30 advanced usage UTF-8(UNICODE)

Posted: Mon Jun 02, 2014 10:04 am
by useful
have the opportunity but should not ( not required ).

I'm afraid my translation of the phrase is not accurate

p.s. For standard users was actually a standard localization of applications and only programmers often deprived of this :mrgreen:

Re: 5.30 advanced usage UTF-8(UNICODE)

Posted: Mon Jun 02, 2014 10:22 am
by Little John
Shield wrote:And when Germans use German variable names and Chinese programmers use Pinyin to romanize their words,
the rest of the world also can't understand.
That's true. But it's no reason for introducing a feature into PB, that would enable and encourage people to write code which is even less readable for the rest of the world -- for the reasons that I mentioned.
Shield wrote:but it's nice to have for some people, for example for the user 'useful'.
No language is made for the private use of single persons. A language is a tool for communication.

A general purpose programming language such as PB is not only for communication between a programmer and a compiler, but also for communication between programmers. We can be happy when PB makes it as easy as posible for all PureBasic programmers in the world to read and understand each other's code, and e.g. everyone here on the forum benefits from this fact.

Re: 5.30 advanced usage UTF-8(UNICODE)

Posted: Wed Aug 13, 2014 8:16 am
by RomanR
I have to totally agree with Little John.
Little John wrote:A general purpose programming language such as PB is not only for communication between a programmer and a compiler, but also for communication between programmers.
Although english is not my mothertoung, the primary language in my programs of all variables, comments AND interface-texts (menues, gadgets other text) are in english (optionally with a localization from a languagefile). It helps tremendously to share code or get help from the community.

If you start to fully localize (with all possible special letters of all languages) variables then you have to localize also the keywords (e.g. for german: If .. Else .. EndIf -> Wenn .. Sonst .. EndeWenn like its done in MS Office !!!). The result would be, you have to translate the code you get from somebody else into your local language! You will never be able to learn from some genius who wrote modules, includes (pbi) or other usefull sourcecode. And if you use precompiled includes (userlibraries) written from someone whose mothertoung is different than yours, you have to write the keywords in that language -> have fun to write german umlauts on non-german keyboards or maybe russian or greek? :mrgreen:

My humble opinion: leave everything which is not between double quotes ("") to ascii-text. :!:

Re: 5.30 advanced usage UTF-8(UNICODE)

Posted: Wed Aug 13, 2014 1:35 pm
by Danilo
RomanR wrote:(e.g. for german: If .. Else .. EndIf -> Wenn .. Sonst .. EndeWenn like its done in MS Office !!!).
- PB auf Deutsch ;)

Re: 5.30 advanced usage UTF-8(UNICODE)

Posted: Thu Aug 14, 2014 2:40 pm
by RomanR
Danilo wrote:- PB auf Deutsch ;)
:lol: impressive... :lol: :mrgreen: 8)

To demonstrate "natural" programming interesting. But for real programs not practical. I understand, that someone who is not so firm in english would prefer a programming language in her/his mother tongue. As long as you are only writing programs to entertain yourself, everyting is allowed.

For a beginner of programming localisation of variables may look appealing. But when you gain more and more experience, there comes the point where you like to share your work with others or need help with a particular problem. Then you have to translate your code (or code-snippet) to post it here and then translate back a possible solution. :? :( :cry:

I think the intention of PureBasic is having a simple yet powerful programming language, which makes it possible to write cool programs on the three (four - if you include good old Amiga with v4.00 :wink: ) main platforms. Better remove bugs and implement new (and some missing) features than making the compiler more complicated then necessary. (Why complicated: have you ever tried to translate UTF-8 to UCS2? This is necessary if you are using the ScintillaGadget and want to print colored text. Try translating € (UTF-8: E2 82 AC -> Unicode: 20 AC). Happy coding :wink: - would like to see your code-snippets).

Re: 5.30 advanced usage UTF-8(UNICODE)

Posted: Fri Aug 15, 2014 1:11 am
by Danilo
RomanR wrote:(Why complicated: have you ever tried to translate UTF-8 to UCS2? This is necessary if you are using the ScintillaGadget and want to print colored text. Try translating € (UTF-8: E2 82 AC -> Unicode: 20 AC). Happy coding :wink: - would like to see your code-snippets).
In Unicode mode it is as easy as that:

Code: Select all

Euro.s = PeekS(?EuroSign,-1,#PB_UTF8)
MessageRequester("Euro Sign",Euro+" - "+Hex(Asc(Euro)))

DataSection
    EuroSign:
    Data.a $E2, $82, $AC
    Data.a 0
EndDataSection

Re: 5.30 advanced usage UTF-8(UNICODE)

Posted: Fri Aug 15, 2014 6:50 am
by useful
Danilo wrote:
RomanR wrote:(e.g. for german: If .. Else .. EndIf -> Wenn .. Sonst .. EndeWenn like its done in MS Office !!!).
- PB auf Deutsch ;)

without (Ä ä, Ö ö, Ü ü, ß.) you can do? :)