5.30 advanced usage UTF-8(UNICODE)

Got an idea for enhancing PureBasic? New command(s) you'd like to see?
User avatar
useful
Enthusiast
Enthusiast
Posts: 402
Joined: Fri Jul 19, 2013 7:36 am

5.30 advanced usage UTF-8(UNICODE)

Post by useful »

Given the move towards UTF8 would like to be able to use national alphabets in identifiers, constants, variables, procedures, macros, ....
Dawn will come inevitably.
User avatar
useful
Enthusiast
Enthusiast
Posts: 402
Joined: Fri Jul 19, 2013 7:36 am

Re: 5.30 advanced usage UTF-8(UNICODE)

Post by useful »

It's interesting to me only?
I would like to know the opinion of the team/community in my opinion a fundamental issue.
Dawn will come inevitably.
User avatar
Shield
Addict
Addict
Posts: 1021
Joined: Fri Jan 21, 2011 8:25 am
Location: 'stralia!
Contact:

Re: 5.30 advanced usage UTF-8(UNICODE)

Post by Shield »

I would also be in favor of that, even though I probably wouldn't use it.

From a logical / programming point of view, there is no real counterargument against a feature
like this other that it isn't really necessary and that it would probably require some modifications in the compiler.
Other modern languages support this.

However, regarding the implementation I'm not sure if other components like FASM support that.
If not, it means that the compiler had to convert all identifiers into some weird ASCII
string that wouldn't be readable anymore.

That's probably the main problem.
Image
Blog: Why Does It Suck? (http://whydoesitsuck.com/)
"You can disagree with me as much as you want, but during this talk, by definition, anybody who disagrees is stupid and ugly."
- Linus Torvalds
Little John
Addict
Addict
Posts: 4777
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: 5.30 advanced usage UTF-8(UNICODE)

Post by Little John »

be able to use national alphabets in identifiers
-1
I hope that this is not going to happen.

When greek people would use greek characters for identifiers, this would not be understandable for the rest of the people in the world; when japanese people would use japanese characters for identifiers, this would not be understandable for the rest of the people in the world; etc.
The main purpose of any (natural or technical) language is to allow for versatile, reliable and as easy as possible communication among all individuals for whom this is of benefit. Thanks to the internet, we live in a "global village" today, and also everyone here on the forum benefits from this fact. In our "global village" it's already a considerable problem that there are so many different natural languages. We can be happy when a technical language such as PureBasic does not reproduce the problem of the diversity of the natural languages, but makes it as easy as posible for all PureBasic programmers in the world to read and understand each other's code.
User avatar
useful
Enthusiast
Enthusiast
Posts: 402
Joined: Fri Jul 19, 2013 7:36 am

Re: 5.30 advanced usage UTF-8(UNICODE)

Post by useful »

Theme can be transferred to offtopic.
I agree with the challenges of globalization, but sometimes at home I want to feel comfortable
Dawn will come inevitably.
User avatar
Shield
Addict
Addict
Posts: 1021
Joined: Fri Jan 21, 2011 8:25 am
Location: 'stralia!
Contact:

Re: 5.30 advanced usage UTF-8(UNICODE)

Post by Shield »

Little John wrote:When greek people would use greek characters for identifiers,
this would not be understandable for the rest of the people in the world; when japanese people would use japanese characters for identifiers, this would not be understandable for the rest of the people in the world; etc.
And when Germans use German variable names and Chinese programmers use Pinyin to romanize their words,
the rest of the world also can't understand. Not having this feature is just grammatically limiting them.
Not everyone speaks English and allowing users to stick more to their native language may help beginners a lot.
If you take a look at other languages that support this feature, no serious project uses non-english variable names,
functions etc. So this isn't a problem for code sharing.

As I said, it's not a mandatory feature but it's nice to have for some people, for example for the user 'useful'.
Image
Blog: Why Does It Suck? (http://whydoesitsuck.com/)
"You can disagree with me as much as you want, but during this talk, by definition, anybody who disagrees is stupid and ugly."
- Linus Torvalds
User avatar
useful
Enthusiast
Enthusiast
Posts: 402
Joined: Fri Jul 19, 2013 7:36 am

Re: 5.30 advanced usage UTF-8(UNICODE)

Post by useful »

If you remember how he kept basic program 50 years ago, at a new stage of development to create localized frontend in the IDE and backend before asm-link, with the possibility of translation on the basis of the database.
i.e. the language of the writer <-> language system <-> language of the reader and to the turnover,
the project will blow up the world
Dawn will come inevitably.
IdeasVacuum
Always Here
Always Here
Posts: 6426
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: 5.30 advanced usage UTF-8(UNICODE)

Post by IdeasVacuum »

It is not worth the effort in my opinion. Most programming languages are based around American English and for you to understand the millions of code examples available around the world, you still need to use it.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
User avatar
useful
Enthusiast
Enthusiast
Posts: 402
Joined: Fri Jul 19, 2013 7:36 am

Re: 5.30 advanced usage UTF-8(UNICODE)

Post by useful »

have the opportunity but should not ( not required ).

I'm afraid my translation of the phrase is not accurate

p.s. For standard users was actually a standard localization of applications and only programmers often deprived of this :mrgreen:
Dawn will come inevitably.
Little John
Addict
Addict
Posts: 4777
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: 5.30 advanced usage UTF-8(UNICODE)

Post by Little John »

Shield wrote:And when Germans use German variable names and Chinese programmers use Pinyin to romanize their words,
the rest of the world also can't understand.
That's true. But it's no reason for introducing a feature into PB, that would enable and encourage people to write code which is even less readable for the rest of the world -- for the reasons that I mentioned.
Shield wrote:but it's nice to have for some people, for example for the user 'useful'.
No language is made for the private use of single persons. A language is a tool for communication.

A general purpose programming language such as PB is not only for communication between a programmer and a compiler, but also for communication between programmers. We can be happy when PB makes it as easy as posible for all PureBasic programmers in the world to read and understand each other's code, and e.g. everyone here on the forum benefits from this fact.
RomanR
User
User
Posts: 16
Joined: Wed Jul 11, 2012 3:54 pm

Re: 5.30 advanced usage UTF-8(UNICODE)

Post by RomanR »

I have to totally agree with Little John.
Little John wrote:A general purpose programming language such as PB is not only for communication between a programmer and a compiler, but also for communication between programmers.
Although english is not my mothertoung, the primary language in my programs of all variables, comments AND interface-texts (menues, gadgets other text) are in english (optionally with a localization from a languagefile). It helps tremendously to share code or get help from the community.

If you start to fully localize (with all possible special letters of all languages) variables then you have to localize also the keywords (e.g. for german: If .. Else .. EndIf -> Wenn .. Sonst .. EndeWenn like its done in MS Office !!!). The result would be, you have to translate the code you get from somebody else into your local language! You will never be able to learn from some genius who wrote modules, includes (pbi) or other usefull sourcecode. And if you use precompiled includes (userlibraries) written from someone whose mothertoung is different than yours, you have to write the keywords in that language -> have fun to write german umlauts on non-german keyboards or maybe russian or greek? :mrgreen:

My humble opinion: leave everything which is not between double quotes ("") to ascii-text. :!:
User avatar
Danilo
Addict
Addict
Posts: 3036
Joined: Sat Apr 26, 2003 8:26 am
Location: Planet Earth

Re: 5.30 advanced usage UTF-8(UNICODE)

Post by Danilo »

RomanR wrote:(e.g. for german: If .. Else .. EndIf -> Wenn .. Sonst .. EndeWenn like its done in MS Office !!!).
- PB auf Deutsch ;)
RomanR
User
User
Posts: 16
Joined: Wed Jul 11, 2012 3:54 pm

Re: 5.30 advanced usage UTF-8(UNICODE)

Post by RomanR »

Danilo wrote:- PB auf Deutsch ;)
:lol: impressive... :lol: :mrgreen: 8)

To demonstrate "natural" programming interesting. But for real programs not practical. I understand, that someone who is not so firm in english would prefer a programming language in her/his mother tongue. As long as you are only writing programs to entertain yourself, everyting is allowed.

For a beginner of programming localisation of variables may look appealing. But when you gain more and more experience, there comes the point where you like to share your work with others or need help with a particular problem. Then you have to translate your code (or code-snippet) to post it here and then translate back a possible solution. :? :( :cry:

I think the intention of PureBasic is having a simple yet powerful programming language, which makes it possible to write cool programs on the three (four - if you include good old Amiga with v4.00 :wink: ) main platforms. Better remove bugs and implement new (and some missing) features than making the compiler more complicated then necessary. (Why complicated: have you ever tried to translate UTF-8 to UCS2? This is necessary if you are using the ScintillaGadget and want to print colored text. Try translating € (UTF-8: E2 82 AC -> Unicode: 20 AC). Happy coding :wink: - would like to see your code-snippets).
User avatar
Danilo
Addict
Addict
Posts: 3036
Joined: Sat Apr 26, 2003 8:26 am
Location: Planet Earth

Re: 5.30 advanced usage UTF-8(UNICODE)

Post by Danilo »

RomanR wrote:(Why complicated: have you ever tried to translate UTF-8 to UCS2? This is necessary if you are using the ScintillaGadget and want to print colored text. Try translating € (UTF-8: E2 82 AC -> Unicode: 20 AC). Happy coding :wink: - would like to see your code-snippets).
In Unicode mode it is as easy as that:

Code: Select all

Euro.s = PeekS(?EuroSign,-1,#PB_UTF8)
MessageRequester("Euro Sign",Euro+" - "+Hex(Asc(Euro)))

DataSection
    EuroSign:
    Data.a $E2, $82, $AC
    Data.a 0
EndDataSection
User avatar
useful
Enthusiast
Enthusiast
Posts: 402
Joined: Fri Jul 19, 2013 7:36 am

Re: 5.30 advanced usage UTF-8(UNICODE)

Post by useful »

Danilo wrote:
RomanR wrote:(e.g. for german: If .. Else .. EndIf -> Wenn .. Sonst .. EndeWenn like its done in MS Office !!!).
- PB auf Deutsch ;)

without (Ä ä, Ö ö, Ü ü, ß.) you can do? :)
Dawn will come inevitably.
Post Reply