Removing 'ASCII' switch from PureBasic

Post by **freak** » Sat Aug 09, 2014 11:45 am

Official announcement here: http://www.purebasic.fr/english/viewtop ... 14&t=60214

Little John · Post by **Little John** » Sat Aug 09, 2014 11:46 am

Hi davido!

davido wrote:That is why I asked the question.

I knew that that was the reason.

And that's exactly why I'm strictly against improper names in general: because they cause nothing but confusion.

User_Russian · Post by **User_Russian** » Sat Aug 09, 2014 12:12 pm

Danilo wrote:Let me rephrase: Especially for them, Ascii is quite useless, in my opinion.

This is not so.
In the alphabet of the Russian language, 33 letters, and for them, it is possible to use the range 128 - 255, encoding ASCII. There is a code page, Windows 1251, which is the default in Russian Windows. This allows using the encoding ASCII, support all English and Russian characters.

Danilo wrote:As you can see with the PB compiler,
Ascii applications only create extra trouble (see example "C:\Программы\"). With full Unicode support, you don't
have this problems, and that's what makes Ascii applications pretty obsolete.

With the ASCII no problem.
Problem with Unicode because the compiler does not support UTF-8 - encoding of the source text.

Danilo · Post by **Danilo** » Sat Aug 09, 2014 1:00 pm

User_Russian wrote:In the alphabet of the Russian language, 33 letters, and for them, it is possible to use the range 128 - 255, encoding ASCII. There is a code page, Windows 1251, which is the default in Russian Windows. This allows using the encoding ASCII, support all English and Russian characters.

On my Windows, the ASCII characters 128 - 255 are not Russian. It works only in your world.
With UNICODE, your Russian characters work for the rest of the world, too.
Anyway, the decision has been made by the PB team, so no problem if you still don't want to understand the overall problem.
You will still be able to work with codepage 1251, even after the change. If Microsoft removes the codepage stuff from Windows,
you just need to write a small conversion table to convert chars 128 - 255 to Russian UNICODE range.

heartbone · Post by **heartbone** » Sat Aug 09, 2014 1:21 pm

luis wrote:
Danilo wrote: Especially for the Russian guys here I had expected they would welcome the change. Now I see it's the opposite, and it makes me wonder.
Welcome the change ?

They can use unicode already. The proposal is not to add unicode for 5.40, is to remove the ability to make ascii builds.

That does put it into the proper perspective.

User_Russian · Post by **User_Russian** » Sat Aug 09, 2014 1:29 pm

Not for everyone projects need Unicode. For example, no need for many DLL.
If for project needs Unicode, I use it.
But for many projects preferably ASCII, and development using Unicode, create a lot of difficulties. Example. http://www.purebasic.fr/english/viewtop ... 33#p450133
Also, the compiler does not support UTF-8, and if the project is using Unicode, the folder names of the project should be only English! If the project is ASCII, then I can use Russian and English names of the folders.

Danilo · Post by **Danilo** » Sat Aug 09, 2014 1:36 pm

User_Russian wrote:Also, the compiler does not support UTF-8, and if the project is using Unicode, the folder names of the project should be only English! If the project is ASCII, then I can use Russian and English names of the folders.

I understand that, User_Russian. I hope they make the compiler completely Unicode, too.
If they do, there will be no more problems with paths, whether Chinese / Russian / Korean.
More things need to get changed and verified, not just removing ASCII compilation.

Let's hope for the best...

chris319 · Post by **chris319** » Sat Aug 09, 2014 9:34 pm

This works:

myString$ = ascii string

myString.s = unicode string

myString.s{16} = 16 unicode characters

myString.s "hello"

Len(myString.s) = 5

SizeOf(myString.s) = 32 bytes

Mid(myString.s, 2) = "e" = Mid(myString$, 2)

Mid(myString.s, 2,3) = "ell" = Mid(myString$, 2,3)

It's a lot less ugly than ToAscii/ToUTF8() or p-ascii, p-utf8. This supports both types of string and eliminates the compiler switch and isn't ugly. Win-win.

You're going to break a lot of code if you abandon ASCII entirely.

The mathematical concepts of Log() and Sqr() are centuries old and are thus old technology. I hope the PureBasic team doesn't abandon those.

c4s · Post by **c4s** » Sun Aug 10, 2014 8:52 am

chris319 wrote:The mathematical concepts of Log() and Sqr() are centuries old and are thus old technology. I hope the PureBasic team doesn't abandon those.

Wow, I have no words. Are you serious? Unfortunately the "arguments" in this thread are getting worse and worse...

luis · Post by **luis** » Sun Aug 10, 2014 10:55 am

chris319 wrote: myString$ = ascii string

myString.s = unicode string

Personally I use $ for all my strings (no .s) because it makes easier to spot strings in code, so $ for both ascii and unicode.
Currently .s and $ are synonymous and I hope they will not change meaning in the future.

Little John · Post by **Little John** » Sun Aug 10, 2014 11:06 am

luis wrote:Currently .s and $ are synonymous and I hope they will not change meaning in the future.

Changing the meaning of .s or $ would break a lot of existing code.
I can't imagine that the PB team will do so.

chris319 · Post by **chris319** » Sun Aug 10, 2014 11:15 am

c4s wrote:
chris319 wrote:The mathematical concepts of Log() and Sqr() are centuries old and are thus old technology. I hope the PureBasic team doesn't abandon those.
Wow, I have no words. Are you serious? Unfortunately the "arguments" in this thread are getting worse and worse...

It's a joke.

chris319 · Post by **chris319** » Sun Aug 10, 2014 11:56 am

Little John wrote:
luis wrote:Currently .s and $ are synonymous and I hope they will not change meaning in the future.
Changing the meaning of .s or $ would break a lot of existing code.
I can't imagine that the PB team will do so.

How's this?

myString.s and myString$ -> ASCII strings. No legacy code broken.

myString.n (second letter in uNicode -- "u" is logical but not available)

It's not butt-ugly like:

*AsciiBuffer = ToAscii(String$)
*UTF8Buffer = ToUTF8(String$)

In the above example, is myString$ ASCII or unicode or could it be either? If it's unicode, you've just broken legacy programs. If it's either, it's ambiguous.

I think we all agree on two things:

1. ASCII strings must not be abandoned entirely because some applications require it.

2. Users need control over whether their strings are ASCII or unicode.

As I understand it, myChar.c will be unicode, breaking code that relies on it being ASCII. A bad decision comes home to roost.

If they make us use strcpy and strcmp, I'm going back to "C".

wilbert · Post by **wilbert** » Sun Aug 10, 2014 12:27 pm

chris319 wrote:I think we all agree on two things:

1. ASCII strings must not be abandoned entirely because some applications require it.

2. Users need control over whether their strings are ASCII or unicode.

As I understand it, myChar.c will be unicode, breaking code that relies on it being ASCII. A bad decision comes home to roost.

It's no problem for me if strings are internally based on unicode. It's not different from a language like Visual Basic that also uses strings based on unicode internally (BSTR).
What's required is that the compiler can quickly convert between whatever it is using internally and another format if required like the current pseudotypes PB uses.
Code that relies on .c to be ASCII is similar to code that relies on .i to be 32 bits. Both .c and .i can have a different length and that is clearly mentioned.

chris319 · Post by **chris319** » Sun Aug 10, 2014 12:47 pm

Code that relies on .c to be ASCII is similar to code that relies on .i to be 32 bits. Both .c and .i can have a different length and that is clearly mentioned.

It clearly says it depends on the compiler mode. With the compiler switch gone it will be unicode and that's that. It breaks legacy code and takes away user control.

PureBasic Forums - English

Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic