Removing 'ASCII' switch from PureBasic

Poshu · Post by **Poshu** » Wed Aug 06, 2014 12:45 pm

Strings are already slow as hell with pb (because of some needed dumbproofing), unicode wont make your code significantly slower. Anyway, if this kind of speed is your priority, then PB isn't the tool you need.

And your DLL example is fundamentally biased : of course, in a 1 fonction procedure, it'll look much longer, especially if you write it so badly; but in a real world example it won't be that much of a drag, especially since you could write a nice little macro to do all that.

Unicode is better for everything, said it.

Lord · Post by **Lord** » Wed Aug 06, 2014 12:53 pm

Please stay BASIC.
ASCII is BASIC.
If there is to much work: drop SpiderXXXXX

As a hobbyprogrammer, I don't need Unicode and
I don't like to have to change all my recent programs.

As a (not as young as I want to be) hobbyprogrammer,
I don't like new ways of handling just ASCII strings.

As a hobbyprogrammer, I don't like to have double the
size of text in my programs.

But I'm just a hobbyprogrammer.
Maybe I'm just one of a few.
Maybe there are a lot more hobbyprogrammer, but they
are not often in this (international) Forum.
Maybe they also think, I'm just a hobbyprogrammer and
will not be heard because the Pros are making a lot more
noise.

So, I think, the ASCII-switch will be dropped, regardless if
there is a silent majority which will not like this.

wilbert · Post by **wilbert** » Wed Aug 06, 2014 1:21 pm

Lord wrote:Please stay BASIC.
ASCII is BASIC.

It's not like ASCII is basic and Unicode more advanced.
ASCII only covers codes 0-127. Codes 128-255 are extended codes that PB uses but are not cross platform compatible.
Unicode is truly cross platform compatible and not limited to the english language.
If you create international applications, unicode simply is a requirement for it to function properly.

davido · Post by **davido** » Wed Aug 06, 2014 2:02 pm

@wilbert

Thank you for the explanation. Never bothered to look at Unicode as I didn't think I'd ever need it.

However, I'm always pleased to move with the times. So I'll be going with Unicode from now on.

Thanks.

heartbone · Post by **heartbone** » Wed Aug 06, 2014 2:30 pm

Lord wrote:So, I think, the ASCII-switch will be dropped, regardless if
there is a silent majority which will not like this.

I agree with you on this.
And I really don't have any problems with the change, as it does not affect my programming.
But it does seem like a significant change.
Perhaps instead of a 5.40 branch, it deserves the 6.00 designation?

skywalk · Post by **skywalk** » Wed Aug 06, 2014 3:12 pm

I detect a weird schism developing in this thread. Having both ascii and unicode support is NOT a bad thing for the user. It is a delight and a benefit to me. Clearly, Fred is feeling the burden of dual support and is asking about its necessity.
As of v5.3, I "absolutely" need Unicode compile switch to:
Display international characters in a gadget.
...
That is 10% of my apps. So, for the 90%, I compile Ascii and enjoy the benefits of smaller code and slightly faster execution.

Detour: My idea of modern would be the ability to import C++ libs without having to edit/export its functions in a C++ compiler.

-rant off

Olby · Post by **Olby** » Wed Aug 06, 2014 3:34 pm

Have been using Unicode switch for the past 5 years and can't even recall the last time I've built something purely in ASCII. No problems here whatsoever. All my projects have been converted to Unicode and (some) to x64. I support the move. If you want to stay old school go back to QBASIC.

Lebostein · Post by **Lebostein** » Wed Aug 06, 2014 8:42 pm

16-Bit? That is not Unicode. Unicode needs more than 16 Bit or not? I heard about 100.000 characters...

wilbert · Post by **wilbert** » Wed Aug 06, 2014 8:51 pm

Lebostein wrote:16-Bit? That is not Unicode. Unicode needs more than 16 Bit or not? I heard about 100.000 characters...

I think PureBasic only supports the first plane.
http://en.wikipedia.org/wiki/Plane_(Unicode)

luis · Post by **luis** » Wed Aug 06, 2014 8:54 pm

@Lebostein

According to the manual PB uses UCS-2 internally, so 16 bits.

Also the .u type is defined as 16 bit and this is a pretty strong hint it's 16 bits.

UTF-8, UTF-16, UCS-2 are different types of encoding for unicode chars.
What you are referring to is UTF-16, which will continue to be filled with the most absurd stuff if history is of any indication.

Lebostein · Post by **Lebostein** » Thu Aug 07, 2014 7:05 am

What about an unsigned Byte and Word variable? At the moment we have to abuse the Ascii and Unicode types to handle that...

Little John · Post by **Little John** » Thu Aug 07, 2014 7:12 am

Lebostein wrote:What about an unsigned Byte and Word variable? At the moment we have to abuse the Ascii and Unicode types to handle that...

That's no "abuse". Those variable types are just misnamed.
(And that has nothing got to do with the topic of this thread.)

Bananenfreak · Post by **Bananenfreak** » Thu Aug 07, 2014 7:54 am

What about OGRE? Paths will be still in ASCII...
Yet, we can´t use äüöß or Russian letters,... in paths, but OGRE is OGRE and it will stay on the ASCII-side even we have only Unicodesupport, or not?

EDIT: I think there is no problem.

All strings in PB will be handled as UCS2 (16-bit) strings internally. So if you used "@String$" somewhere in your code, change are high it won't work anymore (if dealing which an API for example)

So what I have to do to get my program working?

Danilo · Post by **Danilo** » Thu Aug 07, 2014 8:11 am

Bananenfreak wrote:
All strings in PB will be handled as UCS2 (16-bit) strings internally. So if you used "@String$" somewhere in your code, change are high it won't work anymore (if dealing which an API for example)
So what I have to do to get my program working?

It depends on the function where you used '@String$'. With most WinAPI functions there shouldn't be a problem, because
they automatically use the Unicode version then (FunctionA vs. FunctionW).

PB Functions that work with Pointers to Strings need to use '*p + sizeOf(Character)' instead '*p + 1' to move to the next character.
That's what most guys here do anyway, for years. So no problems with that, too.
Many codes here in the forum will work like before, because many people already using Unicode for years,
and they made sure that codes worked in both modes.

For external functions (DLLs, wrappers, static lib imports), you may need to alter the function declarations or Prototypes.
This can be done with Pseudo-Types like p-ascii and p-utf8. If you already used the correct (pseudo-)types, you don't need to change anything.

Little John wrote:That's no "abuse". Those variable types are just misnamed.

Code: Select all

Macro ub:a:EndMacro
Macro uw:u:EndMacro

Define uByte.ub = $FF
Define uWord.uw = $FFFF

Maybe they didn't want to use 2-character-types like "ub" or "uw" etc.

Didelphodon · Post by **Didelphodon** » Thu Aug 07, 2014 8:39 am

PB wrote:I used to avoid Unicode like the plague until I read this article:

http://www.joelonsoftware.com/articles/Unicode.html

PureBasic is moving in the right direction. Unicode is the future.

Thx for this article it's *really* well written. Especially compared to those book chapters with dozens of pages discussing character(s)/-sets/-encoding/... which I never managed to read entirely due to sinking motivation.

However, still I have a lot of code around from times before PureBasic's Unicode switch which I would have to check.

Cheers,
Didel.

PureBasic Forums - English

Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic

Re: Removing 'ASCII' switch from PureBasic