Removing 'ASCII' switch from PureBasic

Developed or developing a new product in PureBasic? Tell the world about it.
Fred
Administrator
Administrator
Posts: 16617
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Removing 'ASCII' switch from PureBasic

Post by Fred »

Hi there,

Since PB 5.30, the minimum Windows OS to run PureBasic is Windows XP. That means than every OS (Windows XP+, OS X 10.5+, Linux) now supports unicode natively, so we discussed with Timo the opportunity for us to remove ASCII support from PB and provide an unicode only compiler. Still supporting ASCII is a big work for us, as we need to provide duplicate fonctions when dealing with strings, which can leads to more bugs. Also, ASCII is an old tech and is condamned to disappear sooner or later, as unicode can handle it as well.

What would change for you:

- Basically, if your software runs with the "unicode" swith ON, nothing will change, you can skip the following text :). If not, then you can enable it and test it.
- All strings in PB will be handled as UCS2 (16-bit) strings internally. So if you used "@String$" somewhere in your code, change are high it won't work anymore (if dealing which an API for example)
- We plan to provide 2 new functions, to ease things a bit:
*AsciiBuffer = ToAscii(String$)
*UTF8Buffer = ToUTF8(String$)

What would change for us:

- Faster building time, less code in our source tree, makefiles much shorter
- Less bugs because of code reduction
- No more unicode switch, so it's easier when sharing code source on the forum, or when developping an user lib (everybody is unicode)
- Makes PB definitely more modern.

We would like to do it for the 5.40 version. What are your thoughts about it ? Is it a deal breaker for you ?

edit: before freaking out, we are just talking about removing the "unicode switch", not all ascii related operations !

The Fantaisie Software Team
Ocean
New User
New User
Posts: 6
Joined: Wed Mar 19, 2014 10:39 am
Location: Europe
Contact:

Re: Removing ASCII mode from PureBasic

Post by Ocean »

Hi Fred,

serial communications (gps receivers, embedded boards, sensors, etc.) still largely communicate with ASCII-based character sets, meaning that we need to push single byte values to those devices and receive single byte values from them... With no ASCII character set left in PureBasic, a programmer would need to do the encoding him-/herself...

cheers
Ocean
Fred
Administrator
Administrator
Posts: 16617
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: Removing ASCII mode from PureBasic

Post by Fred »

What about using ToAscii/ToUTF8() or prototype (p-ascii, p-utf8) when importing these functions ? We are talking about PB internals, not the fact than PB will be able to handle ASCII stuff (you could indeed still read ASCII files etc.). For example, the serial lib WriteSerialString() does support the #PB_Ascii flag already (http://www.purebasic.com/documentation/ ... tring.html).
User avatar
Danilo
Addict
Addict
Posts: 3037
Joined: Sat Apr 26, 2003 8:26 am
Location: Planet Earth

Re: Removing ASCII mode from PureBasic

Post by Danilo »

Will functions like PokeS() and PeekS() still be supported? To read and write Ascii/UTF8/Unicode strings from/to memory buffers.
Same for ReadString() and WriteString() file functions, is it still supported to read/write ASCII and UTF files?
What about p-ascii and p-utf8 pseudo types for calling external library functions/APIs?

I use always UNICODE compiler mode, and by using the above functions, I am still able to interact
with functions that work with ASCII/UTF8 data.
For serial communication (mentioned by Ocean), we still have the Byte and Ascii types (.b .Byte .a .Ascii), we can use
with memory buffers.

If it is only about removing the ASCII compiler mode, and conversion functions (see above) are still
present, I see no problem so far.
Fred
Administrator
Administrator
Posts: 16617
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: Removing ASCII mode from PureBasic

Post by Fred »

Yes, all this will be indeed supported, we are just talking about the "Unicode switch" found in the "compiler options" window. May be I wasn't explicit enough in the fist post. I will edit it.
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Removing ASCII mode from PureBasic

Post by wilbert »

If pseudo types would remain I think they idea itself is a good idea. The timing however looks bad to me.
It would have been much better if a decision like this would have been made prior to the 5.2x LTS release.
For years to come you still will have to post code to the forum supporting both ascii and unicode mode if you want it to be LTS compatible.
Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
STARGÅTE
Addict
Addict
Posts: 2067
Joined: Thu Jan 10, 2008 1:30 pm
Location: Germany, Glienicke
Contact:

Re: Removing ASCII mode from PureBasic

Post by STARGÅTE »

This is a strong change, but for me it is ok.

If we need a "spezial" string format, we have already the functions like PokeS and PeekS() with StringByteLength() to write a string in Ascii/UTF8.
@Ocean: If we communicate with a device, we send some data (memory and not a string). So you can read the ascii with PeekS().

But: Many users need to be careful when they use memory functions like MD5, Base64 as string functions!
I often see code like this: MD5Fingerprint(@String) or Base64Decoder(@String).
I have write and post functions to work with this functions and strings in Ascii, UTF8 and unicode.

*AsciiBuffer = ToAscii(String$) is a nice feature, but we have to freeing the memory self?

Code: Select all

Procedure.i ToAscii(String.s)
  Protected *Buffer = AllocateMemory(StringByteLength(String, #PB_Ascii))
  PokeS(*Buffer, -1, #PB_Ascii)
  ProcedureReturn *Buffer
PB 6.01 ― Win 10, 21H2 ― Ryzen 9 3900X, 32 GB ― NVIDIA GeForce RTX 3080 ― Vivaldi 6.0 ― www.unionbytes.de
Lizard - Script language for symbolic calculations and moreTypeface - Sprite-based font include/module
Fred
Administrator
Administrator
Posts: 16617
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: Removing ASCII mode from PureBasic

Post by Fred »

STARGÅTE wrote:*AsciiBuffer = ToAscii(String$) is a nice feature, but we have to freeing the memory self?

Code: Select all

Procedure.i ToAscii(String.s)
  Protected *Buffer = AllocateMemory(StringByteLength(String, #PB_Ascii))
  PokeS(*Buffer, -1, #PB_Ascii)
  ProcedureReturn *Buffer
It's not decided for now. BTW, you need to add one byte for for AllocateMemory(), for the terminating null byte.
Fred
Administrator
Administrator
Posts: 16617
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: Removing ASCII mode from PureBasic

Post by Fred »

wilbert wrote:If pseudo types would remain I think they idea itself is a good idea. The timing however looks bad to me.
It would have been much better if a decision like this would have been made prior to the 5.2x LTS release.
For years to come you still will have to post code to the forum supporting both ascii and unicode mode if you want it to be LTS compatible.
New LTS will be out in about 1 year, so by the time the 5.40 is out, it will be less than one year, which seems OK to me.
Ocean
New User
New User
Posts: 6
Joined: Wed Mar 19, 2014 10:39 am
Location: Europe
Contact:

Re: Removing 'ASCII' switch from PureBasic

Post by Ocean »

if streaming ASCII is still going to be supported by the various libraries I think your proposal is a good thing.

cheers
Ocean
User_Russian
Addict
Addict
Posts: 1443
Joined: Wed Nov 12, 2008 5:01 pm
Location: Russia

Re: Removing ASCII mode from PureBasic

Post by User_Russian »

Fred wrote:We would like to do it for the 5.40 version. What are your thoughts about it ? Is it a deal breaker for you ?
This is a very very bad idea! :( :?
Some projects may be only ASCII! For example, DLL, called from other programs, including not created in PureBasic. This greatly complicates the programming! Have to abandon from strings and use a memory!
In addition, the conversion from Unicode to ASCII, and from UTF-8 to ASCII, does not always work correctly and I some time ago published examples incorrect conversion!
Fred
Administrator
Administrator
Posts: 16617
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: Removing 'ASCII' switch from PureBasic

Post by Fred »

Could you be more specific about your DLL example ? About the bugs, could you point the topics ? We uses standard function (WideCharToMultiByte_()) to do it, so it should be OK.
Little John
Addict
Addict
Posts: 4519
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: Removing 'ASCII' switch from PureBasic

Post by Little John »

I've also got a question regarding DLLs:
When I create a DLL which is a plug-in for a 3rd party program, and that 3rd party program expects ASCII strings ... will that still be possible?
User avatar
luis
Addict
Addict
Posts: 3876
Joined: Wed Aug 31, 2005 11:09 pm
Location: Italy

Re: Removing 'ASCII' switch from PureBasic

Post by luis »

If the only thing changing is the unicode switch permanently on (so to speak) and all the ascii flags supported by the various function will be kept intact (so it's still possible to both create and read ascii buffers), and the pseudotypes will also be kept as they are then it's acceptable for me.

Negatives:

When unicode is not needed ascii only strings may reduce the size of the final executable a lot and maybe for someone this may be important.

When you must communicate with something using ascii only, as long you still have Byte and Ascii data type all is well for single chars, but when you have to send strings you may have to do some operations on them in advance, so you will be forced to keep them in unicode if you want to be able to use the string library functions since they only understand unicode, and then you may have to do some kind of conversion on them (using pseudotypes when possible and manually when it isn't).

EDIT: see UTF8() and Ascii() added in 5.50 to do just so -> http://www.purebasic.fr/english/viewtop ... 14&t=65868

Having an ascii mode all would be more straightforward, as it is now.

In the end when the other half of the software layer you are communicating with expects ascii, you will sooner or later need to waste some cpu time in conversions, memory space in allocating temporary buffers, write some additional code to operate on those transitional buffers and so on.

So it will make it a little more complicate for PB users and a little more bug prone for some.

Having the option, I would keep the ascii mode obviously. Having ascii/unicode/x86/x64 was one of the many strong points of PB. If you remove ascii now and maybe then x86 later on... it will certainly lose some appeal for many.
Last edited by luis on Wed Jun 08, 2016 12:39 am, edited 1 time in total.
"Have you tried turning it off and on again ?"
A little PureBasic review
User avatar
Danilo
Addict
Addict
Posts: 3037
Joined: Sat Apr 26, 2003 8:26 am
Location: Planet Earth

Re: Removing 'ASCII' switch from PureBasic

Post by Danilo »

@Little John:

Code: Select all

ProcedureDLL.i DLLfunc(*input.Ascii)
    If *input
        theString.s = PeekS(*input,-1,#PB_Ascii)
    EndIf
    ProcedureReturn toAscii("returnString")
EndProcedure
It is all already possible. If you enable UNICODE compiler mode, you can still use ASCII functions/libs/DLLs.

The major problem will be for people that still work with ASCII compiler mode.
Either they convert their projects to work with Unicode strings or they stay with 5.3x for that project and
PB 5.4x for new projects only.

It requires the same effort for people that still use always 32bit PB and use .l data type everywhere.
When they move to 64bit, many things don't work anymore. The day will come when all OS are
64bit or higher, and PB 32bit is not required anymore...
Post Reply