Don't worry, I got that. My point was that comparing the removal of Ascii-only compilation with fundamental mathematical concepts is wrong on so many levels that it doesn't make any sense (even as a joke).chris319 wrote:It's a joke.c4s wrote:Wow, I have no words. Are you serious? Unfortunately the "arguments" in this thread are getting worse and worse...chris319 wrote:The mathematical concepts of Log() and Sqr() are centuries old and are thus old technology. I hope the PureBasic team doesn't abandon those.
Removing 'ASCII' switch from PureBasic
Re: Removing 'ASCII' switch from PureBasic
If any of you native English speakers have any suggestions for the above text, please let me know (via PM). Thanks!
Re: Removing 'ASCII' switch from PureBasic
We all seem to agree on two things:
1. ASCII strings must not be abandoned entirely because some applications require them.
2. Users need control over whether their strings are ASCII or unicode.
Here is a proposed solution:
We have a somewhat similar* situation with character variables where, without a compiler switch, myChar.c will always be unicode. In order to have it be ASCII one would use myChar.a. The programmer still has control over what kind of variable it is.
Why not do something similar with strings?
myString.s and myString$ are always ASCII strings.
myString.n is a unicode string (second letter in uNicode -- ".u" is not available)
It's not ugly like:
*AsciiBuffer = ToAscii(String$)
*UTF8Buffer = ToUTF8(String$)
The above example is ambiguous. String$ could be either ASCII or unicode, "depending". My idea is unambiguous. myString.n is always unicode. myString$ and myString.s are always ASCII. There is no ambiguity.
Without a unicode switch, myChar.c will always be unicode with no ambiguity. That's an advancement. Any code relying on it being compiled as ASCII will have to be rewritten with myChar.a. Any string written as myString$ or myString.s and made unicode at compile time would have to be rewritten as myString.n
*In reality, .a and .c are not character variables; they are numeric variables. You can't do:
myChar.a = "x", or myChar.b = "y" or myChar.c = "z"
But you can do:
myChar.c = 123, assigning a numeric value to a character. Presently, if myChar.c is compiled as unicode you can do myChar.c = 65000. If it is not compiled as unicode, you cannot.
1. ASCII strings must not be abandoned entirely because some applications require them.
2. Users need control over whether their strings are ASCII or unicode.
Here is a proposed solution:
We have a somewhat similar* situation with character variables where, without a compiler switch, myChar.c will always be unicode. In order to have it be ASCII one would use myChar.a. The programmer still has control over what kind of variable it is.
Why not do something similar with strings?
myString.s and myString$ are always ASCII strings.
myString.n is a unicode string (second letter in uNicode -- ".u" is not available)
It's not ugly like:
*AsciiBuffer = ToAscii(String$)
*UTF8Buffer = ToUTF8(String$)
The above example is ambiguous. String$ could be either ASCII or unicode, "depending". My idea is unambiguous. myString.n is always unicode. myString$ and myString.s are always ASCII. There is no ambiguity.
Without a unicode switch, myChar.c will always be unicode with no ambiguity. That's an advancement. Any code relying on it being compiled as ASCII will have to be rewritten with myChar.a. Any string written as myString$ or myString.s and made unicode at compile time would have to be rewritten as myString.n
*In reality, .a and .c are not character variables; they are numeric variables. You can't do:
myChar.a = "x", or myChar.b = "y" or myChar.c = "z"
But you can do:
myChar.c = 123, assigning a numeric value to a character. Presently, if myChar.c is compiled as unicode you can do myChar.c = 65000. If it is not compiled as unicode, you cannot.
-
- Enthusiast
- Posts: 542
- Joined: Tue Apr 24, 2012 5:08 pm
- Location: Ontario, Canada
Re: Removing 'ASCII' switch from PureBasic
After reading all the above comments I can't see what all the fuss is about.
In the 50+ years that I've been programming there have always been conflicts between internal and external data representation, and programmers have always had to work around them.
Second generation systems, right up to the early 1970s, typically used bytes of 6-bits. For example, the IBM 1401 and 1620 systems used four data bits, a word mark bit, and a check bit. Text required 2 bytes per character and numeric data used 1 byte per decimal digit.
When the IBM 360 arrived in 1964, not only were internal data formats completely different, every single program had to be re-written. Characters now used 1 byte and internal data formats could be binary or BCD. Programmers like myself wrote programs to simulate the 2nd generation hardware, in order to allow the old executable files to be used while programs were re-written. Some of these simulaters were in use for over 10 years.
Then when Windows, VB, and Unicode arrived, systems had to be converted from DOS to Windows. This required the new programs to correctly read and write Ascii file data, so programmers devised methods for handling the I/O. And it wasn't rocket science.
The latest changes to PB are just part of the evolutionary process that has been happening to software since computers were invented. And having to adapt to such changes is just part of being a programmer. It's not the end of the world. There are simple strategies for making the transition to Unicode, while at the same time handling Ascii data from external sources.
It's not a big issue.
In the 50+ years that I've been programming there have always been conflicts between internal and external data representation, and programmers have always had to work around them.
Second generation systems, right up to the early 1970s, typically used bytes of 6-bits. For example, the IBM 1401 and 1620 systems used four data bits, a word mark bit, and a check bit. Text required 2 bytes per character and numeric data used 1 byte per decimal digit.
When the IBM 360 arrived in 1964, not only were internal data formats completely different, every single program had to be re-written. Characters now used 1 byte and internal data formats could be binary or BCD. Programmers like myself wrote programs to simulate the 2nd generation hardware, in order to allow the old executable files to be used while programs were re-written. Some of these simulaters were in use for over 10 years.
Then when Windows, VB, and Unicode arrived, systems had to be converted from DOS to Windows. This required the new programs to correctly read and write Ascii file data, so programmers devised methods for handling the I/O. And it wasn't rocket science.
The latest changes to PB are just part of the evolutionary process that has been happening to software since computers were invented. And having to adapt to such changes is just part of being a programmer. It's not the end of the world. There are simple strategies for making the transition to Unicode, while at the same time handling Ascii data from external sources.
It's not a big issue.
For ten years Caesar ruled with an iron hand, then with a wooden foot, and finally with a piece of string.
~ Spike Milligan
~ Spike Milligan
- Didelphodon
- PureBasic Expert
- Posts: 448
- Joined: Sat Dec 18, 2004 11:56 am
- Location: Vienna - Austria
- Contact:
Re: Removing 'ASCII' switch from PureBasic
just fore the sake of completeness ...
if you like to set the font of a scintilla gadget (via styles) and your program is unicode based you still have to keep the fontname in zero terminated ascii
if you like to set the font of a scintilla gadget (via styles) and your program is unicode based you still have to keep the fontname in zero terminated ascii
Go, tell it on the mountains.
Re: Removing 'ASCII' switch from PureBasic
Not sure, I did not understand this, I need all ascii controls for my RS232 controlled machinery.Fred wrote: edit: before freaking out, we are just talking about removing the "unicode switch", not all ascii related operations !
The Fantaisie Software Team
So, if I understand well, there is no problem for that ??
Else, PB5.22 and 5.23 LST will still use ascii ??
thanks,
Marc
- every professional was once an amateur - greetings from Pajottenland - Belgium -
PS: sorry for my english I speak flemish ...
PS: sorry for my english I speak flemish ...
-
- Always Here
- Posts: 6425
- Joined: Fri Oct 23, 2009 2:33 am
- Location: Wales, UK
- Contact:
Re: Removing 'ASCII' switch from PureBasic
Hi Marc
.....RS-232 is gradually disappearing, even from industrial machines. Apparently RS-232 does not in any case define the character encoding to be used, so that is defined by the device (no doubt you have a handbook for the device in-out).
So, given the age of the device design, sounds like you only need the standard 128 characters? In which case you don't need any conversion functions. Wiki: UTF-8 uses one byte for any ASCII character, all of which have the same code values in both UTF-8 and ASCII encoding.
.....RS-232 is gradually disappearing, even from industrial machines. Apparently RS-232 does not in any case define the character encoding to be used, so that is defined by the device (no doubt you have a handbook for the device in-out).
So, given the age of the device design, sounds like you only need the standard 128 characters? In which case you don't need any conversion functions. Wiki: UTF-8 uses one byte for any ASCII character, all of which have the same code values in both UTF-8 and ASCII encoding.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
If it sounds simple, you have not grasped the complexity.
Re: Removing 'ASCII' switch from PureBasic
@IdeasVacuum
You are right, but...
Fred is talking about unicode and not UTF-8, so he need conversions.
@Marc
You will have to use very often the #PB_ASCII flag.
But it will work.
You can test this already, simply enable the 'Unicode executable' flag in compiler options.
Bernd
You are right, but...
Fred is talking about unicode and not UTF-8, so he need conversions.
@Marc
You will have to use very often the #PB_ASCII flag.
But it will work.
You can test this already, simply enable the 'Unicode executable' flag in compiler options.
Bernd
Re: Removing 'ASCII' switch from PureBasic
The ability to work with data buffers is still there, for input and output. Memory buffers can contain
any data you want, including Byte data and Ascii characters.
any data you want, including Byte data and Ascii characters.
Code: Select all
Structure AsciiArr : a.a[0] : EndStructure
Length = 1024
*Buffer.Ascii = AllocateMemory(Length)
*pointer.AsciiArr = *buffer
If *pointer
*pointer\a[0] = 'A'
*pointer\a[1] = 'B'
*pointer\a[2] = 'C'
Debug PeekS(*pointer, -1, #PB_Ascii)
PokeS(*pointer, "Hello World", -1, #PB_Ascii)
For i = 0 To 12
Debug " Dec: " + *pointer\a[i] +
" Hex: " + Hex( *pointer\a[i] ) +
" Chr: " + Chr( *pointer\a[i] )
Next
EndIf
CompilerIf 0
;--------------------------------------------------------------------------------------------------------------
; Data functions ; String functions
;--------------------------------------------------------------------------------------------------------------
; ;
; ;
; Serial Port ;
; ;
WriteSerialPortData(#SerialPort, *Buffer, Length) ; WriteSerialPortString(#SerialPort, String$ [, Format])
ReadSerialPortData (#SerialPort, *Buffer, Length) ;
;--------------------------------------------------------------------------------------------------------------
; ;
; ;
; Process ;
; ;
WriteProgramData(Program, *Buffer, Length) ; PB 5.2x: WriteProgramString (Program, String$) -- optional [, Format] missing (use data functions)
; WriteProgramStringN(Program, String$) -- optional [, Format] missing (use data functions)
; PB 5.3+: WriteProgramString (Program, String$ [, Flags])
; WriteProgramStringN(Program, String$ [, Flags])
;
ReadProgramData (Program, *Buffer, Length) ; PB 5.2x: ReadProgramString(Program) -- optional [, Format] missing (use data functions)
; PB 5.3+: ReadProgramString(Program [, Flags])
;--------------------------------------------------------------------------------------------------------------
; ;
; Network ;
; ;
SendNetworkData (Connection, *Buffer, Length) ; SendNetworkString(Connection, String$ [, Format])
ReceiveNetworkData(Connection, *Buffer, Length) ;
;--------------------------------------------------------------------------------------------------------------
; ;
; File ;
; ;
WriteData(#File, *Buffer, Length) ; WriteString (#File, Text$ [, Format])
; WriteStringN(#File, Text$ [, Format])
ReadData (#File, *Buffer, Length) ; ReadString (#File [, Flags [, Length]])
;--------------------------------------------------------------------------------------------------------------
; ;
; Console ;
; ;
WriteConsoleData(*Buffer, Length) ; Print (Text$) -- optional [, Format] missing (use data functions)
; PrintN(Text$) -- optional [, Format] missing (use data functions)
ReadConsoleData (*Buffer, Length) ; String$ = Input() -- optional [, Format] missing (use data functions)
;--------------------------------------------------------------------------------------------------------------
CompilerEndIf
Last edited by Danilo on Mon Aug 11, 2014 6:40 pm, edited 1 time in total.
Re: Removing 'ASCII' switch from PureBasic
Danilo is right,
I use PB to write programs to control machines using Rs232, Rs485, or Rs422 and I compile them as "unicode executable" since I have interfaces in different languages.
I too save log files for microcontrollers in 8 bit; I can confirm that they all work fine if you use the correct parameters
I use PB to write programs to control machines using Rs232, Rs485, or Rs422 and I compile them as "unicode executable" since I have interfaces in different languages.
I too save log files for microcontrollers in 8 bit; I can confirm that they all work fine if you use the correct parameters
Re: Removing 'ASCII' switch from PureBasic
@Danilo: since 5.30, Read/WriteProgramString() now have the format parameters as well
-
- Addict
- Posts: 4527
- Joined: Thu Jun 07, 2007 3:25 pm
- Location: Berlin, Germany
Re: Removing 'ASCII' switch from PureBasic
Generally speaking, I agree with what you wrote.BorisTheOld wrote:It's not a big issue.
However, there are still some bugs when compiling in Unicode mode.
These bugs should be fixed before ASCII mode is dropped, otherwise PB users will not be amused.
Re: Removing 'ASCII' switch from PureBasic
Thanks, changed it in the table. (I'm still using LTS, of course with Unicode )Fred wrote:@Danilo: since 5.30, Read/WriteProgramString() now have the format parameters as well
-
- Addict
- Posts: 4527
- Joined: Thu Jun 07, 2007 3:25 pm
- Location: Berlin, Germany
Re: Removing 'ASCII' switch from PureBasic
//edit 2016-06-08:
Transmogrified the code, and moved it to the "Tricks 'n' Tips" section
http://www.purebasic.fr/english/viewtop ... 12&t=65905
Transmogrified the code, and moved it to the "Tricks 'n' Tips" section
http://www.purebasic.fr/english/viewtop ... 12&t=65905
Last edited by Little John on Wed Jun 08, 2016 9:34 pm, edited 1 time in total.
Re: Removing 'ASCII' switch from PureBasic
Speaking off, I am having an issue with my PCI-Z software.Little John wrote:Generally speaking, I agree with what you wrote.BorisTheOld wrote:It's not a big issue.
However, there are still some bugs when compiling in Unicode mode.
By activating Unicode mode and moving all PeekS functions to proper #PB_Ascii parameter, when compiling in Debug mode, software works fine.
However, when I compile executable and run it, I get this.
Code: Select all
Problem signature:
Problem Event Name: APPCRASH
Application Name: PCI-Z.exe
Application Version: 1.3.0.1
Application Timestamp: 53e9d317
Fault Module Name: MSVCRT.dll
Fault Module Version: 7.0.7601.17744
Fault Module Timestamp: 4eeaf722
Exception Code: c0000005
Exception Offset: 0001d33d
OS Version: 6.1.7601.2.1.0.256.48
Locale ID: 1050
Additional Information 1: 0a9e
Additional Information 2: 0a9e372d3b4ad19135b953a78882e789
Additional Information 3: 0a9e
Additional Information 4: 0a9e372d3b4ad19135b953a78882e789
Read our privacy statement online:
http://go.microsoft.com/fwlink/?linkid=104288&clcid=0x0409
If the online privacy statement is not available, please read our privacy statement offline:
C:\Windows\system32\en-US\erofflps.txt
I am aware that this is maybe something for "Problem" section of the forum, but since this problem seems to be directly related to Unicode, perhaps someone has an insight? Windows 7 and PB 5.30 x86, but it also happens with x64 version.
Re: Removing 'ASCII' switch from PureBasic
What about things like existing databases? (SQLite, etc)