Removing 'ASCII' switch from PureBasic

Developed or developing a new product in PureBasic? Tell the world about it.
c4s
Addict
Addict
Posts: 1981
Joined: Thu Nov 01, 2007 5:37 pm
Location: Germany

Re: Removing 'ASCII' switch from PureBasic

Post by c4s »

chris319 wrote:
c4s wrote:
chris319 wrote:The mathematical concepts of Log() and Sqr() are centuries old and are thus old technology. I hope the PureBasic team doesn't abandon those.
Wow, I have no words. Are you serious? Unfortunately the "arguments" in this thread are getting worse and worse...
It's a joke.
Don't worry, I got that. ;) My point was that comparing the removal of Ascii-only compilation with fundamental mathematical concepts is wrong on so many levels that it doesn't make any sense (even as a joke).
If any of you native English speakers have any suggestions for the above text, please let me know (via PM). Thanks!
chris319
Enthusiast
Enthusiast
Posts: 782
Joined: Mon Oct 24, 2005 1:05 pm

Re: Removing 'ASCII' switch from PureBasic

Post by chris319 »

We all seem to agree on two things:

1. ASCII strings must not be abandoned entirely because some applications require them.

2. Users need control over whether their strings are ASCII or unicode.

Here is a proposed solution:

We have a somewhat similar* situation with character variables where, without a compiler switch, myChar.c will always be unicode. In order to have it be ASCII one would use myChar.a. The programmer still has control over what kind of variable it is.

Why not do something similar with strings?

myString.s and myString$ are always ASCII strings.

myString.n is a unicode string (second letter in uNicode -- ".u" is not available)

It's not ugly like:
*AsciiBuffer = ToAscii(String$)
*UTF8Buffer = ToUTF8(String$)

The above example is ambiguous. String$ could be either ASCII or unicode, "depending". My idea is unambiguous. myString.n is always unicode. myString$ and myString.s are always ASCII. There is no ambiguity.

Without a unicode switch, myChar.c will always be unicode with no ambiguity. That's an advancement. Any code relying on it being compiled as ASCII will have to be rewritten with myChar.a. Any string written as myString$ or myString.s and made unicode at compile time would have to be rewritten as myString.n

*In reality, .a and .c are not character variables; they are numeric variables. You can't do:

myChar.a = "x", or myChar.b = "y" or myChar.c = "z"

But you can do:

myChar.c = 123, assigning a numeric value to a character. Presently, if myChar.c is compiled as unicode you can do myChar.c = 65000. If it is not compiled as unicode, you cannot.
BorisTheOld
Enthusiast
Enthusiast
Posts: 542
Joined: Tue Apr 24, 2012 5:08 pm
Location: Ontario, Canada

Re: Removing 'ASCII' switch from PureBasic

Post by BorisTheOld »

After reading all the above comments I can't see what all the fuss is about.

In the 50+ years that I've been programming there have always been conflicts between internal and external data representation, and programmers have always had to work around them.

Second generation systems, right up to the early 1970s, typically used bytes of 6-bits. For example, the IBM 1401 and 1620 systems used four data bits, a word mark bit, and a check bit. Text required 2 bytes per character and numeric data used 1 byte per decimal digit.

When the IBM 360 arrived in 1964, not only were internal data formats completely different, every single program had to be re-written. Characters now used 1 byte and internal data formats could be binary or BCD. Programmers like myself wrote programs to simulate the 2nd generation hardware, in order to allow the old executable files to be used while programs were re-written. Some of these simulaters were in use for over 10 years.

Then when Windows, VB, and Unicode arrived, systems had to be converted from DOS to Windows. This required the new programs to correctly read and write Ascii file data, so programmers devised methods for handling the I/O. And it wasn't rocket science.

The latest changes to PB are just part of the evolutionary process that has been happening to software since computers were invented. And having to adapt to such changes is just part of being a programmer. It's not the end of the world. There are simple strategies for making the transition to Unicode, while at the same time handling Ascii data from external sources.

It's not a big issue.
For ten years Caesar ruled with an iron hand, then with a wooden foot, and finally with a piece of string.
~ Spike Milligan
User avatar
Didelphodon
PureBasic Expert
PureBasic Expert
Posts: 448
Joined: Sat Dec 18, 2004 11:56 am
Location: Vienna - Austria
Contact:

Re: Removing 'ASCII' switch from PureBasic

Post by Didelphodon »

just fore the sake of completeness ...
if you like to set the font of a scintilla gadget (via styles) and your program is unicode based you still have to keep the fontname in zero terminated ascii
Go, tell it on the mountains.
marc_256
Enthusiast
Enthusiast
Posts: 745
Joined: Thu May 06, 2010 10:16 am
Location: Belgium
Contact:

Re: Removing 'ASCII' switch from PureBasic

Post by marc_256 »

Fred wrote: edit: before freaking out, we are just talking about removing the "unicode switch", not all ascii related operations !

The Fantaisie Software Team
Not sure, I did not understand this, I need all ascii controls for my RS232 controlled machinery.
So, if I understand well, there is no problem for that ??
Else, PB5.22 and 5.23 LST will still use ascii ??

thanks,
Marc
- every professional was once an amateur - greetings from Pajottenland - Belgium -
PS: sorry for my english I speak flemish ...
IdeasVacuum
Always Here
Always Here
Posts: 6425
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: Removing 'ASCII' switch from PureBasic

Post by IdeasVacuum »

Hi Marc

.....RS-232 is gradually disappearing, even from industrial machines. Apparently RS-232 does not in any case define the character encoding to be used, so that is defined by the device (no doubt you have a handbook for the device in-out).

So, given the age of the device design, sounds like you only need the standard 128 characters? In which case you don't need any conversion functions. Wiki: UTF-8 uses one byte for any ASCII character, all of which have the same code values in both UTF-8 and ASCII encoding.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
infratec
Always Here
Always Here
Posts: 6873
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: Removing 'ASCII' switch from PureBasic

Post by infratec »

@IdeasVacuum
You are right, but...
Fred is talking about unicode and not UTF-8, so he need conversions.

@Marc
You will have to use very often the #PB_ASCII flag.
But it will work.
You can test this already, simply enable the 'Unicode executable' flag in compiler options. :wink:

Bernd
User avatar
Danilo
Addict
Addict
Posts: 3037
Joined: Sat Apr 26, 2003 8:26 am
Location: Planet Earth

Re: Removing 'ASCII' switch from PureBasic

Post by Danilo »

The ability to work with data buffers is still there, for input and output. Memory buffers can contain
any data you want, including Byte data and Ascii characters.

Code: Select all

Structure AsciiArr : a.a[0] : EndStructure

Length = 1024
*Buffer.Ascii = AllocateMemory(Length)

*pointer.AsciiArr = *buffer
If *pointer
    *pointer\a[0] = 'A'
    *pointer\a[1] = 'B'
    *pointer\a[2] = 'C'
    
    Debug PeekS(*pointer, -1, #PB_Ascii)
    
    PokeS(*pointer, "Hello World", -1, #PB_Ascii)
    
    For i = 0 To 12
        Debug " Dec: " + *pointer\a[i] +
              " Hex: " + Hex( *pointer\a[i] ) +
              " Chr: " + Chr( *pointer\a[i] )
    Next
EndIf

CompilerIf 0

;--------------------------------------------------------------------------------------------------------------
;  Data functions                                      ;  String functions
;--------------------------------------------------------------------------------------------------------------
;                                                      ;
;                                                      ;
; Serial Port                                          ;
;                                                      ;
WriteSerialPortData(#SerialPort, *Buffer, Length)      ; WriteSerialPortString(#SerialPort, String$ [, Format])
ReadSerialPortData (#SerialPort, *Buffer, Length)      ;
;--------------------------------------------------------------------------------------------------------------
;                                                      ;
;                                                      ;
; Process                                              ;
;                                                      ;
WriteProgramData(Program, *Buffer, Length)             ; PB 5.2x: WriteProgramString (Program, String$)   --  optional [, Format] missing (use data functions)
                                                       ;          WriteProgramStringN(Program, String$)   --  optional [, Format] missing (use data functions)
                                                       ; PB 5.3+: WriteProgramString (Program, String$ [, Flags])
                                                       ;          WriteProgramStringN(Program, String$ [, Flags])
                                                       ;                                                       
ReadProgramData (Program, *Buffer, Length)             ; PB 5.2x: ReadProgramString(Program)             --  optional [, Format] missing (use data functions)
                                                       ; PB 5.3+: ReadProgramString(Program [, Flags])
;--------------------------------------------------------------------------------------------------------------
;                                                      ;
; Network                                              ;
;                                                      ;
SendNetworkData   (Connection, *Buffer, Length)        ; SendNetworkString(Connection, String$ [, Format])
ReceiveNetworkData(Connection, *Buffer, Length)        ;
;--------------------------------------------------------------------------------------------------------------
;                                                      ;
; File                                                 ;
;                                                      ;
WriteData(#File, *Buffer, Length)                      ; WriteString (#File, Text$ [, Format])
                                                       ; WriteStringN(#File, Text$ [, Format])
ReadData (#File, *Buffer, Length)                      ; ReadString  (#File [, Flags [, Length]])
;--------------------------------------------------------------------------------------------------------------
;                                                      ;
; Console                                              ;
;                                                      ;
WriteConsoleData(*Buffer, Length)                      ; Print (Text$)                          --  optional [, Format] missing (use data functions)
                                                       ; PrintN(Text$)                          --  optional [, Format] missing (use data functions)
ReadConsoleData (*Buffer, Length)                      ; String$ = Input()                      --  optional [, Format] missing (use data functions)
;--------------------------------------------------------------------------------------------------------------

CompilerEndIf
Last edited by Danilo on Mon Aug 11, 2014 6:40 pm, edited 1 time in total.
luciano
Enthusiast
Enthusiast
Posts: 151
Joined: Wed Mar 09, 2011 8:25 pm

Re: Removing 'ASCII' switch from PureBasic

Post by luciano »

Danilo is right,
I use PB to write programs to control machines using Rs232, Rs485, or Rs422 and I compile them as "unicode executable" since I have interfaces in different languages.
I too save log files for microcontrollers in 8 bit; I can confirm that they all work fine if you use the correct parameters
Fred
Administrator
Administrator
Posts: 16681
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: Removing 'ASCII' switch from PureBasic

Post by Fred »

@Danilo: since 5.30, Read/WriteProgramString() now have the format parameters as well
Little John
Addict
Addict
Posts: 4527
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: Removing 'ASCII' switch from PureBasic

Post by Little John »

BorisTheOld wrote:It's not a big issue.
Generally speaking, I agree with what you wrote.
However, there are still some bugs when compiling in Unicode mode.
These bugs should be fixed before ASCII mode is dropped, otherwise PB users will not be amused.
User avatar
Danilo
Addict
Addict
Posts: 3037
Joined: Sat Apr 26, 2003 8:26 am
Location: Planet Earth

Re: Removing 'ASCII' switch from PureBasic

Post by Danilo »

Fred wrote:@Danilo: since 5.30, Read/WriteProgramString() now have the format parameters as well
Thanks, changed it in the table. (I'm still using LTS, of course with Unicode ;))
Little John
Addict
Addict
Posts: 4527
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: Removing 'ASCII' switch from PureBasic

Post by Little John »

//edit 2016-06-08:
Transmogrified the code, and moved it to the "Tricks 'n' Tips" section
http://www.purebasic.fr/english/viewtop ... 12&t=65905
Last edited by Little John on Wed Jun 08, 2016 9:34 pm, edited 1 time in total.
User avatar
bbanelli
Enthusiast
Enthusiast
Posts: 543
Joined: Tue May 28, 2013 10:51 pm
Location: Europe
Contact:

Re: Removing 'ASCII' switch from PureBasic

Post by bbanelli »

Little John wrote:
BorisTheOld wrote:It's not a big issue.
Generally speaking, I agree with what you wrote.
However, there are still some bugs when compiling in Unicode mode.
Speaking off, I am having an issue with my PCI-Z software.

By activating Unicode mode and moving all PeekS functions to proper #PB_Ascii parameter, when compiling in Debug mode, software works fine.

However, when I compile executable and run it, I get this.

Code: Select all

Problem signature:
  Problem Event Name:	APPCRASH
  Application Name:	PCI-Z.exe
  Application Version:	1.3.0.1
  Application Timestamp:	53e9d317
  Fault Module Name:	MSVCRT.dll
  Fault Module Version:	7.0.7601.17744
  Fault Module Timestamp:	4eeaf722
  Exception Code:	c0000005
  Exception Offset:	0001d33d
  OS Version:	6.1.7601.2.1.0.256.48
  Locale ID:	1050
  Additional Information 1:	0a9e
  Additional Information 2:	0a9e372d3b4ad19135b953a78882e789
  Additional Information 3:	0a9e
  Additional Information 4:	0a9e372d3b4ad19135b953a78882e789

Read our privacy statement online:
  http://go.microsoft.com/fwlink/?linkid=104288&clcid=0x0409

If the online privacy statement is not available, please read our privacy statement offline:
  C:\Windows\system32\en-US\erofflps.txt
As I have both CLI and GUI mode in this software, I am sure that the bug isn't in the part of accessing PCI devices or getting proper ID's from compressed database since CLI output works properly, but rather somewhere in the GUI part itself.

I am aware that this is maybe something for "Problem" section of the forum, but since this problem seems to be directly related to Unicode, perhaps someone has an insight? Windows 7 and PB 5.30 x86, but it also happens with x64 version.
"If you lie to the compiler, it will get its revenge."
Henry Spencer
https://www.pci-z.com/
jassing
Addict
Addict
Posts: 1774
Joined: Wed Feb 17, 2010 12:00 am

Re: Removing 'ASCII' switch from PureBasic

Post by jassing »

What about things like existing databases? (SQLite, etc)
Post Reply