PB 5.45 Data.u in ASCII mode

Just starting out? Need help? Post your questions and find answers here.
fryquez
Enthusiast
Enthusiast
Posts: 362
Joined: Mon Dec 21, 2015 8:12 pm

PB 5.45 Data.u in ASCII mode

Post by fryquez »

Can we still report bugs for PB 5.45 LTS?
It's the last one that supports ASCII mode and this problem only occurs in ASCII mode.

Code: Select all

DataSection
  uText:
  Data.u "UNICODE"
  Data.u 'U', 'N', 'I', 'C', 'O', 'D', 'E', 0
EndDataSection
Get's translated to

Code: Select all

l_utext:
 dw "UNICODE",0
 dw 85,78,73,67,79,68,69,0
whats raises a fasm error:

correct translation should be:

Code: Select all

l_utext:
 du "UNICODE",0
 dw 85,78,73,67,79,68,69,0
Little John
Addict
Addict
Posts: 4519
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: PB 5.45 Data.u in ASCII mode

Post by Little John »

fryquez wrote:

Code: Select all

DataSection
  uText:
  Data.u "UNICODE"
  Data.u 'U', 'N', 'I', 'C', 'O', 'D', 'E', 0
EndDataSection
Shouldn't the 3rd quoted line read
Data.s "UNICODE"
?
fryquez
Enthusiast
Enthusiast
Posts: 362
Joined: Mon Dec 21, 2015 8:12 pm

Re: PB 5.45 Data.u in ASCII mode

Post by fryquez »

Nope, you have 3 cases.

Data.a - Ascii string or character array
Data.u - Unicode string or character array
Data.s - depends on compiler mode
User avatar
Demivec
Addict
Addict
Posts: 4086
Joined: Mon Jul 25, 2005 3:51 pm
Location: Utah, USA

Re: PB 5.45 Data.u in ASCII mode

Post by Demivec »

fryquez wrote:Nope, you have 3 cases.

Data.a - Ascii string or character array
Data.u - Unicode string or character array
Data.s - depends on compiler mode
Data.a and Data.u are for numeric values only, not strings.

PB only permits encoding string literals in the same encoding as the compiler mode, either ASCII or Unicode but not both.
fryquez
Enthusiast
Enthusiast
Posts: 362
Joined: Mon Dec 21, 2015 8:12 pm

Re: PB 5.45 Data.u in ASCII mode

Post by fryquez »

You wrong about that, check out the 3 option in unicode mode.
Little John
Addict
Addict
Posts: 4519
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: PB 5.45 Data.u in ASCII mode

Post by Little John »

fryquez wrote:Data.a - Ascii string or character array
Data.u - Unicode string or character array
This does not correspond with reality. As Demivec wrote, Data.a and Data.u are for numeric values only.
This is not a bug in PureBasic but a bug in your code, as I already pointed out in my first post here.

Just read the documentation:
Data
Variables and Types
bosker
Enthusiast
Enthusiast
Posts: 105
Joined: Fri Jan 08, 2010 11:04 pm
Location: Hampshire, UK

Re: PB 5.45 Data.u in ASCII mode

Post by bosker »

Well, I have some sympathy for fryquez because it's quite likely that you may need ASCII strings in a Unicode program or Unicode strings in an ASCII program. I certainly do.

Using...
Data.u "string"
Data.a "string"
.. is the only way of getting those. I've been using data.a "string" as a stopgap for a year (see later).

In Unicode programs (5.50) these work exactly as required but as noted, data.u doesn't work in ASCII progs.
Like many things in Purebasic, the manual doesn't tell everything - for example I don't think it mentions in data that you can write data.i @"string" (but you can).

The problem here is that it's not consistent and I guess it never will be with the removal of ASCII support. However, I think it is a bug in Purebasic because it's emitting incorrect assembler - it's FASM that reports an error.

@fryquez - your (not ideal) workaround for static strings...

Instead of:

Code: Select all

uText:
    Data.u "UNICODE"
write:

Code: Select all

utext:
    ! du 'UNICODE',0
The removal of ASCII support is a complete show-stopper for me and I'll be starting translation of my major customer project to another language in January, after 7 years of supporting it in Purebasic. A lot of unproductive work. Pity.
User avatar
skywalk
Addict
Addict
Posts: 3972
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: PB 5.45 Data.u in ASCII mode

Post by skywalk »

@Bosker - I too would appreciate Ascii Strings in the current PB flow, but I made do with

Code: Select all

*p_ascii_string = Ascii("unicode_string")
to emit Ascii strings when required.
What is the showstopper that does not work?
Or is it the pain of handling mem pointers and FreeMemory()?
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
bosker
Enthusiast
Enthusiast
Posts: 105
Joined: Fri Jan 08, 2010 11:04 pm
Location: Hampshire, UK

Re: PB 5.45 Data.u in ASCII mode

Post by bosker »

Hi Skywalk
The project is (probably) quite unusual...

A brief synopsis...
I have a bunch of programs that manipulate streams of (ASCII) data.
The actual nitty-gritty manipulation is done by a collection of (16?) proprietary DLL's.
I only have the interfaces to the customer DLL's and they all use ASCII (and as far as I can tell will always do so).

Much of the detailed processing is controlled by command strings (ASCII) passed via the DLL interfaces.
The problem with these interfaces is that when they get any string parameter, they save the address (possibly for much later use), not a string copy. (Written in C I suspect)

So this works well for strings that are held static in each program string space but collapses spectacularly when an inline conversion from Unicode to ASCII is done because the converted string is temporary. Did that make sense?
The net result of all this is that I cannot make the damned thing work with a non-ASCII compiler without more messing about than is sensible.

As an added complication, there's a (roughly) monthly update of 'signature' strings that have to be built into the software when they arrive (yes, ASCII again, about 5-6000 lines).

I have tried what you suggest and the memory management pain is indeed great. I've even trialled predeclaring all the interface strings using data.a and passing the static addresses to the DLL's (for one very small section). This does actually work but makes the whole thing very hard work and error-prone. Without the command strings inline, it loses all clarity and I'd never get anything through a review again.

I've been investigating all this for about the last year and on balance, I've come to the conclusion it's better to remake it all and keep it sensible rather than trying to hack round the lack of ASCII support. My current plan is to continue to maintain using 5.45 for the next year (approx) while I change languages (and gui) in the background.

Sorry, this is a lot longer post than I intended but I guess it's therapeutic. Thanks for your interest.
If you have any ideas / magic wand, please let me know. ;-)
Little John
Addict
Addict
Posts: 4519
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: PB 5.45 Data.u in ASCII mode

Post by Little John »

bosker wrote:Well, I have some sympathy for fryquez because it's quite likely that you may need ASCII strings in a Unicode program or Unicode strings in an ASCII program. I certainly do.
AFAIR some time ago someone made a feature request for that.
bosker wrote:Using...
Data.u "string"
Data.a "string"
.. is the only way of getting those. I've been using data.a "string" as a stopgap for a year (see later).

In Unicode programs (5.50) these work exactly as required but as noted, data.u doesn't work in ASCII progs.
Like many things in Purebasic, the manual doesn't tell everything - for example I don't think it mentions in data that you can write data.i @"string" (but you can).
Some undocumented things might work (for a while) by chance. However, people are using those "features" on their own risk and better shouldn't be surprisend when they stop working.
User avatar
skywalk
Addict
Addict
Posts: 3972
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: PB 5.45 Data.u in ASCII mode

Post by skywalk »

bosker wrote:Hi Skywalk
The project is (probably) quite unusual...

A brief synopsis...
I have a bunch of programs that manipulate streams of (ASCII) data.
The actual nitty-gritty manipulation is done by a collection of (16?) proprietary DLL's.
I only have the interfaces to the customer DLL's and they all use ASCII (and as far as I can tell will always do so).

Much of the detailed processing is controlled by command strings (ASCII) passed via the DLL interfaces.
The problem with these interfaces is that when they get any string parameter, they save the address (possibly for much later use), not a string copy. (Written in C I suspect)

So this works well for strings that are held static in each program string space but collapses spectacularly when an inline conversion from Unicode to ASCII is done because the converted string is temporary.
Yeah, I am never surprised by the amount of custom apps that must be written. Work is plentiful. But, I am confused why an ascii parameter you pass to a dll must remain untouched? If the dll needs time to complete its task, just put the parameter strings in a cmd array of your design and let them persist.
The amount and frequency of strings does not sound like a gene sequencer. This should be doable in some fashion with PB.
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
fryquez
Enthusiast
Enthusiast
Posts: 362
Joined: Mon Dec 21, 2015 8:12 pm

Re: PB 5.45 Data.u in ASCII mode

Post by fryquez »

@bosker thanks for understanding the subject.

Yes I can use inline assembly, but a string would be much more comfortable.
Especially ones like ~"\"Quoted Line1\"\nUnquoted Line2".
Well seems this is another thing I should fix in my preprocessor.


The worsted is actually what's going on in this forum.
We have no bug tracker, so we have to post here.

But what happens, some people that do not understand the issue have to bitch in.
Another one silently moves the topic.
bosker
Enthusiast
Enthusiast
Posts: 105
Joined: Fri Jan 08, 2010 11:04 pm
Location: Hampshire, UK

Re: PB 5.45 Data.u in ASCII mode

Post by bosker »

@Little John
Thanks for the response...
I admit I thought the data.u / data.a WAS the way to get Unicode / Ascii strings in the opposite kind of exe.
The data.i @"string" was specifically added some years ago (by Timo I think) but has never showed up in the docs.
I agree with your caution re: using undocumented features but I have other solutions standing by if these things stopped working.
The alternatives rely on using asm data and macros.

@skywalk
When an analysis is being set up, there are lots of commands shovelled across the DLL interfaces to get everything ready before the final 'run' is issued. There's also a data logging feature in there in addition to the sequence analysis and manipulation.
Whatever the reason, the "string address" thing is part of the original req so I have to deal with it.

You're right that this could still be done in PB, but I haven't come up with a scheme I (or my client) can live with in the last 6 months, so I think straightforward conversion is less work and more likely to end in a happy place.
Converting PB to (say) C-11 isn't too difficult (apart from the GUI).

You are also correct - it's not a gene sequencer. ;-)

@fryquez
I agree that doing the strings in PB is a lot easier, but I keep the assembler options as cover in case something stops working.
Unfortunately, I didn't cover the loss of ASCII support in the compiler.
User avatar
skywalk
Addict
Addict
Posts: 3972
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: PB 5.45 Data.u in ASCII mode

Post by skywalk »

Why can't you store the ascii cmd's in a byte array(multiple arrays or multi-dimensional)? Those would persist. And you can manipulate and sort them and other stuff. I also store my logs as ansi text, not utf-8.
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
Post Reply