Page 1 of 1
PB 5.45 Data.u in ASCII mode
Posted: Thu Oct 19, 2017 12:41 pm
by fryquez
Can we still report bugs for PB 5.45 LTS?
It's the last one that supports ASCII mode and this problem only occurs in ASCII mode.
Code: Select all
DataSection
uText:
Data.u "UNICODE"
Data.u 'U', 'N', 'I', 'C', 'O', 'D', 'E', 0
EndDataSection
Get's translated to
Code: Select all
l_utext:
dw "UNICODE",0
dw 85,78,73,67,79,68,69,0
whats raises a fasm error:
correct translation should be:
Code: Select all
l_utext:
du "UNICODE",0
dw 85,78,73,67,79,68,69,0
Re: PB 5.45 Data.u in ASCII mode
Posted: Thu Oct 19, 2017 4:35 pm
by Little John
fryquez wrote:
Code: Select all
DataSection
uText:
Data.u "UNICODE"
Data.u 'U', 'N', 'I', 'C', 'O', 'D', 'E', 0
EndDataSection
Shouldn't the 3rd quoted line read
Data.
s "UNICODE"
?
Re: PB 5.45 Data.u in ASCII mode
Posted: Thu Oct 19, 2017 5:21 pm
by fryquez
Nope, you have 3 cases.
Data.a - Ascii string or character array
Data.u - Unicode string or character array
Data.s - depends on compiler mode
Re: PB 5.45 Data.u in ASCII mode
Posted: Fri Oct 20, 2017 2:14 am
by Demivec
fryquez wrote:Nope, you have 3 cases.
Data.a - Ascii string or character array
Data.u - Unicode string or character array
Data.s - depends on compiler mode
Data.a and Data.u are for numeric values only, not strings.
PB only permits encoding string literals in the same encoding as the compiler mode, either ASCII or Unicode but not both.
Re: PB 5.45 Data.u in ASCII mode
Posted: Fri Oct 20, 2017 7:58 am
by fryquez
You wrong about that, check out the 3 option in unicode mode.
Re: PB 5.45 Data.u in ASCII mode
Posted: Fri Oct 20, 2017 8:21 am
by Little John
fryquez wrote:Data.a - Ascii string or character array
Data.u - Unicode string or character array
This does not correspond with reality. As Demivec wrote,
Data.a and
Data.u are for numeric values only.
This is
not a bug in PureBasic but a bug in your code, as I already pointed out in my first post here.
Just read the documentation:
Data
Variables and Types
Re: PB 5.45 Data.u in ASCII mode
Posted: Fri Oct 20, 2017 9:32 pm
by bosker
Well, I have some sympathy for fryquez because it's quite likely that you may need ASCII strings in a Unicode program or Unicode strings in an ASCII program. I certainly do.
Using...
Data.u "string"
Data.a "string"
.. is the only way of getting those. I've been using data.a "string" as a stopgap for a year (see later).
In Unicode programs (5.50) these work exactly as required but as noted, data.u doesn't work in ASCII progs.
Like many things in Purebasic, the manual doesn't tell everything - for example I don't think it mentions in data that you can write data.i @"string" (but you can).
The problem here is that it's not consistent and I guess it never will be with the removal of ASCII support. However, I think it is a bug in Purebasic because it's emitting incorrect assembler - it's FASM that reports an error.
@fryquez - your (not ideal) workaround for static strings...
Instead of:
write:
The removal of ASCII support is a complete show-stopper for me and I'll be starting translation of my major customer project to another language in January, after 7 years of supporting it in Purebasic. A lot of unproductive work. Pity.
Re: PB 5.45 Data.u in ASCII mode
Posted: Fri Oct 20, 2017 10:23 pm
by skywalk
@Bosker - I too would appreciate Ascii Strings in the current PB flow, but I made do with
Code: Select all
*p_ascii_string = Ascii("unicode_string")
to emit Ascii strings when required.
What is the showstopper that does not work?
Or is it the pain of handling mem pointers and FreeMemory()?
Re: PB 5.45 Data.u in ASCII mode
Posted: Sat Oct 21, 2017 12:37 am
by bosker
Hi Skywalk
The project is (probably) quite unusual...
A brief synopsis...
I have a bunch of programs that manipulate streams of (ASCII) data.
The actual nitty-gritty manipulation is done by a collection of (16?) proprietary DLL's.
I only have the interfaces to the customer DLL's and they all use ASCII (and as far as I can tell will always do so).
Much of the detailed processing is controlled by command strings (ASCII) passed via the DLL interfaces.
The problem with these interfaces is that when they get any string parameter, they save the address (possibly for much later use), not a string copy. (Written in C I suspect)
So this works well for strings that are held static in each program string space but collapses spectacularly when an inline conversion from Unicode to ASCII is done because the converted string is temporary. Did that make sense?
The net result of all this is that I cannot make the damned thing work with a non-ASCII compiler without more messing about than is sensible.
As an added complication, there's a (roughly) monthly update of 'signature' strings that have to be built into the software when they arrive (yes, ASCII again, about 5-6000 lines).
I have tried what you suggest and the memory management pain is indeed great. I've even trialled predeclaring all the interface strings using data.a and passing the static addresses to the DLL's (for one very small section). This does actually work but makes the whole thing very hard work and error-prone. Without the command strings inline, it loses all clarity and I'd never get anything through a review again.
I've been investigating all this for about the last year and on balance, I've come to the conclusion it's better to remake it all and keep it sensible rather than trying to hack round the lack of ASCII support. My current plan is to continue to maintain using 5.45 for the next year (approx) while I change languages (and gui) in the background.
Sorry, this is a lot longer post than I intended but I guess it's therapeutic. Thanks for your interest.
If you have any ideas / magic wand, please let me know.

Re: PB 5.45 Data.u in ASCII mode
Posted: Sat Oct 21, 2017 4:33 am
by Little John
bosker wrote:Well, I have some sympathy for fryquez because it's quite likely that you may need ASCII strings in a Unicode program or Unicode strings in an ASCII program. I certainly do.
AFAIR some time ago someone made a feature request for that.
bosker wrote:Using...
Data.u "string"
Data.a "string"
.. is the only way of getting those. I've been using data.a "string" as a stopgap for a year (see later).
In Unicode programs (5.50) these work exactly as required but as noted, data.u doesn't work in ASCII progs.
Like many things in Purebasic, the manual doesn't tell everything - for example I don't think it mentions in data that you can write data.i @"string" (but you can).
Some undocumented things might work (for a while) by chance. However, people are using those "features" on their own risk and better shouldn't be surprisend when they stop working.
Re: PB 5.45 Data.u in ASCII mode
Posted: Sat Oct 21, 2017 5:30 am
by skywalk
bosker wrote:Hi Skywalk
The project is (probably) quite unusual...
A brief synopsis...
I have a bunch of programs that manipulate streams of (ASCII) data.
The actual nitty-gritty manipulation is done by a collection of (16?) proprietary DLL's.
I only have the interfaces to the customer DLL's and they all use ASCII (and as far as I can tell will always do so).
Much of the detailed processing is controlled by command strings (ASCII) passed via the DLL interfaces.
The problem with these interfaces is that when they get any string parameter, they save the address (possibly for much later use), not a string copy. (Written in C I suspect)
So this works well for strings that are held static in each program string space but collapses spectacularly when an inline conversion from Unicode to ASCII is done because the converted string is temporary.
Yeah, I am never surprised by the amount of custom apps that must be written. Work is plentiful. But, I am confused why an ascii parameter you pass to a dll must remain untouched? If the dll needs time to complete its task, just put the parameter strings in a cmd array of your design and let them persist.
The amount and frequency of strings does not sound like a gene sequencer. This should be doable in some fashion with PB.
Re: PB 5.45 Data.u in ASCII mode
Posted: Sat Oct 21, 2017 4:34 pm
by fryquez
@bosker thanks for understanding the subject.
Yes I can use inline assembly, but a string would be much more comfortable.
Especially ones like ~"\"Quoted Line1\"\nUnquoted Line2".
Well seems this is another thing I should fix in my preprocessor.
The worsted is actually what's going on in this forum.
We have no bug tracker, so we have to post here.
But what happens, some people that do not understand the issue have to bitch in.
Another one silently moves the topic.
Re: PB 5.45 Data.u in ASCII mode
Posted: Sun Oct 22, 2017 2:59 pm
by bosker
@Little John
Thanks for the response...
I admit I thought the data.u / data.a WAS the way to get Unicode / Ascii strings in the opposite kind of exe.
The data.i @"string" was specifically added some years ago (by Timo I think) but has never showed up in the docs.
I agree with your caution re: using undocumented features but I have other solutions standing by if these things stopped working.
The alternatives rely on using asm data and macros.
@skywalk
When an analysis is being set up, there are lots of commands shovelled across the DLL interfaces to get everything ready before the final 'run' is issued. There's also a data logging feature in there in addition to the sequence analysis and manipulation.
Whatever the reason, the "string address" thing is part of the original req so I have to deal with it.
You're right that this could still be done in PB, but I haven't come up with a scheme I (or my client) can live with in the last 6 months, so I think straightforward conversion is less work and more likely to end in a happy place.
Converting PB to (say) C-11 isn't too difficult (apart from the GUI).
You are also correct - it's not a gene sequencer.
@fryquez
I agree that doing the strings in PB is a lot easier, but I keep the assembler options as cover in case something stops working.
Unfortunately, I didn't cover the loss of ASCII support in the compiler.
Re: PB 5.45 Data.u in ASCII mode
Posted: Sun Oct 22, 2017 3:17 pm
by skywalk
Why can't you store the ascii cmd's in a byte array(multiple arrays or multi-dimensional)? Those would persist. And you can manipulate and sort them and other stuff. I also store my logs as ansi text, not utf-8.