How about separating the Ascii and unicode functions
like:
ALen() for ascii len
ULen() for unicode len
AMid()
UMid()
ALeft()
ULeft()
etc...
Just put an "A" infront of the ascii functions and a "U" in front of the Unicode ones.
It would probably be easier for the coders to write the functions and easier for the programmers to program with.
Ascii and Unicode
-
- User
- Posts: 14
- Joined: Mon Jul 20, 2009 9:29 pm
- Location: Santa Barbara California US
Re: Ascii and Unicode
And break all the code in these forums and our apps? No thanks.
I compile using 5.31 (x86) on Win 7 Ultimate (64-bit).
"PureBasic won't be object oriented, period" - Fred.
"PureBasic won't be object oriented, period" - Fred.
It always seemed incongruous to me that you set a data type in the compilers IDE - after all, we dont set all numeric variables to Integer / Float / Double / Quad / Boolean in the IDE, so why strings. Surely strings should be defined as Unicode / ASCII in the code.
I agree with PB that breaking all the String commands isnt such a hot idea, but if we could have the code below that would be brilliant.
Alternately, perhaps the String (unicode / ascii) type should be defined in the same way as numeric variables:
Anyway, the ascii / unicode switch in the compiler acts globally. If you want that, then yes - its much easier. But if you want to write code that operates either with Unicode -or- ASCII, then (I think) its impossible currently.
I had this issue when I wrote a DLL that could be accessed by MS Excel either as a VBA routine or as an Imported Excel Formula. It seems that it is impossible since VBA is ASCII and the Excel Grid interface is UniCode :-\ Doh!
Basically to do what I wanted, I needed to compile the same code as 2 separate DLLs in ASCII And Unicode to provide the same functions to Excel And VBA
I'm guessing the Compiler Switch Swaps in a bunch of ASCII -or- Unicode stuff, so perhaps it would be a lot of work to change - only Fred & Team PB knows that...
I agree with PB that breaking all the String commands isnt such a hot idea, but if we could have the code below that would be brilliant.
Code: Select all
Len("ABC", [#PB_ASCII | #PB_Unicode])
Mid("ABC",1,2 [#PB_ASCII | #PB_Unicode])
with the optional 2nd parameter acting to override the compiler switch -or- using the comipler switch (as it does now) if omitted)
Code: Select all
ABC.s = "HELLO WORLD" 'defines ABC a string as Unicode / ASCII according to the IDE switch
ABC.a = "HELLO WORLD" 'defines ABC as an ASCII String
ABC.u = "HELLO WORLD" 'defines ABC as a UniCode String
Anyway, the ascii / unicode switch in the compiler acts globally. If you want that, then yes - its much easier. But if you want to write code that operates either with Unicode -or- ASCII, then (I think) its impossible currently.
I had this issue when I wrote a DLL that could be accessed by MS Excel either as a VBA routine or as an Imported Excel Formula. It seems that it is impossible since VBA is ASCII and the Excel Grid interface is UniCode :-\ Doh!
Basically to do what I wanted, I needed to compile the same code as 2 separate DLLs in ASCII And Unicode to provide the same functions to Excel And VBA

I'm guessing the Compiler Switch Swaps in a bunch of ASCII -or- Unicode stuff, so perhaps it would be a lot of work to change - only Fred & Team PB knows that...
Last edited by naw on Tue Jul 21, 2009 1:18 pm, edited 2 times in total.
Ta - N
-
- PureBasic Expert
- Posts: 4229
- Joined: Sat Apr 26, 2003 8:27 am
- Location: Strasbourg / France
- Contact:
PeekS() and PokeS() with #PB_Ascii/#PB_UTF8/#PB_Unicode should help you to convert ANSI to Unicode and vice versa.naw wrote:But the ascii / unicode switch in the compiler acts globally. If you want that, then yes - its much easier. But if you want to write code that operates either with Unicode -or- ASCII, then its impossible currently.
For free libraries and tools, visit my web site (also home of jaPBe V3 and PureFORM).
gnozal wrote:PeekS() and PokeS() with #PB_Ascii/#PB_UTF8/#PB_Unicode should help you to convert ANSI to Unicode and vice versa.naw wrote:But the ascii / unicode switch in the compiler acts globally. If you want that, then yes - its much easier. But if you want to write code that operates either with Unicode -or- ASCII, then its impossible currently.
I'm a humble part-time programmer

While Peek / Poke will let me process ASCII / Unicode within PB, I suspect that a DLL compiled as ASCII would NOT be able to return a UniCode value (though Peek / Poke will let me do stuff with ASCII / UniCode within the PB code) If I'm right, then thats the crux of the problem...
Ta - N
Ascii/Ansi is pretty much deprecated anyway.
Starting with NT (W2K 5.0, XP 5.1, 2003 5.2, Vista 6.0, Win7 6.1) the OS is native unicode.
Any ascii calls are remapped to the Unicode version.
So compiling as Unicode gives less OS overhead when using API calls on modern Windows.
Only on Win9x (95, 98, Me) do you need to use Ascii,
although in fact you could do Unicode on 98 and Me but you need an extra dll and some trickery to get it to work.
I don't support anything older than W2K at all any more.
So if you see any ascii dlls or exe's in modern Windows it's only for backwards compatibility, or there hasn't been a update of that part yet.
Mac and Linux has been unicode for ages.
In modern apps it is better to use unicode and use the PeekS and similar PB functions to handle the few cases of ascii calls and conversion you need to make, it's been only in a few odd cases I needed to do that these last few years.
Starting with NT (W2K 5.0, XP 5.1, 2003 5.2, Vista 6.0, Win7 6.1) the OS is native unicode.
Any ascii calls are remapped to the Unicode version.
So compiling as Unicode gives less OS overhead when using API calls on modern Windows.
Only on Win9x (95, 98, Me) do you need to use Ascii,
although in fact you could do Unicode on 98 and Me but you need an extra dll and some trickery to get it to work.
I don't support anything older than W2K at all any more.
So if you see any ascii dlls or exe's in modern Windows it's only for backwards compatibility, or there hasn't been a update of that part yet.
Mac and Linux has been unicode for ages.
In modern apps it is better to use unicode and use the PeekS and similar PB functions to handle the few cases of ascii calls and conversion you need to make, it's been only in a few odd cases I needed to do that these last few years.