Support for Ascii compilation ends after the next LTS cycle

Rescator · Post by **Rescator** » Sat Aug 09, 2014 6:13 pm

luis wrote:I wondering... would be wise to continue to use SizeOf(Character), StringByteLength() and similar constructs to be able to support in future maybe unicode encodings not exactly 2 bytes in len or is ok to simply consider a char len = 2 like you did for ascii with char len = 1 ? This when you know your program will have to access only unicode data and not other formats.
I would still tend towards using this type of code not making supposition. In short to continue to code as an ascii build were still possible.

Don't think you ave to worry about that. SizeOf(Character) will probably remain around for age (or never go away), as to StringByteLength() and similar stuff, they will/may be needed for UTF8 (and UCS2/UTF16 possibly as well).
And in cases where you are dealing with binary data (savegame files for example or data conversions) then you might need it too.

They are not going to remove any functions, they are planning to add "two" even (which I'm looking forward to).
They'll just get rid of a lot of duplicate calls/functionality, and require those that use future versions of PureBasic to code with Unicode always (and remember to convert to/from unicode which is what all this huffpuff is due to that and panic that funtionalty is removed).

Don't people realize that an ascii string can be stored in a unicode string? *shakes head*

But back to your subject. SizeOf ad other stuff won't change at all and .a and .c will not vanish either.

DK_PETER · Post by **DK_PETER** » Sat Aug 09, 2014 6:50 pm

@Team.
That's a very generous compromise.

Rescator wrote:Does certain people here have an axe to grind? It certainly seems that way.
......................................................................................
.....................................................................................................................
""Lazy programmer" how dare you call me that?" they will shout. And I'll simply tell them, "It takes one to know one!"...

Så blev det sagt en gang for alle. Håber at folk forstår det nu. Godt gået.

Little John · Post by **Little John** » Sat Aug 09, 2014 6:57 pm

DK_PETER wrote:Så blev det sagt en gang for alle. Håber at folk forstår det nu. Godt gået.

Situs vilate inisit avernit.

DK_PETER · Post by **DK_PETER** » Sat Aug 09, 2014 7:02 pm

Little John wrote:
DK_PETER wrote:Så blev det sagt en gang for alle. Håber at folk forstår det nu. Godt gået.
Situs vilate inisit (et) avernit.

Richtig, aber es ist nur Dänisch.

Little John · Post by **Little John** » Sat Aug 09, 2014 7:28 pm

DK_PETER wrote:
Little John wrote:
DK_PETER wrote:Så blev det sagt en gang for alle. Håber at folk forstår det nu. Godt gået.
Situs vilate inisit (et) avernit.

THIS IS WRONG CITATION!
There was no (et) in my text!
Regardless whether or not you understood what I wrote:
AS A GENERAL RULE, DO NOT POST BOGUS QUOTES!

luis · Post by **luis** » Sat Aug 09, 2014 7:32 pm

Rescator, I think you misunderstood my post or I wasn't clear enough, because you are telling me things already knew and you didn't answer my question.

I'll try to rephrase that...

First, as I said, I was talking about writing a program where you have to access only unicode data and not other formats.
I think it's pretty obvious you need the above mentioned constructs (and more) if that's not the case.
So, I'm talking about a unicode program manipulating only unicode data.

"would be wise to continue to use SizeOf(Character), StringByteLength() and similar constructs to be able to support in future maybe unicode encodings not exactly 2 bytes in len or is ok to simply consider a char len = 2 like you did for ascii with char len = 1 ?"

Translated: when coding for the ascii world, a char has len = 1. You can rely on that so you don't need in your code to expect any other size. Today, now, coding for the unicode world in PB means a char has len = 2, just because the unicode encoding used is UCS-2. So, would be needed, required or merely wise instead of assuming char len = 2 to not make this assumption and to still write such a program using a mix of compile time and run time constructs to treat all the dimensions involved as potentially not multiple of 2, since for example another unicode encoding can use 3 or more bytes per char ?

"I would still tend towards using this type of code not making supposition. In short to continue to code as an ascii build were still possible."

So in the end I answered myself, but I was looking for an opinion if any.

edit: typos

DK_PETER · Post by **DK_PETER** » Sat Aug 09, 2014 7:35 pm

Little John wrote:
DK_PETER wrote:
Little John wrote:
DK_PETER wrote:Så blev det sagt en gang for alle. Håber at folk forstår det nu. Godt gået.
Situs vilate inisit (et) avernit.
THIS IS WRONG CITATION!
There was no (et) in my text!
Regardless whether or not you understood what I wrote:
AS A GENERAL RULE, DO NOT POST BOGUS QUOTES!

ROFLMAO!!! As far as I know (et) must be present. Hence the addition..My latin is rusty, but.....
Just found this: Ref: http://mundmische.de/bedeutung/17148-et ... et_avernit

Have a nice day, little john

Little John · Post by **Little John** » Sat Aug 09, 2014 7:44 pm

DK_PETER wrote:As far as I know (et) must be present. Hence the addition..My latin is rusty, but.....

That text was not Latin, so it seems you didn't actually understand it.

But that's not the question here. You just better should not show disrespect to other people by posting bogus quotes.
You are not entitled to put any words into other peoples mouths. This is a basic communication rule.
Is it really too hard to understand for you?

DK_PETER · Post by **DK_PETER** » Sat Aug 09, 2014 7:50 pm

Little John wrote:
DK_PETER wrote: ROFLMAO!!! As far as I know (et) must be present. Hence the addition..My latin is rusty, but.....
That text was not Latin, so it seems you didn't actually understand it.

But that's not the question here. You just better should not show disrespect to other people by posting bogus quotes.
Is that really too hard to understand for you?

Really dude..If you want to do bugus latin..Please do so 'correctly'. In the bugos latin the (et) is there..(Not that I give a crap).
But, just to make one thing quite clear. Your comments are useless and non productive and you bore the hell out of me!! So please..
Do us both a great favor. Stop responding to my posts. Just....ignore me completely...I'll promise to return the favor.
The forum will be sooo much easier to traverse and enjoy.. Bye sucker!

Little John · Post by **Little John** » Sat Aug 09, 2014 8:01 pm

DK_PETER wrote:Your comments are useless

Yes, obviously you are not able to learn very basic rules of communication and respect.

DK_PETER wrote:Stop responding to my posts.

Well, when you post crap on a public forum, you better shouldn't be surprised when somone posts a comment and calls it crap.

To the other forum members:
I'm sorry about this deviation from the topic, but there is misbehaviour which simply is not acceptable.

DK_PETER · Post by **DK_PETER** » Sat Aug 09, 2014 8:05 pm

@Kleiner john.

Respect must be earned. You earned nothing ZZZZZZZZzzzzzzzzz!
http://youtu.be/WVFVCUOjt74
Last post to this duchebag from me.

Little John · Post by **Little John** » Sat Aug 09, 2014 8:11 pm

Actually it is not necessary to give us more insight in your antisocial mindset ...

Tenaja · Post by **Tenaja** » Sat Aug 09, 2014 8:12 pm

marroh wrote:Odd logic, i need the feature to build ASCII exe, this is fact for me. Pointless to to discuss it, that you and other should accept.

Then keep using 5.2. Or 5.3. Or 5.4, all of which will support it. And possibly even more versions.

skywalk · Post by **skywalk** » Sat Aug 09, 2014 8:14 pm

Rescator wrote:Does certain people here have an axe to grind? It certainly seems that way.

I can only suspect it is you given the depth of your posts on this subject.

Rescator wrote:If you are "surprised" by unicode today, where the hell where you a decade ago or two decades ago? In a daze? Or where you even born yet? Or is the other direction that is the issue?
Are you a 125 year old geriatric programmer that dream of the good old COBOL and pre-Y2K days when using two digit years was not an issue?

Are you addressing Fred and Freak here too? C'mon already, they admitted it was a large piece of work to convert the IDE recently to Unicode and there is not full acceptance from some of the UTF8 fans.
I am accepting of the Unicode compile only, but please stop the dismissal of its impact to users with lots of code to convert and debug.

Rescator · Post by **Rescator** » Sat Aug 09, 2014 8:18 pm

luis wrote:Rescator, I think you misunderstood my post or I wasn't clear enough, because you are telling me things already knew and you didn't answer my question.

I'll try to rephrase that...

Ah, stupid me. Right. In theory yes you have the right idea.

But in reality I have no idea if that is correct though.

I don't feel like diving into MSDN right now. But if I recall correctly a character will always be 16bits, but it may be 16+16 in cases where a character above 65534 need to be represented.
Extending like this is kind of like how UTF8 work (using 8bit). (I think this is how UCS-2 is, the UCS-2 stuff is kind of confusing, throw in Wide and Multibyte conversions and even I get easily confused) I've been spoiled by PureBasic dealing with the unicode stuff for me so I never had to look into how it's "physically" stored

And I mentioned this before but I think wxWidgets on Linux uses 32bit characters, thereby avoiding the issue of UTF8 and UTF16 "strings" having to be parsed for extended characters as it's just a binary string of fixed 32bit characters instead. (this might explain some odd Ascii vs Unicode comparison results on Linux)
So in theory you are right, SizeOf(Character) could be 1 (for UTF8), 2 (for UCS-2/UTF-16/16bit) and 4 (for UCS-4/32bit), same with the .c type.
So if you want to be future proof then use Character and .c, I would call that a very smart choice.
At the very least it saves you a ton of work later changing .w to .c hehe. (I imagine many will be changing from .a to .c or from .b to .c in their code going forwards now)

SizeOf(Character) and .c should be considered "safe" and future proof.

PureBasic Forums - English

Support for Ascii compilation ends after the next LTS cycle

Re: Support for Ascii compilation ends after the next LTS cy

Re: Support for Ascii compilation ends after the next LTS cy

Re: Support for Ascii compilation ends after the next LTS cy

Re: Support for Ascii compilation ends after the next LTS cy

Re: Support for Ascii compilation ends after the next LTS cy

Re: Support for Ascii compilation ends after the next LTS cy

Re: Support for Ascii compilation ends after the next LTS cy

Re: Support for Ascii compilation ends after the next LTS cy

Re: Support for Ascii compilation ends after the next LTS cy

Re: Support for Ascii compilation ends after the next LTS cy

Re: Support for Ascii compilation ends after the next LTS cy

Re: Support for Ascii compilation ends after the next LTS cy

Re: Support for Ascii compilation ends after the next LTS cy

Re: Support for Ascii compilation ends after the next LTS cy

Re: Support for Ascii compilation ends after the next LTS cy