UTF8 and strings...

helpy · Post by **helpy** » Mon Jan 18, 2010 11:05 am

Behaviour changed in 4.41 RC1:

unicode compiler option == OFF ==> Result of TEST:

PeekS(?string,?dataEnd-?string,#PB_UTF8) ==> Test ? 123

Dump of ?string:
004223FC  54 65 73 74 20 E9 20 31 32 33                      |Test é 123|

unicode compiler option == ON ==> Result of TEST:

Code: Select all

PeekS(?string,?dataEnd-?string,#PB_UTF8) ==> Test  123

Dump of ?string:
004224B8  54 65 73 74 20 E9 20 31 32 33                      |Test é 123|

Joakim Christiansen · Mon Jan 18, 2010 12:02 pm

helpy wrote:Behaviour changed in 4.41 RC1:

Yeah, I posted my example in the bug section and they wrote "fixed".
And I'm happy with the new behavior (still maybe not perfect, but okay enough for me). That's whats so nice with PureBasic; the developers listen to their users.

Trond · Post by **Trond** » Mon Jan 18, 2010 12:59 pm

Extended ascii (> 128) can't be read by UTF-8, this is by design.

Also, you can't expect unicode characters to display the same in both unicode and ascii mode. (When you use an UTF-8-encoded é, it's a unicode character.) Even if the character is the same as the ascii character é, they have different numbers, and when converting a character with character number > 255 to ascii, it will be lost.

Of course, just cutting the string at this character (like in 4.40) isn't the right thing to do, though.

PureBasic Forums - English

UTF8 and strings...

Re: UTF8 and strings...

Re: UTF8 and strings...

Re: UTF8 and strings...