UTF8 and strings...

Everything else that doesn't fall into one of the other PB categories.
User avatar
helpy
Enthusiast
Enthusiast
Posts: 552
Joined: Sat Jun 28, 2003 12:01 am

Re: UTF8 and strings...

Post by helpy »

Behaviour changed in 4.41 RC1:

unicode compiler option == OFF ==> Result of TEST:

Code: Select all

PeekS(?string,?dataEnd-?string,#PB_UTF8) ==> Test ? 123

Dump of ?string:
004223FC  54 65 73 74 20 E9 20 31 32 33                      |Test é 123|
unicode compiler option == ON ==> Result of TEST:

Code: Select all

PeekS(?string,?dataEnd-?string,#PB_UTF8) ==> Test  123

Dump of ?string:
004224B8  54 65 73 74 20 E9 20 31 32 33                      |Test é 123|
Windows 10 / Windows 7
PB Last Final / Last Beta Testing
User avatar
Joakim Christiansen
Addict
Addict
Posts: 2452
Joined: Wed Dec 22, 2004 4:12 pm
Location: Norway
Contact:

Re: UTF8 and strings...

Post by Joakim Christiansen »

helpy wrote:Behaviour changed in 4.41 RC1:
Yeah, I posted my example in the bug section and they wrote "fixed".
And I'm happy with the new behavior (still maybe not perfect, but okay enough for me). That's whats so nice with PureBasic; the developers listen to their users.
I like logic, hence I dislike humans but love computers.
Trond
Always Here
Always Here
Posts: 7446
Joined: Mon Sep 22, 2003 6:45 pm
Location: Norway

Re: UTF8 and strings...

Post by Trond »

Extended ascii (> 128) can't be read by UTF-8, this is by design.

Also, you can't expect unicode characters to display the same in both unicode and ascii mode. (When you use an UTF-8-encoded é, it's a unicode character.) Even if the character is the same as the ascii character é, they have different numbers, and when converting a character with character number > 255 to ascii, it will be lost.

Of course, just cutting the string at this character (like in 4.40) isn't the right thing to do, though.
Post Reply