PureBasic Forums - English

Posted: **Fri Oct 19, 2007 3:34 pm**

Revisiting this, it looks like a bug!

This is not consistant


OpenFile(0,"F:\Programming\PureBasicCode\SQLite\uni kanji.txt") ; contains the unicode of "漢字" ("Kanji")
Text.s = ReadString(0,#PB_Unicode)

Char1 = Asc(Mid(text,1,1))
Char2 = Asc(Mid(text,2,1))

MessageRequester("", text)                              ; Displays "漢字" okay
MessageRequester("", Str(char1) + " " + Str(char2))     ; Displays "65279 28450" okay
MessageRequester("", Chr(char1))                        ; Displays nothing
MessageRequester("", Chr(char2))                        ; Displays "漢" okay
MessageRequester("", Chr(char1) + Chr(char2))           ; Displays only "漢"

If you change the top text to be

Code: Select all

Text = "AB"     ;Two characters but english

then everything works as expected

Posted: **Fri Oct 19, 2007 3:49 pm**

pdwyer wrote:Revisiting this, it looks like a bug!

This is not consistant

Code: Select all


OpenFile(0,"F:\Programming\PureBasicCode\SQLite\uni kanji.txt") ; contains the unicode of "漢字" ("Kanji")
Text.s = ReadString(0,#PB_Unicode)

Char1 = Asc(Mid(text,1,1))
Char2 = Asc(Mid(text,2,1))

MessageRequester("", text)                              ; Displays "漢字" okay
MessageRequester("", Str(char1) + " " + Str(char2))     ; Displays "65279 28450" okay
MessageRequester("", Chr(char1))                        ; Displays nothing
MessageRequester("", Chr(char2))                        ; Displays "漢" okay
MessageRequester("", Chr(char1) + Chr(char2))           ; Displays only "漢"

If you change the top text to be

Code: Select all

Text = "AB"     ;Two characters but english

then everything works as expected

Maybe there's a null character in front of the original string? Looks like your missing the Vanilla Ice standing on one leg character (字).

Posted: **Fri Oct 19, 2007 3:52 pm**

ARRRRHHH

Its that damn byte order mark!

Forgot about that grrrrr

Posted: **Fri Oct 19, 2007 3:53 pm**

pdwyer wrote:ARRRRHHH

Its that damn byte order mark!

Forgot about that grrrrr

Textbook.

Posted: **Fri Oct 19, 2007 3:56 pm**

It's not a bug. I had the exact same problem today. :roll:

http://www.purebasic.fr/english/viewtopic.php?t=29216

Posted: **Fri Oct 19, 2007 3:58 pm**

"Magic!"

What's with this then?

Code: Select all


MessageRequester("", Chr($6F22))        ;No Good
MessageRequester("", Chr(28450))        ;No Good
MessageRequester("", Chr(Int(28450)))   ;OK

;all same in debugger
Debug 28450             
Debug $6F22             
Debug Int(28450)

Does chr() have some type limitations for the int?

Posted: **Fri Oct 19, 2007 4:14 pm**

pdwyer wrote:"Magic!"

What's with this then?
Code: Select all
MessageRequester("", Chr($6F22))        ;No Good
MessageRequester("", Chr(28450))        ;No Good
MessageRequester("", Chr(Int(28450)))   ;OK

;all same in debugger
Debug 28450             
Debug $6F22             
Debug Int(28450)
Does chr() have some type limitations for the int?

Looks like it isn't casting to an int on the fly. It needs an actual int, not a string or whatever.

Ah BASIC. Everything's a string... except when its not.

Posted: **Fri Oct 19, 2007 4:24 pm**

Things have changed just a tad since Qbasic dood, Perhaps you spent too long with Pick basic.

Actually, if you aren't explicit "everything's a long". But in this case perhaps "Everything's a bug"

Posted: **Fri Oct 19, 2007 4:28 pm**

I'm writing my own AAC codec for WMP.

My head just exploded...

Posted: **Fri Oct 19, 2007 4:31 pm**

pdwyer wrote: Perhaps you spent too long with Pick basic.

ooooo yeah. <shudder>

Posted: **Fri Oct 19, 2007 4:54 pm**

KingNips wrote:I'm writing my own AAC codec for WMP.

My head just exploded...

In what language? I have this naging suspicion that you're just a tourist here.

Posted: **Fri Oct 19, 2007 4:58 pm**

If you're bored you should take a look here
http://projecteuler.net/

<throws down gauntlet>
I haven't been back in a couple of months, but I'm a 13% genius so far.

Posted: **Fri Oct 19, 2007 6:41 pm**

Actually, it is a bug (now fixed). Better use UTF8 source files if you do unicode programs (and you wouldn't had this issue).

Posted: **Fri Oct 19, 2007 10:53 pm**

For things like that I always use (compiled in unicode):

Code: Select all

kanji.c = 28450 ; or $6F22
kanjiStr.s = PeekS(@kanji,1)

Debug kanjiStr
MessageRequester ("test", kanjiStr)

I never used chr() because historically that was basically meant for ASCII only.

I personally almost exclusively use UTF8 so I can enter Japanese chars directly into the source avoiding a lot of complications.

Posted: **Sat Oct 20, 2007 5:22 am**

Fred wrote: Better use UTF8 source files if you do unicode programs (and you wouldn't had this issue).

I'm not sure I understand this. My IDE is set to use utf8 but in the example of the messagerequester() there's not non ascii in the IDE. I passed a unicode value to a chr() and unless it was of a certain type it wouldn't display. If I used UTF8 how would I have passed it to the messagerequester unless I converted it again?

<RANT>
Thankfully, my intl requirements are (at this stage anyway) just japanese so I can avoid unicode. My system handles codepage 932 so I can type kanji into the IDE, it displays fine, the apps are fine and it even displays in the debugger and I don't need to compile unicode! I need to just be careful about string lengths and chr() etc.

I was wondering about these chr() functions with UTF16 but generally they look fine in unicode mode, but UTF8... ??? I suppose the advantage is that for most apps that don't need intl support utf8 just works like ascii but uther than it's ability to hold intl data it's a real pain to work with, more painful than utf16 and non unicode code pages.

</RANT>

@mskuma: How is your OS set up? for me, Japanese is fine in the IDE but typing japanese doesn't go in as UTF8, everything is cp932. My system has multi language installed so my wife's profile is all japanese, right down to the start button and menu's and my profile is english but has support for japanese (more than just the asia fonts installed) but I see no where in regional settings or what not to use UTF8 rather than codepages. I'm not familiar with any way of setting the OS to UTF8 either except for IE.

PureBasic Forums - English

chr() and unicode