Page 2 of 4
Posted: Fri Oct 19, 2007 3:34 pm
by pdwyer
Revisiting this, it looks like a bug!
This is not consistant
Code: Select all
OpenFile(0,"F:\Programming\PureBasicCode\SQLite\uni kanji.txt") ; contains the unicode of "漢字" ("Kanji")
Text.s = ReadString(0,#PB_Unicode)
Char1 = Asc(Mid(text,1,1))
Char2 = Asc(Mid(text,2,1))
MessageRequester("", text) ; Displays "漢字" okay
MessageRequester("", Str(char1) + " " + Str(char2)) ; Displays "65279 28450" okay
MessageRequester("", Chr(char1)) ; Displays nothing
MessageRequester("", Chr(char2)) ; Displays "漢" okay
MessageRequester("", Chr(char1) + Chr(char2)) ; Displays only "漢"
If you change the top text to be
Code: Select all
Text = "AB" ;Two characters but english
then everything works as expected
Posted: Fri Oct 19, 2007 3:49 pm
by KingNips
pdwyer wrote:Revisiting this, it looks like a bug!
This is not consistant
Code: Select all
OpenFile(0,"F:\Programming\PureBasicCode\SQLite\uni kanji.txt") ; contains the unicode of "漢字" ("Kanji")
Text.s = ReadString(0,#PB_Unicode)
Char1 = Asc(Mid(text,1,1))
Char2 = Asc(Mid(text,2,1))
MessageRequester("", text) ; Displays "漢字" okay
MessageRequester("", Str(char1) + " " + Str(char2)) ; Displays "65279 28450" okay
MessageRequester("", Chr(char1)) ; Displays nothing
MessageRequester("", Chr(char2)) ; Displays "漢" okay
MessageRequester("", Chr(char1) + Chr(char2)) ; Displays only "漢"
If you change the top text to be
Code: Select all
Text = "AB" ;Two characters but english
then everything works as expected
Maybe there's a null character in front of the original string? Looks like your missing the Vanilla Ice standing on one leg character (字).
Posted: Fri Oct 19, 2007 3:52 pm
by pdwyer
ARRRRHHH
Its that damn byte order mark!
Forgot about that grrrrr
Posted: Fri Oct 19, 2007 3:53 pm
by KingNips
pdwyer wrote:ARRRRHHH
Its that damn byte order mark!
Forgot about that grrrrr
Textbook.
Posted: Fri Oct 19, 2007 3:56 pm
by Mistrel
It's not a bug. I had the exact same problem today. :roll:
http://www.purebasic.fr/english/viewtopic.php?t=29216
Posted: Fri Oct 19, 2007 3:58 pm
by pdwyer
"Magic!"
What's with this then?
Code: Select all
MessageRequester("", Chr($6F22)) ;No Good
MessageRequester("", Chr(28450)) ;No Good
MessageRequester("", Chr(Int(28450))) ;OK
;all same in debugger
Debug 28450
Debug $6F22
Debug Int(28450)
Does chr() have some type limitations for the int?
Posted: Fri Oct 19, 2007 4:14 pm
by KingNips
pdwyer wrote:"Magic!"
What's with this then?
Code: Select all
MessageRequester("", Chr($6F22)) ;No Good
MessageRequester("", Chr(28450)) ;No Good
MessageRequester("", Chr(Int(28450))) ;OK
;all same in debugger
Debug 28450
Debug $6F22
Debug Int(28450)
Does chr() have some type limitations for the int?
Looks like it isn't casting to an int on the fly. It needs an actual int, not a string or whatever.
Ah BASIC. Everything's a string... except when its not.
Posted: Fri Oct 19, 2007 4:24 pm
by pdwyer
Things have changed just a tad since Qbasic dood, Perhaps you spent too long with Pick basic.
Actually, if you aren't explicit "everything's a long". But in this case perhaps "Everything's a bug"

Posted: Fri Oct 19, 2007 4:28 pm
by KingNips
I'm writing my own AAC codec for WMP.
My head just exploded...
Posted: Fri Oct 19, 2007 4:31 pm
by KingNips
pdwyer wrote: Perhaps you spent too long with Pick basic.
ooooo yeah. <shudder>
Posted: Fri Oct 19, 2007 4:54 pm
by pdwyer
KingNips wrote:I'm writing my own AAC codec for WMP.
My head just exploded...
In what language? I have this naging suspicion that you're just a tourist here.

Posted: Fri Oct 19, 2007 4:58 pm
by pdwyer
If you're bored you should take a look here
http://projecteuler.net/
<throws down gauntlet>
I haven't been back in a couple of months, but I'm a 13% genius so far.

Posted: Fri Oct 19, 2007 6:41 pm
by Fred
Actually, it is a bug (now fixed). Better use UTF8 source files if you do unicode programs (and you wouldn't had this issue).
Posted: Fri Oct 19, 2007 10:53 pm
by mskuma
For things like that I always use (compiled in unicode):
Code: Select all
kanji.c = 28450 ; or $6F22
kanjiStr.s = PeekS(@kanji,1)
Debug kanjiStr
MessageRequester ("test", kanjiStr)
I never used chr() because historically that was basically meant for ASCII only.
I personally almost exclusively use UTF8 so I can enter Japanese chars directly into the source avoiding a lot of complications.
Posted: Sat Oct 20, 2007 5:22 am
by pdwyer
Fred wrote: Better use UTF8 source files if you do unicode programs (and you wouldn't had this issue).
I'm not sure I understand this. My IDE is set to use utf8 but in the example of the messagerequester() there's not non ascii in the IDE. I passed a unicode value to a chr() and unless it was of a certain type it wouldn't display. If I used UTF8 how would I have passed it to the messagerequester unless I converted it again?
<RANT>
Thankfully, my intl requirements are (at this stage anyway) just japanese so I can avoid unicode. My system handles codepage 932 so I can type kanji into the IDE, it displays fine, the apps are fine and
it even displays in the debugger and I don't need to compile unicode! I need to just be careful about string lengths and chr() etc.
I was wondering about these chr() functions with UTF16 but generally they look fine in unicode mode, but UTF8... ??? I suppose the advantage is that for most apps that don't need intl support utf8 just works like ascii but uther than it's ability to hold intl data it's a real pain to work with, more painful than utf16 and non unicode code pages.
</RANT>
@mskuma: How is your OS set up? for me, Japanese is fine in the IDE but typing japanese doesn't go in as UTF8, everything is cp932. My system has multi language installed so my wife's profile is all japanese, right down to the start button and menu's and my profile is english but has support for japanese (more than just the asia fonts installed) but I see no where in regional settings or what not to use UTF8 rather than codepages. I'm not familiar with any way of setting the OS to UTF8 either except for IE.
