Page 1 of 3

Using Chr() With Val()

Posted: Thu Feb 19, 2015 1:44 am
by chris319
I am compiling the following as unicode:

Code: Select all

PrintN (Chr($2046))

a.s = "$2046": PrintN (Chr(Val(a.s)))
The first line prints an uppercase "F" as expected.

The second line prints a question mark.

How can I print "F" using the Val() of a string?

Re: Using Chr() With Val()

Posted: Thu Feb 19, 2015 2:30 am
by skywalk

Code: Select all

Debug Chr($46)
a$ = "$46"
Debug Chr(Val(a$))

Re: Using Chr() With Val()

Posted: Thu Feb 19, 2015 2:47 am
by chris319
That doesn't help. I'm importing this hex number from a file generated by another application so I have to use that application's character encoding.

It works if I pass the number directly but not if I try to convert a string to a number using val(), or even to an intermediate variable.

Code: Select all

z$ = "$2046"
m.q = Val(z$): Debug m
n.u = m
Debug Chr(n.u)
Debug Chr(8262)

Re: Using Chr() With Val()

Posted: Thu Feb 19, 2015 3:53 am
by heartbone
chris319 wrote:I am compiling the following as unicode:

Code: Select all

PrintN (Chr($2046))

a.s = "$2046": PrintN (Chr(Val(a.s)))
The first line prints an uppercase "F" as expected.

The second line prints a question mark.

How can I print "F" using the Val() of a string?
chris319 wrote:That doesn't help. I'm importing this hex number from a file generated by another application so I have to use that application's character encoding.

It works if I pass the number directly but not if I try to convert a string to a number using val(), or even to an intermediate variable.

Code: Select all

z$ = "$2046"
m.q = Val(z$): Debug m
n.u = m
Debug Chr(n.u)
Debug Chr(8262)
This whole Unicode thing can be somewhat confusing at times, so I'm probably not much help here chris319,
So I have no idea why Chr($2046) works, when the builtin IDE help declares Chr() to be an ASCII (ANSI ?) command.
But that may have something to do with it.
Or not. :|

Re: Using Chr() With Val()

Posted: Thu Feb 19, 2015 3:55 am
by chris319
The character value. It can be an ASCII value, or an unicode value if the program is compiled as Unicode mode.

Re: Using Chr() With Val()

Posted: Thu Feb 19, 2015 4:28 am
by Dude
Seems to be a bug with Chr? This should work fine when compiled in Unicode, but I get an F and a dash.

Code: Select all

h$="$2046" ; 8262
Debug Chr(8262) ; F
Debug Chr(Val(h$)) ; -

Re: Using Chr() With Val()

Posted: Thu Feb 19, 2015 4:54 am
by BasicallyPure
It looks like Chr() doesn't like that big of a number.
You can force it to work by masking it down to 13 binary bits.
The largest number Chr() can take is $1FFF or 8191.

Code: Select all

z$ = "$2046"

m.q = Val(z$)
Debug Bin(m) ; = %10000001000110 = 14 significant bits

n.u = m 
Debug Bin(n) ; still = %10000001000110 = 14 significant bits

n & %1111111111111 ; remove all bits above 13.
Debug Bin(n) ; = %0000001000110

Debug Chr(n) ; now it works

Debug Chr(8262)

Re: Using Chr() With Val()

Posted: Thu Feb 19, 2015 5:42 am
by Little John
Dude wrote:Seems to be a bug with Chr? This should work fine when compiled in Unicode, but I get an F and a dash.

Code: Select all

h$="$2046" ; 8262
Debug Chr(8262) ; F
Debug Chr(Val(h$)) ; -
No bug with Chr().
Using MessageRequester() instead of Debug works as expected, i.e. shows (at least with PB 5.31 on Windows):

Code: Select all

h$="$2046"
MessageRequester("", Chr($2046) + #LF$ +
                     Chr(Val(h$)))
So this seems to be a limitation of Debug, or of the font that Debug uses on your or my system.
A Unicode character can only be displayed, if the used font has a glyph for it.
In many fonts, glyphs for several Unicode characters are missing.
BasicallyPure wrote:Debug Chr(n) ; now it works
No, that shows F here.
But Chr($2046) is , not F.

Re: Using Chr() With Val()

Posted: Thu Feb 19, 2015 5:51 am
by Demivec
BasicallyPure wrote: It looks like Chr() doesn't like that big of a number.
You can force it to work by masking it down to 13 binary bits.
The largest number Chr() can take is $1FFF or 8191
I didn't have any problems with numbers as high as $FFFF.

This code ran without problems and showed characters as high as $FFFD.

Code: Select all

For i = 1 To $ffff
  Debug "" + i + "= (" + Hex(i) + ") : " + Chr(i)
Next
If you are debugging the value to the output window you may need to select a custom font in the IDE preferences (for the Debugger).

For the Console window (as in the first post) I could only see a selection of 3 fonts which didn't include much in the way of Unicode characters. The font was selectable from the console's system menu.

Re: Using Chr() With Val()

Posted: Thu Feb 19, 2015 6:02 am
by netmaestro
Ascii or Unicode:

Code: Select all

n=$2046
Debug Chr(Val(StrU(n, #PB_Ascii)))

Re: Using Chr() With Val()

Posted: Thu Feb 19, 2015 6:07 am
by Little John
netmaestro wrote:Ascii or Unicode:

Code: Select all

n=$2046
Debug Chr(Val(StrU(n, #PB_Ascii)))
Displays wrongly F here.
For more information read http://www.purebasic.fr/english/viewtop ... 08#p461308

Re: Using Chr() With Val()

Posted: Thu Feb 19, 2015 2:29 pm
by luis
You didn't mention the OS, the compiler version, etc.

So I'll assume Windows 7 and 5.31 x86 because it's what I'm using by default.
chris319 wrote: The first line prints an uppercase "F" as expected.
The expected result for a unicode program, or at least a meaningful one, would be and not F (as LJ already noted).

What you get depend on the font in use by the console in unicode mode, and by the font in use by the console and its currently selected code page in ascii mode.

I don't get the char above printed because the font I have selected (Lucida Console) does not have the char mapped.

See here -> http://www.fileformat.info/info/unicode ... s/grid.htm

But I get the "empty rectangle" symbol printed twice (so the same output twice)

I know it maps cyricllic chars, so replacing your $2046 with a cyrillic one ($044f) both the expressions seems to work as expected and I get the corrected char (я) printed twice in the console.

Code: Select all

OpenConsole()

PrintN(Chr($044f)) ; OK
a.s = "$044f" : PrintN(Chr(Val(a.s))) ; OK

Input()
CloseConsole()

Even this work as expected:

Code: Select all

Debug (Chr($044f)) ; OK
a.s = "$044f" : Debug (Chr(Val(a.s))) ; OK
Also this work as expected:

Code: Select all

z$ = "$2046"
m.q = Val(z$): Debug m
n.u = m
Debug Chr(n.u) ; OK
Debug Chr(8262) ; OK
Also this work as expected:

Code: Select all

h$="$2046" ; 8262
Debug Chr(8262) ; OK
Debug Chr(Val(h$)) ; OK

Re: Using Chr() With Val()

Posted: Thu Feb 19, 2015 2:30 pm
by heartbone
Little John wrote:So this seems to be a limitation of Debug, or of the font that Debug uses on your or my system.
A Unicode character can only be displayed, if the used font has a glyph for it.
In many fonts, glyphs for several Unicode characters are missing.
Is there even a suitable replacement font available to test your theory?
luis wrote:I don't get the char above printed because the font I have selected (Lucida Console) does not have the char mapped.
I'm also using Lucida Console Regular 12 as my source code font. That must be the default, although I was under the impression that I had selected it. But I didn't think the debug output and requesters used the same font.

Re: Using Chr() With Val()

Posted: Thu Feb 19, 2015 4:18 pm
by kenmo
1. Compiler > Compiler Options > Create Unicode Executable
2. File > File Format > Encoding: UTF-8

It's a bug, see here:
http://www.purebasic.fr/english/viewtop ... 24&t=60930

The compiler does a small optimization, it converts constant Chr() calls directly to strings at compile-time.

This causes a problem when using Unicode values in Chr() in ASCII format source files.

The compiler tries to convert Chr($2046) to a character, but the character 0x2046 cannot fit in one ASCII byte, so it apparently is truncated to 0x46 ("F").

Re: Using Chr() With Val()

Posted: Thu Feb 19, 2015 4:55 pm
by luis
kenmo wrote: The compiler tries to convert Chr($2046) to a character, but the character 0x2046 cannot fit in one ASCII byte, so it apparently is truncated to 0x46 ("F").
You are right about the bug, and there are two funny things going here:

The first funny thing is actually the compiler could store the number without problems, since it's using double words for it -> http://www.purebasic.fr/english/viewtop ... 91#p453091
It just write the wrong numbers there without an apparent reason beyond the file format of the source which has nothing to do with the code it actually generates, since what it see is just an ascii literal constant (the number) and not a symbol encoded in some way.

The second funny thing is the problem reported here says the first print is ok (the 'F') when the expected results from Chr($2046) in a unicode program is not F. That's the wrong output, not the good one. Maybe it's what he would like to get, but it's not the unicode char mapped to that constant.

The second output with the question mark is actually the correct one, because by using a string + val() instead of the constant as in the first print, you bypass the bug in the compiler since the second number it's not stored the wrong way in the exe but it's evalued at runtime.
The question mark it's there just for the reasons expressed above.