Page 1 of 1

about percent hex...

Posted: Thu Dec 19, 2013 3:29 pm
by AlanFoo
Hi experts,

hope someone can help me on this.

I converted a unicode e.g Chinese character "中" using asc I get 20013
and hex(20013) i get 4E2D as hex.

When I read this same character "中" using browser e.g. www.host.com/?中
the browser convert the "中" to %E4%B8%AD which I understand is called percent hex and equivalent to 4E2D

i am not sure how to use Purebasic to arrive at %E4%B8%AD hex from unicode "中"

Or is there anyway using purebasic, to convert 4E2D to %E4%B8%AD so that I can process the same way using php in the browser ?

<?php
$query_string = $_SERVER['QUERY_STRING'];
print $query_string;
?>

This php code displays %E4%B8%AD
instead of 4E2D

Pls help if there is a pb routine to convert?

Rgds
Alan

Re: about percent hex...

Posted: Thu Dec 19, 2013 3:35 pm
by JHPJHP
Hi AlanFoo,

Not knowing if your setup may be a factor...
Possibly: URLEncoder(URL$) / URLDecoder(URL$)

Re: about percent hex...

Posted: Thu Dec 19, 2013 4:22 pm
by AlanFoo
JHPJHP wrote:Hi AlanFoo,

Not knowing if your setup may be a factor...
Possibly: URLEncoder(URL$) / URLDecoder(URL$)
Thanks for reply.

No I dont think it is the urlencorder .... but rather the way browsers convert unicode to percent hex %E4%B8%AD format which
actually is the same value as dec 20013. it is I believe another form of hex. the one PB convert takes the form 4E2D

I am not sure how to convert %E4%B8%AD to dec 20013 .

alan

Re: about percent hex...

Posted: Thu Dec 19, 2013 4:40 pm
by JHPJHP
Just some more information to assist tracing the problem: http://www.charbase.com/4e2d-unicode-cj ... -ideograph
- UTF16 to UTF8

Re: about percent hex...

Posted: Thu Dec 19, 2013 5:53 pm
by PMV

Code: Select all

Define Letter.s = "中"
Define *Buffer = AllocateMemory(StringByteLength(Letter, #PB_UTF8) + 1)
PokeS(*Buffer, Letter, 1, #PB_UTF8)

Define *w.BYTE = *Buffer

Define Result.s = ""
While *w\b
  Result + "%" + Hex(*w\b, #PB_Byte)
  *w + 1
Wend

Debug Result

Re: about percent hex...

Posted: Thu Dec 19, 2013 7:19 pm
by Demivec
@AlanFoo: Here's an example to convert from hex percent to the hex value of the character and back again depending on your preference.

Code: Select all

Define i, UTF8_ByteCount

Define *buffer1, browserText.s

;convert browser text into PB Unicode String, assumes encoding is UTF8
browserText.s = "%E4%B8%AD" ;isolated text in hex percent encoding
Debug browserText

UTF8_ByteCount = CountString(browserText, "%")
*buffer1 = AllocateMemory(UTF8_ByteCount + 1) ;add 1 byte for a null

For i = 1 To UTF8_ByteCount
  PokeB(*buffer1 + i -1, Val("$" + StringField(browserText, i + 1, "%")))
Next

Debug PeekS(*buffer1, -1, #PB_UTF8) ;display character
Debug "$" + Hex(Asc(PeekS(*buffer1, -1, #PB_UTF8)), #PB_Word) ;display hex value of character
 
Define c, *buffer2, convertedText.s

Debug "-----------"

;convert unicode character into hex percent format
c = $4E2D ;character
Debug "$" + Hex(c, #PB_Word) ;display hex value of character to convert

*buffer2 = AllocateMemory(StringByteLength(Chr(c), #PB_UTF8) + 1) ;hold UTF8 version of unicode character
PokeS(*buffer2, Chr(c), 1, #PB_UTF8)
UTF8_ByteCount = MemoryStringLength(*buffer2 , #PB_UTF8)

convertedText.s
For i = 1 To UTF8_ByteCount
  convertedText + "%" + Hex(PeekB(*buffer2 + i - 1), #PB_Byte)
Next
Debug convertedText

The part that converts to hex percent is virtually identical to PMV's version. I started mine before he posted and thought I would posted it anyway even though he posted before me. I think his looks neater too. :)

Re: about percent hex...

Posted: Thu Dec 19, 2013 10:14 pm
by AlanFoo
PMV wrote:

Code: Select all

Define Letter.s = "中"
Define *Buffer = AllocateMemory(StringByteLength(Letter, #PB_UTF8) + 1)
PokeS(*Buffer, Letter, 1, #PB_UTF8)

Define *w.BYTE = *Buffer

Define Result.s = ""
While *w\b
  Result + "%" + Hex(*w\b, #PB_Byte)
  *w + 1
Wend

Debug Result
Dear w.BYTE,

Thanks a lot .

Your conversion is very neat.
Changing from "中" is one of the issues I need and Deniver have provided the routine from "%E4%B8%AD to Unicode "中"

As posted I would need to convert from utf8 "%E4%B8%AD" to decimal. 20013 too
Can you help ?

Regards
Alan

Re: about percent hex...

Posted: Fri Dec 20, 2013 1:24 am
by AlanFoo
Demivec wrote:@AlanFoo: Here's an example to convert from hex percent to the hex value of the character and back again depending on your preference.

Code: Select all

Define i, UTF8_ByteCount

Define *buffer1, browserText.s

;convert browser text into PB Unicode String, assumes encoding is UTF8
browserText.s = "%E4%B8%AD" ;isolated text in hex percent encoding
Debug browserText

UTF8_ByteCount = CountString(browserText, "%")
*buffer1 = AllocateMemory(UTF8_ByteCount + 1) ;add 1 byte for a null

For i = 1 To UTF8_ByteCount
  PokeB(*buffer1 + i -1, Val("$" + StringField(browserText, i + 1, "%")))
Next

Debug PeekS(*buffer1, -1, #PB_UTF8) ;display character
Debug "$" + Hex(Asc(PeekS(*buffer1, -1, #PB_UTF8)), #PB_Word) ;display hex value of character
 
Define c, *buffer2, convertedText.s

Debug "-----------"

;convert unicode character into hex percent format
c = $4E2D ;character
Debug "$" + Hex(c, #PB_Word) ;display hex value of character to convert

*buffer2 = AllocateMemory(StringByteLength(Chr(c), #PB_UTF8) + 1) ;hold UTF8 version of unicode character
PokeS(*buffer2, Chr(c), 1, #PB_UTF8)
UTF8_ByteCount = MemoryStringLength(*buffer2 , #PB_UTF8)

convertedText.s
For i = 1 To UTF8_ByteCount
  convertedText + "%" + Hex(PeekB(*buffer2 + i - 1), #PB_Byte)
Next
Debug convertedText

The part that converts to hex percent is virtually identical to PMV's version. I started mine before he posted and thought I would posted it anyway even though he posted before me. I think his looks neater too. :)
Thanks indeed for the prompt response to both you and W_Byte.

Your routine converts the utf8 to hex and back to utf8 . Very useful to me.

I will need to know how to convert to ascii directly from utf8 "%E4%B8%AD" percent hex to dec.

regards
Alan

Re: about percent hex...

Posted: Fri Dec 20, 2013 6:54 am
by Shield
Just for the record I copy my post here since the OP did a double post
and I seemed to be the only one who didn't post in this version.

@Moderators: please delete the duplicate topic, thank you.
Shield (in other duplicate post) wrote:PB gives you the UTF-16 encoding but in the URL it is encoded as UTF-8.
The following code demonstrates how to get the correct hex values:

Code: Select all

string.s = "中"
value.i = 0

length = PokeS(@value, string, -1, #PB_UTF8)
For i = 0 To length - 1
	Debug Hex(PeekA(@value + i))
Next

Re: about percent hex...

Posted: Fri Dec 20, 2013 7:20 am
by wilbert
AlanFoo wrote:I will need to know how to convert to ascii directly from utf8 "%E4%B8%AD" percent hex to dec.
You will have to check with some more codes but I think this will do

Code: Select all

Procedure.u PercentHexToDec(PercentHex.s)
  Protected result.u, l.l = Val("$" + RemoveString(PercentHex, "%"))
  If l & $FFFFFF80 = 0
    ; 1 byte code
    result = l
  ElseIf l & $FFFFE0C0 = $C080
    ; 2 byte code
    result = l & $3F + l >> 2 & $7C0
  ElseIf l & $FFF0C0C0 = $E08080
    ; 3 byte code
    result = l & $3F + l >> 2 & $FC0 + l >> 4 & $F000
  EndIf
  ProcedureReturn result  
EndProcedure

Debug PercentHexToDec("%E4%B8%AD")
Debug PercentHexToDec("%D5%B3")

Re: about percent hex...

Posted: Fri Dec 20, 2013 7:58 am
by AlanFoo
wilbert wrote:
AlanFoo wrote:I will need to know how to convert to ascii directly from utf8 "%E4%B8%AD" percent hex to dec.
You will have to check with some more codes but I think this will do

Code: Select all

Procedure.u PercentHexToDec(PercentHex.s)
  Protected result.u, l.l = Val("$" + RemoveString(PercentHex, "%"))
  If l & $FFFFFF80 = 0
    ; 1 byte code
    result = l
  ElseIf l & $FFFFE0C0 = $C080
    ; 2 byte code
    result = l & $3F + l >> 2 & $7C0
  ElseIf l & $FFF0C0C0 = $E08080
    ; 3 byte code
    result = l & $3F + l >> 2 & $FC0 + l >> 4 & $F000
  EndIf
  ProcedureReturn result  
EndProcedure

Debug PercentHexToDec("%E4%B8%AD")
Debug PercentHexToDec("%D5%B3")
Got it ... with thanks.

Alan

Re: about percent hex...

Posted: Fri Dec 20, 2013 8:00 am
by AlanFoo
Shield wrote:Just for the record I copy my post here since the OP did a double post
and I seemed to be the only one who didn't post in this version.

@Moderators: please delete the duplicate topic, thank you.
Shield (in other duplicate post) wrote:PB gives you the UTF-16 encoding but in the URL it is encoded as UTF-8.
The following code demonstrates how to get the correct hex values:

Code: Select all

string.s = "中"
value.i = 0

length = PokeS(@value, string, -1, #PB_UTF8)
For i = 0 To length - 1
	Debug Hex(PeekA(@value + i))
Next
Thanks it works...
Alan