Page 1 of 1
about percent hex...
Posted: Thu Dec 19, 2013 3:29 pm
by AlanFoo
Hi experts,
hope someone can help me on this.
I converted a unicode e.g Chinese character "中" using asc I get 20013
and hex(20013) i get 4E2D as hex.
When I read this same character "中" using browser e.g.
www.host.com/?中
the browser convert the "中" to %E4%B8%AD which I understand is called percent hex and equivalent to 4E2D
i am not sure how to use Purebasic to arrive at %E4%B8%AD hex from unicode "中"
Or is there anyway using purebasic, to convert 4E2D to %E4%B8%AD so that I can process the same way using php in the browser ?
<?php
$query_string = $_SERVER['QUERY_STRING'];
print $query_string;
?>
This php code displays %E4%B8%AD
instead of 4E2D
Pls help if there is a pb routine to convert?
Rgds
Alan
Re: about percent hex...
Posted: Thu Dec 19, 2013 3:35 pm
by JHPJHP
Hi AlanFoo,
Not knowing if your setup may be a factor...
Possibly: URLEncoder(URL$) / URLDecoder(URL$)
Re: about percent hex...
Posted: Thu Dec 19, 2013 4:22 pm
by AlanFoo
JHPJHP wrote:Hi AlanFoo,
Not knowing if your setup may be a factor...
Possibly: URLEncoder(URL$) / URLDecoder(URL$)
Thanks for reply.
No I dont think it is the urlencorder .... but rather the way browsers convert unicode to percent hex %E4%B8%AD format which
actually is the same value as dec 20013. it is I believe another form of hex. the one PB convert takes the form 4E2D
I am not sure how to convert %E4%B8%AD to dec 20013 .
alan
Re: about percent hex...
Posted: Thu Dec 19, 2013 4:40 pm
by JHPJHP
Just some more information to assist tracing the problem:
http://www.charbase.com/4e2d-unicode-cj ... -ideograph
- UTF16 to UTF8
Re: about percent hex...
Posted: Thu Dec 19, 2013 5:53 pm
by PMV
Code: Select all
Define Letter.s = "中"
Define *Buffer = AllocateMemory(StringByteLength(Letter, #PB_UTF8) + 1)
PokeS(*Buffer, Letter, 1, #PB_UTF8)
Define *w.BYTE = *Buffer
Define Result.s = ""
While *w\b
Result + "%" + Hex(*w\b, #PB_Byte)
*w + 1
Wend
Debug Result
Re: about percent hex...
Posted: Thu Dec 19, 2013 7:19 pm
by Demivec
@AlanFoo: Here's an example to convert from hex percent to the hex value of the character and back again depending on your preference.
Code: Select all
Define i, UTF8_ByteCount
Define *buffer1, browserText.s
;convert browser text into PB Unicode String, assumes encoding is UTF8
browserText.s = "%E4%B8%AD" ;isolated text in hex percent encoding
Debug browserText
UTF8_ByteCount = CountString(browserText, "%")
*buffer1 = AllocateMemory(UTF8_ByteCount + 1) ;add 1 byte for a null
For i = 1 To UTF8_ByteCount
PokeB(*buffer1 + i -1, Val("$" + StringField(browserText, i + 1, "%")))
Next
Debug PeekS(*buffer1, -1, #PB_UTF8) ;display character
Debug "$" + Hex(Asc(PeekS(*buffer1, -1, #PB_UTF8)), #PB_Word) ;display hex value of character
Define c, *buffer2, convertedText.s
Debug "-----------"
;convert unicode character into hex percent format
c = $4E2D ;character
Debug "$" + Hex(c, #PB_Word) ;display hex value of character to convert
*buffer2 = AllocateMemory(StringByteLength(Chr(c), #PB_UTF8) + 1) ;hold UTF8 version of unicode character
PokeS(*buffer2, Chr(c), 1, #PB_UTF8)
UTF8_ByteCount = MemoryStringLength(*buffer2 , #PB_UTF8)
convertedText.s
For i = 1 To UTF8_ByteCount
convertedText + "%" + Hex(PeekB(*buffer2 + i - 1), #PB_Byte)
Next
Debug convertedText
The part that converts to hex percent is virtually identical to PMV's version. I started mine before he posted and thought I would posted it anyway even though he posted before me. I think his looks neater too.

Re: about percent hex...
Posted: Thu Dec 19, 2013 10:14 pm
by AlanFoo
PMV wrote:Code: Select all
Define Letter.s = "中"
Define *Buffer = AllocateMemory(StringByteLength(Letter, #PB_UTF8) + 1)
PokeS(*Buffer, Letter, 1, #PB_UTF8)
Define *w.BYTE = *Buffer
Define Result.s = ""
While *w\b
Result + "%" + Hex(*w\b, #PB_Byte)
*w + 1
Wend
Debug Result
Dear w.BYTE,
Thanks a lot .
Your conversion is very neat.
Changing from "中" is one of the issues I need and Deniver have provided the routine from "%E4%B8%AD to Unicode "中"
As posted I would need to convert from utf8 "%E4%B8%AD" to decimal. 20013 too
Can you help ?
Regards
Alan
Re: about percent hex...
Posted: Fri Dec 20, 2013 1:24 am
by AlanFoo
Demivec wrote:@AlanFoo: Here's an example to convert from hex percent to the hex value of the character and back again depending on your preference.
Code: Select all
Define i, UTF8_ByteCount
Define *buffer1, browserText.s
;convert browser text into PB Unicode String, assumes encoding is UTF8
browserText.s = "%E4%B8%AD" ;isolated text in hex percent encoding
Debug browserText
UTF8_ByteCount = CountString(browserText, "%")
*buffer1 = AllocateMemory(UTF8_ByteCount + 1) ;add 1 byte for a null
For i = 1 To UTF8_ByteCount
PokeB(*buffer1 + i -1, Val("$" + StringField(browserText, i + 1, "%")))
Next
Debug PeekS(*buffer1, -1, #PB_UTF8) ;display character
Debug "$" + Hex(Asc(PeekS(*buffer1, -1, #PB_UTF8)), #PB_Word) ;display hex value of character
Define c, *buffer2, convertedText.s
Debug "-----------"
;convert unicode character into hex percent format
c = $4E2D ;character
Debug "$" + Hex(c, #PB_Word) ;display hex value of character to convert
*buffer2 = AllocateMemory(StringByteLength(Chr(c), #PB_UTF8) + 1) ;hold UTF8 version of unicode character
PokeS(*buffer2, Chr(c), 1, #PB_UTF8)
UTF8_ByteCount = MemoryStringLength(*buffer2 , #PB_UTF8)
convertedText.s
For i = 1 To UTF8_ByteCount
convertedText + "%" + Hex(PeekB(*buffer2 + i - 1), #PB_Byte)
Next
Debug convertedText
The part that converts to hex percent is virtually identical to PMV's version. I started mine before he posted and thought I would posted it anyway even though he posted before me. I think his looks neater too.

Thanks indeed for the prompt response to both you and W_Byte.
Your routine converts the utf8 to hex and back to utf8 . Very useful to me.
I will need to know how to convert to ascii directly from utf8 "%E4%B8%AD" percent hex to dec.
regards
Alan
Re: about percent hex...
Posted: Fri Dec 20, 2013 6:54 am
by Shield
Just for the record I copy my post here since the OP did a double post
and I seemed to be the only one who didn't post in this version.
@Moderators: please delete the duplicate topic, thank you.
Shield (in other duplicate post) wrote:PB gives you the UTF-16 encoding but in the URL it is encoded as UTF-8.
The following code demonstrates how to get the correct hex values:
Code: Select all
string.s = "中"
value.i = 0
length = PokeS(@value, string, -1, #PB_UTF8)
For i = 0 To length - 1
Debug Hex(PeekA(@value + i))
Next
Re: about percent hex...
Posted: Fri Dec 20, 2013 7:20 am
by wilbert
AlanFoo wrote:I will need to know how to convert to ascii directly from utf8 "%E4%B8%AD" percent hex to dec.
You will have to check with some more codes but I think this will do
Code: Select all
Procedure.u PercentHexToDec(PercentHex.s)
Protected result.u, l.l = Val("$" + RemoveString(PercentHex, "%"))
If l & $FFFFFF80 = 0
; 1 byte code
result = l
ElseIf l & $FFFFE0C0 = $C080
; 2 byte code
result = l & $3F + l >> 2 & $7C0
ElseIf l & $FFF0C0C0 = $E08080
; 3 byte code
result = l & $3F + l >> 2 & $FC0 + l >> 4 & $F000
EndIf
ProcedureReturn result
EndProcedure
Debug PercentHexToDec("%E4%B8%AD")
Debug PercentHexToDec("%D5%B3")
Re: about percent hex...
Posted: Fri Dec 20, 2013 7:58 am
by AlanFoo
wilbert wrote:AlanFoo wrote:I will need to know how to convert to ascii directly from utf8 "%E4%B8%AD" percent hex to dec.
You will have to check with some more codes but I think this will do
Code: Select all
Procedure.u PercentHexToDec(PercentHex.s)
Protected result.u, l.l = Val("$" + RemoveString(PercentHex, "%"))
If l & $FFFFFF80 = 0
; 1 byte code
result = l
ElseIf l & $FFFFE0C0 = $C080
; 2 byte code
result = l & $3F + l >> 2 & $7C0
ElseIf l & $FFF0C0C0 = $E08080
; 3 byte code
result = l & $3F + l >> 2 & $FC0 + l >> 4 & $F000
EndIf
ProcedureReturn result
EndProcedure
Debug PercentHexToDec("%E4%B8%AD")
Debug PercentHexToDec("%D5%B3")
Got it ... with thanks.
Alan
Re: about percent hex...
Posted: Fri Dec 20, 2013 8:00 am
by AlanFoo
Shield wrote:Just for the record I copy my post here since the OP did a double post
and I seemed to be the only one who didn't post in this version.
@Moderators: please delete the duplicate topic, thank you.
Shield (in other duplicate post) wrote:PB gives you the UTF-16 encoding but in the URL it is encoded as UTF-8.
The following code demonstrates how to get the correct hex values:
Code: Select all
string.s = "中"
value.i = 0
length = PokeS(@value, string, -1, #PB_UTF8)
For i = 0 To length - 1
Debug Hex(PeekA(@value + i))
Next
Thanks it works...
Alan