about percent hex...

Just starting out? Need help? Post your questions and find answers here.
AlanFoo
Enthusiast
Enthusiast
Posts: 172
Joined: Fri Jul 24, 2009 6:24 am
Location: Malaysia

about percent hex...

Post by AlanFoo »

Hi experts,

hope someone can help me on this.

I converted a unicode e.g Chinese character "中" using asc I get 20013
and hex(20013) i get 4E2D as hex.

When I read this same character "中" using browser e.g. www.host.com/?中
the browser convert the "中" to %E4%B8%AD which I understand is called percent hex and equivalent to 4E2D

i am not sure how to use Purebasic to arrive at %E4%B8%AD hex from unicode "中"

Or is there anyway using purebasic, to convert 4E2D to %E4%B8%AD so that I can process the same way using php in the browser ?

<?php
$query_string = $_SERVER['QUERY_STRING'];
print $query_string;
?>

This php code displays %E4%B8%AD
instead of 4E2D

Pls help if there is a pb routine to convert?

Rgds
Alan
User avatar
JHPJHP
Addict
Addict
Posts: 2251
Joined: Sat Oct 09, 2010 3:47 am

Re: about percent hex...

Post by JHPJHP »

Hi AlanFoo,

Not knowing if your setup may be a factor...
Possibly: URLEncoder(URL$) / URLDecoder(URL$)

If you're not investing in yourself, you're falling behind.

My PureBasic StuffFREE STUFF, Scripts & Programs.
My PureBasic Forum ➤ Questions, Requests & Comments.
AlanFoo
Enthusiast
Enthusiast
Posts: 172
Joined: Fri Jul 24, 2009 6:24 am
Location: Malaysia

Re: about percent hex...

Post by AlanFoo »

JHPJHP wrote:Hi AlanFoo,

Not knowing if your setup may be a factor...
Possibly: URLEncoder(URL$) / URLDecoder(URL$)
Thanks for reply.

No I dont think it is the urlencorder .... but rather the way browsers convert unicode to percent hex %E4%B8%AD format which
actually is the same value as dec 20013. it is I believe another form of hex. the one PB convert takes the form 4E2D

I am not sure how to convert %E4%B8%AD to dec 20013 .

alan
User avatar
JHPJHP
Addict
Addict
Posts: 2251
Joined: Sat Oct 09, 2010 3:47 am

Re: about percent hex...

Post by JHPJHP »

Just some more information to assist tracing the problem: http://www.charbase.com/4e2d-unicode-cj ... -ideograph
- UTF16 to UTF8

If you're not investing in yourself, you're falling behind.

My PureBasic StuffFREE STUFF, Scripts & Programs.
My PureBasic Forum ➤ Questions, Requests & Comments.
PMV
Enthusiast
Enthusiast
Posts: 727
Joined: Sat Feb 24, 2007 3:15 pm
Location: Germany

Re: about percent hex...

Post by PMV »

Code: Select all

Define Letter.s = "中"
Define *Buffer = AllocateMemory(StringByteLength(Letter, #PB_UTF8) + 1)
PokeS(*Buffer, Letter, 1, #PB_UTF8)

Define *w.BYTE = *Buffer

Define Result.s = ""
While *w\b
  Result + "%" + Hex(*w\b, #PB_Byte)
  *w + 1
Wend

Debug Result
User avatar
Demivec
Addict
Addict
Posts: 4260
Joined: Mon Jul 25, 2005 3:51 pm
Location: Utah, USA

Re: about percent hex...

Post by Demivec »

@AlanFoo: Here's an example to convert from hex percent to the hex value of the character and back again depending on your preference.

Code: Select all

Define i, UTF8_ByteCount

Define *buffer1, browserText.s

;convert browser text into PB Unicode String, assumes encoding is UTF8
browserText.s = "%E4%B8%AD" ;isolated text in hex percent encoding
Debug browserText

UTF8_ByteCount = CountString(browserText, "%")
*buffer1 = AllocateMemory(UTF8_ByteCount + 1) ;add 1 byte for a null

For i = 1 To UTF8_ByteCount
  PokeB(*buffer1 + i -1, Val("$" + StringField(browserText, i + 1, "%")))
Next

Debug PeekS(*buffer1, -1, #PB_UTF8) ;display character
Debug "$" + Hex(Asc(PeekS(*buffer1, -1, #PB_UTF8)), #PB_Word) ;display hex value of character
 
Define c, *buffer2, convertedText.s

Debug "-----------"

;convert unicode character into hex percent format
c = $4E2D ;character
Debug "$" + Hex(c, #PB_Word) ;display hex value of character to convert

*buffer2 = AllocateMemory(StringByteLength(Chr(c), #PB_UTF8) + 1) ;hold UTF8 version of unicode character
PokeS(*buffer2, Chr(c), 1, #PB_UTF8)
UTF8_ByteCount = MemoryStringLength(*buffer2 , #PB_UTF8)

convertedText.s
For i = 1 To UTF8_ByteCount
  convertedText + "%" + Hex(PeekB(*buffer2 + i - 1), #PB_Byte)
Next
Debug convertedText

The part that converts to hex percent is virtually identical to PMV's version. I started mine before he posted and thought I would posted it anyway even though he posted before me. I think his looks neater too. :)
AlanFoo
Enthusiast
Enthusiast
Posts: 172
Joined: Fri Jul 24, 2009 6:24 am
Location: Malaysia

Re: about percent hex...

Post by AlanFoo »

PMV wrote:

Code: Select all

Define Letter.s = "中"
Define *Buffer = AllocateMemory(StringByteLength(Letter, #PB_UTF8) + 1)
PokeS(*Buffer, Letter, 1, #PB_UTF8)

Define *w.BYTE = *Buffer

Define Result.s = ""
While *w\b
  Result + "%" + Hex(*w\b, #PB_Byte)
  *w + 1
Wend

Debug Result
Dear w.BYTE,

Thanks a lot .

Your conversion is very neat.
Changing from "中" is one of the issues I need and Deniver have provided the routine from "%E4%B8%AD to Unicode "中"

As posted I would need to convert from utf8 "%E4%B8%AD" to decimal. 20013 too
Can you help ?

Regards
Alan
AlanFoo
Enthusiast
Enthusiast
Posts: 172
Joined: Fri Jul 24, 2009 6:24 am
Location: Malaysia

Re: about percent hex...

Post by AlanFoo »

Demivec wrote:@AlanFoo: Here's an example to convert from hex percent to the hex value of the character and back again depending on your preference.

Code: Select all

Define i, UTF8_ByteCount

Define *buffer1, browserText.s

;convert browser text into PB Unicode String, assumes encoding is UTF8
browserText.s = "%E4%B8%AD" ;isolated text in hex percent encoding
Debug browserText

UTF8_ByteCount = CountString(browserText, "%")
*buffer1 = AllocateMemory(UTF8_ByteCount + 1) ;add 1 byte for a null

For i = 1 To UTF8_ByteCount
  PokeB(*buffer1 + i -1, Val("$" + StringField(browserText, i + 1, "%")))
Next

Debug PeekS(*buffer1, -1, #PB_UTF8) ;display character
Debug "$" + Hex(Asc(PeekS(*buffer1, -1, #PB_UTF8)), #PB_Word) ;display hex value of character
 
Define c, *buffer2, convertedText.s

Debug "-----------"

;convert unicode character into hex percent format
c = $4E2D ;character
Debug "$" + Hex(c, #PB_Word) ;display hex value of character to convert

*buffer2 = AllocateMemory(StringByteLength(Chr(c), #PB_UTF8) + 1) ;hold UTF8 version of unicode character
PokeS(*buffer2, Chr(c), 1, #PB_UTF8)
UTF8_ByteCount = MemoryStringLength(*buffer2 , #PB_UTF8)

convertedText.s
For i = 1 To UTF8_ByteCount
  convertedText + "%" + Hex(PeekB(*buffer2 + i - 1), #PB_Byte)
Next
Debug convertedText

The part that converts to hex percent is virtually identical to PMV's version. I started mine before he posted and thought I would posted it anyway even though he posted before me. I think his looks neater too. :)
Thanks indeed for the prompt response to both you and W_Byte.

Your routine converts the utf8 to hex and back to utf8 . Very useful to me.

I will need to know how to convert to ascii directly from utf8 "%E4%B8%AD" percent hex to dec.

regards
Alan
User avatar
Shield
Addict
Addict
Posts: 1021
Joined: Fri Jan 21, 2011 8:25 am
Location: 'stralia!
Contact:

Re: about percent hex...

Post by Shield »

Just for the record I copy my post here since the OP did a double post
and I seemed to be the only one who didn't post in this version.

@Moderators: please delete the duplicate topic, thank you.
Shield (in other duplicate post) wrote:PB gives you the UTF-16 encoding but in the URL it is encoded as UTF-8.
The following code demonstrates how to get the correct hex values:

Code: Select all

string.s = "中"
value.i = 0

length = PokeS(@value, string, -1, #PB_UTF8)
For i = 0 To length - 1
	Debug Hex(PeekA(@value + i))
Next
Image
Blog: Why Does It Suck? (http://whydoesitsuck.com/)
"You can disagree with me as much as you want, but during this talk, by definition, anybody who disagrees is stupid and ugly."
- Linus Torvalds
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: about percent hex...

Post by wilbert »

AlanFoo wrote:I will need to know how to convert to ascii directly from utf8 "%E4%B8%AD" percent hex to dec.
You will have to check with some more codes but I think this will do

Code: Select all

Procedure.u PercentHexToDec(PercentHex.s)
  Protected result.u, l.l = Val("$" + RemoveString(PercentHex, "%"))
  If l & $FFFFFF80 = 0
    ; 1 byte code
    result = l
  ElseIf l & $FFFFE0C0 = $C080
    ; 2 byte code
    result = l & $3F + l >> 2 & $7C0
  ElseIf l & $FFF0C0C0 = $E08080
    ; 3 byte code
    result = l & $3F + l >> 2 & $FC0 + l >> 4 & $F000
  EndIf
  ProcedureReturn result  
EndProcedure

Debug PercentHexToDec("%E4%B8%AD")
Debug PercentHexToDec("%D5%B3")
Windows (x64)
Raspberry Pi OS (Arm64)
AlanFoo
Enthusiast
Enthusiast
Posts: 172
Joined: Fri Jul 24, 2009 6:24 am
Location: Malaysia

Re: about percent hex...

Post by AlanFoo »

wilbert wrote:
AlanFoo wrote:I will need to know how to convert to ascii directly from utf8 "%E4%B8%AD" percent hex to dec.
You will have to check with some more codes but I think this will do

Code: Select all

Procedure.u PercentHexToDec(PercentHex.s)
  Protected result.u, l.l = Val("$" + RemoveString(PercentHex, "%"))
  If l & $FFFFFF80 = 0
    ; 1 byte code
    result = l
  ElseIf l & $FFFFE0C0 = $C080
    ; 2 byte code
    result = l & $3F + l >> 2 & $7C0
  ElseIf l & $FFF0C0C0 = $E08080
    ; 3 byte code
    result = l & $3F + l >> 2 & $FC0 + l >> 4 & $F000
  EndIf
  ProcedureReturn result  
EndProcedure

Debug PercentHexToDec("%E4%B8%AD")
Debug PercentHexToDec("%D5%B3")
Got it ... with thanks.

Alan
AlanFoo
Enthusiast
Enthusiast
Posts: 172
Joined: Fri Jul 24, 2009 6:24 am
Location: Malaysia

Re: about percent hex...

Post by AlanFoo »

Shield wrote:Just for the record I copy my post here since the OP did a double post
and I seemed to be the only one who didn't post in this version.

@Moderators: please delete the duplicate topic, thank you.
Shield (in other duplicate post) wrote:PB gives you the UTF-16 encoding but in the URL it is encoded as UTF-8.
The following code demonstrates how to get the correct hex values:

Code: Select all

string.s = "中"
value.i = 0

length = PokeS(@value, string, -1, #PB_UTF8)
For i = 0 To length - 1
	Debug Hex(PeekA(@value + i))
Next
Thanks it works...
Alan
Post Reply