Page 1 of 1

Some questions about url escaping

Posted: Sat Jul 22, 2006 10:32 am
by lexvictory
I was wondering, what are the special characters that should NOT be encoded into %XX?
I've been working on some code to do the escaping/unescaping, and i remember a topic about doing it before, but cant seem to find it again......

heres the code (droopy's lib needed - download at the site (droopyslib.us.to) - get the newest version and follow instructions)

download it here http://demonioardente.us.to/escapingcode.zip (it wouldnt display properly here....)

NOTE: u may need to use a unicode enabled font in the ide to display some of the characters in the code (remember to compile it in unicode mode)

so, can anyone help with some of the characters that dont need to be escaped?

also, what do u think of the code? (it didnt take long to code - and dont ask me where i got the idea to use &$FF....... (i coded that bit this morning....))

Posted: Sat Jul 22, 2006 2:05 pm
by Num3
You can encode them all.... Or just the ones that escape plain ASCII ...

The server should always decode them...

Code: Select all

Procedure.s URL_Encode(string.s)
  String$ = ""
  For i = 1 To Len(string)
    If Mid(string,i,1) <> "\" Or Mid(string,i,1) <> "/"
      Letter$ = "%" + Hex(Asc(Mid(string,i,1)))
    Else
      Letter$ = "\"
    EndIf
    String$  + Letter$
  Next i
  
  ProcedureReturn String$
  
EndProcedure

Code: Select all

Procedure Hex2Dec(HexNumber.s)
  Structure OneByte
    a.b
  EndStructure
  *t.OneByte = @HexNumber
  Result.l = 0
  While *t\a <0>= '0' And *t\a <= '9'
      Result = (Result <<4>= 'A' And *t\a <= 'F'
      Result = (Result <<4>= 'a' And *t\a <= 'f'
      Result = (Result << 4) + (*t\a - 87)
    Else
      Result = (Result << 4) + (*t\a - 55)
    EndIf
    *t + 1
  Wend
  ProcedureReturn Result
EndProcedure;

Procedure.s URL_Decode(string.s)
  out.s=""
  For a=1 To Len(string.s)
    c$=Mid(string.s,a,1)
    If c$="%"
      k$=Mid(string.s,a+1,2)
      out.s+ Chr(Hex2Dec(k$))
      a+2
    Else
      out.s+c$
    EndIf
  Next
  ProcedureReturn out
EndProcedure

Posted: Sat Jul 22, 2006 2:12 pm
by lexvictory
ok, but have u tested your code with utf-8? thats the main reason that i made my procedures the way i did....

ok, ill encode all non alphanumeric ones.....
i mainly made this stuff for my own internal use, and as long as it can decode urls from browsers, it'll be fine :D

Posted: Sun Jul 23, 2006 12:52 am
by Phoenix
I'm not sure but if you use this code then Windows will do it for you, so you don't have to worry about catching all characters....

Code: Select all

url$="www%2edomain%2ecom%2fmy%5ffile%2ehtml"

If OpenLibrary(0, "shlwapi.dll")
  CallFunction(0, "UrlUnescapeA", @url$, 0, 0, #URL_UNESCAPE_INPLACE)
  CloseLibrary(0)
EndIf

Debug url$

Posted: Sun Jul 23, 2006 12:01 pm
by lexvictory
the reason i made my own routines is because WinAPI ones werent working in Unicode mode.
and the ones i made work, so it doesnt matter (and they're fast enough for me)

and Phoenix, ure using the ASCII version, so it wouldnt work with unicode data

Posted: Sun Jul 23, 2006 12:12 pm
by Flype

Code: Select all

url.s = "www%2edomain%2ecom%2fmy%5ffile%2ehtml"

UrlUnescape_(url, 0, 0, #URL_UNESCAPE_INPLACE)

Debug url
With PB4, this works in ASCII and in UNICODE :!:


And each time a function in PureBasic do not support unicode, you can easily make it compatible (if windows already support it) :

Code: Select all

Import "shlwapi.lib"
  CompilerIf #PB_Compiler_Unicode
  UrlUnescape(pszURL.s, *pszUnescaped, *pcchUnescaped, dwFlags.l) As "_UrlUnescapeW@16"
  CompilerElse
  UrlUnescape(pszURL.s, *pszUnescaped, *pcchUnescaped, dwFlags.l) As "_UrlUnescapeA@16"
  CompilerEndIf
EndImport

url.s = "www%2edomain%2ecom%2fmy%5ffile%2ehtml"

If UrlUnescape(url, 0, 0, #URL_UNESCAPE_INPLACE) = #NO_ERROR
  Debug url
EndIf

If UrlUnescape_(url, 0, 0, #URL_UNESCAPE_INPLACE) = #NO_ERROR
  Debug url
EndIf

Posted: Sun Jul 23, 2006 12:47 pm
by lexvictory
i think it was when unicode data was introduced that the api ones didnt do it properly, i.e escaping cyrilic and asian characters, etc

Posted: Sun Jul 23, 2006 1:28 pm
by Flype
ah ok, another windows feature :D