Some questions about url escaping

Just starting out? Need help? Post your questions and find answers here.
lexvictory
Addict
Addict
Posts: 1027
Joined: Sun May 15, 2005 5:15 am
Location: Australia
Contact:

Some questions about url escaping

Post by lexvictory »

I was wondering, what are the special characters that should NOT be encoded into %XX?
I've been working on some code to do the escaping/unescaping, and i remember a topic about doing it before, but cant seem to find it again......

heres the code (droopy's lib needed - download at the site (droopyslib.us.to) - get the newest version and follow instructions)

download it here http://demonioardente.us.to/escapingcode.zip (it wouldnt display properly here....)

NOTE: u may need to use a unicode enabled font in the ide to display some of the characters in the code (remember to compile it in unicode mode)

so, can anyone help with some of the characters that dont need to be escaped?

also, what do u think of the code? (it didnt take long to code - and dont ask me where i got the idea to use &$FF....... (i coded that bit this morning....))
Demonio Ardente

Currently managing Linux & OS X Tailbite
OS X TailBite now up to date with Windows!
Num3
PureBasic Expert
PureBasic Expert
Posts: 2812
Joined: Fri Apr 25, 2003 4:51 pm
Location: Portugal, Lisbon
Contact:

Post by Num3 »

You can encode them all.... Or just the ones that escape plain ASCII ...

The server should always decode them...

Code: Select all

Procedure.s URL_Encode(string.s)
  String$ = ""
  For i = 1 To Len(string)
    If Mid(string,i,1) <> "\" Or Mid(string,i,1) <> "/"
      Letter$ = "%" + Hex(Asc(Mid(string,i,1)))
    Else
      Letter$ = "\"
    EndIf
    String$  + Letter$
  Next i
  
  ProcedureReturn String$
  
EndProcedure

Code: Select all

Procedure Hex2Dec(HexNumber.s)
  Structure OneByte
    a.b
  EndStructure
  *t.OneByte = @HexNumber
  Result.l = 0
  While *t\a <0>= '0' And *t\a <= '9'
      Result = (Result <<4>= 'A' And *t\a <= 'F'
      Result = (Result <<4>= 'a' And *t\a <= 'f'
      Result = (Result << 4) + (*t\a - 87)
    Else
      Result = (Result << 4) + (*t\a - 55)
    EndIf
    *t + 1
  Wend
  ProcedureReturn Result
EndProcedure;

Procedure.s URL_Decode(string.s)
  out.s=""
  For a=1 To Len(string.s)
    c$=Mid(string.s,a,1)
    If c$="%"
      k$=Mid(string.s,a+1,2)
      out.s+ Chr(Hex2Dec(k$))
      a+2
    Else
      out.s+c$
    EndIf
  Next
  ProcedureReturn out
EndProcedure
lexvictory
Addict
Addict
Posts: 1027
Joined: Sun May 15, 2005 5:15 am
Location: Australia
Contact:

Post by lexvictory »

ok, but have u tested your code with utf-8? thats the main reason that i made my procedures the way i did....

ok, ill encode all non alphanumeric ones.....
i mainly made this stuff for my own internal use, and as long as it can decode urls from browsers, it'll be fine :D
Demonio Ardente

Currently managing Linux & OS X Tailbite
OS X TailBite now up to date with Windows!
Phoenix
Enthusiast
Enthusiast
Posts: 141
Joined: Sun Sep 04, 2005 2:25 am

Post by Phoenix »

I'm not sure but if you use this code then Windows will do it for you, so you don't have to worry about catching all characters....

Code: Select all

url$="www%2edomain%2ecom%2fmy%5ffile%2ehtml"

If OpenLibrary(0, "shlwapi.dll")
  CallFunction(0, "UrlUnescapeA", @url$, 0, 0, #URL_UNESCAPE_INPLACE)
  CloseLibrary(0)
EndIf

Debug url$
lexvictory
Addict
Addict
Posts: 1027
Joined: Sun May 15, 2005 5:15 am
Location: Australia
Contact:

Post by lexvictory »

the reason i made my own routines is because WinAPI ones werent working in Unicode mode.
and the ones i made work, so it doesnt matter (and they're fast enough for me)

and Phoenix, ure using the ASCII version, so it wouldnt work with unicode data
Demonio Ardente

Currently managing Linux & OS X Tailbite
OS X TailBite now up to date with Windows!
User avatar
Flype
Addict
Addict
Posts: 1542
Joined: Tue Jul 22, 2003 5:02 pm
Location: In a long distant galaxy

Post by Flype »

Code: Select all

url.s = "www%2edomain%2ecom%2fmy%5ffile%2ehtml"

UrlUnescape_(url, 0, 0, #URL_UNESCAPE_INPLACE)

Debug url
With PB4, this works in ASCII and in UNICODE :!:


And each time a function in PureBasic do not support unicode, you can easily make it compatible (if windows already support it) :

Code: Select all

Import "shlwapi.lib"
  CompilerIf #PB_Compiler_Unicode
  UrlUnescape(pszURL.s, *pszUnescaped, *pcchUnescaped, dwFlags.l) As "_UrlUnescapeW@16"
  CompilerElse
  UrlUnescape(pszURL.s, *pszUnescaped, *pcchUnescaped, dwFlags.l) As "_UrlUnescapeA@16"
  CompilerEndIf
EndImport

url.s = "www%2edomain%2ecom%2fmy%5ffile%2ehtml"

If UrlUnescape(url, 0, 0, #URL_UNESCAPE_INPLACE) = #NO_ERROR
  Debug url
EndIf

If UrlUnescape_(url, 0, 0, #URL_UNESCAPE_INPLACE) = #NO_ERROR
  Debug url
EndIf
No programming language is perfect. There is not even a single best language.
There are only languages well suited or perhaps poorly suited for particular purposes. Herbert Mayer
lexvictory
Addict
Addict
Posts: 1027
Joined: Sun May 15, 2005 5:15 am
Location: Australia
Contact:

Post by lexvictory »

i think it was when unicode data was introduced that the api ones didnt do it properly, i.e escaping cyrilic and asian characters, etc
Demonio Ardente

Currently managing Linux & OS X Tailbite
OS X TailBite now up to date with Windows!
User avatar
Flype
Addict
Addict
Posts: 1542
Joined: Tue Jul 22, 2003 5:02 pm
Location: In a long distant galaxy

Post by Flype »

ah ok, another windows feature :D
No programming language is perfect. There is not even a single best language.
There are only languages well suited or perhaps poorly suited for particular purposes. Herbert Mayer
Post Reply