Page 1 of 1
Some questions about url escaping
Posted: Sat Jul 22, 2006 10:32 am
by lexvictory
I was wondering, what are the special characters that should NOT be encoded into %XX?
I've been working on some code to do the escaping/unescaping, and i remember a topic about doing it before, but cant seem to find it again......
heres the code (droopy's lib needed - download at the site (droopyslib.us.to) - get the newest version and follow instructions)
download it here
http://demonioardente.us.to/escapingcode.zip (it wouldnt display properly here....)
NOTE: u may need to use a unicode enabled font in the ide to display some of the characters in the code (remember to compile it in unicode mode)
so, can anyone help with some of the characters that dont need to be escaped?
also, what do u think of the code? (it didnt take long to code - and dont ask me where i got the idea to use &$FF....... (i coded that bit this morning....))
Posted: Sat Jul 22, 2006 2:05 pm
by Num3
You can encode them all.... Or just the ones that escape plain ASCII ...
The server should always decode them...
Code: Select all
Procedure.s URL_Encode(string.s)
String$ = ""
For i = 1 To Len(string)
If Mid(string,i,1) <> "\" Or Mid(string,i,1) <> "/"
Letter$ = "%" + Hex(Asc(Mid(string,i,1)))
Else
Letter$ = "\"
EndIf
String$ + Letter$
Next i
ProcedureReturn String$
EndProcedure
Code: Select all
Procedure Hex2Dec(HexNumber.s)
Structure OneByte
a.b
EndStructure
*t.OneByte = @HexNumber
Result.l = 0
While *t\a <0>= '0' And *t\a <= '9'
Result = (Result <<4>= 'A' And *t\a <= 'F'
Result = (Result <<4>= 'a' And *t\a <= 'f'
Result = (Result << 4) + (*t\a - 87)
Else
Result = (Result << 4) + (*t\a - 55)
EndIf
*t + 1
Wend
ProcedureReturn Result
EndProcedure;
Procedure.s URL_Decode(string.s)
out.s=""
For a=1 To Len(string.s)
c$=Mid(string.s,a,1)
If c$="%"
k$=Mid(string.s,a+1,2)
out.s+ Chr(Hex2Dec(k$))
a+2
Else
out.s+c$
EndIf
Next
ProcedureReturn out
EndProcedure
Posted: Sat Jul 22, 2006 2:12 pm
by lexvictory
ok, but have u tested your code with utf-8? thats the main reason that i made my procedures the way i did....
ok, ill encode all non alphanumeric ones.....
i mainly made this stuff for my own internal use, and as long as it can decode urls from browsers, it'll be fine

Posted: Sun Jul 23, 2006 12:52 am
by Phoenix
I'm not sure but if you use this code then Windows will do it for you, so you don't have to worry about catching all characters....
Code: Select all
url$="www%2edomain%2ecom%2fmy%5ffile%2ehtml"
If OpenLibrary(0, "shlwapi.dll")
CallFunction(0, "UrlUnescapeA", @url$, 0, 0, #URL_UNESCAPE_INPLACE)
CloseLibrary(0)
EndIf
Debug url$
Posted: Sun Jul 23, 2006 12:01 pm
by lexvictory
the reason i made my own routines is because WinAPI ones werent working in Unicode mode.
and the ones i made work, so it doesnt matter (and they're fast enough for me)
and Phoenix, ure using the ASCII version, so it wouldnt work with unicode data
Posted: Sun Jul 23, 2006 12:12 pm
by Flype
Code: Select all
url.s = "www%2edomain%2ecom%2fmy%5ffile%2ehtml"
UrlUnescape_(url, 0, 0, #URL_UNESCAPE_INPLACE)
Debug url
With PB4, this works in ASCII and in UNICODE
And each time a function in PureBasic do not support unicode, you can easily make it compatible (if windows already support it) :
Code: Select all
Import "shlwapi.lib"
CompilerIf #PB_Compiler_Unicode
UrlUnescape(pszURL.s, *pszUnescaped, *pcchUnescaped, dwFlags.l) As "_UrlUnescapeW@16"
CompilerElse
UrlUnescape(pszURL.s, *pszUnescaped, *pcchUnescaped, dwFlags.l) As "_UrlUnescapeA@16"
CompilerEndIf
EndImport
url.s = "www%2edomain%2ecom%2fmy%5ffile%2ehtml"
If UrlUnescape(url, 0, 0, #URL_UNESCAPE_INPLACE) = #NO_ERROR
Debug url
EndIf
If UrlUnescape_(url, 0, 0, #URL_UNESCAPE_INPLACE) = #NO_ERROR
Debug url
EndIf
Posted: Sun Jul 23, 2006 12:47 pm
by lexvictory
i think it was when unicode data was introduced that the api ones didnt do it properly, i.e escaping cyrilic and asian characters, etc
Posted: Sun Jul 23, 2006 1:28 pm
by Flype
ah ok, another windows feature
