Page 1 of 1
UrlEncoder() doesn't encode "#" character
Posted: Thu Jun 16, 2016 12:48 am
by Lunasole
5.42
Code: Select all
Debug URLEncoder("D:\#something\")
Re: UrlEncoder() doesn't encode "#" character
Posted: Thu Jun 16, 2016 8:22 am
by RSBasic
@Lunasole
Please use the latest version to test.
I tested in 5.50 Beta 1:
Output wrote:D:%5C#something%5C
Re: UrlEncoder() doesn't encode "#" character
Posted: Thu Jun 16, 2016 9:25 am
by DarkDragon
RSBasic wrote:@Lunasole
Please use the latest version to test.
I tested in 5.50 Beta 1:
Output wrote:D:%5C#something%5C
So it doesn't work with yours, too? # is not encoded. But I guess thats intentionally. There are often multilevel encode functions available.
Re: UrlEncoder() doesn't encode "#" character
Posted: Thu Jun 16, 2016 10:55 am
by DontTalkToMe
'#' it's non encoded, but ':' also it's not encoded
should be D%3A%5C%23something%5C
http://www.w3schools.com/tags/ref_urlencode.asp
decoding works
Debug URLDecoder("D%3A%5C%23something%5C") ; D:\#something\
Re: UrlEncoder() doesn't encode "#" character
Posted: Thu Jun 16, 2016 1:46 pm
by helpy
Consider the following!
It is not possible to use the same functions URLEncoder/URLDecoder for different purposes:
- De/Encoding of the URI path
- De/Encoding of the query arguments (you have to de/encode each argument separately)
If the "=" character is part of an argument, than it has to be encoded!
If the "=" character is used as sperator between argument name and argument value, it should not be encoded!
If the "#" is used as sperator between path/query and fragement, it should not be encoded!
If the "#" character is part of an argument, than it has to be encoded!
... and so on!
==> This can not be done by only one pair of functions URLEncoder/URLDecoder!
Be aware of this!
Re: UrlEncoder() doesn't encode "#" character
Posted: Thu Jun 16, 2016 3:07 pm
by Little John
What
helpy wrote is crucial here.
RFC 3986, section 2.4 wrote:Under normal circumstances, the only time when octets within a URI
are percent-encoded is during the process of producing the URI from
its component parts. This is when an implementation determines which
of the reserved characters are to be used as subcomponent delimiters
and which can be safely used as data.
RFC 3986, section 2.4 wrote:When a URI is dereferenced, the components and subcomponents
significant to the scheme-specific dereferencing process (if any)
must be parsed and separated before the percent-encoded octets within
those components can be safely decoded, as otherwise the data may be
mistaken for component delimiters.
Re: UrlEncoder() doesn't encode "#" character
Posted: Fri Jun 17, 2016 5:41 am
by Lunasole
RSBasic wrote:@Lunasole
Please use the latest version to test.
I tested in 5.50 Beta 1:
Output wrote:D:%5C#something%5C
I'm not using 'beta' and similar unstable stuff, also if I don't see a fix in changelogs, then I guess there no any fix was made.
And in this case it was not.
Re: UrlEncoder() doesn't encode "#" character
Posted: Fri Jun 17, 2016 7:25 am
by DarkDragon
Thats what I meant with multi level encode functions.
Re: UrlEncoder() doesn't encode "#" character
Posted: Fri Jun 17, 2016 11:16 am
by helpy
See also URL specification:
==>
https://url.spec.whatwg.org/#simple-encode-set
There are different encode sets.
I do not understand this fully, but it seems that depending on the part of the URL, which has to be encoded, a different encode set is used.
If you want to create a correctly encoded URL you have to encode the parts and then build the whole URL.
Re: UrlEncoder() doesn't encode "#" character
Posted: Sat Jun 18, 2016 8:47 pm
by Lunasole
For simplicity there of course should be one UrlEncoder().
I don't know exact sets of chars that typical URL-encoders from other languages (web-languages) encoding, but at least in Notepad++ encoder plugin is encoding "#" char and in javascript as I remember it is encoded too.
So talking about "how it should be" I think the better is to copy-paste this function from javascript runtime instead of writing own bicycle and guessing how it should work, there are many of it's implementation in lot of browser engines/js engines and so on.
Re: UrlEncoder() doesn't encode "#" character
Posted: Sat Jun 18, 2016 9:05 pm
by DarkDragon
Lunasole wrote:For simplicity there of course should be one UrlEncoder().
I don't know exact sets of chars that typical URL-encoders from other languages (web-languages) encoding, but at least in Notepad++ encoder plugin is encoding "#" char and in javascript as I remember it is encoded too.
So talking about "how it should be" I think the better is to copy-paste this function from javascript runtime instead of writing own bicycle and guessing how it should work, there are many of it's implementation in lot of browser engines/js engines and so on.
That "javascript is my god" is not something I'll agree on, but the most basic call on URLEncode should be intuitively on the level of argument values (encoding all). There should however be a flag for specifying whether the input is an argument value, an argument,a query string or a whole url.
Re: UrlEncoder() doesn't encode "#" character
Posted: Mon Jun 20, 2016 10:41 am
by c4s
I needed a similar function too, so I coded something that does what you need. You can find it in the following thread:
Encode String to URL Format (with "+" as Space etc.)
Re: UrlEncoder() doesn't encode "#" character
Posted: Mon Jun 20, 2016 10:07 pm
by Lunasole
DarkDragon wrote:
That "javascript is my god" is not something I'll agree on, but the most basic call on URLEncode should be intuitively on the level of argument values (encoding all). There should however be a flag for specifying whether the input is an argument value, an argument,a query string or a whole url.
It is not a god

I dislike JS a lot, and web-stuff at all (any of it is poor and damn stupid, boring and primitive, web-technologies and languages generally remind me .bat-files, and even tools/IDE to work with them in 2016 by their functional are like some desktop IDEs from 1999, any of them even cannot optimize CSS file automatically, for example, so you can see and load lot of unused trash in CSS on almost any site you visit. And if talk about php or how web is constructed internally and perverted thinking patterns of web-developers who never coded native desktop soft ....).
But SpiderBasic is JS-based and JS is kind of "standart" in web, so it is reasonable to make related things "like in JS".
@c4s: thanks, for my case I've made simple workaround
Code: Select all
; convert file path to URL
Procedure$ PathToURL (Path$)
Path$ = ReplaceString(Path$, "\", "/")
Path$ = URLEncoder(Path$, #PB_UTF8)
Path$ = ReplaceString(Path$, "#", "%23") ; workaround of PB Urlencoder bug
ProcedureReturn Path$
EndProcedure
Re: UrlEncoder() doesn't encode "#" character
Posted: Tue Jun 21, 2016 2:55 am
by Keya
i see your Javascript god, and i raise you my NoScript firefox plugin god
