Page 1 of 1

Encode String to URL Format (with "+" as Space etc.)

Posted: Thu Aug 20, 2015 9:44 am
by c4s
I realized that unfortunately UrlEncoder() does not do that correctly. So how can I convert a string to the "encoded URL format"?

With the test string "Äbc: Dêfg - Hìjkl & Mnöpqr?" UrlEncoder() returns the following:
%C4bc:%20D%EAfg%20-%20H%ECjkl%20&%20Mn%F6pqr?
Check out the underlined parts that will most likely cause problems. From a proper converter I would like to see or rather expect something like this:
%C3%84bc%3A+D%C3%AAfg+-+H%C3%ACjkl+%26+Mn%C3%B6pqr%3F
I couldn't find anything else in the help file nor the "Tips and Tricks" section. So I guess this topic is also a coding question...

Re: Encode String to URL Format (with "+" as Space etc.)

Posted: Thu Aug 20, 2015 10:48 am
by Fred
In the next version (5.40), it will supports proper UTF-8 encoding, but '?' and '&' won't be encoded as you can pass a whole URL including parameter. What you are looking for is a new mode to encode like a 'parameter' part.

Re: Encode String to URL Format (with "+" as Space etc.)

Posted: Thu Aug 20, 2015 1:29 pm
by c4s
Thanks for your reply and for implementing the UTF8 support in 5.40.

A new flag for UrlEncoder() or a new function for URL parameter encoding would be awesome!

Re: Encode String to URL Format (with "+" as Space etc.)

Posted: Thu Aug 20, 2015 3:36 pm
by Little John
c4s wrote:A new flag for UrlEncoder() or a new function for URL parameter encoding would be awesome!
I agree. I would appreciate it if PB had two functions named say EncodeURLComponent() and DecodeURLComponent().
That would be in line with RFC 3986, section 2.4.
There it reads for example:
When a URI is dereferenced, the components and subcomponents significant to the scheme-specific dereferencing process (if any) must be parsed and separated before the percent-encoded octets within those components can be safely decoded, as otherwise the data may be mistaken for component delimiters.
see also UTF-8 support for encoding and decoding URLs

Re: Encode String to URL Format (with "+" as Space etc.)

Posted: Sun Sep 06, 2015 1:21 am
by RichAlgeni
If you are interested in using, there are encode and decode functions built in.

See: http://www.purebasic.fr/english/viewtop ... 29#p470729

Thats 'if' you are interested.

Re: Encode String to URL Format (with "+" as Space etc.)

Posted: Sun Sep 06, 2015 5:56 pm
by c4s
Thanks for the tip, but a few days ago I hacked together a small procedure for encoding purposes and am quite satisfied with the results:

Code: Select all

Procedure.s WebURIEncode(URI.s)
	Protected *MemoryID, *c.Ascii
	Protected Result.s = ""
	
	If URI = "" : ProcedureReturn Result : EndIf
	
	
	*MemoryID = AllocateMemory(StringByteLength(URI, #PB_UTF8) + SizeOf(Character))
	If *MemoryID
		; Convert to UTF-8 and write to memory
		PokeS(*MemoryID, URI, -1, #PB_UTF8) : *c = *MemoryID
		
		; Build encoded result string
		While *c\a <> 0
			Select *c\a
				Case '0' To '9', '-', '.', 'A' To 'Z', '_', 'a' To 'z', '~'  ; Allowed chars
					Result + Chr(*c\a)
				Case ' '  ; Space char
					Result + "+"
				Default  ; Everything else
					Result + "%" + Hex(*c\a, #PB_Ascii)
			EndSelect
			
			*c + SizeOf(Ascii)
		Wend
		
		FreeMemory(*MemoryID)
	EndIf

	ProcedureReturn Result
EndProcedure