Encode String to URL Format (with "+" as Space etc.)

Got an idea for enhancing PureBasic? New command(s) you'd like to see?
c4s
Addict
Addict
Posts: 1981
Joined: Thu Nov 01, 2007 5:37 pm
Location: Germany

Encode String to URL Format (with "+" as Space etc.)

Post by c4s »

I realized that unfortunately UrlEncoder() does not do that correctly. So how can I convert a string to the "encoded URL format"?

With the test string "Äbc: Dêfg - Hìjkl & Mnöpqr?" UrlEncoder() returns the following:
%C4bc:%20D%EAfg%20-%20H%ECjkl%20&%20Mn%F6pqr?
Check out the underlined parts that will most likely cause problems. From a proper converter I would like to see or rather expect something like this:
%C3%84bc%3A+D%C3%AAfg+-+H%C3%ACjkl+%26+Mn%C3%B6pqr%3F
I couldn't find anything else in the help file nor the "Tips and Tricks" section. So I guess this topic is also a coding question...
If any of you native English speakers have any suggestions for the above text, please let me know (via PM). Thanks!
Fred
Administrator
Administrator
Posts: 18162
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: Encode String to URL Format (with "+" as Space etc.)

Post by Fred »

In the next version (5.40), it will supports proper UTF-8 encoding, but '?' and '&' won't be encoded as you can pass a whole URL including parameter. What you are looking for is a new mode to encode like a 'parameter' part.
c4s
Addict
Addict
Posts: 1981
Joined: Thu Nov 01, 2007 5:37 pm
Location: Germany

Re: Encode String to URL Format (with "+" as Space etc.)

Post by c4s »

Thanks for your reply and for implementing the UTF8 support in 5.40.

A new flag for UrlEncoder() or a new function for URL parameter encoding would be awesome!
If any of you native English speakers have any suggestions for the above text, please let me know (via PM). Thanks!
Little John
Addict
Addict
Posts: 4779
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: Encode String to URL Format (with "+" as Space etc.)

Post by Little John »

c4s wrote:A new flag for UrlEncoder() or a new function for URL parameter encoding would be awesome!
I agree. I would appreciate it if PB had two functions named say EncodeURLComponent() and DecodeURLComponent().
That would be in line with RFC 3986, section 2.4.
There it reads for example:
When a URI is dereferenced, the components and subcomponents significant to the scheme-specific dereferencing process (if any) must be parsed and separated before the percent-encoded octets within those components can be safely decoded, as otherwise the data may be mistaken for component delimiters.
see also UTF-8 support for encoding and decoding URLs
User avatar
RichAlgeni
Addict
Addict
Posts: 935
Joined: Wed Sep 22, 2010 1:50 am
Location: Bradenton, FL

Re: Encode String to URL Format (with "+" as Space etc.)

Post by RichAlgeni »

If you are interested in using, there are encode and decode functions built in.

See: http://www.purebasic.fr/english/viewtop ... 29#p470729

Thats 'if' you are interested.
c4s
Addict
Addict
Posts: 1981
Joined: Thu Nov 01, 2007 5:37 pm
Location: Germany

Re: Encode String to URL Format (with "+" as Space etc.)

Post by c4s »

Thanks for the tip, but a few days ago I hacked together a small procedure for encoding purposes and am quite satisfied with the results:

Code: Select all

Procedure.s WebURIEncode(URI.s)
	Protected *MemoryID, *c.Ascii
	Protected Result.s = ""
	
	If URI = "" : ProcedureReturn Result : EndIf
	
	
	*MemoryID = AllocateMemory(StringByteLength(URI, #PB_UTF8) + SizeOf(Character))
	If *MemoryID
		; Convert to UTF-8 and write to memory
		PokeS(*MemoryID, URI, -1, #PB_UTF8) : *c = *MemoryID
		
		; Build encoded result string
		While *c\a <> 0
			Select *c\a
				Case '0' To '9', '-', '.', 'A' To 'Z', '_', 'a' To 'z', '~'  ; Allowed chars
					Result + Chr(*c\a)
				Case ' '  ; Space char
					Result + "+"
				Default  ; Everything else
					Result + "%" + Hex(*c\a, #PB_Ascii)
			EndSelect
			
			*c + SizeOf(Ascii)
		Wend
		
		FreeMemory(*MemoryID)
	EndIf

	ProcedureReturn Result
EndProcedure
If any of you native English speakers have any suggestions for the above text, please let me know (via PM). Thanks!
Post Reply