Switching my app to Unicode breaks things

Just starting out? Need help? Post your questions and find answers here.
MachineCode
Addict
Addict
Posts: 1482
Joined: Tue Feb 22, 2011 1:16 pm

Switching my app to Unicode breaks things

Post by MachineCode »

Okay, so I enabled compiling to Unicode for one of my apps, but now it doesn't work. For example, one part uses the following procedure, which returns the HTML of a web page into a variable. In ASCII, it works. In Unicode, it returns a string like "?????????????????????????". Great. Anyone know how to fix this? And does it mean all my string procedures are now going to be broken because of Unicode?

Code: Select all

Procedure.s DownloadHTML(url$)
  #INTERNET_FLAG_RELOAD=$80000000
  hInet=InternetOpen_(url$,1,0,0,0)
  If hInet
    hURL=InternetOpenUrl_(hInet,url$,0,0,#INTERNET_FLAG_RELOAD,0)
    If hURL
      html$=Space(256)
      If InternetReadFile_(hURL,@html$,Len(html$),@bytes)
        html$=Trim(html$)
      EndIf
    EndIf
    InternetCloseHandle_(hInet)
  EndIf
  ProcedureReturn html$
EndProcedure
Microsoft Visual Basic only lasted 7 short years: 1991 to 1998.
PureBasic: Born in 1998 and still going strong to this very day!
User avatar
Shield
Addict
Addict
Posts: 1021
Joined: Fri Jan 21, 2011 8:25 am
Location: 'stralia!
Contact:

Re: Switching my app to Unicode breaks things

Post by Shield »

You need to convert the string from ASCII to unicode in order to read it properly in a PB program that runs on unicode.
Just a quick hack here to demonstrate this:

Code: Select all

Procedure.s DownloadHTML(url$)
	#INTERNET_FLAG_RELOAD=$80000000
	hInet=InternetOpen_(url$,1,0,0,0)
	If hInet
		hURL=InternetOpenUrl_(hInet,url$,0,0,#INTERNET_FLAG_RELOAD,0)
		If hURL
			*buffer = AllocateMemory(256)
			If InternetReadFile_(hURL,*buffer,MemorySize(*buffer),@bytes)
				html$ = PeekS(*buffer, MemorySize(*buffer), #PB_Ascii)
			EndIf
		EndIf
		InternetCloseHandle_(hInet)
	EndIf
	ProcedureReturn html$
EndProcedure


Debug DownloadHTML("http://www.google.com/")
You need to do this because the API just writes into memory what it gets. It doesn't care about the encoding,
you need to take care of this yourself. :)
Image
Blog: Why Does It Suck? (http://whydoesitsuck.com/)
"You can disagree with me as much as you want, but during this talk, by definition, anybody who disagrees is stupid and ugly."
- Linus Torvalds
User avatar
happer66
User
User
Posts: 33
Joined: Tue Jan 12, 2010 12:10 pm
Location: Sweden

Re: Switching my app to Unicode breaks things

Post by happer66 »

Another take, convert the string to ascii

Code: Select all

Procedure.s DownloadHTML(url$)
<removed> pasted the wrong code!
Unicode characters are 2 bytes when stored in memory, when you want to count the number of characters you actually see you can use Len(), but when you want to know it's memory size you have to use StringByteLength() or Len(string$)*SizeOf(Character). Don't forget that the ending NUL is also the size of a character. Hope that made somewhat sense :)
Otherwise I'm sure someone will correct it/me.

edit: Almost forgot, this is quite good
http://www.joelonsoftware.com/articles/Unicode.html
Image If your code isn't clean atleast make sure it's pure!
MachineCode
Addict
Addict
Posts: 1482
Joined: Tue Feb 22, 2011 1:16 pm

Re: Switching my app to Unicode breaks things

Post by MachineCode »

Shield wrote:html$ = PeekS(*buffer, MemorySize(*buffer), #PB_Ascii)
Ewwwwww. I need to PeekS() everything now? Can't the compiler just do this for me if I've set Unicode on in Compiler Options?
Microsoft Visual Basic only lasted 7 short years: 1991 to 1998.
PureBasic: Born in 1998 and still going strong to this very day!
User avatar
luis
Addict
Addict
Posts: 3893
Joined: Wed Aug 31, 2005 11:09 pm
Location: Italy

Re: Switching my app to Unicode breaks things

Post by luis »

If you set the compiler to unicode and you are reading data from a source where it's not stored in unicode it's your job to know it and convert it. Could be not only the output of a procedure, but some external data stored in that way that you just read in memory. You certainly cannot ask the compiler to inspect data at runtime every time you do a memory access and try to imagine if you want to perform some kind of conversion in-between. Excluding the cost of this, sometimes you could want it, other times not.
Other times data can resemble one thing when in reality it's another thing or should be treated in a certain way irrespectively of how it looks like.

To explicitly tell the compiler to do some of these conversion when calling a foreign procedure, you have prototypes and pseudotypes. When you cannot use them, you have to use peeks().
"Have you tried turning it off and on again ?"
A little PureBasic review
MachineCode
Addict
Addict
Posts: 1482
Joined: Tue Feb 22, 2011 1:16 pm

Re: Switching my app to Unicode breaks things

Post by MachineCode »

Thanks for the explanation. I'll keep compiling in Ascii then, as I have no time for Unicode right now, and don't know what data will be Unicode and not (I can't tell, I'm a Unicode newbie, and Joel's article doesn't help). Maybe later when I have more time, experience and money.
Microsoft Visual Basic only lasted 7 short years: 1991 to 1998.
PureBasic: Born in 1998 and still going strong to this very day!
Post Reply