Page 2 of 2
Posted: Sat Sep 06, 2008 8:48 pm
by ts-soft
It is not sure to use other formats in a pb stringvariable, use memory for this:
Code: Select all
Procedure StringToUTF(S.s)
#AutoLength = -1
Protected *Buffer
*Buffer = AllocateMemory(StringByteLength(S,#PB_UTF8) + 1) ;<== add a byte for the Null
PokeS(*Buffer,S, #AutoLength, #PB_UTF8)
ProcedureReturn *Buffer
EndProcedure
Posted: Sat Sep 06, 2008 9:35 pm
by blueznl
Demivec wrote:
You have to add one byte for the Null when you reserve buffer space.
Actually TWO when you work in Windows Unicode / UTF16.... even the null comes double...
Posted: Sat Sep 06, 2008 10:07 pm
by Demivec
blueznl wrote:Demivec wrote:
You have to add one byte for the Null when you reserve buffer space.
Actually TWO when you work in Windows Unicode / UTF16.... even the null comes double...
True, but the procedures are designed specifically for Ascii-to-UTF8 and UTF8-to-Ascii.
Posted: Sun Sep 07, 2008 2:56 am
by spacefractal
ts-soft's didn't actually work as it should when I tested here, here is fixed modified version which worked here (these functions does here both way):
Code: Select all
; UTF8 to Unicode/AscII (depend if the app is compiled as unicode or not).
Procedure.s Unicode(s.s)
Protected *Buffer
*Buffer = AllocateMemory(StringByteLength(S,#PB_UTF8) + 2) ;<== add a byte for the Null (1 or 2?)
PokeS(*Buffer,S, -1, #PB_Ascii)
Result$=PeekS(*Buffer, -1, #PB_UTF8)
FreeMemory(*Buffer)
ProcedureReturn Result$
EndProcedure
; Unicode/AscII (depend if the app is compiled as unicode or not) to UTf8.
Procedure.s UTF8(s.s)
Protected *Buffer
*Buffer = AllocateMemory(StringByteLength(S,#PB_UTF8) + 2) ;<== add a byte for the Null (1 or 2)?
PokeS(*Buffer,S, -1, #PB_UTF8);
Result$=PeekS(*Buffer, -1, #PB_Ascii);
FreeMemory(*Buffer)
ProcedureReturn Result$
EndProcedure
UTF8 is a ASCII formatted string using variable length for encodning the chars, hence it need to been "saved" to ASCII, and then convert it to a string using #PB_UTF8.
Posted: Sun Sep 07, 2008 3:15 am
by ts-soft
PB Stringmanager support only Unicode in unicode-applications and ASCII
in ASCII application. Your Return of a stringvariable, that hold a UTF-8 in the
buffer, this is not sure. The UTF-8 is only sure in a allocated memory but
never in a stringvariable.
UTF-8 is never required in a Stringvariable.
If a lib requires UTF-8, you can use a pseudotype or a pointer to memory
Posted: Sun Sep 07, 2008 9:40 am
by blueznl
You can store an UTF8 string in a string variable, as a UTF8 string will never contain a zero. Of course, PB's string handling commands will all be thrown off-track...
Hmm.
Except for Linux, I suppose. Is PB Unicode in Linux in UTF16 or UTF8 in memory?
Posted: Sun Sep 07, 2008 3:40 pm
by Michael Vogel
The reason for using such routines is simple: sometimes it is necessary to handle different files within one program (preferences, database etc.) - so both text representations must be handled also.
In my case, I have to handle (addtionally to a simple INI file) GPX, HST and TCX files for GPS data. For normal, these files consist of UTF-8 text, but sometimes there is also simple ASCII content.
In such cases, my routines above can help - maybe a fast WhatStringTypeIs() function would be fine to check, if a string is ASCII or UTF8 formated.