Does anyone know if there are a set of definitive functions on this board to work with unicode?
It would be nice to have some functions that :
1) Can detect if a string is unicode.
2) Convert a string to unicode.
3) Convert unicode to string.
Helpy was able to show me the conversion of a unicode string to ascii in an earlier problem, but when I used some of the example to put the string back all I got was Japanese characters... and I checked to make sure I was using the correct language. I tried several examples from these boards.
Just thinking it would be nice to have this functionality wrapped up in a couple of functions.
Thanks.
Kent
Definitive Unicode functions?
In the Platform SDK you can find the function "IsTextUnicode". You can use it like this:
More information in the Platform SDK
cu, helpy
Code: Select all
*pPointerToBufferWhichContainsAString.l
*pPointerToBufferWhichContainsAString = FunctionReturnsPointerToUnicodeString()
If IsTextUnicode_( *pPointerToBufferWhichContainsAString, SizeOfBuffer, #NULL )
; String IS a Unicode string
; ...
EndIf
cu, helpy
Thanks for the tip, helpy! That beats a DIY approach. 
As I understand it, there is a thing called a BOM, or Byte Order Mark (but some protocols prohibit this use.
)
In an incoming bytestream, the following BOMs (in hex) would suggest:
00 00 FE FF
UTF-32, big-endian - but this also means something else!
FF FE 00 00
UTF-32, little-endian
FE FF
UTF-16, big-endian - but this also means something else!
FF FE
UTF-16, little-endian
EF BB BF
UTF-8
Unless, of course, the protocols disallow BOMs.
Standards are set to create employment and confuse the users!
A question on another standard, End Of Line in a text file
Can anyone confirm that there are only 3 standard EOL character sequences:
chr(13) CR or ctl-r
chr(10) LF or ctl-n
chr(13)+chr(10)
Or are there some others floating around?

As I understand it, there is a thing called a BOM, or Byte Order Mark (but some protocols prohibit this use.

In an incoming bytestream, the following BOMs (in hex) would suggest:
00 00 FE FF
UTF-32, big-endian - but this also means something else!
FF FE 00 00
UTF-32, little-endian
FE FF
UTF-16, big-endian - but this also means something else!
FF FE
UTF-16, little-endian
EF BB BF
UTF-8
Unless, of course, the protocols disallow BOMs.
Standards are set to create employment and confuse the users!
A question on another standard, End Of Line in a text file
Can anyone confirm that there are only 3 standard EOL character sequences:
chr(13) CR or ctl-r
chr(10) LF or ctl-n
chr(13)+chr(10)
Or are there some others floating around?
Code: Select all
Procedure.l Ansi2Uni(ansi.s)
size.l=MultiByteToWideChar_(#CP_ACP,0,ansi,-1,0,0)
Dim unicode.w(size)
MultiByteToWideChar_(#CP_ACP, 0, ansi, Len(ansi), unicode(), size)
ProcedureReturn @unicode()
EndProcedure
Procedure.s Uni2Ansi(*Unicode.l)
size.l = WideCharToMultiByte_(#CP_ACP, 0, *Unicode, -1, #Null, #Null, #Null, #Null)
ansi.s=Space(size)
WideCharToMultiByte_(#CP_ACP, 0, *Unicode, -1, @ansi, size, #Null, #Null)
ProcedureReturn ansi
EndProcedure
*pointeur=Ansi2Uni("Ben ça alors, dis donc, si j'avais su.")
Debug Uni2Ansi(*pointeur)
i've seen chr(0) being used but it's not exactly common 
(probably a lazy memory dump by a c (or pure
) programmer)

(probably a lazy memory dump by a c (or pure

( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB)
( The path to enlightenment and the PureBasic Survival Guide right here... )
( The path to enlightenment and the PureBasic Survival Guide right here... )
- Kwai chang caine
- Always Here
- Posts: 5494
- Joined: Sun Nov 05, 2006 11:42 pm
- Location: Lyon - France
Re: Definitive Unicode functions?
I have a pointer from an VB array like this
I have try the code of HELPY..but that don't works 
Somebody know why ??? :roll:
Code: Select all
*strPtr.INTEGER

Somebody know why ??? :roll:
Code: Select all
Procedure ArrayExe2Local(*strPtr.INTEGER, Array Array2Modify.s(1), ArraySize)
#ArrayPB = 1
#ArrayVB = 2
*Pointer.l = *strPtr\i
If Not TypeArray
If IsTextUnicode_(*Pointer , 1, #Null )
TypeArray = #ArrayVB
Else
TypeArray = #ArrayPB
EndIf
EndIf

Not a destination
Re: Definitive Unicode functions?
That's old code... what are you trying to accomplish?
( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB)
( The path to enlightenment and the PureBasic Survival Guide right here... )
( The path to enlightenment and the PureBasic Survival Guide right here... )
- Kwai chang caine
- Always Here
- Posts: 5494
- Joined: Sun Nov 05, 2006 11:42 pm
- Location: Lyon - France
Re: Definitive Unicode functions?
Hello BLUEZNL 
Thanks to try to help me
In fact, i transfert a string array from a VB code with this function in a PB DLL
In the exe VB
SROD help and give me, this code for get the array in the DLL
In the DLL
And that's works fine 
But the problem, it's that i call the DLL by two type of EXE.
A VB exe or a PB exe
VB exe works in UNICODE, and PB exe works in ASCII
I don't want to compile PB in UNICODE mode.
So how can i do in the DLL for recognize, if the array passing by pointer is UNICODE or ASCII ???
How can i fill my "TypeArray" variable, for "select case" the PeekS (UNICODE or ASCII) ???

Thanks to try to help me

In fact, i transfert a string array from a VB code with this function in a PB DLL
In the exe VB
Code: Select all
Dim ArrayString() As String
ReDim ArrayString(10)
Send2PB(VarPtr(ArrayString(0)), Parameter)
In the DLL
Code: Select all
Procedure Send2PB(*strPtr.INTEGER)
Dim Array2Modify(10)
For i = 1 to 10
*strPtr + SizeOf(INTEGER)
Select TypeArray
Case #ArrayPB
Array2Modify(i) = PeekS(*strPtr\i, - 1, #PB_Ascii)
Case #ArrayVB
Array2Modify(i) = PeekS(*strPtr\i, - 1, #PB_Unicode)
EndSelect
Next
EndProcedure

But the problem, it's that i call the DLL by two type of EXE.
A VB exe or a PB exe
VB exe works in UNICODE, and PB exe works in ASCII
I don't want to compile PB in UNICODE mode.
So how can i do in the DLL for recognize, if the array passing by pointer is UNICODE or ASCII ???
How can i fill my "TypeArray" variable, for "select case" the PeekS (UNICODE or ASCII) ???

Not a destination