Does anyone know if there are a set of definitive functions on this board to work with unicode?
It would be nice to have some functions that :
1) Can detect if a string is unicode.
2) Convert a string to unicode.
3) Convert unicode to string. 
Helpy was able to show me the conversion of a unicode string to ascii in an earlier problem, but when I used some of the example to put the string back all I got was Japanese characters... and I checked to make sure I was using the correct language. I tried several examples from these boards.
Just thinking it would be nice to have this functionality wrapped up in a couple of functions.
Thanks.
Kent
			
			
									
									
						Definitive Unicode functions?
In the Platform SDK you can find the function "IsTextUnicode". You can use it like this:
More information in the Platform SDK
cu, helpy
			
			
									
									
						Code: Select all
*pPointerToBufferWhichContainsAString.l
*pPointerToBufferWhichContainsAString = FunctionReturnsPointerToUnicodeString()
If IsTextUnicode_( *pPointerToBufferWhichContainsAString, SizeOfBuffer, #NULL )
  ; String IS a Unicode string
  ; ...
EndIfcu, helpy
Thanks for the tip, helpy! That beats a DIY approach.  
As I understand it, there is a thing called a BOM, or Byte Order Mark (but some protocols prohibit this use.
 )
In an incoming bytestream, the following BOMs (in hex) would suggest:
00 00 FE FF
UTF-32, big-endian - but this also means something else!
FF FE 00 00
UTF-32, little-endian
FE FF
UTF-16, big-endian - but this also means something else!
FF FE
UTF-16, little-endian
EF BB BF
UTF-8
Unless, of course, the protocols disallow BOMs.
Standards are set to create employment and confuse the users!
A question on another standard, End Of Line in a text file
Can anyone confirm that there are only 3 standard EOL character sequences:
chr(13) CR or ctl-r
chr(10) LF or ctl-n
chr(13)+chr(10)
Or are there some others floating around?
			
			
									
									
						As I understand it, there is a thing called a BOM, or Byte Order Mark (but some protocols prohibit this use.
In an incoming bytestream, the following BOMs (in hex) would suggest:
00 00 FE FF
UTF-32, big-endian - but this also means something else!
FF FE 00 00
UTF-32, little-endian
FE FF
UTF-16, big-endian - but this also means something else!
FF FE
UTF-16, little-endian
EF BB BF
UTF-8
Unless, of course, the protocols disallow BOMs.
Standards are set to create employment and confuse the users!
A question on another standard, End Of Line in a text file
Can anyone confirm that there are only 3 standard EOL character sequences:
chr(13) CR or ctl-r
chr(10) LF or ctl-n
chr(13)+chr(10)
Or are there some others floating around?
Code: Select all
Procedure.l Ansi2Uni(ansi.s)
  size.l=MultiByteToWideChar_(#CP_ACP,0,ansi,-1,0,0)
  Dim unicode.w(size)
  MultiByteToWideChar_(#CP_ACP, 0, ansi, Len(ansi), unicode(), size)
  ProcedureReturn @unicode()  
EndProcedure 
 
Procedure.s Uni2Ansi(*Unicode.l)
  size.l = WideCharToMultiByte_(#CP_ACP, 0, *Unicode, -1, #Null, #Null, #Null, #Null)
  ansi.s=Space(size)
  WideCharToMultiByte_(#CP_ACP, 0, *Unicode, -1, @ansi, size, #Null, #Null)
  ProcedureReturn ansi  
EndProcedure
 
*pointeur=Ansi2Uni("Ben ça alors, dis donc, si j'avais su.")
Debug Uni2Ansi(*pointeur)i've seen chr(0) being used but it's not exactly common 
(probably a lazy memory dump by a c (or pure
) programmer)
			
			
									
									(probably a lazy memory dump by a c (or pure
( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB - upgrade incoming...)
( The path to enlightenment and the PureBasic Survival Guide right here... )
						( The path to enlightenment and the PureBasic Survival Guide right here... )
- Kwai chang caine
 - Always Here

 - Posts: 5502
 - Joined: Sun Nov 05, 2006 11:42 pm
 - Location: Lyon - France
 
Re: Definitive Unicode functions?
I have a pointer from an VB array like this
I have try the code of HELPY..but that don't works  
Somebody know why ??? :roll:
			
			
									
									Code: Select all
*strPtr.INTEGERSomebody know why ??? :roll:
Code: Select all
Procedure ArrayExe2Local(*strPtr.INTEGER, Array Array2Modify.s(1), ArraySize)
 
 #ArrayPB = 1
 #ArrayVB = 2
 *Pointer.l = *strPtr\i
 
 If Not TypeArray
  If IsTextUnicode_(*Pointer , 1, #Null )
   TypeArray = #ArrayVB 
  Else
   TypeArray = #ArrayPB
  EndIf
 
 EndIf
The happiness is a road...Not a destination
Re: Definitive Unicode functions?
That's old code... what are you trying to accomplish?
			
			
									
									( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB - upgrade incoming...)
( The path to enlightenment and the PureBasic Survival Guide right here... )
						( The path to enlightenment and the PureBasic Survival Guide right here... )
- Kwai chang caine
 - Always Here

 - Posts: 5502
 - Joined: Sun Nov 05, 2006 11:42 pm
 - Location: Lyon - France
 
Re: Definitive Unicode functions?
Hello BLUEZNL  
Thanks to try to help me
 
In fact, i transfert a string array from a VB code with this function in a PB DLL
In the exe VB
SROD help and give me, this code for get the array in the DLL
In the DLL
And that's works fine  
But the problem, it's that i call the DLL by two type of EXE.
A VB exe or a PB exe
VB exe works in UNICODE, and PB exe works in ASCII
I don't want to compile PB in UNICODE mode.
So how can i do in the DLL for recognize, if the array passing by pointer is UNICODE or ASCII ???
How can i fill my "TypeArray" variable, for "select case" the PeekS (UNICODE or ASCII) ???
			
			
									
									Thanks to try to help me
In fact, i transfert a string array from a VB code with this function in a PB DLL
In the exe VB
Code: Select all
Dim ArrayString() As String 
ReDim ArrayString(10)
Send2PB(VarPtr(ArrayString(0)), Parameter)In the DLL
Code: Select all
Procedure Send2PB(*strPtr.INTEGER)
  Dim Array2Modify(10)
 
  For i = 1 to 10
   *strPtr + SizeOf(INTEGER)
  
   Select TypeArray
  
    Case #ArrayPB 
     Array2Modify(i) = PeekS(*strPtr\i, - 1, #PB_Ascii)
    Case #ArrayVB
     Array2Modify(i) = PeekS(*strPtr\i, - 1, #PB_Unicode)
  
   EndSelect
    
  Next
 
EndProcedure
But the problem, it's that i call the DLL by two type of EXE.
A VB exe or a PB exe
VB exe works in UNICODE, and PB exe works in ASCII
I don't want to compile PB in UNICODE mode.
So how can i do in the DLL for recognize, if the array passing by pointer is UNICODE or ASCII ???
How can i fill my "TypeArray" variable, for "select case" the PeekS (UNICODE or ASCII) ???
The happiness is a road...Not a destination


