Definitive Unicode functions?

kruddick · Post by **kruddick** » Fri Mar 26, 2004 7:43 pm

Does anyone know if there are a set of definitive functions on this board to work with unicode?

It would be nice to have some functions that :

1) Can detect if a string is unicode.
2) Convert a string to unicode.
3) Convert unicode to string.

Helpy was able to show me the conversion of a unicode string to ascii in an earlier problem, but when I used some of the example to put the string back all I got was Japanese characters... and I checked to make sure I was using the correct language. I tried several examples from these boards.

Just thinking it would be nice to have this functionality wrapped up in a couple of functions.

Thanks.
Kent

helpy · Post by **helpy** » Fri Mar 26, 2004 8:42 pm

In the Platform SDK you can find the function "IsTextUnicode". You can use it like this:

Code: Select all

*pPointerToBufferWhichContainsAString.l

*pPointerToBufferWhichContainsAString = FunctionReturnsPointerToUnicodeString()

If IsTextUnicode_( *pPointerToBufferWhichContainsAString, SizeOfBuffer, #NULL )
  ; String IS a Unicode string
  ; ...
EndIf

More information in the Platform SDK

cu, helpy

Post by **Dare2** » Thu Apr 08, 2004 6:33 am

Thanks for the tip, helpy! That beats a DIY approach.

As I understand it, there is a thing called a BOM, or Byte Order Mark (but some protocols prohibit this use.

)

In an incoming bytestream, the following BOMs (in hex) would suggest:

00 00 FE FF
UTF-32, big-endian - but this also means something else!
FF FE 00 00
UTF-32, little-endian
FE FF
UTF-16, big-endian - but this also means something else!
FF FE
UTF-16, little-endian
EF BB BF
UTF-8

Unless, of course, the protocols disallow BOMs.

Standards are set to create employment and confuse the users!

A question on another standard, End Of Line in a text file

Can anyone confirm that there are only 3 standard EOL character sequences:

chr(13) CR or ctl-r
chr(10) LF or ctl-n
chr(13)+chr(10)

Or are there some others floating around?

Nico · Post by **Nico** » Thu Apr 08, 2004 10:49 pm

Code: Select all

Procedure.l Ansi2Uni(ansi.s)
  size.l=MultiByteToWideChar_(#CP_ACP,0,ansi,-1,0,0)
  Dim unicode.w(size)
  MultiByteToWideChar_(#CP_ACP, 0, ansi, Len(ansi), unicode(), size)
  ProcedureReturn @unicode()  
EndProcedure 
 
Procedure.s Uni2Ansi(*Unicode.l)
  size.l = WideCharToMultiByte_(#CP_ACP, 0, *Unicode, -1, #Null, #Null, #Null, #Null)
  ansi.s=Space(size)
  WideCharToMultiByte_(#CP_ACP, 0, *Unicode, -1, @ansi, size, #Null, #Null)
  ProcedureReturn ansi  
EndProcedure
 
*pointeur=Ansi2Uni("Ben Ã§a alors, dis donc, si j'avais su.")
Debug Uni2Ansi(*pointeur)

blueznl · Post by **blueznl** » Fri Apr 09, 2004 10:06 pm

i've seen chr(0) being used but it's not exactly common

(probably a lazy memory dump by a c (or pure

) programmer)

Kwai chang caine · Post by **Kwai chang caine** » Thu Oct 08, 2009 12:33 pm

I have a pointer from an VB array like this

Code: Select all

*strPtr.INTEGER

I have try the code of HELPY..but that don't works

Somebody know why ??? :roll:

Code: Select all

Procedure ArrayExe2Local(*strPtr.INTEGER, Array Array2Modify.s(1), ArraySize)
 
 #ArrayPB = 1
 #ArrayVB = 2

 *Pointer.l = *strPtr\i
 
 If Not TypeArray

  If IsTextUnicode_(*Pointer , 1, #Null )
   TypeArray = #ArrayVB 
  Else
   TypeArray = #ArrayPB
  EndIf
 
 EndIf

blueznl · Post by **blueznl** » Thu Oct 08, 2009 8:53 pm

That's old code... what are you trying to accomplish?

Kwai chang caine · Post by **Kwai chang caine** » Fri Oct 09, 2009 7:57 am

Hello BLUEZNL

Thanks to try to help me

In fact, i transfert a string array from a VB code with this function in a PB DLL

In the exe VB

Code: Select all

Dim ArrayString() As String 
ReDim ArrayString(10)
Send2PB(VarPtr(ArrayString(0)), Parameter)

SROD help and give me, this code for get the array in the DLL

In the DLL

Code: Select all

Procedure Send2PB(*strPtr.INTEGER)
  Dim Array2Modify(10)
 
  For i = 1 to 10

   *strPtr + SizeOf(INTEGER)
  
   Select TypeArray
  
    Case #ArrayPB 
     Array2Modify(i) = PeekS(*strPtr\i, - 1, #PB_Ascii)
    Case #ArrayVB
     Array2Modify(i) = PeekS(*strPtr\i, - 1, #PB_Unicode)
  
   EndSelect
    
  Next
 
EndProcedure

And that's works fine

But the problem, it's that i call the DLL by two type of EXE.
A VB exe or a PB exe
VB exe works in UNICODE, and PB exe works in ASCII
I don't want to compile PB in UNICODE mode.

So how can i do in the DLL for recognize, if the array passing by pointer is UNICODE or ASCII ???
How can i fill my "TypeArray" variable, for "select case" the PeekS (UNICODE or ASCII) ???

PureBasic Forums - English

Definitive Unicode functions?

Definitive Unicode functions?

Re: Definitive Unicode functions?

Re: Definitive Unicode functions?

Re: Definitive Unicode functions?