Definitive Unicode functions?

Just starting out? Need help? Post your questions and find answers here.
kruddick
User
User
Posts: 11
Joined: Mon Mar 08, 2004 1:15 am

Definitive Unicode functions?

Post by kruddick »

Does anyone know if there are a set of definitive functions on this board to work with unicode?

It would be nice to have some functions that :

1) Can detect if a string is unicode.
2) Convert a string to unicode.
3) Convert unicode to string.

Helpy was able to show me the conversion of a unicode string to ascii in an earlier problem, but when I used some of the example to put the string back all I got was Japanese characters... and I checked to make sure I was using the correct language. I tried several examples from these boards.

Just thinking it would be nice to have this functionality wrapped up in a couple of functions.

Thanks.
Kent
User avatar
helpy
Enthusiast
Enthusiast
Posts: 552
Joined: Sat Jun 28, 2003 12:01 am

Post by helpy »

In the Platform SDK you can find the function "IsTextUnicode". You can use it like this:

Code: Select all

*pPointerToBufferWhichContainsAString.l

*pPointerToBufferWhichContainsAString = FunctionReturnsPointerToUnicodeString()

If IsTextUnicode_( *pPointerToBufferWhichContainsAString, SizeOfBuffer, #NULL )
  ; String IS a Unicode string
  ; ...
EndIf
More information in the Platform SDK

cu, helpy
Dare2
Moderator
Moderator
Posts: 3321
Joined: Sat Dec 27, 2003 3:55 am
Location: Great Southern Land

Post by Dare2 »

Thanks for the tip, helpy! That beats a DIY approach. :)

As I understand it, there is a thing called a BOM, or Byte Order Mark (but some protocols prohibit this use. :? )

In an incoming bytestream, the following BOMs (in hex) would suggest:

00 00 FE FF
  UTF-32, big-endian - but this also means something else!
FF FE 00 00
  UTF-32, little-endian
FE FF
  UTF-16, big-endian - but this also means something else!
FF FE
  UTF-16, little-endian
EF BB BF
  UTF-8

Unless, of course, the protocols disallow BOMs.

Standards are set to create employment and confuse the users!



A question on another standard, End Of Line in a text file

Can anyone confirm that there are only 3 standard EOL character sequences:

chr(13)   CR or ctl-r
chr(10)   LF or ctl-n
chr(13)+chr(10)

Or are there some others floating around?
Nico
Enthusiast
Enthusiast
Posts: 274
Joined: Sun Jan 11, 2004 11:34 am
Location: France

Post by Nico »

Code: Select all

Procedure.l Ansi2Uni(ansi.s)
  size.l=MultiByteToWideChar_(#CP_ACP,0,ansi,-1,0,0)
  Dim unicode.w(size)
  MultiByteToWideChar_(#CP_ACP, 0, ansi, Len(ansi), unicode(), size)
  ProcedureReturn @unicode()  
EndProcedure 
 
Procedure.s Uni2Ansi(*Unicode.l)
  size.l = WideCharToMultiByte_(#CP_ACP, 0, *Unicode, -1, #Null, #Null, #Null, #Null)
  ansi.s=Space(size)
  WideCharToMultiByte_(#CP_ACP, 0, *Unicode, -1, @ansi, size, #Null, #Null)
  ProcedureReturn ansi  
EndProcedure
 
*pointeur=Ansi2Uni("Ben ça alors, dis donc, si j'avais su.")
Debug Uni2Ansi(*pointeur)
User avatar
blueznl
PureBasic Expert
PureBasic Expert
Posts: 6166
Joined: Sat May 17, 2003 11:31 am
Contact:

Post by blueznl »

i've seen chr(0) being used but it's not exactly common :-)
(probably a lazy memory dump by a c (or pure :-)) programmer)
( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB)
( The path to enlightenment and the PureBasic Survival Guide right here... )
User avatar
Kwai chang caine
Always Here
Always Here
Posts: 5494
Joined: Sun Nov 05, 2006 11:42 pm
Location: Lyon - France

Re: Definitive Unicode functions?

Post by Kwai chang caine »

I have a pointer from an VB array like this

Code: Select all

*strPtr.INTEGER
I have try the code of HELPY..but that don't works :(
Somebody know why ??? :roll:

Code: Select all

Procedure ArrayExe2Local(*strPtr.INTEGER, Array Array2Modify.s(1), ArraySize)
 
 #ArrayPB = 1
 #ArrayVB = 2

 *Pointer.l = *strPtr\i
 
 If Not TypeArray

  If IsTextUnicode_(*Pointer , 1, #Null )
   TypeArray = #ArrayVB 
  Else
   TypeArray = #ArrayPB
  EndIf
 
 EndIf
ImageThe happiness is a road...
Not a destination
User avatar
blueznl
PureBasic Expert
PureBasic Expert
Posts: 6166
Joined: Sat May 17, 2003 11:31 am
Contact:

Re: Definitive Unicode functions?

Post by blueznl »

That's old code... what are you trying to accomplish?
( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB)
( The path to enlightenment and the PureBasic Survival Guide right here... )
User avatar
Kwai chang caine
Always Here
Always Here
Posts: 5494
Joined: Sun Nov 05, 2006 11:42 pm
Location: Lyon - France

Re: Definitive Unicode functions?

Post by Kwai chang caine »

Hello BLUEZNL :D
Thanks to try to help me 8)

In fact, i transfert a string array from a VB code with this function in a PB DLL

In the exe VB

Code: Select all

Dim ArrayString() As String 
ReDim ArrayString(10)
Send2PB(VarPtr(ArrayString(0)), Parameter)
SROD help and give me, this code for get the array in the DLL

In the DLL

Code: Select all

Procedure Send2PB(*strPtr.INTEGER)
  Dim Array2Modify(10)
 
  For i = 1 to 10

   *strPtr + SizeOf(INTEGER)
  
   Select TypeArray
  
    Case #ArrayPB 
     Array2Modify(i) = PeekS(*strPtr\i, - 1, #PB_Ascii)
    Case #ArrayVB
     Array2Modify(i) = PeekS(*strPtr\i, - 1, #PB_Unicode)
  
   EndSelect
    
  Next
 
EndProcedure
And that's works fine :D
But the problem, it's that i call the DLL by two type of EXE.
A VB exe or a PB exe
VB exe works in UNICODE, and PB exe works in ASCII
I don't want to compile PB in UNICODE mode.

So how can i do in the DLL for recognize, if the array passing by pointer is UNICODE or ASCII ???
How can i fill my "TypeArray" variable, for "select case" the PeekS (UNICODE or ASCII) ???
ImageThe happiness is a road...
Not a destination
Post Reply