ASCII to UNICODE

Just starting out? Need help? Post your questions and find answers here.
jesperbrannmark
Enthusiast
Enthusiast
Posts: 536
Joined: Mon Feb 16, 2009 10:42 am
Location: sweden
Contact:

ASCII to UNICODE

Post by jesperbrannmark »

Hi.
I've been looking a lot, so please don't mock me...
If i want to urlencode something, its easy i use
a.s=urlencode("tesåäö%&& ")
and get a good string

but if i want to turn something into UTF-8..... from a string... how do I do that?
like what i in php would do with
$string = utf8_encode($string);
User avatar
ts-soft
Always Here
Always Here
Posts: 5756
Joined: Thu Jun 24, 2004 2:44 pm
Location: Berlin - Germany

Re: ASCII to UNICODE

Post by ts-soft »

Code: Select all

string$ = PeekS(@string, -1, #PB_UTF8)
Have a look on the helpfile - PeekS and PokeS
PureBasic 5.73 | SpiderBasic 2.30 | Windows 10 Pro (x64) | Linux Mint 20.1 (x64)
Old bugs good, new bugs bad! Updates are evil: might fix old bugs and introduce no new ones.
Image
jesperbrannmark
Enthusiast
Enthusiast
Posts: 536
Joined: Mon Feb 16, 2009 10:42 am
Location: sweden
Contact:

Re: ASCII to UNICODE

Post by jesperbrannmark »

It doesnt work

Code: Select all

utf8encode$="ÅÄÖ"
utf8encode$=PeekS(@utf8encode$, -1, #PB_UTF8)
Debug utf8encode$
just returns ?
ÅÄÖ is swedish characters.. Tried with " (chr34) as well...
User avatar
ts-soft
Always Here
Always Here
Posts: 5756
Joined: Thu Jun 24, 2004 2:44 pm
Location: Berlin - Germany

Re: ASCII to UNICODE

Post by ts-soft »

A PB String can only hold as in compileroptions! ASCII or Unicode.
You can change only in memory with pokeS or you can read with peeks if in memory another format, than the
result is in compilerformat!

I hope you understand my bad english.
PureBasic 5.73 | SpiderBasic 2.30 | Windows 10 Pro (x64) | Linux Mint 20.1 (x64)
Old bugs good, new bugs bad! Updates are evil: might fix old bugs and introduce no new ones.
Image
jesperbrannmark
Enthusiast
Enthusiast
Posts: 536
Joined: Mon Feb 16, 2009 10:42 am
Location: sweden
Contact:

Re: ASCII to UNICODE

Post by jesperbrannmark »

I still dont get it, i try to change to unicode in compiler options.
I am still not able to convert that ASCII to UNICODE.

What I need is something just like the php function, would be better if there is a proper way instead of replace every character and get a character map.
example: I send in a string thats 10 bytes long (ASCII) and get back a 16 bytes long UNICODE/UTF-8 string.

I am manually making json arrays and need UTF8 otherwise the " and ' and / makes it freak out.
jesperbrannmark
Enthusiast
Enthusiast
Posts: 536
Joined: Mon Feb 16, 2009 10:42 am
Location: sweden
Contact:

Re: ASCII to UNICODE

Post by jesperbrannmark »

Solution:

Code: Select all

Procedure.s utf8encode(txt.s)
  *b=AllocateMemory(StringByteLength(txt.s,#PB_UTF8)+1)
  PokeS(*b,txt.s,Len(txt.s),#PB_UTF8)             ; writes 2 bytes, 1 for 'x' and 1 for a zero (no special characters)   
  txt.s=PeekS(*b,StringByteLength(txt.s,#PB_UTF8),#PB_Ascii)
  FreeMemory(*b)
  ProcedureReturn txt.s
EndProcedure
 Debug utf8encode("ABC 123 ÅÄÖ Ééåäö ^nñ ABC")
ABBKlaus
Addict
Addict
Posts: 1143
Joined: Sat Apr 10, 2004 1:20 pm
Location: Germany

Re: ASCII to UNICODE

Post by ABBKlaus »

Thats a neat trick jesperbrannmark, but a waste of memory if you are in unicode mode.
hessu
User
User
Posts: 25
Joined: Fri Nov 20, 2015 6:30 am

Re: ASCII to UNICODE

Post by hessu »

Procedure.s utf8encode(txt.s)
*b=AllocateMemory(StringByteLength(txt.s,#PB_UTF8)+1)
PokeS(*b,txt.s,Len(txt.s),#PB_UTF8) ; writes 2 bytes, 1 for 'x' and 1 for a zero (no special characters)
txt.s=PeekS(*b,StringByteLength(txt.s,#PB_UTF8),#PB_Ascii)
FreeMemory(*b)
ProcedureReturn txt.s
EndProcedure
Debug utf8encode("ABC 123 ÅÄÖ Ééåäö ^nñ ABC")

This works, but if I read ABC 123 ÅÄÖ Ééåäö ^nñ ABC from a file so it works not !!!
So what to do ???

:cry:
infratec
Always Here
Always Here
Posts: 7583
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: ASCII to UNICODE

Post by infratec »

This stuff is ... very old and not needed.

If you read from a file use the correct flag for ReadString(), then the content is automatically converted.
User avatar
mk-soft
Always Here
Always Here
Posts: 6207
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: ASCII to UNICODE

Post by mk-soft »

The program should be in Unicode.

When converting from String to Ascii or UTF8 these are no longer strings for internal PB, but pointers on a memory.
Not Unicode String must also be pointers to a memory on the Ascii or UTF8 string.

Code: Select all


CompilerIf #PB_Compiler_Version < 550
  
  Procedure Ascii(String.s)
    Protected *mem, len = Len(String)
    *mem = AllocateMemory(len + 1)
    If *mem
      PokeS(*mem, String, -1, #PB_Ascii)
    EndIf
    ProcedureReturn *mem
  EndProcedure
  
  Procedure UTF8(String.s)
    Protected *mem, len = StringByteLength(String, #PB_UTF8)
    *mem = AllocateMemory(len + 1)
    If *mem
      PokeS(*mem, String, -1, #PB_UTF8)
    EndIf
    ProcedureReturn *mem
  EndProcedure
  
CompilerEndIf

a.s = "ABC 123 ÅÄÖ Ééåäö ^nñ ABC"

*utf8_String = UTF8(a.s)
*ascii_String = Ascii(a.s)

Debug PeekS(*utf8_String, -1, #PB_UTF8)
Debug PeekS(*ascii_String, -1, #PB_Ascii)

ShowMemoryViewer(*utf8_String, 64)
;ShowMemoryViewer(*ascii_String, 64)

FreeMemory(*utf8_String)
FreeMemory(*ascii_String)
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
hessu
User
User
Posts: 25
Joined: Fri Nov 20, 2015 6:30 am

Re: ASCII to UNICODE

Post by hessu »

Hi thanks for tip.

jag tried write THis string thing, but no luck.

can du write a little exempel
how to read a string fro file.
I would be much happy for that.
I think it so complicated med purebasic.
But it works so good with sqlite.
User avatar
Mijikai
Addict
Addict
Posts: 1517
Joined: Sun Sep 11, 2016 2:17 pm

Re: ASCII to UNICODE

Post by Mijikai »

Either use a more up to date Version where you can set the flags fot ReadString()
or read the raw data and use PeekS() as showed before.
infratec
Always Here
Always Here
Posts: 7583
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: ASCII to UNICODE

Post by infratec »

So tell us first which PB version you are using.
And with which encoding is your file stored?
User avatar
mk-soft
Always Here
Always Here
Posts: 6207
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: ASCII to UNICODE

Post by mk-soft »

Sorry, i don't know what is the problem with file system

Code: Select all

;-TOP

EnableExplicit

Procedure ReadFileToList(Filename.s, List Rows.s()) ; Result = BOM
  Protected file, bom
  
  ClearList(Rows())
  file = ReadFile(#PB_Any, Filename)
  If file
    If Not Eof(file)
      bom = ReadStringFormat(file)
      While Not Eof(file)
        AddElement(Rows())
        Rows() = ReadString(file, bom)
      Wend
    EndIf
    CloseFile(file)
  EndIf
  ProcedureReturn bom
EndProcedure

Procedure WriteFileFromList(Filename.s, List Rows.s(), Bom = #PB_Ascii)
  Protected file
  
  file = CreateFile(#PB_Any, Filename)
  If file
    WriteStringFormat(file, Bom)
    ForEach Rows()
      WriteStringN(file, Rows(), Bom)
    Next
    CloseFile(file)
    ProcedureReturn #True
  Else
    ProcedureReturn #False
  EndIf
EndProcedure

; ----

Global NewList Result.s()
Define fname.s = OpenFileRequester("Textfile", "", "", 0)
Define rows, bom

If fname <> ""
  bom = ReadFileToList(fname, Result())
  If Not bom
    Debug "Error open file " + fname
  Else
    rows = ListSize(Result())
    Debug "count rows = " + rows
    Debug "bom = " + bom
    
    ForEach Result()
      Debug Result()
    Next
    
    fname + ".unicode"
    If WriteFileFromList(fname, Result(), #PB_Unicode)
      Debug "Write file as unicode = " + fname
    Else
      Debug "Error write file " + fname
    EndIf
  EndIf
EndIf
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
hessu
User
User
Posts: 25
Joined: Fri Nov 20, 2015 6:30 am

Re: ASCII to UNICODE

Post by hessu »

I use Pb 5.71
This file is saved as ascii file with liberty basic a couple of years ago.
Or maybe plain text file.
:lol:
Post Reply