Unicode Text output

Just starting out? Need help? Post your questions and find answers here.
User avatar
matalog
Enthusiast
Enthusiast
Posts: 305
Joined: Tue Sep 05, 2017 10:07 am

Unicode Text output

Post by matalog »

How can I write to a file so that it will correctly output Unicode codes, which will be recognised on online pages etc resulting in the characters being displayed.

I found a list of Unicode codes here: https://www.unicode.org/Public/UCD/late ... Charts.pdf.

How can I output code 1F0B6 ๐Ÿ‚ถ PLAYING CARD SIX OF HEARTS for example?

Or this

Code: Select all

๐”ธ๐”นโ„‚ ๐•’๐•“๐•” ๐Ÿ™๐Ÿš๐Ÿ›
Which was copied and pasted from one of those online font converters.

I will have an input box and then the asci codes will reference to the codes in an array from the relevant font. I assume I will have to use byte output. Does anyone know how the output of each code would be represented in bytes?
Fred
Administrator
Administrator
Posts: 18351
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: Unicode Text output

Post by Fred »

You can use the excellent UTF-16 module by idle: https://www.purebasic.fr/english/viewtopic.php?t=80275

Code: Select all

Procedure.s StrChr(v.i) ;return a proper surrogate pair for unicode values outside the BMP (Basic Multilingual Plane)
	Protected buffer.q
	If v < $10000
		ProcedureReturn Chr(v)
	Else
		Buffer = (v&$3FF)<<16 | (v-$10000)>>10 | $DC00D800
		ProcedureReturn PeekS(@Buffer, 2, #PB_Unicode)
	EndIf
EndProcedure

a$ = StrChr($1F0B6)
Debug a$
User avatar
matalog
Enthusiast
Enthusiast
Posts: 305
Joined: Tue Sep 05, 2017 10:07 am

Re: Unicode Text output

Post by matalog »

That's great, it seems to work for most of Unicode. I don't expect to be using the eastern characters it doesn't work with so it is good for me.


Would there to get the Unicode code from a pasted character, like the ABC123abc I included in the first post, to see what the large Unicode reference list calls them?
User avatar
matalog
Enthusiast
Enthusiast
Posts: 305
Joined: Tue Sep 05, 2017 10:07 am

Re: Unicode Text output

Post by matalog »

I'm struggling to understand the relation between the Unicode Codes and the Hex of a TXT file containing them, for example:

Code: Select all

a$ = StrChr($1F150)+StrChr($1F0B2)+StrChr($1F0B3)
Converted into ๐Ÿ…๐Ÿ‚ฒ๐Ÿ‚ณ then that txt file saved and then opened in HxD results in: F0 9F 85 90 F0 9F 82 B2 F0 9F 82 B3

It looks like F0 is the marker of a 4 byte code for a character, not sure about any more.
User avatar
Demivec
Addict
Addict
Posts: 4281
Joined: Mon Jul 25, 2005 3:51 pm
Location: Utah, USA

Re: Unicode Text output

Post by Demivec »

matalog wrote: Fri Feb 02, 2024 4:27 pm I'm struggling to understand the relation between the Unicode Codes and the Hex of a TXT file containing them, for example:

Code: Select all

a$ = StrChr($1F150)+StrChr($1F0B2)+StrChr($1F0B3)
Converted into ๐Ÿ…๐Ÿ‚ฒ๐Ÿ‚ณ then that txt file saved and then opened in HxD results in: F0 9F 85 90 F0 9F 82 B2 F0 9F 82 B3

It looks like F0 is the marker of a 4 byte code for a character, not sure about any more.
The code example is using UTF-16 and the values from the text file are in UTF-8.
User avatar
idle
Always Here
Always Here
Posts: 6035
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: Unicode Text output

Post by idle »

Fred wrote: Fri Feb 02, 2024 2:10 pm You can use the excellent UTF-16 module by idle: https://www.purebasic.fr/english/viewtopic.php?t=80275

Code: Select all

Procedure.s StrChr(v.i) ;return a proper surrogate pair for unicode values outside the BMP (Basic Multilingual Plane)
	Protected buffer.q
	If v < $10000
		ProcedureReturn Chr(v)
	Else
		Buffer = (v&$3FF)<<16 | (v-$10000)>>10 | $DC00D800
		ProcedureReturn PeekS(@Buffer, 2, #PB_Unicode)
	EndIf
EndProcedure

a$ = StrChr($1F0B6)
Debug a$
but that's missing the other 12000 lines of code :lol:
Post Reply