PureBasic Forums - English

Posted: **Fri Feb 02, 2024 1:55 pm**

How can I write to a file so that it will correctly output Unicode codes, which will be recognised on online pages etc resulting in the characters being displayed.

I found a list of Unicode codes here: https://www.unicode.org/Public/UCD/late ... Charts.pdf.

How can I output code 1F0B6 🂶 PLAYING CARD SIX OF HEARTS for example?

Or this

Code: Select all

𝔸𝔹ℂ 𝕒𝕓𝕔 𝟙𝟚𝟛

Which was copied and pasted from one of those online font converters.

I will have an input box and then the asci codes will reference to the codes in an array from the relevant font. I assume I will have to use byte output. Does anyone know how the output of each code would be represented in bytes?

Posted: **Fri Feb 02, 2024 2:10 pm**

You can use the excellent UTF-16 module by idle: https://www.purebasic.fr/english/viewtopic.php?t=80275

Code: Select all

Procedure.s StrChr(v.i) ;return a proper surrogate pair for unicode values outside the BMP (Basic Multilingual Plane)
	Protected buffer.q
	If v < $10000
		ProcedureReturn Chr(v)
	Else
		Buffer = (v&$3FF)<<16 | (v-$10000)>>10 | $DC00D800
		ProcedureReturn PeekS(@Buffer, 2, #PB_Unicode)
	EndIf
EndProcedure

a$ = StrChr($1F0B6)
Debug a$

Posted: **Fri Feb 02, 2024 3:57 pm**

That's great, it seems to work for most of Unicode. I don't expect to be using the eastern characters it doesn't work with so it is good for me.

Would there to get the Unicode code from a pasted character, like the ABC123abc I included in the first post, to see what the large Unicode reference list calls them?

Posted: **Fri Feb 02, 2024 4:27 pm**

I'm struggling to understand the relation between the Unicode Codes and the Hex of a TXT file containing them, for example:

Code: Select all

a$ = StrChr($1F150)+StrChr($1F0B2)+StrChr($1F0B3)

Converted into 🅐🂲🂳 then that txt file saved and then opened in HxD results in: F0 9F 85 90 F0 9F 82 B2 F0 9F 82 B3

It looks like F0 is the marker of a 4 byte code for a character, not sure about any more.

Posted: **Fri Feb 02, 2024 6:46 pm**

matalog wrote: Fri Feb 02, 2024 4:27 pm I'm struggling to understand the relation between the Unicode Codes and the Hex of a TXT file containing them, for example:
Code: Select all
a$ = StrChr($1F150)+StrChr($1F0B2)+StrChr($1F0B3)
Converted into 🅐🂲🂳 then that txt file saved and then opened in HxD results in: F0 9F 85 90 F0 9F 82 B2 F0 9F 82 B3

It looks like F0 is the marker of a 4 byte code for a character, not sure about any more.

The code example is using UTF-16 and the values from the text file are in UTF-8.

Posted: **Sat Feb 03, 2024 6:07 am**

Fred wrote: Fri Feb 02, 2024 2:10 pm You can use the excellent UTF-16 module by idle: https://www.purebasic.fr/english/viewtopic.php?t=80275
Code: Select all
Procedure.s StrChr(v.i) ;return a proper surrogate pair for unicode values outside the BMP (Basic Multilingual Plane)
	Protected buffer.q
	If v < $10000
		ProcedureReturn Chr(v)
	Else
		Buffer = (v&$3FF)<<16 | (v-$10000)>>10 | $DC00D800
		ProcedureReturn PeekS(@Buffer, 2, #PB_Unicode)
	EndIf
EndProcedure

a$ = StrChr($1F0B6)
Debug a$

but that's missing the other 12000 lines of code

PureBasic Forums - English

Unicode Text output

Unicode Text output

Re: Unicode Text output

Re: Unicode Text output

Re: Unicode Text output

Re: Unicode Text output

Re: Unicode Text output