Proper way to write Unicode to Windows console
Posted: Mon Sep 10, 2018 3:46 pm
[PB 5.62 x86, Windows 7]
I am struggling to find the "universal" way to print Unicode output to a Windows cmd console.
I know the Windows cmd.exe uses codepages, but I don't want my programs to rely on correct codepage, or use "chcp" to change it... I read if you use WriteConsoleW_() you shouldn't need chcp (I think PB's Print() internally uses this?)
Compile this simple program as "print.exe" and it prints the expected output.
But pipe it through the common command "more", as "print | more" and I have two problems:
1. it prints as a 1-char wide column
2. the é character is displayed incorrectly
So I read that "more" expects a BOM. So this version will work with "print | more", but now the normal "print" shows a blank box character where the BOM is printed.
If I use a #BS$ or #CR$ to overwrite the #BOM$, then the console output of "print" and "print | more" both *look* correct.
But now, if you pipe the output to a file like "print > log.txt", the text file contains an unwanted BOM and BS or CR at the very beginning.
Is there a "proper" way to handle all this???
(PS. I also tried converting text to a UTF8() byte buffer, then using WriteConsoleData()... output looks correct in cmd console, but piping it through "more" fails to display at all!)
I am struggling to find the "universal" way to print Unicode output to a Windows cmd console.
I know the Windows cmd.exe uses codepages, but I don't want my programs to rely on correct codepage, or use "chcp" to change it... I read if you use WriteConsoleW_() you shouldn't need chcp (I think PB's Print() internally uses this?)
Compile this simple program as "print.exe" and it prints the expected output.
Code: Select all
If OpenConsole()
Print("Héllo World!")
CloseConsole()
EndIf
1. it prints as a 1-char wide column
2. the é character is displayed incorrectly
So I read that "more" expects a BOM. So this version will work with "print | more", but now the normal "print" shows a blank box character where the BOM is printed.
Code: Select all
#BOM$ = Chr($FeFF)
If OpenConsole()
Print(#BOM$)
Print("Héllo World!")
CloseConsole()
EndIf
If I use a #BS$ or #CR$ to overwrite the #BOM$, then the console output of "print" and "print | more" both *look* correct.
But now, if you pipe the output to a file like "print > log.txt", the text file contains an unwanted BOM and BS or CR at the very beginning.
Code: Select all
#BOM$ = Chr($FeFF)
If OpenConsole()
Print(#BOM$)
Print(#BS$)
Print("Héllo World!")
CloseConsole()
EndIf
Is there a "proper" way to handle all this???
(PS. I also tried converting text to a UTF8() byte buffer, then using WriteConsoleData()... output looks correct in cmd console, but piping it through "more" fails to display at all!)