Given the current announcement, can we get these commands expedited, so we can begin our transitions?
Thanks.
ASCIItoUnicode & UnicodeToASCII conversions
-
- Always Here
- Posts: 6426
- Joined: Fri Oct 23, 2009 2:33 am
- Location: Wales, UK
- Contact:
Re: ASCIItoUnicode & UnicodeToASCII conversions
Both of those conversions are potentially unreliable, especially if your app is distributed in other countries using a different Locale and Code Page. If your app has to use ASCII because of what it interfaces with, maintain it with the current PB version and save on headaches. Write your new apps as Unicode unless the interface absolutely dictates otherwise. Let's not forget, your app can consist of more than one executable, so you can for example have a tiny exe in the background processing ASCII specific requirements.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
If it sounds simple, you have not grasped the complexity.
Re: ASCIItoUnicode & UnicodeToASCII conversions
My concern is for converting existing code and files, not actually converting unicode russian characters to ascii.
Re: ASCIItoUnicode & UnicodeToASCII conversions
I understand you want a native procedure but in itself it is very simple if you only use characters 32-127.
You simply convert 1 byte to 2 or 2 bytes to 1.
Here's an example that does a fast conversion of one type of input buffer into the other type of output buffer.
Things get complicated and would require more conversion time if you take into account values 128-255 from different code pages.
And this code page mess is exactly why unicode in most cases is a better solution compared to ascii.
You simply convert 1 byte to 2 or 2 bytes to 1.
Here's an example that does a fast conversion of one type of input buffer into the other type of output buffer.
Code: Select all
Procedure.i UnicodeToAscii(*UnicodeIn, *AsciiOut)
; *UnicodeIn : zero terminated unicode input buffer
; *AsciiOut : zero terminated ascii output buffer
; Result : number of converted characters (zero character not included)
!mov eax, -1
CompilerIf #PB_Compiler_Processor = #PB_Processor_x86
!mov [esp - 4], ebx
!mov ebx, [p.p_UnicodeIn]
!mov edx, [p.p_AsciiOut]
!ua_loop:
!inc eax
!mov cx, [ebx + eax * 2]
!mov [edx + eax], cl
CompilerElse
!mov r8, [p.p_UnicodeIn]
!mov r9, [p.p_AsciiOut]
!ua_loop:
!inc eax
!mov cx, [r8 + rax * 2]
!mov [r9 + rax], cl
CompilerEndIf
!and cx, cx
!jnz ua_loop
CompilerIf #PB_Compiler_Processor = #PB_Processor_x86
!mov ebx, [esp - 4]
CompilerEndIf
ProcedureReturn
EndProcedure
Procedure.i AsciiToUnicode(*AsciiIn, *UnicodeOut)
; *AsciiIn : zero terminated ascii input buffer
; *UnicodeOut : zero terminated unicode output buffer
; Result : number of converted characters (zero character not included)
!mov eax, -1
CompilerIf #PB_Compiler_Processor = #PB_Processor_x86
!mov [esp - 4], ebx
!mov ebx, [p.p_AsciiIn]
!mov edx, [p.p_UnicodeOut]
!au_loop:
!inc eax
!movzx cx, byte [ebx + eax]
!mov [edx + eax * 2], cx
CompilerElse
!mov r8, [p.p_AsciiIn]
!mov r9, [p.p_UnicodeOut]
!au_loop:
!inc eax
!movzx cx, byte [r8 + rax]
!mov [r9 + rax * 2], cx
CompilerEndIf
!and cx, cx
!jnz au_loop
CompilerIf #PB_Compiler_Processor = #PB_Processor_x86
!mov ebx, [esp - 4]
CompilerEndIf
ProcedureReturn
EndProcedure
MemSize = 1024 * 1024; Reserve 1 MB
*In = AllocateMemory(MemSize)
*Out = AllocateMemory(MemSize)
PokeS(*In, "Test", -1, #PB_Ascii)
Debug AsciiToUnicode(*In, *Out)
Debug PeekS(*Out, -1, #PB_Unicode)
And this code page mess is exactly why unicode in most cases is a better solution compared to ascii.
Windows (x64)
Raspberry Pi OS (Arm64)
Raspberry Pi OS (Arm64)