Backport UTF8() and Ascii() to PB 5.4x LTS

Got an idea for enhancing PureBasic? New command(s) you'd like to see?
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Backport UTF8() and Ascii() to PB 5.4x LTS

Post by wilbert »

I know usually features aren't added to previous versions.
In this case I think it would be good to have UTF8() and Ascii() backported to the PB 5.4x LTS cycle.
It would help the transition to internal unicode and make it easier to post code which works on both the most recent version and the LTS version.
Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
kenmo
Addict
Addict
Posts: 2033
Joined: Tue Dec 23, 2003 3:54 am

Re: Backport UTF8() and Ascii() to PB 5.4x LTS

Post by kenmo »

That's a good idea.
I'm surprised more string helper functions weren't added BEFORE the ASCII switch was removed in 5.50.

Maybe there should be a Unicode() function, to complete the set :) Sometimes you might want a buffer copy of a string in Unicode format too.

I have been using my own StringHelper.pbi which includes buffer functions like these:

Code: Select all

CompilerIf (Not Defined(Ascii, #PB_Function))
  Procedure.i Ascii(Input.s)
    Protected BufferSize.i = StringByteLength(Input, #PB_Ascii) + 1
    Protected *Buffer = AllocateMemory(BufferSize, #PB_Memory_NoClear)
    If (*Buffer)
      PokeS(*Buffer, Input, -1, #PB_Ascii)
    EndIf
    ProcedureReturn (*Buffer)
  EndProcedure
CompilerEndIf

CompilerIf (Not Defined(UTF8, #PB_Function))
  Procedure.i UTF8(Input.s)
    Protected BufferSize.i = StringByteLength(Input, #PB_UTF8) + 1
    Protected *Buffer = AllocateMemory(BufferSize, #PB_Memory_NoClear)
    If (*Buffer)
      PokeS(*Buffer, Input, -1, #PB_UTF8)
    EndIf
    ProcedureReturn (*Buffer)
  EndProcedure
CompilerEndIf

CompilerIf (Not Defined(Unicode, #PB_Function))
  Procedure.i Unicode(Input.s)
    Protected BufferSize.i = StringByteLength(Input, #PB_Unicode) + 2
    Protected *Buffer = AllocateMemory(BufferSize, #PB_Memory_NoClear)
    If (*Buffer)
      PokeS(*Buffer, Input, -1, #PB_Unicode)
    EndIf
    ProcedureReturn (*Buffer)
  EndProcedure
CompilerEndIf

*AsciiBuffer = Ascii("Héllo World from ASCII!")
Debug MemorySize(*AsciiBuffer)
Debug PeekS(*AsciiBuffer, -1, #PB_Ascii)

*UTF8Buffer = UTF8("Héllo World from UTF-8!")
Debug MemorySize(*UTF8Buffer)
Debug PeekS(*UTF8Buffer, -1, #PB_UTF8)

*UnicodeBuffer = Unicode("Héllo World from Unicode!")
Debug MemorySize(*UnicodeBuffer)
Debug PeekS(*UnicodeBuffer, -1, #PB_Unicode)
User avatar
Lunasole
Addict
Addict
Posts: 1091
Joined: Mon Oct 26, 2015 2:55 am
Location: UA
Contact:

Re: Backport UTF8() and Ascii() to PB 5.4x LTS

Post by Lunasole »

I'm using following simple stuff, it stores ASCII string right inside PB unicode strings, so no need to allocate additional memory and so on.
I don't think it will make much difference if add such functions to 5.4, as most ppl already know that there is no such functions in this version ^^

Code: Select all

; *str	:	a pointer to PB unicode string, or any memory containing unicode string
; RETURN:	PB unicode string with ascii bytes order inside, equal to array of 1-byte chars
Procedure$ ToAscii (*str)
	If *str 
		Protected str$ = PeekS(*str, #PB_Default, #PB_Unicode)
		Protected out$ = Space(1 + Len(str$) / SizeOf(Character))
			PokeS(@out$, str$, #PB_Default, #PB_Ascii)
		ProcedureReturn out$
	EndIf
EndProcedure

; *str	:	pointer to a PB string returned by ToAscii(), or any ASCII string memory buffer
; RETURN:	PB unicode string. This function is just a wrapper for uniformity
Procedure$ ToUnicode (*str)
	If *str :	ProcedureReturn PeekS(*str, #PB_Default, #PB_Ascii) : 	EndIf
EndProcedure


Define A$ = "this is unicode"
Define B$ = ToAscii(@A$) 	; this is A$ converted to ASCII
Define C$ = ToUnicode(@B$) 	; this is B$ converted back to unicode

Debug A$
Debug B$ 
Debug C$
"W̷i̷s̷h̷i̷n̷g o̷n a s̷t̷a̷r"
Post Reply