Page 2 of 2
Re: Revised Chr() & Asc() for UTF-16 surrogate pairs
Posted: Mon Apr 01, 2024 1:10 pm
by mk-soft
With one PeekS ...
Code: Select all
Procedure.s _Chr(v.i) ;return a proper surrogate pair for unicode values outside the BMP (Basic Multilingual Plane)
Protected highlow.l
If v < $10000
ProcedureReturn Chr(v)
Else
;calculate surrogate pair of unicode codepoints to represent value in UTF-16
v - $10000
; high/lead << low/tail surrogate value
highlow = (v / $400 + $D800) | (v % $400 + $DC00) << 16
ProcedureReturn PeekS(@highlow, 2, #PB_Unicode)
EndIf
EndProcedure
Debug _Chr($1F600) ; Smiley
Re: Revised Chr() & Asc() for UTF-16 surrogate pairs
Posted: Mon Apr 01, 2024 4:29 pm
by infratec
I think your last version is the slowest due to several calls to other procedures.
Your previous version is faster :
Code: Select all
Procedure.s _Chr(v.i) ;return a proper surrogate pair for unicode values outside the BMP (Basic Multilingual Plane)
Protected r.s{2}, *p.Character
*p = @r
If v < $10000
*p\c = v
; *p + 2 ; not needed, since PB initializes everything with 0
; *p\c = #Null
Else ; calculate surrogate pair of unicode codepoints to represent value in UTF-16
v - $10000
*p\c = v / $400 + $D800 ; high/lead surrogate value
*p + 2
*p\c = v % $400 + $DC00 ; low/tail surrogate value
EndIf
ProcedureReturn r
EndProcedure
a$ = _Chr($1F600) + " Smiley"
Debug a$
a$ = _Chr($0040) + " At"
Debug a$
Re: Revised Chr() & Asc() for UTF-16 surrogate pairs
Posted: Mon Apr 01, 2024 4:41 pm
by mk-soft
Had the impression that pointers were not wanted

Re: Revised Chr() & Asc() for UTF-16 surrogate pairs
Posted: Mon Apr 01, 2024 4:43 pm
by STARGĂ…TE
infratec wrote: Mon Apr 01, 2024 4:29 pm
I think your last version is the slowest due to several calls to other procedures.
When we talk about speed, than we should avoid division of $400:
Code: Select all
Procedure.s _Chr(Unicode.i)
Protected String.s{2}
Protected *Long.Long = @String
If Unicode < $10000
*Long\l = Unicode
Else
Unicode - $10000
*Long\l = (Unicode>>10) | (Unicode&$3FF)<<16 | $DC00D800
EndIf
ProcedureReturn String
EndProcedure
a$ = _Chr($1F600) + " Smiley"
Debug a$
a$ = _Chr($0040) + " At"
Debug a$
Re: Revised Chr() & Asc() for UTF-16 surrogate pairs
Posted: Mon Apr 01, 2024 6:30 pm
by infratec
Yep, that's true.
But ... for a not bit affine person it's now no longer understandable what happens.
Re: Revised Chr() & Asc() for UTF-16 surrogate pairs
Posted: Wed Apr 17, 2024 9:15 pm
by idle
infratec wrote: Mon Apr 01, 2024 6:30 pm
Yep, that's true.
But ... for a not bit affine person it's now no longer understandable what happens.
They can watch and learn or ask questions. Speed 1st