Page 2 of 2

Re: Remove diacritic chars

Posted: Mon Mar 03, 2025 1:01 pm
by Zapman
Sorry for my successive posts.
I just optimized the preceeding procedure. It is now 400 times faster!

Code: Select all

Procedure.s RemoveAccents(Text$)
  ; Function to remove accents from a string. By Zapman.
  If Text$
    Protected length = Len(Text$) * 2
    Protected OPos, NPos, DoubleLength
    Protected NormalizedText$ = Space(Length)
    ;
    ; FoldString_() will replace each accentuated character by a pair of characteres
    ; as this: (NonAccentuatedCharactere) + (diacritic)
    Length = FoldString_(#MAP_COMPOSITE, @Text$, - 1, @NormalizedText$, Length) - 1
    ;
    ; Examine the result:
    If Length > 0 And Length <> Len(Text$)
      DoubleLength = (Length - 1) * 2
      For NPos = 0 To DoubleLength Step 2
        If PeekC(@Text$ + OPos) <> PeekC(@NormalizedText$ + NPos)
          ; If the character has been replaced, replace it into the original text:
          PokeC(@Text$ + OPos, PeekC(@NormalizedText$ + NPos))
          ; The following character contains the diacritic. Jump over it:
          NPos + 2
        EndIf
        OPos + 2
      Next
    EndIf
    ;
    ProcedureReturn Text$
  EndIf
EndProcedure
On my computer, it takes 130 ms for a 1 million characters string.

Re: Remove diacritic chars

Posted: Wed Mar 05, 2025 9:20 am
by jacdelad
Thanks again, I'll try this when I'm programming again.

Re: Remove diacritic chars

Posted: Wed Mar 05, 2025 12:54 pm
by CalamityJames
It isn't quite the same as what you are asking but here is some code written by Wilbert in answer to a question I asked on the forum 12(!) years ago, which covers the same ground http://www.purebasic.fr/english/viewtop ... 06#p412706. I'm still using it.