Remove diacritic chars

Just starting out? Need help? Post your questions and find answers here.
User avatar
Zapman
Enthusiast
Enthusiast
Posts: 205
Joined: Tue Jan 07, 2020 7:27 pm

Re: Remove diacritic chars

Post by Zapman »

Sorry for my successive posts.
I just optimized the preceeding procedure. It is now 400 times faster!

Code: Select all

Procedure.s RemoveAccents(Text$)
  ; Function to remove accents from a string. By Zapman.
  If Text$
    Protected length = Len(Text$) * 2
    Protected OPos, NPos, DoubleLength
    Protected NormalizedText$ = Space(Length)
    ;
    ; FoldString_() will replace each accentuated character by a pair of characteres
    ; as this: (NonAccentuatedCharactere) + (diacritic)
    Length = FoldString_(#MAP_COMPOSITE, @Text$, - 1, @NormalizedText$, Length) - 1
    ;
    ; Examine the result:
    If Length > 0 And Length <> Len(Text$)
      DoubleLength = (Length - 1) * 2
      For NPos = 0 To DoubleLength Step 2
        If PeekC(@Text$ + OPos) <> PeekC(@NormalizedText$ + NPos)
          ; If the character has been replaced, replace it into the original text:
          PokeC(@Text$ + OPos, PeekC(@NormalizedText$ + NPos))
          ; The following character contains the diacritic. Jump over it:
          NPos + 2
        EndIf
        OPos + 2
      Next
    EndIf
    ;
    ProcedureReturn Text$
  EndIf
EndProcedure
On my computer, it takes 130 ms for a 1 million characters string.
User avatar
jacdelad
Addict
Addict
Posts: 2010
Joined: Wed Feb 03, 2021 12:46 pm
Location: Riesa

Re: Remove diacritic chars

Post by jacdelad »

Thanks again, I'll try this when I'm programming again.
Good morning, that's a nice tnetennba!

PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
CalamityJames
User
User
Posts: 81
Joined: Sat Mar 13, 2010 4:50 pm

Re: Remove diacritic chars

Post by CalamityJames »

It isn't quite the same as what you are asking but here is some code written by Wilbert in answer to a question I asked on the forum 12(!) years ago, which covers the same ground http://www.purebasic.fr/english/viewtop ... 06#p412706. I'm still using it.
Post Reply