Metaphone

Everything else that doesn't fall into one of the other PB categories.
User avatar
RichAlgeni
Addict
Addict
Posts: 935
Joined: Wed Sep 22, 2010 1:50 am
Location: Bradenton, FL

Metaphone

Post by RichAlgeni »

Has anyone written a PB Metaphone process? I'm not real happy with the one I am using. Seems like it could be more efficient, but when I tweak it one way, it seems to be deficient in another.

I was hoping for a fresh approach.

Thanks!
ebs
Enthusiast
Enthusiast
Posts: 567
Joined: Fri Apr 25, 2003 11:08 pm

Re: Metaphone

Post by ebs »

Rich,

This code is not mine; I think I translated it from a Visual Basic version I found, but I don't remember.
It could also be someone else's PB work, so I don't want to claim any credit. In any case it works fine!

Regards,
Eric

Code: Select all

#VOWELS = "AEIOU"   ; vowels
#FRONTV = "EIY"     ; special cases for letters in FRONT of these
#VARSON = "CSPTG"   ; variable sound - those modified by adding "H"
#DOUBLE = "."       ; let these double letters through

#excpPAIR = "AGKPW" ; exceptions "AE-", "GN-", "KN-", "PN-", "WR-"
#nextLTR =  "ENNNR"

Procedure.s MetaPhone(name.s, metalen.l)
  metaph.s 
  
  ii.l
  Pos.l
  
  namelen.l
  
  curLtr.s
  prevLtr.s
  nextLtr.s
  nextLtr2.s
  nextLtr3.s
   
  vowelAfter.l
  vowelBefore.l
  frontvAfter.l
  
  wname.s
  uName.s
  Char.s
  
  ; uppercase string
  uName = UCase(name)
  ; remove non-alpha characters
  For ii = 1 To Len(uName)
    Char = Mid(uName, ii, 1)
    If Char >= "A" And Char <= "Z"
      wname + Char
    EndIf
  Next
  ; done if no letters!
  namelen = Len(wname)
  If namelen = 0
      ProcedureReturn
  EndIf
  
  ; if AE, GN, KN, PN, WR then drop first letter
  Pos = FindString(#excpPAIR, Left(wname, 1), 1)
  If Pos
    If Mid(wname, 2, 1) = Mid(#nextLTR, Pos, 1)
      wname = Right(wname, namelen-1)
    EndIf
  EndIf
  
  ; change X to S
  If Left(wname, 1) = "X"
    wname = "S" + Right(wname, namelen-1)
  EndIf
  
  ; get rid of the H in WH
  If Left(wname, 2) = "WH"
    wname = "W" + Right(wname, namelen-2) 
  EndIf

  ; remove an S from the end of the string
  If Right(wname, 1) = "S"
    wname = Left(wname, namelen-1)
  EndIf
  
  namelen = Len(wname)
  For ii = 1 To namelen
    ; stop when maximum length reached
    If Len(metaph) > metalen
      Break
    EndIf
    
    curLtr = Mid(wname, ii, 1)

    ; if first letter is a vowel KEEP it
    If ii = 1 And FindString(#VOWELS, curLtr, 1)
      metaph + curLtr
      Continue
    EndIf
    
    ; check previous letter
    If ii > 1
      prevLtr = Mid(wname, ii-1, 1)
      vowelBefore = FindString(#VOWELS, prevLtr, 1)
    Else
      prevLtr = ""
      vowelBefore = #False
    EndIf
    
    ; check next letter
    If ii < namelen
      nextLtr = Mid(wname, ii+1, 1)
      vowelAfter = FindString(#VOWELS, nextLtr, 1)
      frontvAfter = FindString(#FRONTV, nextLtr, 1)
    Else
      nextLtr = ""
      vowelAfter = #False
      frontvAfter = #False
    EndIf
    
    ; skip double letters except ones in list
    If curLtr = nextLtr And FindString(#DOUBLE, nextLtr, 1) = 0
      Continue
    EndIf
    
    ; check 2 letters ahead
    If ii < namelen - 1
      nextLtr2 = Mid(wname, ii+2, 1)
    Else
      nextLtr2 = "" 
    EndIf
    
    ; check 3 letters ahead
    If ii < namelen - 2
      nextLtr3 = Mid(wname, ii+3, 1)
    Else
      nextLtr3 = "" 
    EndIf
      
    Select curLtr
      Case "B"
        ; keep B unless name ends in MB
        If ii <> namelen Or prevLtr <> "M"
          metaph + "B"
        EndIf
        
      Case "C"
        ; silent C in -SCI- -SCE- -SCY- SCI- etc
        If ii < 2 Or prevLtr <> "S" Or frontvAfter = #False
          ; -CIA- = X
          If ii > 1 And nextLtr = "I" And nextLtr2 = "A"
            metaph + "X"
          Else
            If frontvAfter
              metaph + "S"
            Else
              ; C in -SCH- = K
              If ii > 2 And prevLtr = "S" And nextLtr = "H"
                metaph + "K"
              Else
                If nextLtr = "H"
                  If ii = 1 And FindString(#VOWELS, nextLtr2, 1) = 0
                    ; -CH<consonant> = K
                    metaph + "K"
                  Else
                    ; -CH<vowel> = X
                    metaph + "X"
                  EndIf
                Else
                  If prevLtr = "C"
                    metaph + "C"
                  Else
                    metaph + "K"
                  EndIf
                EndIf
              EndIf
            EndIf
          EndIf
        EndIf

      Case "D"
        ; -DG<frontvowel> = J, else T
        If nextLtr = "G" And FindString(#FRONTV, nextLtr2, 1)
          metaph + "J"
        Else
          metaph + "T"
        EndIf
        
      Case "G"
        ; silent G in -GH- except for -GH and no vowel after H
        If ii < namelen - 1 And nextLtr = "H" And FindString(#VOWELS, nextLtr2, 1) = 0
        ; silent G in -GNED- and -GN
        ElseIf ii = namelen - 3 And nextLtr = "N" And nextLtr2 = "E" And nextLtr3 = "D"
        ElseIf ii = namelen - 1 And nextLtr = "N"
        ElseIf prevLtr = "D" And frontvAfter
        Else
          If frontvAfter And prevLtr <> "G"
            metaph + "J"
          Else
            ; hard GG
            metaph + "K"
          EndIf
        EndIf
        
      Case "H"
        ; silent H if preceeded by certain consonants
        If FindString(#VARSON, prevLtr, 1)
        ; silent H in <vowel>H<consonant>
        ElseIf vowelBefore And vowelAfter = #False
        Else
          metaph + "H"
        EndIf
        
      Case "F"
        metaph + curLtr
      Case "J"
        metaph + curLtr
      Case "L"
        metaph + curLtr
      Case "M"
        metaph + curLtr
      Case "N"
        metaph + curLtr
      Case "R"
        metaph + curLtr
        
      Case "K"
        If prevLtr <> "C"
          metaph + "K"
        EndIf
      
      Case "P"
        If nextLtr = "H"
          metaph + "F"
        Else
          metaph + "P" 
        EndIf
        
      Case "Q"
        metaph + "K"
        
      Case "S"
        ; -SH-, -SIO-, -SIA- = X
        If nextLtr = "H" Or (ii > 2 And nextLtr = "I" And (nextLtr2 = "O" Or nextLtr2 = "A"))
          metaph + "X"
        Else
          metaph + "S"
        EndIf
        
      Case "T"
        ; -TIO-, -TIA- = X
        If ii > 2 And nextLtr = "I" And (nextLtr2 = "O" Or nextLtr2 = "A")
          metaph + "X"
        Else  ; THE = 0, THO = T, WITHROW = 0
          If nextLtr = "H"
            If ii > 1 Or FindString(#VOWELS, nextLtr2, 1)
              metaph + "0"
            Else
              metaph + "T"
            EndIf
          Else
            If ii >= namelen - 2 Or nextLtr <> "C" Or nextLtr2 <> "H"
              metaph + "T"
            EndIf
          EndIf
        EndIf
        
      Case "V"
        metaph + "F"
        
      Case "W"
        If (ii < namelen) And vowelAfter
          metaph + curLtr
        EndIf
      Case "Y"
        If (ii < namelen) And vowelAfter
          metaph + curLtr
        EndIf
        
      Case "X"
        metaph + "KS"
        
      Case "Z"
        metaph + "S"
    EndSelect
  Next
  
  ProcedureReturn metaph
EndProcedure

Repeat
  iname.s = InputRequester("Metaphone Test", "Enter word or name", "")
  If iname
    MessageRequester("Metaphone Test", "Metaphone code for " + iname + " is " + MetaPhone(iname, 20))
  Else
    Break
  EndIf
ForEver
User avatar
RichAlgeni
Addict
Addict
Posts: 935
Joined: Wed Sep 22, 2010 1:50 am
Location: Bradenton, FL

Re: Metaphone

Post by RichAlgeni »

Thanks Eric!

What did you think of it? Did it do what you needed?

I was using an older version to look up business names, it did ok, but not as good as I had hoped.
ebs
Enthusiast
Enthusiast
Posts: 567
Joined: Fri Apr 25, 2003 11:08 pm

Re: Metaphone

Post by ebs »

Rich,

I used it to do some "fuzzy matching" to look up names in a database when
the user might only know how the name "sounds", but not the exact spelling.

It did a pretty good job; it allowed me to present a list of possible matches for the user to pick from.
It was certainly better than the Soundex algorithm I started with!

Regards,
Eric
User avatar
RichAlgeni
Addict
Addict
Posts: 935
Joined: Wed Sep 22, 2010 1:50 am
Location: Bradenton, FL

Re: Metaphone

Post by RichAlgeni »

Thanks again Eric!
Post Reply