CSpell

Phonetic Similarity Score

Introduction

This page describes the algorithm to calculate the similarity score by phonetic approach. The idea is to find edit distance score on the converted Metaphone code of the token and candidate.

Algorithm

  • Convert string to Metaphone code with max. code length = 10 (use double Metaphone)
  • Get edit distance for Metaphone code:
    • delete cost = 95
    • insert cost = 95
    • replace cost = 100
    • swap cost = 90
    • case change cost = 10
    • split cost = insert cost = 95, for each split
  • Get penalty for split
  • Similarity Score = Edit Distance + penalty, use ceiling 1000 (0.00 <= similarity score <= 1.00)

Source Code: