CSpell

Orthographic Score

Introduction

This page describes the ranking algorithm to choose a correct word from the suggested candidates for a spelling error word.

Algorithm

Orthographic score is a weighted sum of the following 3 similarity scores.

That is:
Orthographic score = wf1 * Token similarity score + wf2 * Phonetic similarity score + wf3 * Overlap similarity score

where:

Source Code:

Example:
Orthographic score between misspelling truely and candidate truly

Token similarity score:
There is a delete operation ('e' is delete) from truely to truly. The normalized delete cost is 0.096.
The token similarity score is calculated:
= 1.0 - 0.096
= 0.904
Phonetic similarity score:
The phonetic representation (Double Metaphone code) of truely and truly are the same [TRL], thus the phonetic similarity score is 1.0
Leading/trailing character overlap similarity score:
= (leading overlap characters + trailing overlap characters) / the length of longer terms
= (3+2)/6
= 0.83
Orthographic score:
= 1.00 * 0.904 + 0.70 * 1.0 + 0.80 * 0.83
= 2.27