CSpell

Performance Tests on Ranking Systems (2-Stage)

I. Test Setup

II. Test Results

Stage-1	Stage-2	Precision	Recall	F1
1-Stage Single Ranking
Orthographic	N/A	0.7606	0.7636	0.7621
Word Frequency	N/A	0.6970	0.6925	0.6948
Noisy Channel	N/A	0.7134	0.7171	0.7152
Context Score	N/A	0.8035	0.5917	0.6815
1-Stage Combined Ranking
Ensemble	N/A	0.7516	0.7545	0.7531
2-Stage Single Ranking
Orthographic	Word Frequency	0.8241	0.7687	0.7955
Orthographic	Noisy Channel	0.8255	0.7700	0.7968
Orthographic	Context Score	0.8996	0.5672	0.6957
2-Stage Combined Ranking: CSpell
Orthographic	Context Score, Noisy Channel	0.8407	0.7842	0.8115

III. Discussion

Stage-1 ranking:
- The non-word spelling and split candidate generator that relies on edit distance measure alone generate irrelevant candidates.
- Orthographic similarity scores, includes phonetic and overlap similarity, was used to exclude irrelevant candidates.
- The orthographic score ranking has the highest F1.
Stage-2 ranking:
From either the 1-stage or 2-stage ranking, we observed:
- Context score ranking had the highest precision
- Noisy Channel ranking had the higher recall and F1 (2.04%) compared to word frequency (from 69.48% improved to 71.52%).
- Thus, we use the chain comparator in the sequential order of context score (for high precision), then noisy channel (for increasing recall) in the stage-2 ranking.
Compare 1-stage and 2-stage:
- The best 1-stage ranking is Orthographic with F1 of 0.7621
  => This is better than the combined technique of Ensemble
- The best 2-stage ranking is CSpell with F1 of 0.8115
- The improvement from the best 1-stage ranking to 2-stage ranking is 4.94% (from 76.21% to 81.15%)