CSpell

Performance Tests on Ranking Systems (2-Stage)

I. Test Setup

  • Data: Training Set
  • Gold Standard: non-word only
  • Dictionary: CSpell
  • Corpus: consumer Health corpus
  • Ranking: Combined scores

II. Test Results

Stage-1Stage-2PrecisionRecallF1
1-Stage Single Ranking
OrthographicN/A0.76060.76360.7621
Word FrequencyN/A0.69700.69250.6948
Noisy ChannelN/A0.71340.71710.7152
Context ScoreN/A0.80350.59170.6815
1-Stage Combined Ranking
EnsembleN/A0.75160.75450.7531
2-Stage Single Ranking
OrthographicWord Frequency0.82410.76870.7955
OrthographicNoisy Channel0.82550.77000.7968
OrthographicContext Score0.89960.56720.6957
2-Stage Combined Ranking: CSpell
OrthographicContext Score, Noisy Channel0.84070.78420.8115

III. Discussion

  • Stage-1 ranking:
    • The non-word spelling and split candidate generator that relies on edit distance measure alone generate irrelevant candidates.
    • Orthographic similarity scores, includes phonetic and overlap similarity, was used to exclude irrelevant candidates.
    • The orthographic score ranking has the highest F1.
  • Stage-2 ranking:
    From either the 1-stage or 2-stage ranking, we observed:
    • Context score ranking had the highest precision
    • Noisy Channel ranking had the higher recall and F1 (2.04%) compared to word frequency (from 69.48% improved to 71.52%).
    • Thus, we use the chain comparator in the sequential order of context score (for high precision), then noisy channel (for increasing recall) in the stage-2 ranking.
  • Compare 1-stage and 2-stage:
    • The best 1-stage ranking is Orthographic with F1 of 0.7621
      => This is better than the combined technique of Ensemble
    • The best 2-stage ranking is CSpell with F1 of 0.8115
    • The improvement from the best 1-stage ranking to 2-stage ranking is 4.94% (from 76.21% to 81.15%)