CSpell

Performance Tests on Test Set

I. Test Setup

  • Data: Test Set
  • The Ensemble program from Dr. Kilicoglu was enhanced from Ensemble paper.

II. Test Results

  • Non-word Only:

    Non-word, Detection
    MethodTPFPFNT. RetT. RelPrecisionRecallF1
    Ensemble7362302389669740.76190.75560.7588
    CSpell8521181229709740.87840.87470.8765

    Non-word, Correction
    MethodTPFPFNT. RetT. RelPrecisionRecallF1
    Ensemble5983683769669740.61900.61400.6165
    CSpell7432272319709740.76600.76280.7644

  • Real-word Included:

    Real-word Included, Detection
    MethodTPFPFNT. RetT. RelPrecisionRecallF1
    Ensemble66514551381011780.82100.56450.6690
    CSpell87410830498211780.89000.74190.8093

    Real-word Included, Correction
    MethodTPFPFNT. RetT. RelPrecisionRecallF1
    Ensemble56524561381011780.69750.47960.5684
    CSpell74723543198211780.76070.63410.6917

  • Real-word correction, elapsed running-time:
    • Ensemble:
      • 1st: 34'13" => 2053 sec.
      • 2nd: 34'35" => 2075 sec.
      • Avg. 34'24" => 2064 sec.
    • CSpell:
      • 181.28 sec.
    • Accordingly, CSpell is about 11.38 (= 2064/181.28) time faster than Ensemble.

III. Discussion

  • The improvement from Ensemble to CSpell for non-word detection and correction is 11.77% and 14.79%.
  • The improvement from Ensemble to CSpell for real-word detection and correction is 14.03% and 12.33%.
  • The test set is a harder set for spelling correction because it was samples from questions with the highest OOV rate. The error rate of the test set (0.07) is much higher than the training set (0.04). Accordingly, both CSpell and Ensemble had worse performance on the test set than the development set.