CSpell

Performance Tests for Ensemble

I. Introduction

Performance tests on the test set are conducted on the Ensemble Spelling as the baseline to compare to CSpell.

II. Setup

Program:
${C_SPELL}/SpellCorrection/bin/runSpellingAllData
4 (CSpell data - NER)
3, 4 (nonword, real-word)
4 (methods)
InData:
${C_SPELL}/SpellCorrection/CHQA_SpellCorrection_Dataset/ResultCSpellData/
OutData:
${C_SPELL}/SpellCorrection/CHQA_SpellCorrection_Dataset/ResultCSpellData/LinearWeighted_nw_OUT_4
${C_SPELL}/SpellCorrection/CHQA_SpellCorrection_Dataset/ResultCSpellData/LinearWeighted_rw_OUT_4
Backup on:
- ${C_SPELL}/SpellCorrection/CHQA_SpellCorrection_Dataset/ResultCSpellData.baseline
- ${C_SPELL}/PostProcess/data/Test/NewTest/TestData/9_Baseline/Offical/*

III. Performance Results

Methods	Revised GoldStd TP\|Ret\|Rel Precision\|Recall\|F1
4. Ensemble	559\|966\|974 0.5787\|0.5739\|0.5763

Methods	Revised GoldStd TP\|Ret\|Rel Precision\|Recall\|F1
4. Ensemble (NW)	560\|966\|1178 0.5797\|0.4754\|0.5224
4. Ensemble (RW)	520\|810\|1178 0.6420\|0.4414\|0.5231

The results of non-word and real-word options from Ensemble seems do not have too much difference.