Performance Tests - Ensemble on Training Set
I. Introduction
Performance tests are conducted on different ranking methods of Ensemble Spelling (original code).
II. Setup
${C_SPELL}/SpellCorrection/bin/runSpellingAllData
0
(all data)
3, 4
(nonword, real-word)
0,1,2,3,4
(methods)
${C_SPELL}/SpellCorrection/CHQA_SpellCorrection_Dataset/AllData/
${C_SPELL}/SpellCorrection/CHQA_SpellCorrection_Dataset/ResultAllData/LinearWeighted_nw_OUT_*
${C_SPELL}/SpellCorrection/CHQA_SpellCorrection_Dataset/ResultAllData/LinearWeighted_rw_OUT_*
Backup on:
III. Performance Results
Methods | Original GoldStd TP|Ret|Rel Precision|Recall|F1 | Revised GoldStd TP|Ret|Rel Precision|Recall|F1 |
---|---|---|
0. PreProcess | 289|347|814 0.8329|0.3550|0.4978 | 289|347|774 0.8329|0.3734|0.5156 |
1. Orthographic | 495|824|814 0.6007|0.6081|0.6044 | 511|824|774 0.6201|0.6602|0.6395 |
2. Corpus Frequency | 361|810|814 0.4457|0.4435|0.4446 | 366|810|774 0.4519|0.4729|0.4621 |
3. Word Embedding | 350|807|814 0.4337|0.4300|0.4318 | 358|807|774 0.4436|0.4625|0.4529 |
4. Ensemble | 530|825|814 0.6424|0.6511|0.6467 | 552|825|774 0.6691|0.7132|0.6904 |
Methods | Original GoldStd TP|Ret|Rel Precision|Recall|F1 | Revised GoldStd TP|Ret|Rel Precision|Recall|F1 |
---|---|---|
Ensemble (non-word) | 531|825|926 0.6436|0.5734|0.6065 | 556|825|964 0.6739|0.5768|0.6216 |
Ensemble (real-Word) | 498|718|926 0.6936|0.5378|0.6058 | 517|718|964 0.7201|0.5363|0.6147 |