Test on Lexicon: for AMIA Initial Submission
Norm, MES, and ES are used in a sequential order to retrieve the most spelling variant groups. This model is tested on Lexicon (inflVars.data) and LRSPL for the recall, precisino, F1, and accuracy. The details are shown as follows:
2015 (Used in AMIA paper initial submission)
Step | Methods | Edit Distance | Sample No. | ret-rel | ret-irrel | notRet-rel | notRet-irrel | Precision | Recall | F1 | Accuracy | Notes |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | GoldStd | N/A | 867,728 | 363,217 | 0 | 0 | 504,511 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1 min. |
1 | Norm | N/A | 867,728 | 306,387 | 19,374 | 56,830 | 485,137 | 0.9405 | 0.8435 | 0.8894 | 0.9122 | 2 min. |
2 | MES | 2 | 867,728 | 355,423 | 173,647 | 7,794 | 330,864 | 0.6718 | 0.9785 | 0.7967 | 0.7909 | 6 hr. |
3 | ES | 1 | 867,728 | 360,599 | 286,932 | 2,618 | 217,579 | 0.5569 | 0.9928 | 0.7135 | 0.6663 | 24 hr. |
4 | MES | 3 | 867,728 | 360,956 | 301,097 | 2,261 | 203,414 | 0.5452 | 0.9938 | 0.7041 | 0.6504 | 8 min. |
5 | ES | 2 | 867,728 | 362,082 | 353,512 | 1,135 | 150,999 | 0.5060 | 0.9969 | 0.6713 | 0.5913 | 27 hr. |
6 | MES | 4 | 867,728 | 362,159 | 356,156 | 1,058 | 148,355 | 0.5042 | 0.9971 | 0.6697 | 0.5883 | 2 min. |
Check Item | Check numbers |
---|---|
Total sample no | 867,728 = 362,159 + 356,156 + 1,058 + 148,355 |
Precision | 0.5042 = 362,159 / (362,159 + 356,156) |
Recall | 0.9971 = 362,159 / (362,159 + 1,058) |
F1 | 0.6697 = (2 * 0.5042 * 0.9971) / (0.5042 + 0.9971) |
Accuracy | 0.5883 = (362,159 + 148,355) / 867,728 |