Dictionary Functions - Check Valid Word
I. Introduction
In cSpell, all tokens that are used for spelling error detection are single words. Thus, only single words are needed to be in the dictionary. This page described which dictionary should be used for the spelling erroor detection.
II. Algorithm
Both the whole token and core-term for the token are checked for the valid spellina (Is-Valid-Word):
III. Results
Test cSpell with different dictionaries:
* noAaDic: En + Pn
eng_medical.dic:
Lexicon.dic:
IV. Tests:
Test-1: Tests on Baseline + Lexicon (not used, result are included from above)
Dictionary | TP|Ret|Rel | Precision | Recall | F1 | Notes |
---|---|---|---|---|---|
Lexicon (single-word + multiwords) | |||||
| 535|858|814 | 0.6235 | 0.6572 | 0.6400 | |
| 530|877|814 | 0.6043 | 0.6511 | 0.6268 | |
Lexicon (single-words) | |||||
| 531|808|814 | 0.6572 | 0.6523 | 0.6547 | |
| 535|858|814 | 0.6235 | 0.6572 | 0.6400 | |
| 530|877|814 | 0.6043 | 0.6511 | 0.6268 | |
Combined (10 spVar are included in Lexicon) | |||||
| 529|740|814 | 0.7149 | 0.6499 | 0.6808 | |
| 533|745|814 | 0.7154 | 0.6548 | 0.6838 | |
| 537|745|814 | 0.7208 | 0.6597 | 0.6889 | |
TBD | |||||
| 549|745|814 | 0.7369 | 0.6744 | 0.7043 |
Test 2: Tests on split Dictionaries
Dictionary | TP|Ret|Rel | Precision | Recall | F1 | Notes | ||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Use Baseline Dictionary for check and suggest | |||||||||||||||||||||||||||||||||||||||||||||||
| 546|820|814 | 0.6659 | 0.6708 | 0.6683 | |||||||||||||||||||||||||||||||||||||||||||
| 547|810|814 | 0.6753 | 0.6720 | 0.6736 | Add 10 files for spVars | ||||||||||||||||||||||||||||||||||||||||||
| 548|765|814 | 0.7163 | 0.6732 | 0.6941 | Check proper noun from Lexicon | ||||||||||||||||||||||||||||||||||||||||||
| 547|804|814 | 0.6803 | 0.6720 | 0.6761 | Check Abb/Acr from Lexicon | ||||||||||||||||||||||||||||||||||||||||||
| 548|759|814 | 0.7220 | 0.6732 | 0.6968 | Check proper nouns/Abb/Acr from Lexicon | ||||||||||||||||||||||||||||||||||||||||||
| 544|747|814 | 0.7282 | 0.6683 | 0.6970 | Add SpVar from Lexicon | ||||||||||||||||||||||||||||||||||||||||||
| 543|746|814 | 0.7279 | 0.6671 | 0.6962 | Replace 10 files by Lexicon.spVar | ||||||||||||||||||||||||||||||||||||||||||
| 543|749|814 | 0.7250 | 0.6671 | 0.6948 | Add SpVar to dic decreases F1 because it us also used for suggestion (need a better ranking system) | ||||||||||||||||||||||||||||||||||||||||||
| 543|746|814 | 0.7279 | 0.6671 | 0.6962 | Add number, no change bz of data | ||||||||||||||||||||||||||||||||||||||||||
Implement 2 Dictionaries: Check + Suggest | |||||||||||||||||||||||||||||||||||||||||||||||
Find the Check dictionary | |||||||||||||||||||||||||||||||||||||||||||||||
|
Test-3: test on the suggestion dictionary
Dictionary | TP|Ret|Rel | Precision | Recall | F1 | Notes | ||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Find the Suggest dictionary | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|