About CSpell
CSpell is a generic spelling detection and correction tool developed in Java. It is distributed by NLM via an Open Source License agreement. It was originally developed for Consumer Health Question Answering project and thus the consumer health corpus are used for word frequency and context scores (word vectors). Easy configurable options are provided for customizing different data files. The correction features include:
Type | Input Text | Output Correction |
---|---|---|
Xml/Html handler | "germs" | "germs" |
Informal expression handler | pls | please |
Leading digit splitter | 1.5years | 1.5 years |
Ending digit splitter | from2007 | from 2007 |
Leading punctuation splitter | volunteers(healthy) | volunteers (healthy) |
Ending digit splitter | cancer?if so | cancer? if so |
Type | Input Text | Output Correction |
---|---|---|
Spelling | dianosed | diagnosed |
split | knowabout | know about |
merge | stiff n ess | stiffness |
Type | Input Text | Output Correction |
---|---|---|
Spelling | bowl movement | bowel movement |
split | for along time | for a long time |
merge | early on set | early onset |
Input Text | He was dianosed early on set deminita 3years ago. |
---|---|
Output Correction | He was diagnosed early onset dementia 3 years ago. |
Input text | Output Correction | |
---|---|---|
dianosed | diagnosed | non-word, spelling |
on set | onset | real-word, merge |
deminita | dementia | non-word, spelling |
3years | 3 years | non-dictionary, split |
Input Text | No bowl movement for along time. |
---|---|
Output Correction | No bowel movement for a long time. |
Input text | Output Correction | |
---|---|---|
bowl | bowel | real-word, spelling |
along | a long | real-word, split |