CSpell

  • CSpell
  • Java


Introduction

CSpell detects and corrects spelling errors from the input and sends the corrected results to output. Both isolated-word error corrections and context-dependent corrections are integrated in CSpell. There is only one output for one particular input.

The correction features in CSpell include:

The dictionary used in CSpell was derived from various sources to detect spelling errors and suggest correct words. These sources include:

  • The SPECIALIST Lexicon
  • UMLS-Metathesaurs ( consumer related medical terms)

In addition, CSpell uses the Consumer Health Corpus for word frequency and context information (IM and OM from Word2Vec). Different dictionaries and corpus can be used through configuration setup in CSpell.

Setup

Follow the installation instructions to install and run the CSpell program. Check on the following items only if you don't use the provided script to install CSpell.

  • CLASSPATH:
    1. include the CSpell distribution jar file, ${CSPELL_DIR}/lib/cSpell${YEAR}dist.jar, in your CLASSPATH
    2. include the CSpell top directory, ${CSPELL_DIR}, in your CLASSPATH

  • Configuration File: assign the full path of the top directory of cSpell${YEAR} to a variable named CS_DIR in the configuration file, ${CSPELL_DIR}/data/config/cSpell.properties.

Test Run

  • run java program

    Enter the command:

    
    shell> CSpell -p
    - Please input a term (type "Ctl-d" to quit) >
    He was dianosed early on set deminita 3years ago.
    He was diagnosed early onset dementia 3 years ago.
    - Please input a term (type "Ctl-d" to quit) >
    No bowl movement for along time.
    No bowel movement for a long time.
    

    where:

    • CSpell: script for CSpell
    • -p: set CSpell system option to show prompt (try -h option!)

Output Format

CSpell takes its input (entire line) from standard input, perform spelling error corrections, and then send the results to standard output.

Options

Please refer to design documents on CSpell Options