Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov

CSpell

About CSpell

CSpell is a generic spelling detection and correction tool developed in Java. It is distributed by NLM via an Open Source License agreement. It was originally developed for Consumer Health Question Answering project and thus the consumer health corpus are used for word frequency and context scores (word vectors). Easy configurable options are provided for customizing different data files. The correction features include:

  • Non-dictionary based correction

    TypeInput TextOutput Correction
    Xml/Html handler"germs""germs"
    Informal expression handlerplsplease
    Leading digit splitter1.5years1.5 years
    Ending digit splitterfrom2007from 2007
    Leading punctuation splittervolunteers(healthy)volunteers (healthy)
    Ending digit splittercancer?if socancer? if so

  • Dictionary based correction
    • non-word

      TypeInput TextOutput Correction
      Spellingdianoseddiagnosed
      splitknowaboutknow about
      mergestiff n essstiffness
    • real-word

      TypeInput TextOutput Correction
      Spellingbowl movementbowel movement
      splitfor along timefor a long time
      mergeearly on setearly onset

  • Combination of above:
    • Example-1:
      Input TextHe was dianosed early on set deminita 3years ago.
      Output CorrectionHe was diagnosed early onset dementia 3 years ago.

      Input textOutput Correction
      dianoseddiagnosednon-word, spelling
      on setonsetreal-word, merge
      deminitadementianon-word, spelling
      3years3 yearsnon-dictionary, split

    • Example-2:
      Input TextNo bowl movement for along time.
      Output CorrectionNo bowel movement for a long time.

      Input textOutput Correction
      bowlbowelreal-word, spelling
      alonga longreal-word, split