Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

The SPECIALIST Lexicon

Informal Expression
from Health Information Consumer Language

I. Introduction

The Lexicon includes formal general English and biomedical terms (that are used in published documents). Informal English that is used in everyday conversation and in personal emails, text, etc. are often found in consumer health. Consumers are health information consumers. They search and ask questions from the internet. They could create their own terms and language. These consumer language are informal expression and usually are shorter and simpler. In general, informal expression includes contractions and shorthand. Informal expressions are usually can not be found in a dictionary, such as [plz] -> [please]. Accordingly, it is important to handle informal expression in consumer language in the pre-process in the NLP pipeline.

II. Sources of Informal Expression

The sources of informal expression include:

III. Proposed Generations and Usage for informal expression

  • Generation
    • From health consumer questions corpus:
      • Collect health consumer questions corpus
      • Find short words that has high frequency and not seen in the dictionary
      • Sent the word with the sentences to linguist for informal expression validation
      • Add them to Lexicon
    • From Lexicon: use class_type=informal
  • Usage
    • Generate a mapping file that map from informal expression to formal expression
    • Apply this mapping in the preprocess when deal with consumer health.