Informal Expression
from Health Information Consumer Language

I. Introduction

The Lexicon includes formal general English and biomedical terms (that are used in published documents). Informal English that is used in everyday conversation and in personal emails, text, etc. are often found in consumer health. Consumers are health information consumers. They search and ask questions from the internet. They could create their own terms and language. These consumer language are informal expression and usually are shorter and simpler. In general, informal expression includes contractions and shorthand. Informal expressions are usually can not be found in a dictionary, such as [plz] -> [please]. Accordingly, it is important to handle informal expression in consumer language in the pre-process in the NLP pipeline.

II. Sources of Informal Expression

The sources of informal expression include:

III. Proposed Generations and Usage for informal expression

  • Generation
    • From health consumer questions corpus:
      • Collect health consumer questions corpus
      • Find short words that has high frequency and not seen in the dictionary
      • Sent the word with the sentences to linguist for informal expression validation
      • Add them to Lexicon
    • From Lexicon: use class_type=informal
  • Usage
    • Generate a mapping file that map from informal expression to formal expression
    • Apply this mapping in the preprocess when deal with consumer health.