The SPECIALIST lexicon is a large syntactic lexicon of biomedical and general English, designed to provide the information needed for the SPECIALIST Natural Language Processing System. Coverage includes both commonly occurring English words and biomedical vocabulary. The lexicon entry for each lexical item records syntactic, morphological, and orthographic information. The lexicon has been released as one of the UMLS Knowledge Sources since 1994.

Lexical Tools

The SPECIALIST lexical tools are a set of JAVA programs designed to help users manage lexical variation in biomedical text. The tools use information from the SPECIALIST lexicon and other data to generate lexical variants of words or terms appropriate for use in indexing and other NLP applications.


The MEDLINE n-gram set is used to retrieve multiwords for building the SPECIALIST lexicon. Lexical Systems Group (LSG) would like to share this n-gram set (n = 1 ~ 5) with NLP|MLP community. Please download from the following links.