Synonym Enhancement - Motivation and Plan
I. Objective
Mapping a term to UMLS-Metathesaurus concept(s)/CUI(s) is an important step in NLP. Normalization and Query Expansion are two major technique used in concept mapping to increase recall. A better normalization and synonym list could increase the recall without dropping the precision of concept mapping.
The synonyms in the Lexical Tools were developed in early 90's. It is static. Only few updates have been made over the past decade. A systematic methodology is developed to establish a new system to generate synonyms from the annual release of the SPECIALIST Lexicon and UMLS Metathesaurus. A better recall is expected with the new developed synonym list in use. The objectives of this task are:
II. Phases
The phases of this task are described as follows:
Type | synonym-1 | category-1 | synonym-2 | category-2 | CUIs|Pt |
---|---|---|---|---|---|
adj-adj | ureteral | adj | ureteric | adj | C0041951|Ureter |
adj-adj | emetic | adj | emetogenic | adj | C0013973|Emetics |
adj-adj | farsighted | adj | long sighted | adj | C0020490|Hyperopia |
adj-adj | inner | adj | internal | adj | C0205102|Internal |
noun-noun | wrist | noun | carpal | noun | C0043262|Wrist |
noun-noun | calculus | noun | stone | noun | C0006736|Calculi |
noun-noun | headache | noun | cephalgia | noun | C0018681|Headache |
noun-noun | tumor | noun | neoplasms | noun | C0027651|Neoplasms |
verb-verb | happen | verb | occur | verb | C1709305|Occur (action) |
verb-verb | autopsy | verb | necropsy | verb | C0004398|Autopsy |
verb-verb | acquire | verb | obtain | verb | C1706701|Acquisition (action) |
Type | synonym-1 | category-1 | synonym-2 | category-2 | CUIs|Pt |
---|---|---|---|---|---|
adj-noun | cardiac | adj | heart | noun | C0018787|Heart |
adj-noun | farsighted | adj | hyperopia | noun | C0020490|Hyperopia |
adj-noun | renal | adj | kidney | noun | C0022646|Kidney |
adj-verb | choice | adj | select | verb | C1707391|Choose (action) |
adj-verb | mad | adj | anger | verb | C0002957|Anger |
noun-verb | autograft | noun | autotransplant | verb | C0559189|Autograft Material |
noun-verb | abdominal pain | noun | bellyache | verb | C0000737|Abdominal Pain |
Synonym-1 | Synonym-2 | QE Example | CUI|PT |
---|---|---|---|
inner|C0205102|Internal | internal|C0205102|Internal | inner ear|internal ear | C0022889|Labyrinth |
heart|C0018787|Heart | cardiac|C0018787|Heart | heart abnormalities|cardiac abnormalities | C0018798|Congenital Heart Defects |
kidney|C0022646|Kidney | renal|C0022646|Kidney | kidney disease|renal disease | C0022658|Kidney Diseases |
Synonym-1 | Synonym-2 | Example term | CUI|PT |
---|---|---|---|
ache|C0234238|Ache | pain|C0030193|Pain | head ache|head pain | C0018681|Headache |
bladder|C0005682|Urinary Bladder | vesical|None | bladder fistula|vesical fistula | C0005690|Urinary Bladder Fistula |
III. Model for Test and Analysis
A model need to be established as a measure metric to test the performance (precision and recall) of the new derived synonym list.
The Sub-Term Mapping Tools (STMT), developed by Lexical Systems Group (LSG), is a generic tool set, with fully configurable options (corpus, synonyms, etc.), which provides comprehensive sub-term related features for query expansion and other NLP applications with Java APIs and command line tools [6]. The Synonym Mapping Tool (SMT) is one of the most commonly used tools in the STMT package and is designed to find concepts in the UMLS-Metathesaurus using synonym substitutions. The performance (precision and recall) of SMT in finding the mapped concepts mainly depends on the synonym list if the number limit of substituted synonyms is fixed. The synonyms and related words derived from the above two phases can be configured easily as corpus trees in SMT for concept mapping test as follows.