Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Sub-Term Mapping Tools

Testing Data: UMLS-Core

This list of terms was used in UMLS-Core projects. It is used as gold standard in this project for testing.

I. General Information

  • UMLS-Core: SCTMap_withCUI_201302 (provided by Dr. K.W. Fung)
  • In MS Excel Format:
    Term IdLocal TermSNOMED CIDSNOMED FSNUMLS CUI
  • Contains 15,487 terms with valid mapped CUI (used as gold standard)
  • Contains 13,077 unique terms
  • 1,492 terms are duplicated with different ID (sources)
  • 35 terms have multiple CUIs (ambiguous)

II. Data Process

  • Convert from Excel to CVS format
  • Convert from CVS to pipe separate format (gov.nih.nlm.nls.stmt.Lib.FromCsvToPipeFile)
  • Filter out duplicated terms to unify term|CUI

  • For testing input: Retrieve fields 2
    Local Term

  • For gold standard: Retrieve fields 2,5
    Local TermUMLS CUI

III. Source of UMLS-Core data

  • Problem list terminologies (local terms) from 6 (8) institutions
    • HA: Hong Kong Hospital Authority
    • IH: Intermountain Healthcare
    • KP: Kaiser Permanente
    • MA: Mayo Clinic
    • NU: University of Nebraska Medical Center
    • RI: Regenstrief Institute
  • A problem list is a complete list of all patient's problem
  • The data in the original paper:
    • 76,237 terms and their usage frequenies in 14 million patients were submitted from six institutions
    • 65,678 terms unique across instutions
    • mapping from the local problem list terms to standard terminologies (ICD-9-CM, SNOMED CT) if available
    • 14,395 terms covered 95% of usage in each institution (10,081 terms unique across institutions)
    • 13,26 terms were successfully mapped to 6,776 UMLS concepts
      • UMLS mapping - 2008AA: 10,812 (75%)
        • exact match - case-insensitive: 8,102 (56%)
        • normalized match: 2,035 (14%)
        • synonym substitution: 576 (5%)

      • local maps to standard terminilogies: 1,007 (7%)
        • automatically map - if labeled as exact match
        • manully reviewed for exact match - if not labeled as exact match

      • manual mapping use RRF browser: 1,442 (10%)

      • unmapped: 1,134 (8%)
        • Highly specific: 53%
        • Very general: 11%
        • Administrative: 7%
        • Laterality: 7%
        • Negative finding: 3%
        • Composit comcept: 3%
        • Meaning unclear (ambiguous): 2%
        • Miscellaneous: 13%
  • References: UMLS-Core Project