Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Sub-Term Mapping Tools

UMLS-Core: Sub-Term

  • Descriptions:
    • Find all matched terms in a Trie Tree (synonyms) from the input
    • The match terms (sub-term of input term) include the starting and ending indexes from the input term

  • Examples - Test Cases:

    • Terms in corpus:

      Terms
      dog
      canine
      cat
      feline
      k9
      bull dog
      dog and cat
      pets
      puppy and kitty

    • Input Term:
      Who let dogs and CAT out
      • SynonynMapNorm: who let dog and cat out
      • Go through terms from "who let dog and cat out"
        icurTermbranchMatchesmatchTerms
        0who let dog and cat out  
        1let dog and cat out  
        2dog and cat out
        • dog
        • dog and cat
        • dog
        • dog and cat
        3and cat out 
        • dog
        • dog and cat
        4cat out
        • cat
        • dog
        • dog and cat
        • cat
        5out 
        • dog
        • dog and cat
        • cat

    • Outputs:

      return matched terms | start index | end indexes:

      • dog|2|3
      • dog and cat|2|5
      • cat|4|5

    • Trie Tree

  • Algorithm:
    • Init Vector<String> matchTerms
    • SynonymMapNorm the input term to newInTerm
    • Get inWords by tokenizing newInTerm
    • Go through terms from the inWords
      • Get curTerm from startIndex of inWords
      • Find branchMatches
        • Normalize the input term:
          • SynonymMapNorm
          • Add " $_END" (the END node)
        • Tokenize normalized term into inWords as a Vector<String>
        • Set the curNode to ROOT node
        • Init Vector branchMatches
        • Go through the inWords
          • Initiate curWordNode by the curWord
          • get curChilds from curNode
          • Check if curChilds has END node
            • Yes => add the branch term to branchMatches
          • Check if curChilds contains curWordNode
            • yes => update curNode
            • no => not match (false), break
      • Add branchMatches to matchTerms