Sub-Term Mapping Tools

UMLS-Core: Corpus Tree

  • Descriptions:
    • Load all normalized synonyms (key in normTerm-synonyms table) into a trie tree

  • Examples - Test Cases:

    • Terms:

      Terms
      dog
      dog and cat
      dog and poppy
      canine
      cat
      feline
      k9
      bull dog
      pets

    • Trie Tree (from Synonym Terms):

  • Algorithm:
    • Trie Node:
      • Trie nodes are used to compose the Trie tree.
      • Each node has a key, which is a (single) word
      • Each node has a level, the top (root) level is 0
      • Each node has a child, which is a Vector<TrieNode>

      • Use "$_ROOT" for the word of root node
      • Use "$_END" for the word of end node

    • Trie Tree:
      • Each branch (path) starts from root node to the end node is a term
      • All terms are stored in the branch of the Trie tree

      • Tokenize input term into words
      • Go through each word:
        • curNode starts from the top (root) node
        • Init curWordNode by curWord
        • Check if Childs contains curWordNode
          • if yes, update curNode to the matched node in Childs
          • If no, add curWordNode to Childs and update curNode to curWordNode