Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Lexical Tools

SPT - Terms Match Design (SubTerm)

I. Introduction
This section describes the method used to find all matched terms in a Trie Tree from a given input. In other words, all sub-terms with synonyms from an input are found.

II. Algorithm

  • Init Vector<String> matchTerms
  • LowerCase the input term to newInTerm
  • Get inWords by tokenizing newInTerm
  • Go through terms from the inWords
    • Get curTerm from startIndex of inWords
    • Find branchMatches
      • Normalize the input term:
        • LowerCase
        • Add " $_END" (the END node)
      • Tokenize normalized term into inWords as a Vector<String>
      • Set the curNode to ROOT node
      • Init Vector branchMatches
      • Go through the inWords
        • Initiate curWordNode by the curWord
        • get curChilds from curNode
        • Check if curChilds has END node
          • Yes => add the branch term to branchMatches
        • Check if curChilds contains curWordNode
          • yes => update curNode
          • no => not match (false), break
    • Add branchMatches to matchTerms

III. Java Classes & Method

  • TrieTreeMatch.java: a Java class for matching in TrieTree
  • public Vector<String> FindMatchTerms(String inTerm)

IV. Examples

  • Synonym Rules:

    wordsynonym
    dogcanine
    catfeline
    canineK9
    K9bull dog
    Dog and catpets
    puppy and kittypets

  • Synonym Terms:

    Terms
    dog
    canine
    cat
    feline
    k9
    bull dog
    dog and cat
    pets
    puppy and kitty

  • Input Term:
    Who let dog and cat out
    • LowerCase: who let dog and cat out
    • Go through terms from "who let dog and cat out"
      icurTermbranchMatchesmatchTerms
      0who let dog and cat out  
      1let dog and cat out  
      2dog and cat out
      • dog
      • dog and cat
      • dog
      • dog and cat
      3and cat out 
      • dog
      • dog and cat
      4cat out
      • cat
      • dog
      • dog and cat
      • cat
      5out 
      • dog
      • dog and cat
      • cat

  • Output:

    return matched terms | start index | end indexes:

    • dog|2|3
    • dog and cat|2|5
    • cat|4|5

  • Trie Tree