STMT

UMLS-Core: Sub-Term

Descriptions:
- Find all matched terms in a Trie Tree (synonyms) from the input
- The match terms (sub-term of input term) include the starting and ending indexes from the input term

Examples - Test Cases:

Terms in corpus:

Terms
dog
canine
cat
feline
k9
bull dog
dog and cat
pets
puppy and kitty

Input Term:
Who let dogs and CAT out

Go through terms from "who let dog and cat out"

i	curTerm	branchMatches	matchTerms
0	who let dog and cat out
1	let dog and cat out
2	dog and cat out	dog dog and cat	dog dog and cat
3	and cat out		dog dog and cat
4	cat out	cat	dog dog and cat cat
5	out		dog dog and cat cat

Outputs:
return matched terms | start index | end indexes:
- dog|2|3
- dog and cat|2|5
- cat|4|5
Trie Tree

Algorithm:
- Init Vector<String> matchTerms
- SynonymMapNorm the input term to newInTerm
- Get inWords by tokenizing newInTerm
- Go through terms from the inWords
  - Get curTerm from startIndex of inWords
  - Find branchMatches
    - Normalize the input term:
      - SynonymMapNorm
      - Add " $_END" (the END node)
    - Tokenize normalized term into inWords as a Vector<String>
    - Set the curNode to ROOT node
    - Init Vector branchMatches
    - Go through the inWords
      - Initiate curWordNode by the curWord
      - get curChilds from curNode
      - Check if curChilds has END node
        
        Yes => add the branch term to branchMatches
      - Check if curChilds contains curWordNode
        
        yes => update curNode
        no => not match (false), break
  - Add branchMatches to matchTerms

Sub-Term Mapping Tools