SPT - Pattern Permutation Design
I. Introduction
This page describes the algorithm used to find all patterns permutation of sub-term from a given input. A pattern is one of possible synonym permutation (please see the example for details).
II. Algorithm
- Find all sub-terms of the input that has synonyms
- Sort by the order of start index, then end index
- subTerms: terms has synonyms
- Find all patterns
- parameters:
- inWordIndex: the index of InWords
- subTermObjIndex: the index of SubTerm Objs
- subTermObj: SubTerm object includes: sub term|start index|end Index
- startIndex: start index of SubTerm object[SubTerm Index]
- prevStartIndex: previous start index of SubTerm Object[SubTerm Index-1]
- patterns: pattern of terms|synonyms
- baseTermsWordCount: word count of the base terms in the pattern
- Initiation
- inWordIndex = 0
- subTermIndex = 0
- patterns = new Vector<Pattern>();
- Go through each InWord (word from input) and perform following actions:
- Go through each subTerm (terms have synonyms)
- CASE-I: current inWord is not a subTerm (starting word)
Conditions | Actions
|
---|
- subTermObjIndex >= subTerms.size() or
current inWord is after the last subTerms
- inWordIndex != startIndex
current inWord is not a subTerm
|
- add inWord (no synonyms) to all patterns
- move to next inWord (inWordIndex++)
|
- CASE-II: current inWord is a subTerm (starting word)
- subTermObjIndex < subTerms.size() and
- inWordIndex == startIndex
- instantiate synonyms with current subTermObj
-
CASE | Conditions | Actions
|
---|
II.1 |
- subTermObj.GetEndIndex()-1 == subTermObj.GetStartIndex()
=> the base form of subTerm is a single word
=> this word must be the first subTerm starting with current inWord (sorted)
|
- add synonyms to all patterns
|
---|
II.2 |
- subTermObj.GetEndIndex()-1 != subTermObj.GetStartIndex()
=> base term of subTerm is a multi-words
|
- add inWord to all patterns
- get last patterns
- get all patterns with last element removed
- add to lastPatterns if
- startIndex == totalBaseTermsWordCount
=>same position for subTerm replacement
- pattern does not exist in lastPatterns
- add synonyms to all lastPatterns
- add all lastPatterns to patterns
|
---|
- move to next inWord (inWordIndex++) if current inWord is not the next subTerm (startIndex != nextStartIndex)
- move to next subTerm (subTermObjIndex++) if subTerm is not the last one (subTermObjIndex < subTerms.size( ))
- Add synonyms (Vector<String> with a single element, single word) to patterns
Condition | Action
|
---|
patterns.size() == 0 =>no pattern exist, first word |
- instantiate a new pattern
- add synonyms to this new pattern
- add this new pattern to patterns
|
patterns.size() > 0 pattern exist, not the first word |
- go through all pattern
- add synonyms to each pattern if the last element does not contains the subTerm (of this single inWord)
|
III. Java Classes & Method
- Pattern.java: a Java class for pattern object
- PatternPermutation.java: a Java class for pattern permutation
public Vector<pattern> FindPatternPermutation(String inTerm,
SynonymsMapping synonymsMapping, TrieTreeMatch trieTreeMatch,
boolean recursiveFlag)
IV. Examples
- Input: Synonym Rules
word | synonym
|
---|
dog | canine
|
cat | feline
|
canine | K9
|
K9 | bull dog
|
Dog and cat | pets
|
puppy and kitty | pets
|
- Input: Synonym Terms
Terms
|
---|
dog
|
canine
|
cat
|
feline
|
k9
|
bull dog
|
dog and cat
|
pets
|
puppy and kitty
|
- Input Term:
Who let dog and cat out
InWord Index | 0 | 1 | 2 | 3 | 4 | 5
|
---|
InWord | who | let | dog | and | cat | out
|
---|
- Permutation Algorithm Example
- Find all sub-terms of the input that has synonyms from
previous section
- Sort by the order of start index, then end index
-
SubTerm Index | Sub Term | Start Index | End Index
|
---|
0 | dog | 2 | 3
|
1 | dog and cat | 2 | 5
|
2 | cat | 4 | 5
|
- Find all patterns
InWord Index | InWord | SubTerm Index | SubTerm Obj | Start Index | Prev Start Index | Case | Patterns Obj
|
---|
0 | who | 0 | dog|2|3 | 2 | -1 | I |
|
---|
1 | let | 0 | dog|2|3 | 2 | -1 | I |
|
---|
2 | dog | 0 | dog|2|3 | 2 | -1 | II.1 | - pattern-1:
- who
- let|
- dog|canine|k9|bull dog|
- [2]
2 | dog | 1 | dog and cat|2|5 | 2 | 2 | II.2 |
- pattern-1:
- who
- let|
- dog|canine|k9|bull dog|
- [2]
- pattern-2:
- who
- let|
- dog and cat|pets|puppy and kitty|
- [5]
3 | and | 2 | cat|4|5 | 4 | 2 | I |
- pattern-1:
- who
- let|
- dog|canine|k9|bull dog|
- and|
- [4]
- pattern-2:
- who
- let|
- dog and cat|pets|puppy and kitty|
- [5]
4 | cat | 2 | cat|4|5 | 4 | 2 | II.1 |
- pattern-1:
- who
- let|
- dog|canine|k9|bull dog|
- and|
- cat|feline|
- [5]
- pattern-2:
- who
- let|
- dog and cat|pets|puppy and kitty|
- [5]
5 | out | 3 | null | -1 | 4 | I |
- pattern-1:
- who
- let|
- dog|canine|k9|bull dog|
- and|
- cat|feline|
- out
- [6]
- pattern-2:
- who
- let|
- dog and cat|pets|puppy and kitty|
- out
- [6]
|
---|
|
---|
|
---|
|
---|
|
---|
- Outputs:
Pattern 1
|
---|
word|synonyms
|
---|
who|
|
let|
|
dog|canine|k9|bull dog|
|
and|
|
cat|feline|
|
out|
|
Pattern 2
|
---|
word|synonyms
|
---|
who|
|
let|
|
dog and cat|pets|puppy and kitty|
|
out|
|