public class ToNormalize extends Transformation implements java.lang.Cloneable
History:
ToMapSymbolToAscii
,
ToRemoveGenitive
,
ToRemoveS
,
ToReplacePunctuationWithSpace
,
ToStripStopWords
,
ToLowerCase
,
ToUninflectWords
,
ToCitation
,
ToUnicodeCoreNorm
,
ToStripMapUnicode
,
ToSortWordsByOrder
NO_MUTATE_INFO, UPDATE
Constructor and Description |
---|
ToNormalize() |
Modifier and Type | Method and Description |
---|---|
static void |
main(java.lang.String[] args)
A unit test driver for this flow component.
|
static java.util.Vector<LexItem> |
Mutate(LexItem in,
int maxTerm,
java.util.Vector<java.lang.String> stopWords,
java.sql.Connection conn,
RamTrie trie,
java.util.Hashtable<java.lang.Character,java.lang.String> symbolMap,
java.util.Hashtable<java.lang.Character,java.lang.String> unicodeMap,
java.util.Hashtable<java.lang.Character,java.lang.String> ligatureMap,
java.util.Hashtable<java.lang.Character,java.lang.Character> diacriticMap,
java.util.Hashtable<java.lang.Character,java.lang.String> nonStripMap,
RTrieTree removeSTree,
boolean detailsFlag,
boolean mutateFlag)
Performs the mutation of this flow component.
|
GetTestStr, PrintResult, PrintResults, UpdateLexItem, UpdateLexItem, UpdateLexItem
public static java.util.Vector<LexItem> Mutate(LexItem in, int maxTerm, java.util.Vector<java.lang.String> stopWords, java.sql.Connection conn, RamTrie trie, java.util.Hashtable<java.lang.Character,java.lang.String> symbolMap, java.util.Hashtable<java.lang.Character,java.lang.String> unicodeMap, java.util.Hashtable<java.lang.Character,java.lang.String> ligatureMap, java.util.Hashtable<java.lang.Character,java.lang.Character> diacriticMap, java.util.Hashtable<java.lang.Character,java.lang.String> nonStripMap, RTrieTree removeSTree, boolean detailsFlag, boolean mutateFlag) throws java.sql.SQLException
in
- a LexItem as the input for this flow componentmaxTerm
- the maxinum number of permutation term (uninflect)stopWords
- A Vector of String - stop wrods listconn
- LVG database connectiontrie
- LVG Ram triesymbolMap
- a hash table contains the unicode symbols mappingunicodeMap
- a hash table contains the unicode mappingligatureMap
- a hash table contains the mapping of ligaturesdiacriticMap
- a hash table contains the mapping of diacriticsnonStripMap
- a hash table contains the non-Strip map unicoderemoveSTree
- a reverse trie tree of removeS pattern rulesdetailsFlag
- a boolean flag for processing details informationmutateFlag
- a boolean flag for processing mutate informationjava.sql.SQLException
- if errors occurr while connect to LVG database.DbBase
,
RamTrie
public static void main(java.lang.String[] args)
args
- arguments