public class ToStripMapUnicode extends Transformation implements java.lang.Cloneable
This flow is used to perform final tune up in normalizing Unicode to ASCII after using other Unicode normalization flows.
Users may define their own mapping in $LVG/data/Unicode/nonStripMap.data.
History:
NO_MUTATE_INFO, UPDATE| Constructor and Description |
|---|
ToStripMapUnicode() |
| Modifier and Type | Method and Description |
|---|---|
static java.util.Hashtable<java.lang.Character,java.lang.String> |
GetNonStripMapFromFile(Configuration config)
Read in non-strip map unicode list from configuration file
|
static void |
main(java.lang.String[] args)
A unit test driver for this flow component.
|
static java.util.Vector<LexItem> |
Mutate(LexItem in,
java.util.Hashtable<java.lang.Character,java.lang.String> nonStripMap,
boolean detailsFlag,
boolean mutateFlag)
Performs the mutation of this flow component.
|
static java.lang.String |
StripMapUnicodeToAscii(char inChar,
java.util.Hashtable<java.lang.Character,java.lang.String> nonStripMap)
Strip or map unicode to ASCII
|
GetTestStr, PrintResult, PrintResults, UpdateLexItem, UpdateLexItem, UpdateLexItempublic static java.util.Vector<LexItem> Mutate(LexItem in, java.util.Hashtable<java.lang.Character,java.lang.String> nonStripMap, boolean detailsFlag, boolean mutateFlag)
in - a LexItem as the input for this flow componentnonStripMap - a hash table contains the non-Strip map unicodedetailsFlag - a boolean flag for processing details informationmutateFlag - a boolean flag for processing mutate informationpublic static java.util.Hashtable<java.lang.Character,java.lang.String> GetNonStripMapFromFile(Configuration config)
config - Configuratin objectpublic static java.lang.String StripMapUnicodeToAscii(char inChar,
java.util.Hashtable<java.lang.Character,java.lang.String> nonStripMap)
inChar - an input characternonStripMap - a hash table contains the unicodepublic static void main(java.lang.String[] args)
args - arguments