Map Unicode to ASCII
This flow converts Unicode characters to ASCII characters. Some Unicode characters are not be able to convert to Unicode by Unicode normalization algorithm, such as strip diacritics, split ligatures, etc. These characters are normalized to ASCII by table lookup mapping. The mapping table is defined in the file of $LVG/data/Unicode/unicodeMap.data. Users may add/modify this file from the default set for their applications. Please refer to the design documents of Map Unicode to ASCII for details.
When the -m flag is specified, the detail mutate operations for each characters of the input string are added after the standard set of lvg output fields. There are two basic mutate operations for normalize Unicode to ASCII in this flow as shown in following table:
Operations | Descriptions | Example |
---|---|---|
NO | No operation | ø -> ø |
MP | Table lookup mapping | Ƽ -> 5 |
None.
shell> lvg -f:q1 -m ⅝ ⅝|5/8|2047|16777215|q1|1|MP|More examples