Generate Uninflected Spelling Variants
- Short Description:
Generate known uninflected form spelling variants.
- Full Description:
This flow component returns the uninflected spelling variants (base forms).
The results are sorted twice. First, sort spelling variants and then sort uninflected terms.
The -m option returns the EUI for the uninflected form in the specific category.
- Difference:
- The table in database has been changed. Accordingly, results are different. The main difference is that uninflected form is separated from infinitive, present, positive, etc.
- The result in C version concatenates categories if the output terms are the same. In other words, one output may include several categories. Thus, EUI is not unique if -m option is used.
- The -m option may not find the EUI since GetEui is case sensitive by uninflected term (need one more field of uninflected term in DB).
- Features:
- Generate the uninflected form for all spelling variants.
- Symbol:
e
- Examples:
shell> lvg -f:e -m
coloring
coloring|coloring|128|1|e|1|E0791541|
coloring|color|1024|1|e|1|E0017903|
coloring|colouring|128|1|e|1|E0791541|
coloring|colour|1024|1|e|1|E0017903|
resume
resume|resume|128|1|e|1|E0053099|
resume|resume|1024|1|e|1|E0053098|
resume|resumé|128|1|e|1|E0053099|
resume|résumé|128|1|e|1|E0053099|
ozena
ozena|ozaena|128|1|e|1|E0044939|
ozena|ozena|128|1|e|1|E0044939|
ozena|ozoena|128|1|e|1|E0044939|
More examples
Implementation Logic:
- Generate spelling variants (first sort).
- Uninflect all spelling variants (second sort).
- Retrieve EUI for each uninflected form of all spelling variants.
- Remove duplicated output LexItems
Source Code: ToBaseSpellingVariant.java
Hierarchy: Object -> Transformation -> ToBaseSpellingVariants