Generate Inflectional Variants
Inflectional variants of terms include the singular and plurals for nouns, the various tenses of verbs, the positive, superlative and the comparative of adjectives and adverbs. By default, inflectional variants that are produced by fact are reported, and only if there are no such facts are rule generated inflections reported.
Inflected forms are generated by first uninflecting the input term, then retrieving all inflected forms of the uninflected form. This differs from earlier versions of lvg. In the past, if an inflected form came in, only its uninflected form was generated. If an uninflected form came in, only the inflected forms (not including the uninflected form) were generated.
In prior versions of LVG, the facts generated were only of irregular forms from the lexicon, relying upon the rules to generate the regular variant inflections. Under those circumstances, the default was that both facts and rules were reported, but a further filtering to a wordlist from the lexicon was done.
All inflected forms are now contained in the facts file, so if a term is in the lexicon, all its inflected forms are there, negating the need to generate rule derived forms. Due to this design change, the indexed keys of inflected term in database must be case insensitive to handle various cases from input terms. Usually, case does not contribute too much meaning in NLP and thus this inflection flow is case insensitive and results in more aggressive results.
Items returned from the inflection morphology unit are now sorted by part of speech, in an order which reflects frequency in the lexicon; nouns, adjectives, verbs, adverbs, modals and auxiliary verbs.
An additional heuristic has also been implemented within the inflectional morphology unit to limit spurious variants. If a term goes through an inflectional morphology mutation, and the term is not known to the lexicon, but its rule generated inflectional form is known to the lexicon, this variant is thrown out, because it is likely to be wrong.
The results are sorted by the frequency of category, length, case insensitive alphabetical order.
The -m flag is used to display the additional information that can be retrieved with the inflection flow. The additional information consists of two parts: The fact or rule that uninflects the term and the fact or rule that are applied to the uninflected form to produce the output. The formats of these two parts are:
shell> lvg -f:i sleep sleep|sleep|128|1|i|1| sleep|sleep|128|512|i|1| sleep|sleep|1024|1|i|1| sleep|sleep|1024|262144|i|1| sleep|sleep|1024|1024|i|1| sleep|slept|1024|32|i|1| sleep|slept|1024|64|i|1| sleep|sleeps|1024|128|i|1| sleep|sleeping|1024|16|i|1|More examples