II. Background
Lexical Tools uses The Specialist LEXICON as corpus for mutation in many flow components. The data in LEXICON are processed and converted in relational database format and stored in embedded database tables in Lexical Tools. These flows include inflectional variants, derivational variants, acronyms, antiNorm, canonical forms, fruitful variants, nominalizations, properNoun, synonyms, etc.. Most of these tables are updated and generated by computer programs for the annual release. However, the derivational variants and synonyms tables are not updated by these computer programs. This section describes a new enhanced methodology to generate derivations table from LEXICON annually.
Derivational variants flow component is one of the most commonly used functions in Lexical Tools. Derivational variants are terms which are somehow related to the original term but do not share the same meaning (they are close in meaning). The existing algorithm of derivational variants flow uses both facts (known derivations in derivations table) and rules (via adding, changing, or removing common suffixes). Facts include 4,559 records, which are developed since C version of Lexical Tools, are stored in database and retrieved by SQL query. Rules of suffix derivations are stored and retrieved through Trie mechanism to generate derivational variants. Three options of heuristic rules are implemented in Java version to filter out non-realistic derivational variants generated by rules:
The derivational table (facts) has not been updated since the first Java release of Lexical Tools release (2002) while LEXICON releases are updated annually. The main reason is because no derivational relationship (and meaning) is coded directly in LEXICON. There are four issues of the current derivations table (fact):
The following section describes details about the original derivations tables.
The following section describes a new methodology to address the above issues to enhance derivational variants generation.