Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Lexical Tools

LVG Trie File Format

I. File Distribution (data for rules)

  • im.rul
  • plural.rul
  • verbinfl.rul

  • dm.rul

II. File Format

There are four different types of data specified in a file. They are:

  1. Comments: lines start with #
  2. Rules: lines with a format of
    key | Input Category | Input Inflection | value | Output Category | Output Inflection
  3. Exceptions: lines start with space, follow with key | value;
  4. Another file for additional rules: lines start with #include "file name"

These types of format work all right. However, they are not intuited to human and thus hard to maintain. A new format is proposed and used to improve its maintainability:

  1. Comments: lines start with #
  2. Rules: lines start with RULE:
  3. Exceptions: lines start with EXCEPTION:
  4. Another file: lines start with FILE:
  5. Standardize the abbreviation of Category and Inflection to be consistent through all lvg components.
    • Category: noun, verb, adj, adv
    • Inflection: base
      singular, plural
      infinitive, pres, past, presPart, pastPart
      positive, comparative, superlative
< Example >
# This is a comments
RULE: e$|adj|positive|er$|adj|comparative
EXCEPTION: inhale|inhaler;
FILE: verbinfl.rul

III. File Characteristics

  • All rules and exceptions are bi-directional (forward and reverse)
  • key is used as a pattern to match the input term
  • value is used as a pattern to change the input term for output
  • Exceptions are the exceptions for the rule that locate right above them

  • key in a rule is called input suffix
  • value in a rule is called output suffix
  • WildCards are used in input suffix and output suffix
  • input category must be one of: adj, adv, noun or verb
  • input inflection must be one of: base, singular, positive, infinitive, plural, comparative, superlative, pres, presPart, past, pastPart
  • output category must be one of: adj, adv, noun or verb
  • output inflection must be one of: base, singular, positive, infinitive, plural, comparative, superlative, pres, presPart, past, pastPart