Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Lexical Tools

LVG Rules: Trie

The trie mechanism is used in Lexcial Tools for inflectional rules and derivational (suffix) rules. A persistent trie (uses Java RandomAccessFile class) was used before lvg.2002. Due to the relatively slower performance, RamTire is used to replace persistent trie after Lvg.2003. The inital memory footprint size for im.rul and dm.rul are about 2.0 MB and 3.0 MB of Lexcial Tools, respectively. After Lvg.2014, rm.rul is generated automatically to include the optimal set of Sd-Rules with associated exceptions to reach above 95% of precision and recall rate (of all candidate Sd-Rules). This new approach increased the footprint of dm.rul to 4.3 Mb. With this small footprint size and one time initiation overhead, the performance of trie has relatively high performance. The design are detailed as follows:

  1. Introduction
  2. File Format
  3. Wild Card
  4. Trie Tree
  5. Persistent Trie
  6. Java Class Usage
  7. Reference Documents