Lexical Tools

Frequently Asked Questions

(Please read before asking a question)

  • How can I ask a question?
    See Contact Us

  • What is the license agreement for Lexical Tools?
    Lexical Tools package is a free open source project for public users. It allows for use and redistribution without warranty. Please refer to NLM copyright information and terms and conditions for details. Lexical Tools use third party software. The license information are shown as follows:

    Third party softwareLicense
    HSQL Database Engine (HSQLDB) from SourceForge SourceForge License: allows for redistribution and use with a "copyright notice"
    ICU - International Component for Unicode X_License: is compatible with GPL but with fewer restrictions on commercial use of the software.
    GNU CYGWIN packageGNU General Public License (GPL)

  • What is the JDK/JRE license on Lexical Tools?
    Oracle announced major change in Java Runtime Environment (JRE) licensing policy in 2019. JRE Version 8 Update 202 is licensed under the Oracle Binary Code License Agreement for Java SE Platform Products. It is the last free JRE distributed on 2019/01/15. Starting from JRE Version 8 Update 211 commercial users will be required to pay for its usage to Oracle.

    The Lexical Tools 2024 is:

    • developed under: openjdk version "1.8.0_392"
    • tested under: jre-8u202
    • default installation: jre-8u202

    User are welcome to upgrade JRE by themselves.

  • What are verions of JRE, Embedded DB, Unicode vs. Lexical Tools version?
    Lexical Tools VersionJRE VersionEmbedded DatabaseUnicode
    20241.8.0.202HSqlDb 2.7.2-jdk8ICU4J 73.2
    20231.8.0.202HSqlDb 2.7.0-jdk8ICU4J 71.1
    20221.8.0.202HSqlDb 2.5.1ICU4J 69.1
    20211.8.0.202HSqlDb 2.5.1ICU4J 67.1
    20201.8.0.181HSqlDb 2.5.0ICU4J 64.2
    20191.8.0.181HSqlDb 2.4.1ICU4J 62.1
    20181.8.0.131HSqlDb 2.3.4ICU4J 57.1
    20171.8.0.101HSqlDb 2.3.4ICU4J 57.1
    20161.8.0.45HSqlDb 2.3.2ICU4J 55.1
    20151.7.0.67HSqlDb 2.3.2ICU4J 53.1
    20141.7.0.40HSqlDb 2.3.0ICU4J 51.2
    20131.7.0.04HSqlDb 2.2.8ICU4J 49.1
    20121.6.0.26HSqlDb 2.2.5ICU4J 4.8.1
    20111.6.0.21HSqlDb 2.0.0ICU4J
    20101.6.0.14HSqlDb 4.0.1
    20091.6.0.06HSqlDb 4.0
    20081.5.0.11HSqlDb 3.6.1
    20071.5.0.07HSqlDb 3.2
    20061.5.0.02HSqlDb 3.2
    20051.4.2.05HSqlDb 1.7.2ICU4J 2.2
    20041.4.2.05IDB 3.26ICU4J 2.2
    20031.4.2.05IDB 3.26N/A
    20021.2IDB V3.26N/A
    Please refer to Java 1.5 upgrade notes for details.

  • Is lvg jar file available in Maven Central Repository?
    Yes, thank Brian Carlson for his contribution. Please see the section of user contributions.

  • I can't install lexical tools successfully?
    One of the most common mistakes is that users install lexical tools from the wrong directory. Make sure to run the lexical tools installation script from the top directory of $LVG_DIR. Please refer to installation instruction for details.

  • How to install lexical tools in Apple/Mac or other platform?
    Please refer to manual installation instructions.

  • How to decide to install 32-bit or 64-bit JRE for Lexical tools (APIs)?
    Release before 2018 provides options to install 32-bit or 64-bit on the following platform:
    • Linux: automatically detects the OS (32/64 bits) and install appropriate JRE
    • Window/PC: provide options for users to install 32 or 64 bits JRE
      Users could find the OS type by:
      • GUI: Start -> Computer -> Properties -> System type
      • GUI: Start -> Control Panel -> System -> System type
      • Command prompt (cmd): shell> vmic os get osarchitecture

    32-bit installation is not supported after release 2018+. However, users may try manual installation for 32-bit machine.

  • Why do I get error message (below) while running Lexical tools (APIs)?
    ** Configuration Error: Can't find bundle for base name data.config.lvg, locale en_US
    ** Error: problem of opening/reading config file: 'data.config.lvg'. Use -x option to specify the config file path.

    The problem is the program can’t find the lvg configuration file: Lvg uses configuration file to define its properties. Configuration file can be specified in Java APIs (please refer to apiDoc). If the configuration file is not specified, lvg will try to find ./data/config/lvg.properties (default) for all directories included in the Java classpath. Please note that the properties LVG_DIR in the configuration file is used as a reference point for other specified files (properties) in the configuration file (not for lvg configuration file itself). This feature allows users create different lvg configuration files and put them outside lvg directories for different lvg releases in their applications.

    There are couples of ways to resolve this problem.

    1. If you don’t want to change the Java code:
      You will need to add your lvg root directory to your java classpath when you run the program. For example, let's say ${LVG_DIR} is "/Projects/lvg2012/", you will need to do something like:
         java –classpath /Projects/lvg2012:/Projects/lvg2012/lib/lvg2012dist/jar ...
    2. Use Java API in your Java code to specify the configuration file:
      There are different constructors users can use to specify the configuration file if you use LVG Java APIs in your program (this is the best way we recommend). For example,
      	  String lvgConfigFile = /Projects/lvg2012/data/config/lvg.properties
      	  LvgCmdApi lvgApi = new LvgCmdApi("-f:i", configFile);

  • What is the difference between lvg2010 and lvg2010lite?
    See lvgLite documents

  • Can we use other database instead of the default database (HSqlDb) for Lexical tools?
    Yes, all types of database can be used with Lexical Tools as long as there is a JDBC connector for it. MySql database is used as an example for illustration. Please refer to install MySql database option for details.

  • How do we use Lexical tools APIs?
    Please refer to Lexical Tools APIS for details.

  • How to make sure it is thread-safe when using Lexical in the application?
    Have separate thread to create separate Lexical Tools APIs in the applications.

  • Is there any tool in Lexical Tools for converting Unicode to ASCII?
    Yes, a new tool, toAscii, is provided since lvg.2009 release.

  • Is there any tool in Lexical Tools for cutting out and rearranging fields?
    Yes, a new tool, fields, is provided since lvg.2011 release.

  • Which lvg flow components can be used for converting Unicode to ASCII?
    Lexical tools provide several options for Unicode to ASCII operations. Flows of -f:q5 and norm (-f:N) normalize Unicode to pure ASCII. Flows of -f:q, -f:q0, -f:q1, -f:q2, -f:q3, -f:q4, -f:q7, -f:q8 provide other useful Unicode normalizations.

  • Are there any scripts that we can use to run lvg or Norm?
    Yes, scripts of norm, luiNorm, wordInd, lvg, toAscii, fields, and lgt are generated under "${LVG_DIR}/bin/" after normal installation. This directory includes scripts for Unix and batch files for Windows, respectively.

  • Problems of using my old Java code with lvg.2007 (and later) Java APIs?
    Lvg was developed and compiled in Java 1.5 in 2006, 2007, and 2008; in Java 1.6 in 2009 and 2010. You will need to use JDK 1.5/1.6 to compile your java codes and run your applications on JRE 1.5 and JRE 1.6. Please refer to Java 1.5 upgrade notes for details.

  • Is the latest Java version faster than old C version?
    The answer is "Yes". The latest java version is as fast as old C version (or even faster). It's because we resolved the performance bottle neck by using HSqlDb and using much faster machine. For the first Java version of lvg.2002, it is slower than C version. The major bottle neck was the database (IDB) and persistent trie. We tried to run lvg on a faster database (MySql), the performance is improved dramatically. After lvg2003, we had improved the performance mainly on trie and some other code optimization. The performance of norm after version 2003 is at the same order of magnitude as C version on Solaris, Sparc platform.

    In lvg.2004, we improved performance (about 50%) by taking advantage of MySql V4.0, new JDBC driver, and other code optimization. After lvg.2005, Lvg uses HSqlDb as default DB to improve performance. As a matter of fact, lvg.2005 or later versions are faster than old C version.