Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Lexical Tools

Application: Normalization

I. Objective

To use Lexical Tools Norm APIs to normalize input term.

In NLP applications, we often want to normalize words/terms before indexing and query in database. Users may define and compose their own normalization according to the requirements (please see MetaMap Norm). Lexical tools provide a very thorough normalization, which involves abstracting away from case, inflection, word order, removing stop words, possessives, replacing punctuation with spaces, removing parenthetic plural forms of (s), (es), (ies), (S), (ES), and (IES), and non-ASCII Unicode to ASCII normalization from the input term. This example illustrates how to use Norm APIs in the applications.

II. Pre-Requirements
install lvg.${YEAR} package to "/Projects/LVG/lvg${YEAR}"

III. Source Code

import java.util.*;
import gov.nih.nlm.nls.lvg.Api.*;

public class Normalization
{
    // test driver
    public static void main(String[] args)
    {
        // instantiate a LvgApi object by config file
        String lvgConfigFile
            = "/export/home/lu/Projects/LVG/lvg2012/data/config/lvg.properties";
        NormApi normApi = new NormApi(lvgConfigFile);

        // Process the inflectional variants mutation
        String in = "left"; // use lexItem as input to lvgApi
        try
        {
            Vector outs = normApi.Mutate(in);

            // PrintOut the Result
            for(String out: outs)
            {
                System.out.println(in + "|" + out);
            }

            // clean up
            normApi.CleanUp();
        }
        catch (Exception e)
        {
            System.err.println("** ERR: " + e.toString());
        }
    }
}

IV. Compile

shell>javac -classpath ../lib/lvg2012dist.jar Normalization.java

V. Run & Results

shell>java -classpath ./:../lib/lvg2012dist.jar:/Projects/LVG/lvg2012/ Normalization

left|left
left|leave

=> Input term, left, can be normalized to left or leave.

VI. Application Package Download

The whole package, Normalization.tgz can be down here.