Lexical Tools

Application: AntiNorm

I. Objective

To use Lexical Tools AntiNorm APIs to find inflectional variants of LEXICON by approximate match of the input term.

Lexical Tools provide AntiNorm flow component to retrieve inflectional variants of LEXICON by approximate match (has same normalized term). This is a useful application in NLP. This API includes:

  • Use the Lexical Tools normalization flow
  • Index normalized terms of all inflectional variants of LEXICON
  • Normalize input term
  • Query all inflectional variants by matching normalized term

This example illustrates how to use AntiNorm APIs in the applications.

II. Pre-Requirements
install lvg.${YEAR} package to "/Projects/LVG/lvg${YEAR}"

III. Source Code

import java.util.*;

import gov.nih.nlm.nls.lvg.Lib.*;
import gov.nih.nlm.nls.lvg.Flows.*;
import gov.nih.nlm.nls.lvg.Api.*;

public class AntiNorm
{
    // test driver
    public static void main(String[] args)
    {
        // instantiate a LvgApi object by config file
        String lvgConfigFile
            = "/export/home/lu/Projects/LVG/lvg2012/data/config/lvg.properties";
        LvgApi lvgApi = new LvgApi(lvgConfigFile);

        // Process the inflectional variants mutation
        LexItem in = new LexItem("Cancers, Lung, NOS"); // use lexItem as input

        try
        {
            Vector outs = ToAntiNorm.Mutate(in, lvgApi.GetMaxTerm(),
                lvgApi.GetStopWords(), lvgApi.GetConnection(),
                lvgApi.GetInflectionTrie(), lvgApi.GetSymbolMap(),
                lvgApi.GetUnicodeMap(), lvgApi.GetLigatureMap(),
                lvgApi.GetDiacriticMap(), lvgApi.GetNonStripMap(),
                lvgApi.GetRemoveSTree(), false, true);

            // PrintOut the Result
            for(LexItem out: outs)
            {
                System.out.println(out.ToString());
            }

            // clean up
            lvgApi.CleanUp();
        }
        catch(Exception e)
        {
            System.err.println("** ERR: " + e.toString());
        }
    }
}

IV. Compile

shell>javac -classpath ../lib/lvg2012dist.jar AntiNorm.java

V. Run & Results

shell>java -classpath ./:../lib/lvg2012dist.jar:/Projects/LVG/lvg2012/ AntiNorm

Cancers, Lung, NOS|Cancers, Lung, NOS|all|all|lung cancer|noun|base|E0319078||
Cancers, Lung, NOS|Cancers, Lung, NOS|all|all|lung cancer|noun|singular|E0319078||
Cancers, Lung, NOS|Cancers, Lung, NOS|all|all|lung cancers|noun|plural|E0319078||

=> Input term (Cancers, Lung, NOS) and inflections (lung cancer, lung cancers) has same normalized term (cancer lung).

VI. Application Package Download

The whole package, AntiNorm.tgz can be down here.