Application: Normalization
I. Objective
To use Lexical Tools Norm APIs to normalize input term.
In NLP applications, we often want to normalize words/terms before indexing and query in database. Users may define and compose their own normalization according to the requirements (please see MetaMap Norm). Lexical tools provide a very thorough normalization, which involves abstracting away from case, inflection, word order, removing stop words, possessives, replacing punctuation with spaces, removing parenthetic plural forms of (s), (es), (ies), (S), (ES), and (IES), and non-ASCII Unicode to ASCII normalization from the input term. This example illustrates how to use Norm APIs in the applications.
II. Pre-Requirements
install lvg.${YEAR} package to "/Projects/LVG/lvg${YEAR}"
III. Source Code
import java.util.*; import gov.nih.nlm.nls.lvg.Api.*; public class Normalization { // test driver public static void main(String[] args) { // instantiate a LvgApi object by config file String lvgConfigFile = "/export/home/lu/Projects/LVG/lvg2012/data/config/lvg.properties"; NormApi normApi = new NormApi(lvgConfigFile); // Process the inflectional variants mutation String in = "left"; // use lexItem as input to lvgApi try { Vectorouts = normApi.Mutate(in); // PrintOut the Result for(String out: outs) { System.out.println(in + "|" + out); } // clean up normApi.CleanUp(); } catch (Exception e) { System.err.println("** ERR: " + e.toString()); } } }
IV. Compile
shell>javac -classpath ../lib/lvg2012dist.jar Normalization.java
V. Run & Results
shell>java -classpath ./:../lib/lvg2012dist.jar:/Projects/LVG/lvg2012/ Normalization left|left left|leave
=> Input term, left, can be normalized to left or leave.
VI. Application Package Download
The whole package, Normalization.tgz can be down here.