Sub-Term Mapping Tools

Normalization

Normalization is commonly used in NLP to abstract away lexical variations (such as case, punctuation, spelling variants, inflectional variants, etc.) of words with same/similar meaning to increase the recall rate. Different project might have different normalization according to the requirements. STMT includes three normalization applying lexical tools APIs and are described as follows:

Comparison

 NormalizeOperationUsage
LexItem Norm
  • case
  • punctuation
one to oneFind term in Lexicon (lsf)
Synonym Norm
  • case
  • punctuation
  • spelling variants
  • inflectional variants
one to manyFind synonym of a term (smt)
Lvg Norm
  • case
  • punctuation
  • spelling variants
  • inflectional variants
  • Non-ASCII Unicode
  • word order
one to manyTerm to CUI mapping (smt)