STMT

Normalization

Normalization is commonly used in NLP to abstract away lexical variations (such as case, punctuation, spelling variants, inflectional variants, etc.) of words with same/similar meaning to increase the recall rate. Different project might have different normalization according to the requirements. STMT includes three normalization applying lexical tools APIs and are described as follows:

LexItem Norm
Synonym Norm
Lvg Norm

Comparison

	Normalize	Operation	Usage
LexItem Norm	case punctuation	one to one	Find term in Lexicon (lsf)
Synonym Norm	case punctuation spelling variants inflectional variants	one to many	Find synonym of a term (smt)
Lvg Norm	case punctuation spelling variants inflectional variants Non-ASCII Unicode word order	one to many	Term to CUI mapping (smt)

Sub-Term Mapping Tools