Lexical Tools

LuiNorm System Options

Normalize and canonicalize text. This process involves abstracting away from case, inflection, and word order. It also involves removing stop words, possessives, punctuation, remove parenthetic plural forms of (s), (es), (ies), (S), (ES), and (IES), and normalize non-ASCII Unicode to ASCII from the input term. Specifically, this normalization is more or less equivalent to the combined lvg flow options (in this order as well) -f:q7:g:rs:o:t:l:B:C:q8:w . That is, Unicode core norm then remove genitives, then remove parenthetic plural forms of (s), (es), (ies), (S), (ES), and (IES) then replace punctuation with spaces, then remove stop words, then lowercase, then uninflect each word, then take each of the uninflected words and map them to their canonical form, then strip or map non-ASCII Unicode characters, and finally word order sort.

The table below lists all system options for luiNorm programs

Original FlagNew FlagFeature Descriptions
Input Filter Options:
tN t:INT Define the field to use as the term field. The default is 1.
Global Behavior Options:
  ci Print configuration information of luiNorm.
  d Print details information of luiNorm operations.
h h Print program help information.
  hs Print option's hierarchy structure.
  i:STR Define input file name. The default is screen input.
  o:STR Define output file name. The default is screen output.
  p Show the prompt. The default is no prompt.
s"|" s:STR Defines a field separator. The default is "|".
v v Return the current version identification of Norm.
xConfigFile x:STR Loading an alternative configuration file.
Output Filter Options:
  ti Display the filtered input term in the output
n n Return a "-No Output-" message when an input produces no output.

Examples:

  • shell> luiNorm -i:in.data -o:out.data
    Read data from file, in.data, and send output to file, out.data.
  • shell> luiNorm -n $$$ $$$|-No Output-
  • shell> luiNorm -t:2 saw|left saw|left|leaf
  • shell> luiNorm -t:2 -ti saw|left left|leaf
  • shell> luiNorm -t:2 -ti -n saw||left |-No Output-
  • shell> luiNorm -s:/ -t:2 leave|saw/left leave|saw/left/leaf
  • shell> luiNorm -ci LVG_DIR: [/Projects/lvg2024/] DB_TYPE: [HSQLDB] DB_NAME: [lvg2024]
  • shell> luiNorm -v luiNorm.2024
  • shell> luiNorm -x:config.data
    Use an alternative configuration file, config.data