Text Categorization

STI Input Filter Options

STI provides input filter option for users to filter out irrelevant words. Such as word extraction filter to filter out punctuation, stopwords, words not in the restrictwords list, words not in the legal words list, non-unique words, etc.. It also provides a detail filtering message for debugging purpose.

The table below lists all input filter options of STI:

  • Input Filter option

    Option FlagFeature Descriptions
    -if:aUse Acronyms filter
    -if:dShow detailed information for input filter
    -if:eNot use words extraction filter
    -if:hShow input filter help menu
    -if:uUse unique words filter (remove duplicate words)

  • Legal Word Filter options