Text Categorization

JDI Legal Word Filter Options

Legal word filter is one of the processes in the input filter option. JDI input filter uses legal word filter to filter out not legal words. This legal word option provides users to set the legal word length, restrictwords, stopwords, document count, word count, and normalized signal according to their preference. This option is designed for advanced users for their special interests. General users should skip this option and use the default setting.

The table below lists all legal words filter options of JDI:

  • Legal Word filter option

    option flagfeature descriptions
    -lw:d Show legal words filter option details
    -lw:dc~INT Set valve of min document count (default:2)
    -lw:dc~u Use min document count criteria
    -lw:h Show legal words filter help menu
    -lw:hs~INT Set value of high signal (default: 754648)
    -lw:hs~n Not use high signal criteria
    -lw:ls~INT Set valve of low signal (default: 2)
    -lw:ls~n Not use low signal criteria
    -lw:r Remove restrictwords filter
    -lw:s keep stopwords (not use stopwords filter)
    -lw:wc~INT Set valve of min word count (default:2)
    -lw:wc~u Use min word count criteria
    -lw:wl~INT Set valve of min. word length (default: 3)
    -lw:wl~n Not use min. word length criteria

  • Examples:
      > jdi -if:d -lw:d -p
      - Please input a term (type "Ctl-d" to quit) >
      the heart valve
      --> Input text: [the heart valve]
      -- Words after Acronym filter [the heart valve], Acronym filter is not used.
      -- W.E. filtered words (3): [the heart valve], W.E. filter is used
      -- Legal words (2): [heart valve]
      ---  Legal words selected options:
         - Min. length: true (3)
         - Remove stopwords: true
         - Restrictwords only: true
         - Min. normalized count: true (2)
         - Max. normalized count: true (792054)
         - Min. WC: false (2)
         - Min. DC: false (2)
         - Illegal words details: 
           - [the]: it is a stopword.
      -- Unique words (2): [heart valve], unique word filter is not used
      -- Final words (2): [heart valve]
      
      -- Number of scores: 123
      -- Total final words used: 2
      --- JD scores (x 1) and rank based on word count ---
      JD018|Cardiology
      1|0.0858526|JD018|Cardiology
      2|0.0624434|JD148|Pulmonary Medicine
      3|0.0495025|JD124|Vascular Diseases
      4|0.0251979|JD144|General Surgery
      5|0.0209033|JD030|Diagnostic Imaging
      6|0.0108041|JD120|Transplantation
      7|0.0090153|JD005|Anesthesiology
      8|0.0086425|JD014|Biomedical Engineering
      9|0.0067363|JD100|Radiology
      10|0.0064961|JD118|Therapeutics
      --- JD scores (x 1) and rank based on document count ---
      JD018|Cardiology
      1|0.1564322|JD018|Cardiology
      2|0.0979494|JD148|Pulmonary Medicine
      3|0.0891969|JD124|Vascular Diseases
      4|0.0438102|JD030|Diagnostic Imaging
      5|0.0400007|JD144|General Surgery
      6|0.0236169|JD005|Anesthesiology
      7|0.0187880|JD120|Transplantation
      8|0.0158293|JD014|Biomedical Engineering
      9|0.0151241|JD092|Physiology
      10|0.0133293|JD118|Therapeutics
      --- Overall JD rank ---
      JD018|Cardiology|dc
      
    • Index input and show prompt with input filter and legal word filter details