Lexical Tools

Example - Add SD-Rules from Other Suggestions

Other suggested SD-Rules can be evaluated and added to this system by following the same approach in the previous two sessions: nomD session and factD session.

  • Suggested SD-rules:

    Suggested SD-RuleNotesExamplesStatus
    $|verb|less$|adjThis rule should be evaluated first to complete suffix negation.
    • count$|verb|countless$|adj
    • resist$|verb|resistless$|adj
    • tame$|verb|tameless$|adj
    • tire$|verb|tireless$|adj
    Evaluated: Bad SD-Rule
    $|adj|ly$|advSuggested by Lynn
    • quick|adj|quickly|adv
    Rejected - duplicated
    $|noun|y$|adjSuggested by Lynn
    • wood|noun|wody|adj
    Rejected - duplicated
    al$|adj|us$|nounSuggested by Chris, seconded by Destinee & Lynn
    • viral|adj|virus|noun
    • estral|adj|estrus|noun
    • coccal|adj|coccus|noun
    TBD
    $|noun|ize$|verbSuggested by Lynn
    • terror|noun|terrorize|verb
    TBD
    e$|verb|ing$|nounSuggested by Chris
    • rave|verb|raving|noun
    TBD

  • Apply the same procedures to get the optimized set by using the latest optimized set of 2.3.4.5 as new baseline. This task involves:
    • Retrieve all raw SD-pairs from Lexicon (2013) of above four selected SD-rules
    • Tag raw SD-pairs
    • Get stats of SD-pairs of these four SD-rules
    • Add to SD-rules set and find the optimization
      • Check if root parent-rule
    • The total valid SD-Pair no. (TotalYes) needs to be calculated as total valid SD-pair no. from all parent-rules.

    The iterative results are shown as follows:

IDSuggested Candidate RuleTotal YesTotal Rule No.Rule No.A. RateOccr.YesNoTbdSD-RuleStatusSourceNotesSys A. RateSys C. RateSys. PerfNotes
2.3.4.5
(prev. optimized set)
  42,545957360.66%183111720ar$|adj|e$|noun2013ORG_RULESELF95.30%95.02%1.9033Baseline
2.3.4.5.1 101|1.84%|381|7|374|0|$|verb|less$|adj|2013|ORG_FACT|SELF 42,552 =
42,545 + 7
967360.66%183111720ar$|adj|e$|noun2013ORG_RULESELF95.30%95.01%1.9031Worse

The result shows the new selected SD-rule does not improve the system performance and should not be included in the optimized set. However, this SD-rule is added to the SD-rules evaluation set to cover more evaluated SD-rules. The new optimized set has coverage rate (95.01%) and system performance (1.9031) with accuracy rate of 95.30% to include 73 (out of 96) SD-rules. This set should be used as baseline for evaluating new rules in the future. The diagram below shows the system accuracy and coverage curves of this optimized set.