Example - Add SD-Rules Derived from factD
The original Lexical Tools collects 4,467 SD-pairs with 4,110 suffix SD-pairs. These SD-pairs can be used to derive possible SD-rules by following the same approach in the nomD session:
Possible SD-rule from factD | Root | Related | Notes |
---|---|---|---|
$|noun|less$|adj|131|131 | Yes | None | Selected |
$|verb|$ion|noun|111|111 | Yes | Duplicated | Not selected |
ist$|noun|y$|noun|63|63 | Yes | None | Selected |
$|adj|ally$|adv|58|58 => ic$|adj|ically$|adv is used instead => need to verify the root stats | Yes | None | Selected |
$|noun|ful$|adj|58|58 | Yes | None | Selected |
c$|adj|s$|noun|54|54 => ic$|adj|is$|noun is used instead => need to verify the root stats | Yes | None | Selected |
on$|noun|ve$|adj|38|38 | Yes | None | Not selected due to low frequency (coverage) |
... | ... | ... | Not selected due to low frequency (coverage) |
The iterative results are shown as follows:
ID | New Candidate Rule | Total Yes | Total Rule No. | Rule No. | A. Rate | Occr. | Yes | No | Tbd | SD-Rule | Status | Source | Notes | Sys A. Rate | Sys C. Rate | Sys. Perf | Notes |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2.3.4 (prev. optimized set) | 39,197 | 90 | 68 | 60.66% | 183 | 111 | 72 | 0 | ar$|adj|e$|noun | 2013 | ORG_RULE | SELF | 95.05% | 94.60% | 1.8965 | Baseline | |
2.3.4.1 |
12|99.95%|1931|1930|1|0|ic$|adj|ically$|adv|2013|ORG_FACT|SELF
| 41,127 = 39,197 + 1930 | 91 | 69 | 60.66% | 183 | 111 | 72 | 0 | ar$|adj|e$|noun | 2013 | ORG_RULE | SELF | 95.28% | 94.85% | 1.9013 | Better |
2.3.4.2 |
15|99.64%|559|557|2|0|$|noun|less$|adj|2013|ORG_FACT|SELF
| 41,684 = 41,127 + 557 | 92 | 70 | 60.66% | 183 | 111 | 72 | 0 | ar$|adj|e$|noun | 2013 | ORG_RULE | SELF | 95.34% | 94.92% | 1.9026 | Better |
2.3.4.3 |
40|95.63%|504|482|22|0|ist$|noun|y$|noun|2013|ORG_FACT|SELF
| 42,166 = 41,684 + 482 | 93 | 71 | 60.66% | 183 | 111 | 72 | 0 | ar$|adj|e$|noun | 2013 | ORG_RULE | SELF | 95.35% | 94.98% | 1.9032 | Better |
2.3.4.4 |
49|91.70%|277|254|23|0|ic$|adj|is$|noun|2013|ORG_FACT|SELF
| 42,420 = 42,166 + 254 | 94 | 72 | 60.66% | 183 | 111 | 72 | 0 | ar$|adj|e$|noun | 2013 | ORG_RULE | SELF | 95.32% | 95.01% | 1.9033 | Better |
2.3.4.5 |
55|89.93%|139|125|14|0|$|noun|ful$|adj|2013|ORG_FACT|SELF
| 42,545 = 42,420 + 125 | 95 | 73 | 60.66% | 183 | 111 | 72 | 0 | ar$|adj|e$|noun | 2013 | ORG_RULE | SELF | 95.30% | 95.02% | 1.9033 | Best |
From above results, all five selected SD-rules (with the highest frequency and precision from factD) improved the system performance. Thus, all these five SD-rles are added to the SD-rule set. Please note that SD-rule ic$|adj|ically$|adv
and ic$|adj|is$|noun
are suggested SD-rules from their root parent-rule $|adj|ally$|adv
and c$|adj|s$|noun
, respectively. Both of root parent-rules should be re-evaluated by this system.
The table above shows the iterative results by adding new rules derived from factD step by step. The results show all five selected SD-rules (with the highest frequency from factD) improve the system performance. Thus, all these five SD-rules are added to the SD-rule set to reach better coverage rate (95.02%) and system performance (1.9033) with accuracy rate of 95.30% to include 73 (out of 95) SD-rule in the optimized set. The diagram below shows the system accuracy and coverage curves of this optimized set.