Lexical Tools

Optimizing 2017 SD-Rule Set - Optimum Log

I. Criteria:

  • Total valid SD-Pairs from baseline (parent only rules) is 51,788
  • Candidate child rules are:
    • Decompose occurrence rate >= 40% (default)
    • Candidate child rules: occurrence rate >= 25% (default)
    • Candidate child rules: precision is decided by the methodology of optimization
    • Find the one with max. precision and recall
    • If the child rule has lower precision and recall than parents, it is not a good candidate even its recall is over 25%!

      Such as in Cases 15 and 16, the system performance is worse because both precision and recall are lower than parents. No need to run the program for these two cases.

      If the performance of a child rule is worse, then the next generation child rules will be worse. No need to run through the following generations (even we still run through them to keep the log completed)!

  • Find the best set by comparing parent vs. child rules:
    • Only apply when child rules precision is better than parent rule
    • Higher system performance
    • If System Performance is the same,
      • Use Precision
      • Use Recall
      • Use Linguistic knowledge

      • Use parent rule to replace child rules.
      • If no parent-child rules involved, use more rules

II. Iterative Optimization Log:

Source:

  • Dir: ${SUFFIX_DIR}/data/${YEAR}/dataR/SdRulesOptimum/*/
  • File: sdRules.stats.out.html

IDRank: Parent-RuleRank: Candidate Child-RulesCutoff SD-Rules Sys Accu. Rate
(Precision)
Sys Cover. Rate
(Recall)
Sys. PerfNotes
Rule No.A. RateOccr.YesNoSD-Rule
0 Rank in Baseline (all Rank)
Parent-rule only - Baseline
Rank
No child-Rule
8563.14%331209122$|noun|ist$|noun95.06%94.47%1.8953Baseline
1.1 18 (28):
0|2072|2053|19|$|adj|ally$|adv|99.08%|100.00%
11:
1|1954|1953|1|c$|adj|cally$|adv|99.95%|94.31%
8563.14%331209122$|noun|ist$|noun95.08%94.28%1.8936Worse
1.2 18 (28):
0|2072|2053|19|$|adj|ally$|adv|99.08%|100.00%
11:
2|1949|1948|1|ic$|adj|ically$|adv|99.95%|94.06%
8563.14%331209122$|noun|ist$|noun95.08%94.27%1.8935Worse
2.1 26 (39):
0|2081|2037|44|$|adj|ity$|noun|97.89%|100.00%
16, 28:
1|946|939|7|c$|adj|city$|noun|99.26%|45.46% 1|728|712|16|l$|adj|lity$|noun|97.80%|34.98%
8663.14%331209122$|noun|ist$|noun95.06%93.73%1.8878Worse
2.2 26 (39):
0|2081|2037|44|$|adj|ity$|noun|97.89%|100.00%
15, 28:
2|945|939|6|ic$|adj|icity$|noun|99.37%|45.41% 1|728|712|16|l$|adj|lity$|noun|97.80%|34.98%
8663.14%331209122$|noun|ist$|noun95.06%93.73%1.8879Worse
3.1 67 (101):
0|1326|968|358|$|noun|al$|adj|73.00%|100.00%
72:
1|673|557|116|n$|noun|nal$|adj|82.76%|50.75%
8761.29%935736$|noun|ish$|adj95.28%94.02%1.8931Worse
3.2 67 (101):
0|1326|968|358|$|noun|al$|adj|73.00%|100.00%
69:
2|621|533|88|on$|noun|onal$|adj|85.83%|46.83%
8761.29%935736$|noun|ish$|adj95.33%93.98%1.8931Worse
3.3 67 (101):
0|1326|968|358|$|noun|al$|adj|73.00%|100.00%
68:
3|577|497|80|ion$|noun|ional$|adj|86.14%|43.51%
8761.29%935736$|noun|ish$|adj95.35%93.91%1.8925Worse
3.4 67 (101):
0|1326|968|358|$|noun|al$|adj|73.00%|100.00%
66:
4|472|408|64|tion$|noun|tional$|adj|86.44%|35.60%
8761.29%935736$|noun|ish$|adj95.37%93.73%1.8910Worse
4.1 20 (114):
0|661|343|318|$|noun|y$|noun|51.89%|100.00%
47:
1|253|234|19|h$|noun|hy$|noun|92.49%|38.28%
8663.14%331209122$|noun|ist$|noun95.04%94.92%1.8997Better
5.1 42 (59):
0|573|537|36|$|verb|ion$|noun|93.72%|100.00%
35:
1|449|434|15|t$|verb|tion$|noun|96.66%|78.36%
8663.14%331209122$|noun|ist$|noun95.07%94.72%1.8980Worse
5.2 42 (59):
0|573|537|36|$|verb|ion$|noun|93.72%|100.00%
15:
2|322|320|2|ct$|verb|ction$|noun|99.38%|56.20%
8663.14%331209122$|noun|ist$|noun95.08%94.50%1.8959Worse
5.3 42 (59):
0|573|537|36|$|verb|ion$|noun|93.72%|100.00%
5:
3|186|186|0|ect$|verb|ection$|noun|100.00%|32.46
8663.14%331209122$|noun|ist$|noun95.08%94.25%1.8932Worse
6.1 66 (86):
0|264|228|36|a$|noun|an$|adj|86.36%|100.00%
No candidate child rules found! 8663.14%331209122$|noun|ist$|noun95.04%94.92%1.8997Same
7.1 109 (139):
0|276|1|275|a$|noun|an$|noun|0.36%|100.00%
116:
1|137|1|136|ia$|noun|ian$|noun|0.73%|49.64%
8663.14%331209122$|noun|ist$|noun95.04%94.92%1.8997Same
8.1 62 (82):
0|137|120|17|a$|noun|ar$|adj|87.59%|100.00%
53:
1|115|105|10|la$|noun|lar$|adj|91.30%|83.94%
8663.14%331209122$|noun|ist$|noun95.05%94.89%1.8995Worse
8.2 62 (82):
0|137|120|17|a$|noun|ar$|adj|87.59%|100.00%
41:
2|69|65|4|ula$|noun|ular$|adj|94.20%|50.36%
8663.14%331209122$|noun|ist$|noun95.06%94.82%1.8988Worse
9.1 16 (25):
0|2527|2508|19|ation$|noun|e$|verb|99.25%|100.00%
13, 2:
1|1060|1059|1|sation$|noun|se$|verb|99.91%|41.95% 1|1257|1257|0|zation$|noun|ze$|verb|100.00%|49.74%
8763.14%331209122$|noun|ist$|noun95.06%94.55%1.8961Worse
9.2 16 (25):
0|2527|2508|19|ation$|noun|e$|verb|99.25%|100.00%
5, 2:
2|1036|1036|0|isation$|noun|ise$|verb|100.00%|41.00% 2|1250|1250|0|ization$|noun|ize$|verb|100.00%|49.47%
8763.14%331209122$|noun|ist$|noun95.06%94.49%1.8955Worse
10.1 59 (79):
0|292|260|32|c$|adj|s$|noun|89.04%|100.00%
52:
1|282|258|24|ic$|adj|is$|noun|91.49%|96.58%
8663.14%331209122$|noun|ist$|noun95.06%94.92%1.8998Better
10.2 59 (79):
0|292|260|32|c$|adj|s$|noun|89.04%|100.00%
41:
2|190|180|10|tic$|adj|tis$|noun|94.74%|65.07%
8663.14%331209122$|noun|ist$|noun95.08%94.77%1.8984Worse
10.3 59 (79):
0|292|260|32|c$|adj|s$|noun|89.04%|100.00%
29:
3|172|168|4|itic$|adj|itis$|noun|97.67%|58.90%
8663.14%331209122$|noun|ist$|noun95.09%94.75%1.8983Worse
11.1 21 (32):
0|414|409|5|cy$|noun|t$|adj|98.79%|100.00%
No candidate child rules found! 8663.14%331209122$|noun|ist$|noun95.06%94.92%1.8998Same
12.1 20 (31):
0|2343|2315|28|e$|verb|ion$|noun|98.80%|100.00%
14:
1|2206|2198|8|te$|verb|tion$|noun|99.64%|94.15%
8663.14%331209122$|noun|ist$|noun95.08%94.69%1.8978Worse
12.2 20 (31):
0|2343|2315|28|e$|verb|ion$|noun|98.80%|100.00%
13:
2|2102|2098|4|ate$|verb|ation$|noun|99.81%|89.71%
8663.14%331209122$|noun|ist$|noun95.08%94.50%1.8958Worse
12.3 20 (31):
0|2343|2315|28|e$|verb|ion$|noun|98.80%|100.00%
4:
3|600|600|0|late$|verb|lation$|noun|100.00%|25.61%
8568.29%1238439ant$|adj|ate$|verb95.15%91.20%1.8636Worse
13.1 37 (52):
0|144|138|6|e$|verb|is$|noun|95.83%|100.00%
27:
1|141|138|3|se$|verb|sis$|noun|97.87%|97.92%
8663.14%331209122$|noun|ist$|noun95.06%94.92%1.8998Same
13.2 37 (52):
0|144|138|6|e$|verb|is$|noun|95.83%|100.00%
36, 8:
2|54|52|2|ose$|verb|osis$|noun|96.30%|37.50% 2|59|59|0|yse$|verb|ysis$|noun|100.00%|40.97%
8763.14%331209122$|noun|ist$|noun95.06%94.87%1.8993Wrose
13.3 37 (52):
0|144|138|6|e$|verb|is$|noun|95.83%|100.00%
36, 8:
2|54|52|2|ose$|verb|osis$|noun|96.30%|37.50% 3|58|58|0|lyse$|verb|lysis$|noun|100.00%|40.28%
8763.14%331209122$|noun|ist$|noun95.06%94.87%1.8993Wrose
14.1 47 (64):
0|224|207|17|esis$|noun|ic$|adj|92.41%|100.00%
22:
1|209|206|3|nesis$|noun|nic$|adj|98.56%|93.30%
8663.14%331209122$|noun|ist$|noun95.08%94.92%1.9000Better
14.2 47 (64):
0|224|207|17|esis$|noun|ic$|adj|92.41%|100.00%
15:
2|207|206|1|enesis$|noun|enic$|adj|99.52%|92.41%
8663.14%331209122$|noun|ist$|noun95.09%94.92%1.9000Better
14.3 47 (64):
0|224|207|17|esis$|noun|ic$|adj|92.41%|100.00%
15:
3|207|206|1|genesis$|noun|genic$|adj|99.52%|92.41%
8663.14%331209122$|noun|ist$|noun95.09%94.92%1.9000Same
14.4 47 (64):
0|224|207|17|esis$|noun|ic$|adj|92.41%|100.00%
5:
4|181|181|0|ogenesis$|noun|ogenic$|adj|100.00%|80.80%
8663.14%331209122$|noun|ist$|noun95.09%94.87%1.8996Worse
15.1 11 (17):
0|1634|1633|1|ility$|noun|le$|adj|99.94%|100.00%
2:
1|1632|1632|0|bility$|noun|ble$|adj|100.00%|99.88%
8663.14%331209122$|noun|ist$|noun95.09%94.92%1.9000Same
15.2 11 (17):
0|1634|1633|1|ility$|noun|le$|adj|99.94%|100.00%
2:
2|1294|1294|0|ability$|noun|able$|adj|100.00%|79.19%
8663.14%331209122$|noun|ist$|noun95.06%94.26%1.8932Worse
16.1 14 (21):
0|1017|1013|4|sis$|noun|tic$|adj|99.61%|100.00%
4, 16:
1|369|369|0|osis$|noun|otic$|adj|100.00%|36.28% 1|336|334|2|esis$|noun|etic$|adj|99.40%|33.04%
8763.14%331209122$|noun|ist$|noun95.06%94.32%1.8938Worse
17.1 58 (78):
0|101|90|11|sity$|noun|us$|adj|89.11%|100.00%
58:
1|100|90|10|osity$|noun|ous$|adj|90.00%|99.01%
8663.14%331209122$|noun|ist$|noun95.09%94.92%1.9001Best