Lexical Tools

Optimizing 2016 SD-Rule Set - Optimum Log

I. Criteria:

  • Total valid SD-Pairs from baseline (parent only rules) is 50,814
  • Candidate child rules are:
    • Decompose occurrence rate >= 40% (default)
    • Candidate child rules: occurrence rate >= 25% (default)
    • Candidate child rules: precision is decided by the methodology of optimization
    • Find the one with max. precision and recallIf the child rule has lower precision and recall than parents, it is not a good candidate even its recall is over 25%!

      Such as in Cases 15 and 16, the system performance is worse because both precision and recall are lower than parents. No need to run the program for these two cases.

      If the performance of a child rule is worse, then the next generation child rules will be worse. No need to run through the following generations (even we still run through them to keep the log completed)!

  • Find the best set by comparing parent vs. child rules:
    • Only apply when child rules precision is better than parent rule
    • Higher system performance
    • If System Performance is the same,
      • Use Precision
      • Use Recall
      • Use Linguistic knowledge

      • Use parent rule to replace child rules.
      • If no parent-child rules involved, use more rules

II. Iterative Optimization Log:

Source:

  • Dir: ${SUFFIX_DIR}/data/${YEAR}/dataR/SdRulesOptimum/*/
  • File: sdRules.stats.out.html

IDRank: Parent-RuleRanl: Candidate Child-RulesCutoff SD-Rules Sys Accu. Rate
(Precision)
Sys Cover. Rate
(Recall)
Sys. PerfNotes
Rule No.A. RateOccr.YesNoSD-Rule
0 Rank in Baseline (all Rank)
Parent-rule only - Baseline
Rank
No child-Rule
8168.29%1238439ant$|adj|ate$|verb95.15%94.85%1.9000Baseline
1.1 16 (25):
0|2072|2053|19|$|adj|ally$|adv|99.08%|100.00%
9:
1|1954|1953|1|c$|adj|cally$|adv|99.95%|94.31%
8168.29%1238439ant$|adj|ate$|verb95.18%94.65%1.8983Worse
1.2 16 (25):
0|2072|2053|19|$|adj|ally$|adv|99.08%|100.00%
9:
2|1949|1948|1|ic$|adj|ically$|adv|99.95%|94.06%
8168.29%1238439ant$|adj|ate$|verb95.18%94.64%1.8982Worse
2.1 24 (37):
0|2082|2038|44|$|adj|ity$|noun|97.89%|100.00%
14, 26:
1|946|939|7|c$|adj|city$|noun|99.26%|45.44% 1|729|713|16|l$|adj|lity$|noun|97.81%|35.01%
8268.29%1238439ant$|adj|ate$|verb95.16%94.09%1.8924Worse
2.2 24 (37):
0|2082|2038|44|$|adj|ity$|noun|97.89%|100.00%
13, 26:
2|945|939|6|ic$|adj|icity$|noun|99.37%|45.39% 1|729|713|16|l$|adj|lity$|noun|97.81%|35.01%
8268.29%1238439ant$|adj|ate$|verb95.16%94.09%1.8925Worse
3.1 77 (96):
0|1326|968|358|$|noun|al$|adj|73.00%|100.00%
69:
1|673|557|116|n$|noun|nal$|adj|82.76%|50.75%
8362.30%19111972ar$|adj|e$|noun95.24%94.68%1.8992Worse
3.2 77 (96):
0|1326|968|358|$|noun|al$|adj|73.00%|100.00%
65:
2|621|533|88|on$|noun|onal$|adj|85.83%|46.83%
8362.30%19111972ar$|adj|e$|noun95.29%94.64%1.8992Worse
3.3 77 (96):
0|1326|968|358|$|noun|al$|adj|73.00%|100.00%
64:
3|577|497|80|ion$|noun|ional$|adj|86.14%|43.51%
8362.30%19111972ar$|adj|e$|noun95.30%94.56%1.8986Worse
3.4 77 (96):
0|1326|968|358|$|noun|al$|adj|73.00%|100.00%
62:
4|472|408|64|tion$|noun|tional$|adj|86.44%|35.60%
8362.30%19111972ar$|adj|e$|noun95.32%94.39%1.8971Worse
4.1 40 (56):
0|573|537|36|$|verb|ion$|noun|93.72%|100.00%
33:
1|449|434|15|t$|verb|tion$|noun|96.66%|78.36%
8168.29%1238439ant$|adj|ate$|verb95.18%94.64%1.8983Worse
4.2 40 (56):
0|573|537|36|$|verb|ion$|noun|93.72%|100.00%
13:
2|322|320|2|ct$|verb|ction$|noun|99.38%|56.20%
8168.29%1238439ant$|adj|ate$|verb95.20%94.42%1.8962Worse
4.3 40 (56):
0|573|537|36|$|verb|ion$|noun|93.72%|100.00%
5:
3|186|186|0|ect$|verb|ection$|noun|100.00%|32.46%
8168.29%1238439ant$|adj|ate$|verb95.19%94.16%1.8934Worse
5.1 62 (80):
0|264|228|36|a$|noun|an$|adj|86.36%|100.00%
No candidate child rules found! 8168.29%1238439ant$|adj|ate$|verb95.15%94.85%1.9000Same
6.1 109 (130):
0|275|1|274|a$|noun|an$|noun|0.36%|100.00%
109:
1|137|1|136|ia$|noun|ian$|noun|0.73%|49.82%
8168.29%1238439ant$|adj|ate$|verb95.15%94.85%1.9000Same
7.1 58 (76):
0|137|120|17|a$|noun|ar$|adj|87.59%|100.00%
50:
1|115|105|10|la$|noun|lar$|adj|91.30%|83.94%
8168.29%1238439ant$|adj|ate$|verb95.17%94.82%1.8998Worse
7.2 58 (76):
0|137|120|17|a$|noun|ar$|adj|87.59%|100.00%
39:
2|69|65|4|ula$|noun|ular$|adj|94.20%|50.36%
8168.29%1238439ant$|adj|ate$|verb95.17%94.74%1.8991Worse
8.1 14 (23):
0|2519|2500|19|ation$|noun|e$|verb|99.25%|100.00%
11, 2:
1|1052|1051|1|sation$|noun|se$|verb|99.90%|41.76% 1|1257|1257|0|zation$|noun|ze$|verb|100.00%|49.90%
8268.29%1238439ant$|adj|ate$|verb95.17%94.47%1.8964Worse
8.2 14 (23):
0|2519|2500|19|ation$|noun|e$|verb|99.25%|100.00%
5, 2:
2|1028|1028|0|isation$|noun|ise$|verb|100.00%|40.81% 2|1250|1250|0|ization$|noun|ize$|verb|100.00%|49.62%
8268.29%1238439ant$|adj|ate$|verb95.17%94.41%1.8958Worse
9.1 55 (73):
0|292|260|32|c$|adj|s$|noun|89.04%|100.00%
47:
1|282|258|24|ic$|adj|is$|noun|91.49%|96.58%
8168.29%1238439ant$|adj|ate$|verb95.17%94.84%1.9001Better
9.2 55 (73):
0|292|260|32|c$|adj|s$|noun|89.04%|100.00%
39:
2|190|180|10|tic$|adj|tis$|noun|94.74%|65.07%
8168.29%1238439ant$|adj|ate$|verb95.19%94.69%1.8988Worse
9.3 55 (73):
0|292|260|32|c$|adj|s$|noun|89.04%|100.00%
27:
3|172|168|4|itic$|adj|itis$|noun|97.67%|58.90%
8168.29%1238439ant$|adj|ate$|verb95.20%94.66%1.8986Worse
10.1 28 (41):
0|865|844|21|ce$|noun|t$|adj|97.57%|100.00%
18:
1|854|844|10|nce$|noun|nt$|adj|98.83%|98.73%
8168.29%1238439ant$|adj|ate$|verb95.18%94.85%1.9002Better
10.2 28 (41):
0|865|844|21|ce$|noun|t$|adj|97.57%|100.00%
17, 21:
2|326|323|3|ance$|noun|ant$|adj|99.08%|37.69% 2|528|521|7|ence$|noun|ent$|adj|98.67%|61.04%
8268.29%1238439ant$|adj|ate$|verb95.18%94.85%1.9002Same
11.1 18 (29):
0|407|402|5|cy$|noun|t$|adj|98.77%|100.00%
No candidate child rules found! 8168.29%1238439ant$|adj|ate$|verb95.18%94.85%1.9002Same
12.1 19 (30):
0|2337|2308|29|e$|verb|ion$|noun|98.76%|100.00%
13:
1|2202|2193|9|te$|verb|tion$|noun|99.59%|94.22%
8168.29%1238439ant$|adj|ate$|verb95.20%94.62%1.8982Worse
12.2 19 (30):
0|2337|2308|29|e$|verb|ion$|noun|98.76%|100.00%
11:
2|2099|2095|4|ate$|verb|ation$|noun|99.81%|89.82%
8168.29%1238439ant$|adj|ate$|verb95.20%94.43%1.8963Worse
12.3 19 (30):
0|2337|2308|29|e$|verb|ion$|noun|98.76%|100.00%
4:
3|600|600|0|late$|verb|lation$|noun|100.00%|25.67%
8168.29%1238439ant$|adj|ate$|verb95.06%91.48%1.8655Worse
13.1 35 (50):
0|144|138|6|e$|verb|is$|noun|95.83%|100.00%
26:
1|141|138|3|se$|verb|sis$|noun|97.87%|97.92%
8168.29%1238439ant$|adj|ate$|verb95.18%94.85%1.9003Better
13.2 35 (50):
0|144|138|6|e$|verb|is$|noun|95.83%|100.00%
35, 6:
2|54|52|2|ose$|verb|osis$|noun|96.30%|37.50% 2|59|59|0|yse$|verb|ysis$|noun|100.00%|40.97%
8268.29%1238439ant$|adj|ate$|verb95.18%94.79%1.8997Worse
13.3 35 (50):
0|144|138|6|e$|verb|is$|noun|95.83%|100.00%
35, 6:
2|54|52|2|ose$|verb|osis$|noun|96.30%|37.50% 3|58|58|0|lyse$|verb|lysis$|noun|100.00%|40.28%
8268.29%1238439ant$|adj|ate$|verb95.18%94.79%1.8997Worse
14.1 44 (60):
0|224|207|17|esis$|noun|ic$|adj|92.41%|100.00%
21:
1|209|206|3|nesis$|noun|nic$|adj|98.56%|93.30%
8168.29%1238439ant$|adj|ate$|verb95.21%94.84%1.9005Better
14.2 44 (60):
0|224|207|17|esis$|noun|ic$|adj|92.41%|100.00%
13:
2|207|206|1|enesis$|noun|enic$|adj|99.52%|92.41%
8263.14%331209122$|noun|ist$|noun95.00%95.26%1.9026Better
14.3 44 (60):
0|224|207|17|esis$|noun|ic$|adj|92.41%|100.00%
13:
3|207|206|1|genesis$|noun|genic$|adj|99.52%|92.41%
8263.14%331209122$|noun|ist$|noun95.00%95.26%1.9026Same: Best
14.4 44 (60):
0|224|207|17|esis$|noun|ic$|adj|92.41%|100.00%
5:
4|181|181|0|ogenesis$|noun|ogenic$|adj|100.00%|80.80%
8263.14%331209122$|noun|ist$|noun95.00%95.21%1.9021Worse
15.1 9 (15):
0|1629|1628|1|ility$|noun|le$|adj|99.94%|100.00%
2:
1|1627|1627|0|bility$|noun|ble$|adj|100.00%|99.88%
8263.14%331209122$|noun|ist$|noun95.00%95.25%1.9026Worse
15.2 9 (15):
0|1629|1628|1|ility$|noun|le$|adj|99.94%|100.00%
2:
2|1292|1292|0|ability$|noun|able$|adj|100.00%|79.31%
8168.29%1238439ant$|adj|ate$|verb95.18%94.18%1.8936Worse
16.1 12 (19):
0|1015|1011|4|sis$|noun|tic$|adj|99.61%|100.00%
14, 4:
1|369|369|0|osis$|noun|otic$|adj|100.00%|36.35% 1|335|333|2|esis$|noun|etic$|adj|99.40%|33.00%
8268.29%1238439ant$|adj|ate$|verb95.19%94.24%1.8942Worse