Lexical Tools

Results of Proposed Rules - 2017

I. Results

10 SD-Rules are proposed to be added to the SD-Rule for evaluation. The results from the optimal set are described as follows:

SD-RuleRankPrecisionInstancesSourceDecomposeResults
Good Rules
sation$|noun|zed$|adj5100.00%131NOM_DRoot-ParentGood SD-Rule
sed$|adj|zation$|noun6100.00%127NOM_DRoot-ParentGood SD-Rule
$|noun|tous$|adj2897.85%186ORG_DRoot-ParentGood SD-Rule
h$|noun|hy$|noun4892.49%253ORG_D1G-ChildGood SD-Rule
osity$|noun|ous$|adj5890.00%100NOM_D1G-childGood SD-Rule
e$|verb|tion$|noun5989.80%98NOM_DRoot-ParentGood SD-Rule
Bad Rules
$|noun|ish$|adj8861.29%93ORG_DRoot-ParentBad SD-Rule
er$|noun|ress$|noun9550.00%18EXP_SUGRoot-ParentBad SD-Rule
$|noun|ty$|adj9747.62%42EXP_SUGRoot-ParentBad SD-Rule
$|noun|fully$|adv1180.00%57ORG_DRoot-ParentBad SD-Rule
  • Good SD-Rules: 6 of them are evaluated as good rules in the optimized set
  • Bad Sd-Rules: 4 are bad rules.

  • Experts' suggestion: 0% (0/2) is good.
  • Computation rules: 75% (6/8) is good.
    • NOM_D: 100% (4/4) is good
    • ORG_D: 50% (2/4) is good

  • Also, in the optimized set, 2 child rules are used to replace proposed root-parent rules
    • h$|noun|hy$|noun from $|noun|y$|noun
    • osity$|noun|ous$|adj from sity$|noun|us$|adj

II. Further Observation on NOM_D

The top SD-Rules generated from NOM_D are added and evaluated (${SUFFIX_D}/data/${YEAR}/dataR/SdRulesFromSdPairs/nomD/sdRulesFromSdPairs.rpt.${YEAR}).

IDSD-RuleRankNotes
Added in 2015: Freq. > 200, Coverage > 1.00% , Accum. Coverage > 80.0%
1$|adj|ness$|noun1Good
2se$|verb|zation$|noun2Good
3sation$|noun|ze$|verb3Good
4ility$|noun|le$|adj11Good
5iness$|noun|y$|adj12Good
6ation$|noun|e$|verb17Good
7ce$|noun|t$|adj31Good
8cy$|noun|t$|adj22Good
9e$|verb|ion$|noun21Good
10$|verb|ment$|noun23Good
11ication$|noun|y$|verb24Good
12ed$|adj|ion$|noun25Good
13$|adj|ity$|noun27Good
14e$|adj|ity$|noun32Good
15$|verb|ion$|noun43Good
16$|verb|ing$|noun47Good
17$|verb|ation$|noun56Good
Added in 2016: Freq. > 100, coverage > 0.40% , Accum. Coverage > 83.36%)
18e$|verb|ing$|noun55Good
19ation$|noun|ed$|adj44Good
20$|adj|ism$|noun57Good
21e$|adj|ion$|noun93Bad
22e$|verb|is$|noun38Good
Added in 2017: Freq. > 70, Coverage > 0.30% , Accum. Coverage > 85.00%)
23sation$|noun|zed$|adj5Good
24sed$|adj|zation$|noun6Good
25sity$|noun|us$|adj58Good (osity$|noun|ous$|adj)
26e$|verb|tion$|noun59Good
27ous$|adj|y$|noun105Bad (exit in 2013)

The results shows 92.59% (25/27) are good SD-Rules, more SD-Rules from nomD should be added in the future releases.

III. Further Observation on ORG_D

The top SD-Rules generated from ORG_D are added and evaluated (${SUFFIX_D}/data/${YEAR}/dataR/SdRulesFromSdPairs/orgFacts/sdRulesFromSdPairs.rpt.${YEAR}).

IDSD-RuleRankNotes
Added in 2015: Freq. > 40, Coverage > 1.00% , Accum. Coverage > 11.50%
1$|noun|less$|adj13Good
2$|verb|ion$|noun21Good, also in NOM_D
3ist$|noun|y$|noun40Good
4$|adj|ally$|adv19Good
5$|noun|ful$|adj60Good
6c$|adj|s$|noun52Good (ic$|adj|is$|noun)
Added in 2016: Freq. >= 35; Accu. coverage: > 16.00% Ind Coverage: > 0.80%
7on$|noun|ve$|adj53Good
8$|noun|ship$|noun74Good
9$|noun|age$|noun103Bad
10e$|noun|ic$|adj51Good
11sia$|noun|tic$|adj42Good
Added in 2017: Freq. >= 35; Accu. coverage: > 16.00% Ind Coverage: > 0.80%
12$|noun|fully$|adv117Bad
13$|noun|ish$|adj88Bad
14$|noun|y$|noun48Good (h$|noun|hy$|noun)
15$|noun|tous$|adj28Good

The results shows 80.00% (12/15) are good SD-Rules, more SD-Rules from orgD should be added in the future releases.

V. Future Work

Evaluated more SD-Rules from NOM_D and ORG_D down the list.