Lexical Tools

Results of Proposed Rules - 2024 updates

I. Results

16 new non-duplicated SD-Rules are proposed to be added to the SD-Rule for evaluation. The good rule threshold is at 105 in the optimal set. The results are described as follows:

SD-RuleRankPrecisionInstancesSourceDecomposeResults
Good Rules (6)
$|noun|free$|adj4395.83%%24EXP_SUGRoot-ParentGood SD-Rule
ier$|noun|y$|verb5693.33%75WORDNETRoot-ParentGood SD-Rule
ize$|verb|y$|noun7687.37%95WORDNETCHILD: mize$|verb|my$|nounGood SD-Rule
$|verb|per$|noun8185.96%57WORDNETCHILD: p$|verb|pper$|nounGood SD-Rule
$|verb|ter$|noun9280.00%55WORDNETCHILD: t$|verb|tter$|nounGood SD-Rule
$|noun|ly$|adj9875.74%136WORDNETRoot-ParentGood SD-Rule
Bad Rules (10)
$|noun|ian$|ad10869.55%243WORDNETRoot-ParentBad SD-Rule
e$|verb|ive$|adj10968.94%425WORDNETRoot-ParentBad SD-Rule
$|noun|ise$|verb11465.61%410WORDNETRoot-ParentBad SD-Rule
$|verb|ive$|adj11859.68%248WORDNETRoot-ParentBad SD-Rule
$|noun|er$|noun12058.12%1237WORDNETRoot-ParentBAD SD-Rule
e$|verb|ory$|adj12157.77%251WORDNETRoot-ParentBad SD-Rule
$|noun|an$|adj13547.11%121WORDNETRoot-ParentBad SD-Rule
asm$|noun|astic$|adj14436.11%36WORDNETSLEFBad SD-Rule
$|noun|ical$|adj14919.51%164WORDNETRoot-ParentBad SD-Rule
al$|adj|s$|noun1529.09%264WORDNETRoot-ParentBad SD-Rule
  • Good SD-Rules: 6 of them are evaluated as good rules in the optimized set
  • Bad SD-Rules: 10 are bad rules.

  • Experts' suggestion: 100% (1/1) is good.
  • Computation rules (WordNet): 33.33% (5/15) is good.

  • In the optimized set, 4 CHILD rules (instead of root parent) are used to replace proposed root-parent rules
    Proposed PARENT ruleCHILD Rule usedNotes
    ize$|verb|y$|nounmize$|verb|my$|nounGood Rule
    $|verb|per$|nounp$|verb|pper$|nounGood Rule
    $|verb|ter$|nount$|verb|tter$|nounGood Rule
    PARENT ruleProposed CHILD Rule usedNotes
    m$|noun|tic$|adjasm$|noun|astic$|adjSELF rule, Bad Rule

II. Further Observation on NOM_D

The top SD-Rules generated from NOM_D are added and evaluated (${SUFFIX_D}/data/${YEAR}/dataR/SdRulesFromSdPairs/nomD/sdRulesFromSdPairs.rpt.${YEAR}).

IDSD-RuleRankNotes
Added in 2015: Freq. > 200, Coverage > 1.00% , Accum. Coverage > 80.0%
1$|adj|ness$|noun1Good
2bility$|noun|ble$|adj2Good (ility$|noun|le$|adj)
3se$|verb|zation$|noun3Good
4sation$|noun|ze$|verb4Good
5iness$|noun|y$|adj16Good
6ation$|noun|e$|verb21Good
7nce$|noun|nt$|adj25Good (ce$|noun|t$|adj)
8e$|verb|ion$|noun26Good
9cy$|noun|t$|adj27Good
10$|verb|ment$|noun28Good
11ication$|noun|y$|verb29Good
12ed$|adj|ion$|noun30Good
13$|adj|ity$|noun32Good
14e$|adj|ity$|noun35Good
15$|verb|ion$|noun49Good
16$|verb|ing$|noun53Good
17$|verb|ation$|noun61Good
Added in 2016: Freq. > 100, coverage > 0.40% , Accum. Coverage > 83.36%)
18e$|verb|is$|noun43Good
19ation$|noun|ed$|adj50Good
20e$|verb|ing$|noun60Good
21$|adj|ism$|noun62Good
22e$|adj|ion$|noun100Bad
Added in 2017: Freq. > 70, Coverage > 0.30% , Accum. Coverage > 85.00%)
23sation$|noun|zed$|adj7Good
24sed$|adj|zation$|noun8Good
25sity$|noun|us$|adj65Good (osity$|noun|ous$|adj)
26e$|verb|tion$|noun63Good
27ous$|adj|y$|noun116Bad (exit in 2013)
Added in 2020: Freq. > 50, Coverage > 0.20% , Accum. Coverage > 87.41%)
28ability$|noun|ible$|adj10Good
29sable$|adj|zability$|noun12Good
30sability$|noun|zable$|adj13Good
31sis$|noun|ze$|verb41Good
32al$|noun|e$|verb92Good
Added in 2021: Freq. > 40, Coverage > 0.17% , Accum. Coverage > 89.27%)
33ability$|noun|eable$|adj34Good
34c$|adj|sm$|noun134Bad
35er$|verb|ration$|noun15,29Good
36$|verb|nce$|noun74Good
37ed$|adj|ment$|noun85Bad
38ity$|noun|y$|adj68Good
39$|adj|y$|noun145Bad
40able$|adj|eability$13Good
41e$|verb|ition$|noun71Good
42d$|verb|sion$|noun56Good

The results shows 88.09% (37/42) are good SD-Rules, more SD-Rules from nomD should be added and evaluated in the future releases.

III. Further Observation on ORG_FACT

The top SD-Rules generated from ORG_FACT are added and evaluated (${SUFFIX_D}/data/${YEAR}/dataR/SdRulesFromSdPairs/orgFacts/sdRulesFromSdPairs.rpt.${YEAR}).

IDSD-RuleRankNotes
Added in 2015: Freq. > 40, Coverage > 1.00% , Accum. Coverage > 11.50%
1$|noun|less$|adj17Good
2$|adj|ally$|adv23Good
3ist$|noun|y$|noun45Good
4$|verb|ion$|noun49Good, also in NOM_D
5c$|adj|s$|noun57Good (ic$|adj|is$|noun)
6$|noun|ful$|adj64Good
Added in 2016: Freq. >= 35; Accu. coverage: > 16.00% Ind Coverage: > 0.80%
7sia$|noun|tic$|adj47Good
8e$|noun|ic$|adj56Good
9on$|noun|ve$|adj58Good
10$|noun|ship$|noun79Good
11$|noun|age$|noun114Bad
Added in 2017: Freq. >= 30; Accu. coverage: > 19.00% Ind Coverage: > 0.70%
12$|noun|tous$|adj33Good
13$|noun|ish$|adj94Bad
14$|noun|y$|noun101Bad
15$|noun|fully$|adv128Bad
Added in 2020: Freq. >= 25; Accu. coverage: > 23.00% Ind Coverage: > 0.60%
16ion$|noun|ory$|adj5Good
17$|adj|s$|noun49Good
18$|verb|age$|noun100Bad
19$|noun|ial$|adj106Bad
Added in 2021: Freq. >= 20; Accu. coverage: > 26.00% Ind Coverage: > 0.49%
20er$|noun|y$|noun128Bad
21ctic$|adj|xis$|noun104Good
22er$|noun|ing$|noun145Bad
23e$|noun|ous$|adj110Bad

The results shows 60.87% (14/23) are good SD-Rules, more SD-Rules from orgD should be added in the future releases.

V. Future Work

Evaluated more SD-Rules from NOM_D and ORG_FACT down the list.
ORG_FACT is closed to the limit, maybe review 1 more year until there is no good rules can be found.