Lexical Tools

Retrieve New SD-Rules from NomD, 2020

I. Description

A set of computer programs (FindSdRulesFromDPairs.java) are developed to find the SD-Rules from a set of suffixD pairs. It identifies and eliminates the same starting characters of a SD-pair and then generates the SD-Rules automatically. Please note that only root-parent SD-Rules is generated in this program. Two sets of SD-pairs are used for this task. This page details the new SD-Rules selected from nomD.

II. Procedures

  • Directory: ${SUFFIXD_DIR}
  • Programs:
    • shell>cd ${SUFFIXD_DIR}/bin
    • shell>GetSdRule ${YEAR}
      2
      nomD

    • shell>GetSdRule ${YEAR}
      3
      nomD

    • shell>GetSdRule ${YEAR} (check each rules)
      5
      ${YEAR}
      rule

III. Results

  • From: ./data/${YEAR}/dataR/SdRulesFromSdPairs/nomD/sdRulesFromSdPairs.rpt
  • These are SD-Pairs from nominalizations of Lexicon.${YEAR}
  • The file of ${NOM_D_DIR}/data/nomD.yes.S.data is used as input
  • There are 24,195 SD-Pairs to generate 1,058 SD-Rules,
  • All generated SD-Rules are root-parent rules (without parent-rule).
  • Rules with following criteria are selected:
    • 2015 release:
      • frequency: >= 200:
      • Accumulate coverage: 80.00%
      • Individual coverage: 1.00%
    • 2016 release:
      • frequency: >= 100:
      • Accumulate coverage: 83.00%
      • Individual coverage: 0.50%
    • 2017 release:
      • frequency: >= 70:
      • Accumulate coverage: 84.00%
      • Individual coverage: 0.30%
    • 2020 release:
      • frequency: >= 50:
      • Accumulate coverage: 87.41%
      • Individual coverage: 0.21%

    • SD-Rules meet above criteria (total instance No. 27,168):
      SD-RulesInstances No.Accu. No.Notes
      $|adj|ness$|noun2734 (11.64%)2734 (11.64%)2013-, existing rule
      ation$|noun|e$|verb2491 (10.61%)5225 (22.25%)2013-, existing rule
      e$|verb|ion$|noun2299 (9.79%)7524 (32.04%)2015, has child rules exist
      $|adj|ity$|noun2037 (8.67%)9561 (40.71%)2013-, existing rule
      ility$|noun|le$|adj1612 (8.67%)11173 (47.58%)2015, has child rules exist
      se$|verb|zation$|noun1108 (4.72%)12281 (52.29%)2015, has no child rules exist
      sation$|noun|ze$|verb1072 (4.56%)13353 (56.86%)2015, has no child rules exist
      ce$|noun|t$|adj843 (3.59%)14196 (60.45%)2015, has child rules exist
      e$|adj|ity$|noun833 (3.55%)15029 (63.99%)2013-, existing rule
      ed$|adj|ion$|noun691 (2.94%)15720 (66.94%)2013-, existing rule
      $|verb|ment$|noun575 (2.45%)16295 (69.38%)2013-, existing rule
      iness$|noun|y$|adj545 (2.32%)16840 (71.71%)2013-, existing rule
      $|verb|ion$|noun536 (2.28%)17376 (73.99%)2013-, existing rule
      $|verb|ing$|noun480 (2.04%)17856 (76.03%)2013-, existing rule
      cy$|noun|t$|adj401 (1.71%)18257 (77.74%)2015, has child rules exist
      $|verb|ation$|noun307 (1.31%)18564 (79.05%)2013-, existing rule
      ication$|noun|y$|verb295 (1.26%)18859 (80.30%)2013-, existing rule
      2015: Frequency > 200, Instance coverage > 1.00% , Accum. Coverage > 80.0%
      e$|verb|ing$|noun191 (0.81%)19050 (81.12%)2013-, existing rule
      ation$|noun|ed$|adj158 (0.67%)19208 (81.79%)2016, has no child rules exist
      $|adj|ism$|noun133 (0.57%)19341 (82.35%)2016, has no child rules exist
      e$|adj|ion$|noun123 (0.52%)19464 (82.88%)2016, has no child rules exist
      e$|verb|is$|noun113 (0.48%)19577 (83.36%)2016, has child rules exist
      2016: Frequency > 100, Instance coverage > 0.40% , Accum. Coverage > 83.36%)
      sation$|noun|zed$|adj107 (0.45%)20068 (83.50%)2017, has no child rules exist
      sed$|adj|zation$|noun106 (0.44%)20174 (83.94%)2017, has no child rules exist
      sity$|noun|us$|adj90 (0.37%)20264 (84.32%)2017, 2017, has a child rule exist
      e$|verb|tion$|noun84 (0.35%)20348 (84.67%)2017, has no child rules exist
      ous$|adj|y$|noun71 (0.30%)20419 (84.96%)2013-, existint rule
      2017: Frequency > 70, Instance coverage > 0.30% , Accum. Coverage > 84.00%)
      ity$|noun|ous$|adj66 (0.27%)20620 (85.22%)2013-existing rule
      able$|adj|ibility$|noun64 (0.26%)20684.26 (85.49%)2017-existing rule
      al$|noun|e$|verb63 (0.26%)20747 (85.75%)2020, has no child rule exist
      ic$|adj|y$|noun63 (0.26%)20810 (86.01%)2013-existing rule
      de$|verb|sion$|noun62 (0.26%)20872 (86.27%)2013-existing rule
      $|verb|ance$|noun61 (0.25%)20933 (86.52%)2013-existing rule
      sis$|noun|ze$|verb57 (0.24%)20990 (86.75%)2020, has no child rule exist
      ability$|noun|ible$|adj56 (0.23%)21046 (86.98%)2020, has no child rule exist
      sable$|adj|zability$|noun54 (0.22%)21100 (87.21%)2020, has no child rule exist
      sability$|noun|zable$|adj50 (0.21%)21150 (87.41%)2020, has no child rule exist
      2020: Frequency > 75, Instance coverage > 0.21% , Accum. Coverage > 87.41%)

    • New SD-Rules with childred rules
      • None

    • New SD-Rules without child-rule
      • al$|noun|e$|verb
      • sis$|noun|ze$|verb
      • ability$|noun|ible$|adj
      • sable$|adj|zability$|noun
      • sability$|noun|zable$|adj