The SPECIALIST Lexicon

Parenthetic Acronym Pattern

I. Introduction
Parenthetic Acronym Pattern includes paterns of (ACRONYM) and (ACRONYMs). It is used in exclusive filter to exclude invalid MWE from n-gram set. On the other hand, it can be used as inclusive filter for MWE candidates.

II. Processes:

  • directory: ${MULTIWORDS_DIR}/bin
  • program: 6.Acronyms
  • Run program: shell> 6.Acronyms ${YEAR}
  • Processes:

    StepDescriptionIONotes - Examples
    1Get n-gram matches (ACR) patternInputs:
    • nGram.2014

    Outputs:

    • ApplyFilters.rpt.1.parAcr.trap
    • ApplyFilters.rpt.1.parAcr.exp
    • ApplyFilters.rpt
    trap - match (ACR) pattern:
    • Balkan endemic nephropathy (BEN)
    • zone of polarizing activity (ZPA)
    exp - not match (ACR) pattern:
    • & Systems Pharmacology (2013)
    • zonula occludens-1 (ZO-1)
    2Get acronym|expansion from step-1Inputs:
    • ApplyFilters.rpt.1.parAcr.trap

    Outputs:

    • acronyms.txt
    Convert n-gram from the format of
    ".. acronym expansion (ACR) .." to
    "ACR|acronym expansion"
    3Get new acronym|expansion (not in Lexicon) from step-2Inputs:
    • acronyms.txt

    Outputs:

    • acronyms.txt.pass
    • acronyms.txt.trap
    trap - in the Lexicon:
    • SSS|Stanford Sleepiness Scale
    • WHO|World Health Organisation
    pass - new acronyms (candidates):
    • BLS|Bureau of Labor Statistics
    • MH|World Mental Health

III. Results:
For 2014 release:

  1. 17023819 n-grams
  2. 163714 n-grams matches (ACR) pattern
  3. 1646 are identified as valid acronym|expansion
  4. 636 are new, used as multiword candidates to add to Lexicon
  5. Tags:
    TagOEYN
    Description
    • Invalid expansion
    • Valid expansion
    • Exist in Lexicon
    • Valid expansion
    • Not in Lexicon
    • Valid MWE
    • Valid expansion
    • Not in Lexicon
    • invalid MWE
    CountTBDTBDTBDTBD