The SPECIALIST Lexicon

Generating Synonyms from All Sources

This page describes processes to generate LexSynonyms from all sources:

I. Pre-Process

  • Directory: ${LEXICON_SYNONYMS}
  • program: ./Synonym/GenTaggedSPairs.java
  • Inputs:
    • ./Tags/SynonymCan_Tagged.txt.${YEAR}
    • ${IN_DIR}LRNOM
    • ${IN_DIR}synonyms.data.lvg
    • ${IN_DIR}LRSPL
  • Outputs:
    • ./Results/sPairsTag.data (all sPairs with tags [Y|N])
    • ./Results/sPairsTag.data.N (invalid sPairs with tag [Y|N])
    • ./Results/sPairsTag.data.Y (valid sPairs with tag [Y|N])
    • ./Results/sPairsTag.data.release (used for annual synonym.data)
  • Algorithm:
    • add sPairs with tags from tagged sClasses
      • get sPairs and assign tag [Y|N]
      • sPair generation includes nom and spVar
      • if sPairs has same key ([syn1|POS1|syn2|POS2])
        • same source (CUI)
          • same tag
            => don't add (duplicates)
          • different tag
            => don't add, assign tag to [T] and ignore [N]
            => update keepTagNo, reTagno, yesNo, noNo
        • different source (CUI)
          => add sPair [syn1|POS1|syn2|POS2|src|tag]
      • assign source to CUI
    • add sPairs with tags from Lexicon nominalization
      • all nominalization are sPairs [Y]
      • assign source to EUI of noun
    • add sPairs with tags from LVG
      • all sPairs in LVG are true [Y]
      • assign source to NLP_LVG
      • removed duplicates (same key: [syn1|POS1|syn2|POS2])

II. Process

  • Directory: ${LEXICON_SYNONYMS}/bin
  • program: GetSynonyms ${year}

    OptionDescriptionsInputsOutputs
    17
    • Synonym.GenTaggedSPairs.java
    • Synonym.SPairTagObj.java
    • Synonym.SPairTagMap.java
    • ./Tags/SynonymCan_Tagged.txt.${YEAR}
    • ${IN_DIR}LRNOM
    • ${IN_DIR}synonyms.data.lvg
    • ${IN_DIR}LRSPL
    • ./Results/sPairsTag.data
      => used to tag sPairs from other source (WordNet).
    • ./Results/sPairsTag.data.N
    • ./Results/sPairsTag.data.Y
      => Used for annual synonym.data release