LexBuild

Crossing References Check

Cross references are EUI used as references among lexical records. They are used in Abbreviations, Acronyms, and Nominalizations. A series of programs are developed to check cross references. This is called in post-process stage to ensure the quality of LEXICON. The design details are described as follows:

  • The following steps are performed by clicking the Identify button:
    • I. Generate the latest LEXICON

    • II. Generate file: acrAbbNom.data
      • Java file: GenerateAcrAbbNomRecords.java
        Go through all lexical records (from file: LEXICON) to print out all information for Abbreviations, Acronyms, and Nominalizations.
      • File format:
        Field 1Field 2Field 3Field 4Fields 5, 6, 7
        basebase catbase EUIACR
        •  
        • acronym
        • acronym|acronym EUI
        basebase catbase EUIABB
        •  
        • abbreviation
        • abbreviation|abbreviation EUI
        basebase catbase EUINOM
        •  
        • nominalization
        • nominalization|nominalization cat
        • nominalization|nominalization cat|nominalization EUI

    • III. Check and tag cross reference on acrAbbNom.data
      • Java file: CheckCrossRef.java
      • Go through every single line in acrAbbNom.data
      • Send results to following five files:
        • newTerm.data
          This file lists all terms (base|cat) are used in ABB/ACR/NOM but does not has a lexical record. These new term should be added to LEXICON to complete the cross references. Ideally, this file should contain no lines.
          TypeConditionSourceABB/ACR/NOMBaseCategory
          acr/abbNo EUIslineABB/ACRbasecat
          nomNo EUIslineNOMbasecat

        • dup.data
          This file lists all terms (base|cat) are used in ABB/ACR/NOM with multiple EUIs. They could be duplicated lexical records and should be consolidated. An exceptions filter should be used for those false-positive cases.
          TypeConditionSourceABB/ACR/NOMBaseCategorySuggested EUIs
          acr/abb
          • Same base/sp_vars (case sensitive)
          • Same category
          • Multiple EUIs
          lineABB/ACRbasecatEUIs
          nom
          • Same base/sp_vars (case sensitive)
          • Same category
          • Multiple EUIs
          lineNOMbasecatEUIs

          Notes: records in dup.data needs to add to dupExceptionList if they are not duplicated.

        • acr.data & abb.data:
          These two files are in the same format to list all found problems on ACR/ABB from acrAbbNom.data.
          Case IDConditionSourceIssue TypesMsg TypesSource Line NoSuggested EUIsNotes
          1field num < 5lineNO BASE FIELDERRORrecNum Fix - add base field
          2field num = 5lineNO EUI FIELDWARNINGrecNumSuggested EUIFix - add EUI field (auto)
          3field num = 5lineNO EUIS FIELDWARNINGrecNumSuggested EUIsFix - add EUI field (choose one)
          4field num = 6
          No record found
          lineNO REC FOUNDWARNINGrecNumEmpty EUIFix - modify lexRecord
          5field num = 6
          No EUI found
          lineNO EUI FOUNDWARNINGrecNumEmpty EUIFix - modify lexRecord
          6field num = 6
          Different EUI found
          lineWRONG EUIERRORrecNumSuggested EUIFix - change EUI (auto)
          7field num = 6
          EUIs does not contain EUI
          lineWRONG EUISERRORrecNumSuggested EUIsFix - choose EUI
          8field num = 6
          EUIs contains EUI
          lineCHECK EUIWARNINGrecNumSuggested EUIsCheck - add to exceptions
          • dupException.data: if not duplicated
          • abb/acrException.data: the right choice
          9field num = 6
          EUI is null
          lineEUI NULLERRORrecNumSuggested EUIsFix - modify lexRecord

        • nom.data
          This file list all found problem on NOM from acrAbbNom.data
          Case IDConditionSourceIssue TypesMsg TypesSource Line NoSuggested catsSuggested EUIsNotes
          1field num < 5lineNO BASE FIELDERRORrecNum  Fix - add base field
          2field num = 5lineNO CAT FIELDERRORrecNumSuggested catSuggested EUIsFix - add cat|EUI fields
          3field num = 6lineNO EUI FIELDWARNINGrecNumcatSuggested EUIFix - add EUI field (auto)
          4field num = 6lineNO EUIS FIELDWARNINGrecNumcatSuggested EUIsFix - add EUI field (choose one)
          5field num = 7
          No record found
          lineNO REC FOUNDWARNINGrecNumcatEmpty EUIFix - modify lexRecord
          6field num = 7
          No EUI found
          lineNO EUI FOUNDWARNINGrecNumcatEmpty EUIFix - modify lexRecord
          7field num = 7
          Different cat
          lineWRONG CATWARNINGrecNumSuggested catSuggested EUIsFix - modify lexRecord
          8field num = 7
          Cats does not contain cat
          lineWRONG CATSWARNINGrecNumSuggested catSuggested EUIsFix - modify lexRecord
          9field num = 7
          Cats contains cat
          lineCHECK CATWARNINGrecNumSuggested catSuggested EUIsCheck - add to filter list
          10field num = 7
          Cat is null
          lineCAT NULLERRORrecNumSuggested catsSuggested EUIsFix - modify lexRecord
          11field num = 7
          Different EUI found
          lineWRONG EUIERRORrecNumcatSuggested EUIFix - change EUI (auto)
          12field num = 7
          EUIs does not contain EUI
          lineWRONG EUIsERRORrecNumcatSuggested EUIsFix - choose EUI
          13field num = 7
          EUIs contains EUI
          lineCHECK EUIWARNINGrecNumcatSuggested EUIsCheck - add to filter list
          14field num = 7
          EUI = null
          lineEUI NULLERRORrecNumcatSuggested EUIsFix - modify lexRecord
          15field num = 7
          Not symmetric
          lineNOT SYMWARNINGrecNum  Fix - modify lexRecord