The SPECIALIST Lexicon

Auto-tag Processes

This page describes the auto-tag processes in LMW candidate list generation:

  • I. Auto-tag raw LMW candidate list:

  • II. Auto-tag completed LMW candidate list:
    • Once candidate list are completed (submitted and approved) by the linguists, the list need to go through processes in 00.CandidateList (step 1-3) to update invalid LMWs (prevCand.data.no)
      • Add completed candidate list to appropriated directory
      • Run 00.CandidateList step 1.
      • candidates are in the latest inflVars.data are valid LMWs (prevCand.data.yes)
      • candidates are not in the latest inflVars.data are invalid LMWs (prevCand.data.no)
        • Invalid LMWs in the candidate list are automatic updated, prevCand.data.no in diagram below
        • All skipped candidates are considered as invalid LMWs in this process (that is why the program provides a 2nd chance for linguist to review invalid LMWs - ATUO_TAG_NO in the final tagging process .
      • CandList.rmYesNo should become empty by runing through the above process with latest Lexicon and invalid LMWs when it is completed.
        • Valid candidates is added to the Lexicon
        • Invalid candidates is added to the invalid LMWs file.
    • Other invalid terms (notBaseLmw.data.no) are static legacy data used before 2019-. (no more update after 2020+)
    • The file Total.data.no (= prevCand.data.no + notBaseLmw.data.no) are used as the latest invalid LMW collections. This file should be used in LexAccess.Files after 2020+.