Step | Descrption | Inputs | Outputs | Notes
|
---|
Pre-Process:
|
0 | - Update the latest valid and invalid LMW list
| | | - Update candidates
- ${LMW_DIR}/bin/00.CandidateList, steps 1-4
=> Linked to the latest Lexicon and inflVars from LexBuild daily backup
|
Process:
|
1 | Generate candidate list from Abb/Acr expansion
| - ${IN_DIR}/LEXICON (input)
- ${IN_DIR}/inflVars.data (valid LMWs)
- ${CUR_DIR}/notBase.data.current (needs to be updated at step 10)
- ${OUT_DIR}/abbAcrExpansions.data.hasEui.Exception.${YEAR} (modified fromprev year)
| - abbAcrExpansions.tag (all tags)
- abbAcrExpansions.invEui (the cross-ref EUI is invalid)
- abbAcrExpansions.hasEui (no cross-ref EUI, but, expansion matches EUIs)
- abbAcrExpansions.rpt (summary report)
- abbAcrExpansions.data.cand (candidate list)
=> manual copy to ./Cand/abbAcrExpansions.data.cand.${YEAR}
=> Link to ./Stats/abbAcrExpansions.data.cand.${YEAR}
=> first, go to step 10 to gen candidate list
=> then, repeat steps 0-2 until abbAcrExpansions.data.cand is empty (0)
|
|
2 | Split invalid cross-ref EUI and no cross-ref EUI matches EUI file
| - abbAcrExpansions.data.invEui
- abbAcrExpansions.data.hasEui
| - abbAcrExpansions.data.invEui.NO_EUI
=> Sent to linguist to tag [D]
- [D]: if the CR of expansion is a deleted record (invalid LMWs), cross-ref EUI should be manually removed.
- Others: the expansion is a valid LMW, this case might require to change the epxasion to citation form, restore the deleted records, or create a new lexRecord, and modify the CR-EUI, etc..
=> update ${LEX_CHECK}/data/File/notBaseForm.data.${YEAR}
- this file should be empty after the update (notBaseForm.data)
- abbAcrExpansions.data.invEui.WRONG_CIT
=> wrong citation, after fixed, it should be empty
- abbAcrExpansions.data.hasEui.E
=> Exceptions, expansion has 1 matched EUI
=> Send to linguist to tag:
- [C]: correct, expansion is invalid LMW, they should not have CR-ref EUI. No fix in LB.
- [Y]: if the suggesting matched EUI is correct, manually add EUI to the lexRecord in LB.
- [- EUI: E0xxxxxxx]: expansion is a valid LMW, add the EUI to the end of line if suggesting matched EUI is not correct. Also, fix in the LB.
- abbAcrExpansions.data.hasEui.M
=> Exceptions, expansion has multiple matched EUIs
=> Sent to linguist to tag:
- [C]: correct, the expansion shold not have cross-ref EUI (even the
spelling is a valid base.=> add to abbAcrExpansions.data.hasEui.Exception.${YEAR}
- [Y]: if the 1 matched EUI is correct (need to update the Lexicon in LExBuild)
- EUI: add the correct EUI, might need to update the corss-ref EUI, modify the expansion, or add a new record (if expansion is a LMW) to Lexicon
|
|
Post-Process:
|
10 | Auto-tag candidate listCandidateUtil.FilterTagCandFile
| - ${STATS_DIR}/abbAcrExpansions.data.cand.${YEAR}
- ${CAND_DIR}/inflVars.data.current (valid LMWs)
- ${CAND_DIR}/totalTerms.all.base.no (invalid LMWs)
|
- abbAcrExpansions.data.cand.${YEAR}.autoTag (all tags)
- abbAcrExpansions.data.cand.${YEAR}.rmYesNo
After updates completed, this file must be empty (wc=0)
- abbAcrExpansions.data.cand.${YEAR}.rmYesTagNo
=>Before update, this file is used as candidate list send to linguist
- No tag
- if the expansion is a valid LMW, add to Lexicon, add CR-EUI to the expansion
- notBaseFormUpdate.data.${YEAR}
- cd ./Stats
- flds 4,2 abbAcrExpansions.data.cand.${YEAR}.rmYesTagNo.${YEAR} > notBaseFormUpdate.data.${YEAR}
- Append notBaseFormUpdate.data.${YEAR} to ${LexCheck}/data/Files/notBaeForm.data.${YEAR}
| After the candidate list is completed:- Add/Link candidates to ${Candidates}/1.LexiconAbbAcrExpansion/abbAcrExpansions.data.cand.${YEAR}
- Run 00.CandidateList, step 1-4
This step updates the valid and invalid LMW, and thus update the candidates.
- rerun step 1-2, until *.cand = 0, because candidates that are LMWs are in the Lexicon and invalid LMWs are tagged as invalid automatically (by the updated totalTerm.all.base.no from 00.CandidateList), no new candidate should be found.
|