Generate LEXICON in pure ASCII format
This step must be completed before generate LEXICON tables because the LEXICON.release might need to modified through this step.
I. Concept: Algorithm of Generating ASCII Lexicon
II. Pre-Process: Prepare data and files
mkdir ${LEXICON_DIR}/data/${YEAR}/tables
cd ${LEXICON_DIR}/data/${YEAR}/tables
ln -sf ../data/LEXICON.release LEXICON
mkdir ${LEXICON_DIR}/data/${YEAR}/ascii
shell>cp -rp ${LEXICON}/data/${PRE_YEAR}/ascii/exceptions ${LEXICON}/data/${YEAR}/ascii/exceptions
III. Process: Generate ASCII Lexicon
shell> ${LEXICON}/bin/3.GenerateAsciiLexicon <year>
${LVG_YEAR}
${LC_YEAR}
4.ReviewAsciiReports ${YEAR}) for further process (see session IV.)
Log
IV. Review ASCII Reports
shell> ${LEXICON}/bin/4.ReviewAsciiReports <year>
| Exception files | Description | Action |
|---|---|---|
| invalidAsciiExceptions.txt | invalid ASCII conversion that is deleted in line to line ASCII conversion | update |
E0543077|base|delete|not-Lex|divorcé|divorce|N
E0702889|base|delete|not-Lex|Pécs|Pecs|N
E0710983|base|delete|not-Lex|GΩ|GOmega|N
E0721571|base|delete|not-Lex|μB|muB|N
| EUI | Type | Action | Reason | non-ASCII | ASCII conversion | Tag (TBD) |
| EUI | Type (spVar|base) | Action (delete) | Reason (not in Lexicon) | non-ASCII Citation | ASCII conversion | Tag (Y|N) |
| Year | Notes |
|---|---|
| 2014 | All 88 valid conversions are deleted in step 3. |
| 2015 | All 90 valid conversions are deleted in step 3 (93 valid exceptions). |
| 2016 | All 90 valid conversions are deleted in step 3 (93 valid exceptions). |
| 2017 | All 94 valid conversions are deleted in step 3 (97 valid exceptions). |
| 2018 | All 92 valid conversions are deleted in step 3 (97 valid exceptions). |
| 2019 | All 95 valid conversions are deleted in step 3 (100 valid exceptions). |
| 2020 | All 93 valid conversions are deleted in step 3 (100 valid exceptions). |
| 2021 | All 100 valid conversions are deleted in step 3 (107 valid exceptions). |
| 2022 | All 100 valid conversions are deleted in step 3 (107 valid exceptions). |
| 2023 | All 100 valid conversions are deleted in step 3 (107 valid exceptions). |
| 2024 | All 100 valid conversions are deleted in step 3 (107 valid exceptions). |
| 2025 | All 100 valid conversions are deleted in step 3 (107 valid exceptions). |
| 2026 | All 100 valid conversions are deleted in step 3 (107 valid exceptions). |
shell>cd /nfsvol/lex/Lu/Development/LVG/Components/Unicode/bin
shell>GetNonAsciiFromFile ${LEXICON.ascii} line char
shell> wc -l line must be 0 (no non-ASCII Unicdoe)
V. Generate ASCII tables
shell> ${LEXICON}/bin/10.GenerateAsciiTables <year>
9
shell> ${LEXICON}/bin/10.GenerateAsciiTables <year>
10