Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.
Generate LEXICON in pure ASCII format
This step must be completed before generate LEXICON tables because the LEXICON.release might need to modified through this step.
I. Concept: Algorithm of Generating ASCII Lexicon
II. Pre-Process: Prepare data and files
mkdir ${LEXICON_DIR}/data/${YEAR}/tables
cd ${LEXICON_DIR}/data/${YEAR}/tables
ln -sf ../data/LEXICON.release LEXICON
mkdir ${LEXICON_DIR}/data/${YEAR}/ascii
shell>cp -rp ${LEXICON}/data/${PRE_YEAR}/ascii/exceptions ${LEXICON}/data/${YEAR}/ascii/exceptions
III. Process: Generate ASCII Lexicon
shell> ${LEXICON}/bin/3.GenerateAsciiLexicon <year>
${LVG_YEAR}
${LC_YEAR}
4.ReviewAsciiReports ${YEAR}
) for further process (see session IV.)
Log
IV. Review ASCII Reports
shell> ${LEXICON}/bin/4.ReviewAsciiReports <year>
Exception files | Description | Action |
---|---|---|
invalidAsciiExceptions.txt | invalid ASCII conversion that is deleted in line to line ASCII conversion | update |
E0543077|base|delete|not-Lex|divorcé|divorce|N
E0702889|base|delete|not-Lex|Pécs|Pecs|N
E0710983|base|delete|not-Lex|GΩ|GOmega|N
E0721571|base|delete|not-Lex|μB|muB|N
EUI | Type | Action | Reason | non-ASCII | ASCII conversion | Tag (TBD) |
EUI | Type (spVar|base) | Action (delete) | Reason (not in Lexicon) | non-ASCII Citation | ASCII conversion | Tag (Y|N) |
Year | Notes |
---|---|
2014 | All 88 valid conversions are deleted in step 3. |
2015 | All 90 valid conversions are deleted in step 3 (93 valid exceptions). |
2016 | All 90 valid conversions are deleted in step 3 (93 valid exceptions). |
2017 | All 94 valid conversions are deleted in step 3 (97 valid exceptions). |
2018 | All 92 valid conversions are deleted in step 3 (97 valid exceptions). |
2019 | All 95 valid conversions are deleted in step 3 (100 valid exceptions). |
2020 | All 93 valid conversions are deleted in step 3 (100 valid exceptions). |
2021 | All 100 valid conversions are deleted in step 3 (107 valid exceptions). |
2022 | All 100 valid conversions are deleted in step 3 (107 valid exceptions). |
2023 | All 100 valid conversions are deleted in step 3 (107 valid exceptions). |
2024 | All 100 valid conversions are deleted in step 3 (107 valid exceptions). |
2025 | All 100 valid conversions are deleted in step 3 (107 valid exceptions). |
2026 | All 100 valid conversions are deleted in step 3 (107 valid exceptions). |
shell>cd /nfsvol/lex/Lu/Development/LVG/Components/Unicode/bin
shell>GetNonAsciiFromFile ${LEXICON.ascii} line char
shell> wc -l line
must be 0 (no non-ASCII Unicdoe)
V. Generate ASCII tables
shell> ${LEXICON}/bin/10.GenerateAsciiTables <year>
9
shell> ${LEXICON}/bin/10.GenerateAsciiTables <year>
10