Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.
Derivations Procedures - prefixD
Generate prefixD pairs in derivation table:
I. Directory: ${DERIVATION}/2.prefixD
II. Input Files (./data/${YEAR}/dataOrg/):
shell> ${PREFIX_D}/bin/GetPrefixD ${YEAR}
0
III. Final files for allD (release)
IV. Summary of GetPrefixD
Step | Description and Program | Input | Output | Notes | Step | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
0 |
| See section II. | See section II. |
| 0 | ||||||
1 |
|
|
| 1 | |||||||
2 |
|
|
|
| 2 | ||||||
8 |
|
|
|
| 8 | ||||||
3 |
|
|
|
| 3 | ||||||
14 |
|
|
|
| 14 | ||||||
4 |
|
|
|
| 4 | ||||||
4a |
|
|
| Re-run this step until:
Go to the end of the log file, check:
Then, rerun Steps: 3~4 until the above three nunbers are 0 and prefixD.tbt.data is empty in Step 4. | 4a | ||||||
5 |
|
|
| Make sure unknonw dType (|U|) from prefixD is empty. | 5 | ||||||
6 |
|
|
|
| 6 | ||||||
15 |
|
|
|
| 15 | ||||||
16 |
|
|
|
| 16 | ||||||
7 |
|
|
|
|
V. Processes Details:
shell>cd ${DERIVATION}/prefixD/bin
shell>GetPrefixD ${YEAR}
1. Routine process (no new PD-Rules, no new Tag)
1: Get valid prefix base forms from LEXICON
=> generates ./data/bases.data
2: Retrieve raw prefixD pairs
or use
8: Retrieve possible raw prefixD pairs with options
DONE
for all prefix is done tagged
=> generates:
3: Add tags to prefixD meta file
=> generates ./data/prefixD.meta.data
must be tagged of [yes|no]
, all errors must be fixed
use tag of tbd
to bypass entry with tagging errors
3.1: Check conflicts by SpVars
(different dPair tags between 2 records).
=> generates ./data/prefixD.meta.data.conflict
Send to linguist to double check "[yes|no|both]"
=> Ideally, the tag of prefixD between two records should be the same
=> This file lists all inconsistent prefixD tags between two records (caused by SpVars).
=> If not empty, sent to linguist to tag [yes|no|both] the EUI line.
14: Auto-fix prefixD.tag.txt for conflicts by SpVars
=> Put the revised tagged file to: ./dataOrg/prefixD.meta.data.conflict.tag.data
=> copy ./dataOrg/prefixD.tag.txt.${YEAR}.fix to ./dataOrg/prefixD.tag.txt.${YEAR} and rerun this step.
4: Split prefixD meta file
=> generates
Make sure prefixD.tbt.data is empty. If not, sent to linguists to tag:
Tag negation: (O|N) if prefix is: a-, an-, de-, dys-, in-, under-
5: Verify dType on prefixD.yes.data
=> generates ./data/prefixD.yes.data.type
6: Add negation tag (N|O), it is uniquely sorted in the program (not by sort -u)
=> generates ./data/prefixD.yes.data.2014
Negation tagging error must be fixed
=> send to linguist to tag the negation (N|O)
6.1: Check conflict (inconsistent) tags between SpVars
generates ./data/prefixD.yes.data.${YEAR}.conflict
=> Ideally, the tag of prefixD between two records should be the same
Also, might cause inconsistent Negation tag on prefixD.
=> Ideally, the tag of negation between two records should be the same
=> If not empty, sent to linguist to tag (N|O|B) the EUI line.
=> The negation could have exceptions:
=> manually update this result to prefixD.yes.data.${YEAR}
=> The final prefix is in ${DERIVATION}/prefixD/data/${YEAR}/data/prefixD.yes.data.${YEAR}
15: Auto-fix prefixD.tag.txt for negation conflicts by SpVars
=> Put the revised tagged file to: ./dataOrg/prefixD.yes.data.${YEAR}.conflict.tag.data
Known cases in 2015 are:
1|E0013901|E0072172| # 556|antebrachium|noun|E0072172|brachium|noun|E0013901|O| # 1431|antibrachium|noun|E0072172|brachium|noun|E0013901|N| 2|E0013883|E0203565| # 557|antebrachial|adj|E0203565|brachial|adj|E0013883|O| # 1432|antibrachial|adj|E0203565|brachial|adj|E0013883|N| 3|E0024983|E0045258| # 11245|empanel|verb|E0024983|panel|noun|E0045258|O| # 15077|impanel|verb|E0024983|panel|noun|E0045258|N| 4|E0434097|E0580659| # 11243|embower|verb|E0580659|bower|noun|E0434097|O| # 15072|imbower|verb|E0580659|bower|noun|E0434097|N| 5|E0059482|E0523982| # 9310|disyllable|noun|E0523982|syllable|noun|E0059482|O| # 10500|dissyllable|noun|E0523982|syllable|noun|E0059482|N|
16: Auto-fix prefixD.tag.txt for negation conflicts by SpVars for class N and O
=> Check fix file exist: ./data/prefixD.negation.fix.data
=> copy ./data/prefixD.yes.${YEAR}.fixNegation to ./data/prefixD.yes.${YEAR}
7: Check afflix on prefixD.yes.data.${YEAR}
=> generates ./data/prefixD.pattern3.rpt (should be empty)
11: Run above 1-7 steps (default)
=> above steps from 1 ~ 7
2. Add new PD-Rules process
8: Retrieve possible raw prefixD pairs with options
${PREFIX}
to generate all prefixD pairs for a specified prefix (check the prefixD.rawNo.rpt.${PREFIX})
DONE
to retrieved all prefix are not TBD
3: Add tags to prefixD meta file
4: Split prefixD meta file
5: Verify dType on prefixD.yes.data
6: Add negation tag (N|O)
7: Compare original tag and result tag files
3. Add tag for new prefix dPairs (annual updates)
Update prefixD growth
Please refer to derivation design documents in Lexical Tools for details.