Synonyms ReTag From Previous Releases
By the latest requirements, all LexSynonyms should be cognitive synonyms. However, there are LexSynonyms, which are near-synonym were tagged incorrectly in 2017 release (due to the modified requirements) and requires re-tag. A new alogirthm should be developed for this effort, such as broader/narrower concepts for terms include their subterms as sPair (e.g. "white horse" and "horse" are not a sPair). This task should be done in the sClass level (not sPair level) so that all not-cognitive spair with bi-direciotnal, spVars, and nominalization can be all exlcuded. Due to the limited resource, it was done in the sPair level (suggested by Francois Lang) for 2018 release. The processes of this fixes are briefly descibed as belows:
I. Inputs (Candidates for re-tag,provided by Francois)
II. Process
GetSynonyms ${year}
Option | Descriptions | Inputs | Outputs |
---|---|---|---|
23 | Retag synonyms from Meta-thesaurus
|
|
./Results/synonymFromMeta.data.fixed
|
24 | Retag synonyms from Nominalization
|
|
./Results/synonymFromNom.data.fixed
|
25 | Retag synonyms from LVG
|
|
./Results/synonymFromLvg.data.fixed
|
26 | Combine fixed synonyms from above steps: 23~25 |
|
./Results/synonym.data.${YEAR}.fixed
|
III. Results
Total | Yes | No |
---|---|---|
3,405 | 1,601 (47.0191%) | 1,804 (52.9809%) |
Source | Original | Removed | Remaining |
---|---|---|---|
CUI | 126,424 | 1,916 (1.5155%) | 124,508 (98.4845%) |
EUI | 67,862 | 32 (0.0472%) | 67,830 (99.9528%) |
LVG | 4,780 | 2 (0.0418%) | 4,778 (99.9582%) |
Total | 199,066 | 1,950 (0.9796%) | 197,116 (99.0204%) |