Sorting Order: No Punctuation First
I. Description:
A base form contains no punctuation should be on the top of the sorting order.
II. Example
Alzheimers
is chosen over
Alzheimer's
and Alzheimers'
as the citation form because it does not contain punctuation.
2013- | 2014+ |
---|---|
{base=Alzheimer's spelling_variant=Alzheimers spelling_variant=Alzheimers' entry=E0000236 cat=noun variants=uncount } |
{base=Alzheimers spelling_variant=Alzheimer's spelling_variant=Alzheimers' entry=E0000236 cat=noun variants=uncount } |
{base=Alzheimer's disease spelling_variant=Alzheimers disease spelling_variant=Alzheimers' disease entry=E0000237 cat=noun variants=uncount } |
{base=Alzheimers disease spelling_variant=Alzheimer's disease spelling_variant=Alzheimers' disease entry=E0000237 cat=noun variants=uncount } |
III. Impacts (on Norm)
Results of NLP programs use citation forms might change accordingly. For examples, using above two LexRecords, the result of Norm (which uses -f:Ct) is changed accordingly between 2013- and 2014+:
Alzheimers' disease
alzheimer disease s
alzheimers disease
Similarly, the results of Norm (which uses -f:Ct) for the example below will be changed accordingly between 2013- and 2014+:
carcinoembryonic
carcino embryonic
carcinoembryonic
2013- | 2014+ |
---|---|
{base=carcino-embryonic spelling_variant=carcinoembryonic entry=E0015222 cat=adj variants=inv position=attrib(3) position=pred stative } |
{base=carcinoembryonic spelling_variant=carcino-embryonic entry=E0015222 cat=adj variants=inv position=attrib(3) position=pred stative } |
Please note that carcino-embryonic
will be normalized to carcino embryonic
in both versions because the replace punctuation with space flow component (-f:o) is processed before -f:Ct in Norm.
carcino-embryonic
carcino embryonic
carcino embryonic