INFORMATION & RESOURCES

COVID-19 Related Resources

We are providing descriptions and links to all of the various Indexing Initiative COVID-19 related activities and resources on this page. We wanted to provide a single landing page for you to find all of these resources without having to go through the entire site.

As of May 1, 2020 - All of the Indexing Initiative tools (MetaMap Lite, MetaMap, SemRep, and MTI) have all been updated to use the UMLS 2020AA release which does contain a small set of COVID-19 related terms that can be identified by our tools. For a complete list of what specific strings are included please review our COVID-19 Terms Page: https://metamap.nlm.nih.gov/Covid19Terms.shtml.

We also have a special coronavirus addendum for the Specialist Lexicon (https://lhncbc.nlm.nih.gov/LSG/Projects/lexicon/current/web/index.html) since we are currently in the middle of our release schedule. This coronavirus specific addendum is compatible with all versions of the Specialist Lexicon.

SemRep processes both the LitCovid and CORD-19 datasets on a weekly basis. Please Note: These results are provided without the need of an UMLS Licence Agreement due to the nature of the data.

SemRep Options Used for Processing Both Datasets:
semrep -A -N -n -S -F -Z 2021AB
     A: Anaphora resolution
     N: use_generic_domain_extension
     n: use_generic_domain_modification
     S: generic_processing
     F: full_fielded_output
     Z 2020AA: use 2020AA data

*** LitCovid Results (209,150 distinct citations as of January 7, 2022):

 LitCovid.RIS.gz (29 MB) - Downloaded RIS-Formatted LitCovid citations

md5sum 890fe2330e28fc0a0504310a9ac6fafe
sha15sum b7bc9b2613542efe744ec9d3affb7b2f803436fc

 LitCovid.MEDLINE.ALL.gz (40 MB) - MEDLINE Formatted LitCovid citations

md5sum a0a0ae25fa76870c83891631cbbaca1c
sha15sum 4652b63235f7e46fc6812072870aa55725906d2d

 LitCovid.SemRep.ALL.gz (185 MB) - SemRep Results File for LitCovid citations

md5sum 35dceda432ab8b68f3f4187c4481beb3
sha15sum 48b3c3d838a159d3cb60b51c905a9b3ce18b00d9

*** CORD-19 Results Part II (793,553 distinct articles as of January 7, 2022):

 CORD-19.metadata.csv.gz (416 MB) - Downloaded metadata.csv CORD-19 articles

md5sum 58252b9c3f89d1f46304ff556436c97a
sha1sum 7c29114c82b57ca968f36308ada0e05cde12998e

 CORD-19.MEDLINE.ALL.gz (282 MB) - MEDLINE Formatted CORD-19 articles

md5sum e81798fe91c54815cb119e67f16ca0b3
sha1sum f29953f28ca78c157e03677b9a0a59bff48505f1

 CORD-19.SemRep.ALL.gz (1.3 GB) - SemRep Results File for CORD-19 articles

md5sum bb9d991c4085ec30c4563f3aba350b90
sha15sum de466246d0785d3cd9243ded0bcf497f81d6f367

semmedVER42_R is a superset of semmedVER41_R that in addition includes data derived from all PubMed citations downloaded on April 30, 2020 using the query --

( covid OR sars-cov-2 OR wuhan OR coronavirus OR 2019-ncov OR sars ) AND 2019:2020[dp]

The additional COVID-19 citations were processed with SemRep and the 2020AA UMLS data, which includes COVID-19 terms in CUIs C5203670, C5203671, C5203672, C5203673, C5203674, C5203675, and C5203676.

To Download the SemMedDB Database click here.

UMLS 2020AA release does contain a small set of COVID-19 related terms that can be identified by our tools. For a complete list of what specific strings are included please review our COVID-19 Terms Page: https://metamap.nlm.nih.gov/Covid19Terms.shtml

MetaMap Lite Downloads: https://metamap.nlm.nih.gov/MetaMapLite.shtml


MetaMap Downloads: https://metamap.nlm.nih.gov/DataSetDownload.shtml