INFORMATION & RESOURCES

COVID-19 Related Resources

We are providing descriptions and links to all of the various Indexing Initiative COVID-19 related activities and resources on this page. We wanted to provide a single landing page for you to find all of these resources without having to go through the entire site.

As of May 1, 2020 - All of the Indexing Initiative tools (MetaMap Lite, MetaMap, SemRep, and MTI) have all been updated to use the UMLS 2020AA release which does contain a small set of COVID-19 related terms that can be identified by our tools. For a complete list of what specific strings are included please review our COVID-19 Terms Page: https://metamap.nlm.nih.gov/Covid19Terms.shtml.

We also have a special coronavirus addendum for the Specialist Lexicon (https://lhncbc.nlm.nih.gov/LSG/Projects/lexicon/current/web/index.html) since we are currently in the middle of our release schedule. This coronavirus specific addendum is compatible with all versions of the Specialist Lexicon.

SemRep processes both the LitCovid and CORD-19 datasets on a weekly basis. Please Note: These results are provided without the need of an UMLS Licence Agreement due to the nature of the data.

SemRep Options Used for Processing Both Datasets:
semrep -A -N -n -S -F -Z 2021AB
     A: Anaphora resolution
     N: use_generic_domain_extension
     n: use_generic_domain_modification
     S: generic_processing
     F: full_fielded_output
     Z 2021AB: use 2021AB data

*** LitCovid Results (351,862 distinct citations as of May 15, 2023):

 LitCovid.RIS.gz (188 MB) - Downloaded RIS-Formatted LitCovid citations

md5sum 9f7949e8bf5689a30c01256fe5067721
sha15sum 80c7dd005ca3a6dff35b821360b4be7a16e57157

 LitCovid.MEDLINE.ALL.gz (52 MB) - MEDLINE Formatted LitCovid citations

md5sum 9d8c284a8a6b2340bebec30a023e4e4d
sha15sum 049bae8c6c95dfc822cfe914d2d3ebd18630a848

 LitCovid.SemRep.ALL.gz (236 MB) - SemRep Results File for LitCovid citations

md5sum 0cd803198e2e5c04528c1a6de304d8f6
sha15sum 08b1048c1de2cbdb201fd3fd044af5a4f1d3da46

*** CORD-19 Results (1,056,977 distinct articles as of July 19, 2022):

 CORD-19.metadata.csv.gz (551 MB) - Downloaded metadata.csv CORD-19 articles

md5sum a0b3f2fe6a19048e6fbcce0fb744874c
sha1sum 80f9c78f7c9d8717258b0fa681969b54b680228c

 CORD-19.MEDLINE.ALL.gz (399 MB) - MEDLINE Formatted CORD-19 articles

md5sum e8701fd1bc80957f08a0af4c85249d7b
sha1sum 09551f6867cdc1b058267b388f072dd7c9d5e94b

 CORD-19.SemRep.ALL.gz (1.6 GB) - SemRep Results File for CORD-19 articles

md5sum cba1c65dce37d97e6ab5a5e5f437ae29
sha15sum 60351ba29b5cf064fc1d2cd2616fd2ca4bad43e6

semmedVER42_R is a superset of semmedVER41_R that in addition includes data derived from all PubMed citations downloaded on April 30, 2020 using the query --

( covid OR sars-cov-2 OR wuhan OR coronavirus OR 2019-ncov OR sars ) AND 2019:2020[dp]

The additional COVID-19 citations were processed with SemRep and the 2020AA UMLS data, which includes COVID-19 terms in CUIs C5203670, C5203671, C5203672, C5203673, C5203674, C5203675, and C5203676.

To Download the SemMedDB Database click here.

UMLS 2020AA release does contain a small set of COVID-19 related terms that can be identified by our tools. For a complete list of what specific strings are included please review our COVID-19 Terms Page: https://metamap.nlm.nih.gov/Covid19Terms.shtml

MetaMap Lite Downloads: https://metamap.nlm.nih.gov/MetaMapLite.shtml


MetaMap Downloads: https://metamap.nlm.nih.gov/DataSetDownload.shtml