INFORMATION & RESOURCES

COVID-19 Related Resources

We are providing descriptions and links to all of the various Indexing Initiative COVID-19 related activities and resources on this page. We wanted to provide a single landing page for you to find all of these resources without having to go through the entire site.

As of May 1, 2020 - All of the Indexing Initiative tools (MetaMap Lite, MetaMap, SemRep, and MTI) have all been updated to use the UMLS 2020AA release which does contain a small set of COVID-19 related terms that can be identified by our tools. For a complete list of what specific strings are included please review our COVID-19 Terms Page: https://metamap.nlm.nih.gov/Covid19Terms.shtml.

We also have a special coronavirus addendum for the Specialist Lexicon (https://lhncbc.nlm.nih.gov/LSG/Projects/lexicon/current/web/index.html) since we are currently in the middle of our release schedule. This coronavirus specific addendum is compatible with all versions of the Specialist Lexicon.

SemRep processes both the LitCovid and CORD-19 datasets on a weekly basis. Please Note: These results are provided without the need of an UMLS Licence Agreement due to the nature of the data.

SemRep Options Used for Processing Both Datasets:
semrep -A -N -n -S -F -Z 2021AB
     A: Anaphora resolution
     N: use_generic_domain_extension
     n: use_generic_domain_modification
     S: generic_processing
     F: full_fielded_output
     Z 2021AB: use 2021AB data

*** LitCovid Results (375,791 distinct citations as of September 1, 2023):

 LitCovid.RIS.gz (203 MB) - Downloaded RIS-Formatted LitCovid citations

md5sum de5c217c7a7b020b96e70156b0ccbb71
sha15sum 7a8eda90baee6521346c9944165b3118bb74d598

 LitCovid.MEDLINE.ALL.gz (53 MB) - MEDLINE Formatted LitCovid citations

md5sum 9659a41ffe322063dd77e4a95bcc7051
sha15sum 1c67faaf9d72e6d3c56f764b6e7b873f8b90c849

 LitCovid.SemRep.ALL.gz (241 MB) - SemRep Results File for LitCovid citations

md5sum 6d3d2f53af0ef398796c8107d0beb5fc
sha15sum 2aff000bee7a57909d567ba187e14a5d813420ae

*** CORD-19 Results (1,056,977 distinct articles as of July 19, 2022):

 CORD-19.metadata.csv.gz (551 MB) - Downloaded metadata.csv CORD-19 articles

md5sum a0b3f2fe6a19048e6fbcce0fb744874c
sha1sum 80f9c78f7c9d8717258b0fa681969b54b680228c

 CORD-19.MEDLINE.ALL.gz (399 MB) - MEDLINE Formatted CORD-19 articles

md5sum e8701fd1bc80957f08a0af4c85249d7b
sha1sum 09551f6867cdc1b058267b388f072dd7c9d5e94b

 CORD-19.SemRep.ALL.gz (1.6 GB) - SemRep Results File for CORD-19 articles

md5sum cba1c65dce37d97e6ab5a5e5f437ae29
sha15sum 60351ba29b5cf064fc1d2cd2616fd2ca4bad43e6

semmedVER42_R is a superset of semmedVER41_R that in addition includes data derived from all PubMed citations downloaded on April 30, 2020 using the query --

( covid OR sars-cov-2 OR wuhan OR coronavirus OR 2019-ncov OR sars ) AND 2019:2020[dp]

The additional COVID-19 citations were processed with SemRep and the 2020AA UMLS data, which includes COVID-19 terms in CUIs C5203670, C5203671, C5203672, C5203673, C5203674, C5203675, and C5203676.

To Download the SemMedDB Database click here.

UMLS 2020AA release does contain a small set of COVID-19 related terms that can be identified by our tools. For a complete list of what specific strings are included please review our COVID-19 Terms Page: https://metamap.nlm.nih.gov/Covid19Terms.shtml

MetaMap Lite Downloads: https://metamap.nlm.nih.gov/MetaMapLite.shtml


MetaMap Downloads: https://metamap.nlm.nih.gov/DataSetDownload.shtml