INFORMATION & RESOURCES

COVID-19 Related Resources

We are providing descriptions and links to all of the various Indexing Initiative COVID-19 related activities and resources on this page. We wanted to provide a single landing page for you to find all of these resources without having to go through the entire site.

As of May 1, 2020 - All of the Indexing Initiative tools (MetaMap Lite, MetaMap, SemRep, and MTI) have all been updated to use the UMLS 2020AA release which does contain a small set of COVID-19 related terms that can be identified by our tools. For a complete list of what specific strings are included please review our COVID-19 Terms Page: https://metamap.nlm.nih.gov/Covid19Terms.shtml.

We also have a special coronavirus addendum for the Specialist Lexicon (https://lhncbc.nlm.nih.gov/LSG/Projects/lexicon/current/web/index.html) since we are currently in the middle of our release schedule. This coronavirus specific addendum is compatible with all versions of the Specialist Lexicon.

SemRep processes both the LitCovid and CORD-19 datasets on a weekly basis. Please Note: These results are provided without the need of an UMLS Licence Agreement due to the nature of the data.

SemRep Options Used for Processing Both Datasets:
semrep -A -N -n -S -F -Z 2020AA
     A: Anaphora resolution
     N: use_generic_domain_extension
     n: use_generic_domain_modification
     S: generic_processing
     F: full_fielded_output
     Z 2020AA: use 2020AA data

*** LitCovid Results (163,864 distinct citations as of August 24, 2021):

 LitCovid.RIS.gz (22 MB) - Downloaded RIS-Formatted LitCovid citations

md5sum c6ea26282ae865b36c88a9fecd8dce17
sha1sum 9774fcda74b534e9ca395db823541e8c15f6992a

 LitCovid.MEDLINE.tar.gz(45 MB) - MEDLINE Formatted LitCovid citations

md5sum bf700aa25de14efcc23d30cbef3131c5
sha1sum 9851d390b0a19691edc47826fce39bbfe25fe835

 (183 MB) - SemRep Results File for LitCovid citations

md5sum 72447f7cadbc808082065fab707cecb1
sha1sum dedb374d1be5ee926cb42da42030bfcc8718b195

*** CORD-19 Results Part II (664,177 distinct articles as of August 24, 2021):

 CORD-19.metadata.csv.2.gz (344 MB) - Downloaded metadata.csv CORD-19 articles

md5sum 32d908b67c07bffad622db079b8ca585
sha1sum 96fd4d42113a0da4ffc4042dc9ca69bc9182ac4d

 CORD-19.MEDLINE.2.tar.gz(4.4 MB) - MEDLINE Formatted CORD-19 articles

md5sum d2c1f818e287c41659efff0f30b7dbb2
sha1sum 39f4321877cdea8226ce02b522ff94748c6a333b

 CORD-19.SemRep.2.tar.gz(18 MB) - SemRep Results File for CORD-19 articles

md5sum b938669fed3097f0d7c67f6b1a8dea08
sha15sum 48d20492464f79d6aa3303d6c4c809bd1eb5a46f

semmedVER42_R is a superset of semmedVER41_R that in addition includes data derived from all PubMed citations downloaded on April 30, 2020 using the query --

( covid OR sars-cov-2 OR wuhan OR coronavirus OR 2019-ncov OR sars ) AND 2019:2020[dp]

The additional COVID-19 citations were processed with SemRep and the 2020AA UMLS data, which includes COVID-19 terms in CUIs C5203670, C5203671, C5203672, C5203673, C5203674, C5203675, and C5203676.

To Download the SemMedDB Database click here.

UMLS 2020AA release does contain a small set of COVID-19 related terms that can be identified by our tools. For a complete list of what specific strings are included please review our COVID-19 Terms Page: https://metamap.nlm.nih.gov/Covid19Terms.shtml

MetaMap Lite Downloads: https://metamap.nlm.nih.gov/MetaMapLite.shtml


MetaMap Downloads: https://metamap.nlm.nih.gov/DataSetDownload.shtml