INFORMATION & RESOURCES

COVID-19 Related Resources

We are providing descriptions and links to all of the various Indexing Initiative COVID-19 related activities and resources on this page. We wanted to provide a single landing page for you to find all of these resources without having to go through the entire site.

As of May 1, 2020 - All of the Indexing Initiative tools (MetaMap Lite, MetaMap, SemRep, and MTI) have all been updated to use the UMLS 2020AA release which does contain a small set of COVID-19 related terms that can be identified by our tools. For a complete list of what specific strings are included please review our COVID-19 Terms Page: https://metamap.nlm.nih.gov/Covid19Terms.shtml.

We also have a special coronavirus addendum for the Specialist Lexicon (https://lhncbc.nlm.nih.gov/LSG/Projects/lexicon/current/web/index.html) since we are currently in the middle of our release schedule. This coronavirus specific addendum is compatible with all versions of the Specialist Lexicon.

SemRep processes both the LitCovid and CORD-19 datasets on a weekly basis. Please Note: These results are provided without the need of an UMLS Licence Agreement due to the nature of the data.

SemRep Options Used for Processing Both Datasets:
semrep -A -N -n -S -F -Z 2020AA
     A: Anaphora resolution
     N: use_generic_domain_extension
     n: use_generic_domain_modification
     S: generic_processing
     F: full_fielded_output
     Z 2020AA: use 2020AA data

*** LitCovid Results (151,586 distinct citations as of July 19, 2021):

 LitCovid.RIS.gz (21 MB) - Downloaded RIS-Formatted LitCovid citations

md5sum b54a016aa83e7035e61574ddc565c433
sha1sum 068b4cc88ea133d6716f90857087eb87b2322e0a

 LitCovid.MEDLINE.tar.gz(44 MB) - MEDLINE Formatted LitCovid citations

md5sum 23c0e9b83a2d16813b352ecd0777e96d
sha1sum 655f76facd0ea8729f570917d0b503bfc43a70ee

 (181 MB) - SemRep Results File for LitCovid citations

md5sum 269275ce96abe2c20fc05a4a00bf4f56
sha1sum bd7640f11548171f8b954cc59bcede06e77d6c43

*** CORD-19 Results Part II (623,267 distinct articles as of July 19, 2021):

 CORD-19.metadata.csv.2.gz (328 MB) - Downloaded metadata.csv CORD-19 articles

md5sum 2a907ae1785955acdabc27173046e989
sha1sum 780c61fa3845276f01890df887569a12fbfb6e5a

 CORD-19.MEDLINE.2.tar.gz(246 MB) - MEDLINE Formatted CORD-19 articles

md5sum ac374e8d8bcf9bf86b9a0f807a4a6a4d
sha1sum 2580c6e681098e868c02576c6285a5630454fdbf

 CORD-19.SemRep.2.tar.gz(975 MB) - SemRep Results File for CORD-19 articles

md5sum ec6cecd5c5d022fb74ee4569183bb292
sha15sum 2972096c8cc6a480d3c922fdcbedf71e44cc9507

semmedVER42_R is a superset of semmedVER41_R that in addition includes data derived from all PubMed citations downloaded on April 30, 2020 using the query --

( covid OR sars-cov-2 OR wuhan OR coronavirus OR 2019-ncov OR sars ) AND 2019:2020[dp]

The additional COVID-19 citations were processed with SemRep and the 2020AA UMLS data, which includes COVID-19 terms in CUIs C5203670, C5203671, C5203672, C5203673, C5203674, C5203675, and C5203676.

To Download the SemMedDB Database click here.

UMLS 2020AA release does contain a small set of COVID-19 related terms that can be identified by our tools. For a complete list of what specific strings are included please review our COVID-19 Terms Page: https://metamap.nlm.nih.gov/Covid19Terms.shtml

MetaMap Lite Downloads: https://metamap.nlm.nih.gov/MetaMapLite.shtml


MetaMap Downloads: https://metamap.nlm.nih.gov/DataSetDownload.shtml