Author: McDonald CJ
Abstract:
Downloads for LOINC Observations and Lab Orders Value Sets:
Author: Jaeger S
Abstract:
This page hosts a repository of segmented cells from the thin blood smear slide images from the Malaria Screener research activity. To reduce the burden for microscopists in resource-constrained regions and improve diagnostic accuracy, researchers at the Lister Hill National Center for Biomedical Communications (LHNCBC), part of National Library of Medicine (NLM), have developed a mobile application that runs on a standard Android smartphone attached to a conventional light microscope. Giemsa-stained thin blood smear slides from 150 P. falciparum-infected and 50 healthy patients were collected and photographed at Chittagong Medical College Hospital, Bangladesh. The smartphone’s built-in camera acquired images of slides for each microscopic field of view. The images were manually annotated by an expert slide reader at the Mahidol-Oxford Tropical Medicine Research Unit in Bangkok, Thailand. The de-identified images and annotations are archived at NLM (IRB#12972). We applied a level-set based algorithm to detect and segment the red blood cells. The dataset contains a total of 27,558 cell images with equal instances of parasitized and uninfected cells. An instance of how the patient-ID is encoded into the cell name is shown herewith: “P1” denotes the patient-ID for the cell labeled “C33P1thinF_IMG_20150619_114756a_cell_179.png”. We have also included the CSV files containing the Patient-ID to cell mappings for the parasitized and uninfected classes. The CSV file for the parasitized class contains 151 patient-ID entries. The slide images for the parasitized patient-ID “C47P8thinOriginal” are read from two different microscope models (Olympus and Motif). The CSV file for the uninfected class contains 201 entries since the normal cells from the infected patients’ slides also make it to the normal cell category (151+50 = 201).
The data appear along with the publication:Rajaraman S, Antani SK, Poostchi M, Silamut K, Hossain MA, Maude, RJ, Jaeger S, Thoma GR. (2018) Pre-trained convolutional neural networks as feature extractors toward improved Malaria parasite detection in thin blood smear images. PeerJ6:e4568 https://doi.org/10.7717/peerj.4568
An improvement in performance has been recently reported using deep neural ensembles toward malaria parasite detection in thin-blood smear images and is published in the PeerJ journal as cited herewith:Rajaraman S, Jaeger S, Antani SK. (2019) Performance evaluation of deep neural ensembles toward malaria parasite detection in thin-blood smear images. PeerJ 7:e6977 https://doi.org/10.7717/peerj.6977
The datasets are available at cell_images.zip, the codes at malaria_cell_classification_code.zip and the Patient-ID to cell mappings for the parasitized and uninfected classes at patientid_cellmapping_parasitized.csv and patientid_cellmapping_uninfected.csv respectively.
Author: Kilicoglu H
Abstract:
472 consumer health questions submitted to NLM, de-identified and annotated for spelling errors (non-word, real-word). For more information on this dataset, see Kilicoglu et al. (AMIA 2015).
URL: https://ceb.nlm.nih.gov/ridem/infobot_docs/CHQA_SpellCorrection_Dataset.zip
Author: Demner-Fushman D, Kilicoglu H
Abstract:
1,548 Consumer Health Questions submitted to NLM, de-identified and annotated with named entities from 15 broad categories, including medical problems, drug/supplements, anatomy, and procedures. For more information on this dataset, see Kilicoglu et al.(LREC 2016).
Author: Kilicoglu H
Abstract:
181 structured drug labels (SPLs) extracted from DailyMed and annotated with three entity categories (drugs, drug classes, and substances) as well as several types of coreference relations (anaphora, cataphora, appositive, and predicate nominative). For more information on this dataset, see Kilicoglu and Demner-Fushman (PLOS ONE, 2016).
Available for download at https://github.com/kilicogluh/Bio-SCoRes/tree/master/DATA/SPL
Author: Kilicoglu H
Abstract:
In order to develop and evaluate a sortal anaphora resolution module, we annotated a corpus of 320 MEDLINE citations with pairwise sortal anaphora relations. Since we aimed at a general approach that takes into account all semantic types and consequently supports SemRep, we collected MEDLINE abstracts on a wide range of topics, including molecular biology and clinical medicine.
For further details, http://skr3.nlm.nih.gov/SortalAnaphora/.
Author: Demner-Fushman D, Roberts K
Abstract:
Consumer Health Questions submitted to the Genetic and Rare Disease Information Center (GARD) manually labeled with question decomposition annotations.This includes sentence-level annotations (Question, Background, and Ignore), question-level annotations (Coordination, Exemplification), and a document-level annotation (Focus). For more information on this data, see Roberts et al. (LREC 2014; BioNLP 2014).
Dataset: Question Decomposition Data
Author: Demner-Fushman D, Roberts K
Abstract:
Consumer Health Questions submitted to the Genetic and Rare Disease Information Center (GARD) manually labeled with question types. Uses the question decomposition annotations (above) to break multi-sentence questions into single-sentence sub-questions. Each sub-question has one question type designed to capture a high-level information need of a consumer health question (e.g., Diagnosis, Management, Susceptibility). For more information on this data, see Roberts et al. (BioTxtM 2014; AMIA 2014).
Author: Antani SK
Abstract:
The following de-identified image data sets of chest x-rays (CXRs) are available to the research community. Both sets contain normal as wellas abnormal x-rays, with the latter containing manifestations of tuberculosis.
Montgomery County X-ray Set:X-ray images in this data set have been acquired from the tuberculosis control program of the Department of Health andHuman Services of Montgomery County, MD, USA. This set contains 138 posterior-anterior x-rays, of which 80 x-rays are normal and 58 x-rays areabnormal with manifestations of tuberculosis. All images are de-identified and available in DICOM format. The set covers a wide range of abnormalities,including effusions and miliary patterns. The data set includes radiology readings available as a text file. Download Link
Shenzhen Hospital X-ray Set:X-ray images in this data set have been collected by Shenzhen No.3 Hospital in Shenzhen, Guangdong providence,China. The x-rays were acquired as part of the routine care at Shenzhen Hospital. The set contains images in JPEG format. There are 326 normal x-raysand 336 abnormal x-rays showing various manifestations of tuberculosis. Download Link
For additional information about these datasets, please refer to our paper.
Author: Fung K
Abstract:
SNOMED CT to ICD-10 Cross Maps (created and maintained by IHTSDO) - support epidemiological, statistical, and administrative reporting.
The map is updated and included with every International release of SNOMED CT which can be downloaded here. http://www.nlm.nih.gov/research/umls/licensedcontent/snomedctfiles.html
Author: Fung K
Abstract:
Mapping SNOMED CT codes to and from ICD codes
SNOMED CT is clinically-based, and oriented for direct use by healthcare providers, to document whatever is needed for patient care. ICD codes are oriented more for coding professionals to use after patient care has already been provided, for statistical data collection and billing. ICD codes lump less common diseases together in "catch-all" categories, for example, J15.8 Pneumonia due to other specified bacteria, which could result in loss of information. SNOMED Ct has more "granular" (specific) clinical coverage than ICD:SNOMED CT (clinical finding) has 100,000 codes, ICD-10-CM has 68,000 codes, and ICD-9-CM has 14,000 codes.
Due to the differences in granularity, emphasis and organizing principles between SNOMED CT and ICD-10-CM, it is not always possible to have a one-to-one map between a SNOMED CT concept and an ICD-10-CM code. To address this challenge, the SNOMED CT to ICD-10-CM Map follows an approach that is consistent with the approach used by the IHTSDO and WHO. When there is a need to choose between alternative ICD-10-CM codes, each possible target code is represented as a “map rule” (the essence of “rule-based mapping”). Related map rules are grouped into a “map group”. Map rules within a map group are evaluated in a prescribed order at run-time, based on contextual information and co-morbidities. Each map group will resolve to at most one ICD-10-CM code. In the event that a SNOMED CT concept requires more than one ICD-10-CM code to fully represent its meaning, the map will consist of multiple map groups.
We have created the SNOMED CT to ICD-10-CM Map to support semi-automated generation of ICD-10-CM codes from clinical data encoded in SNOMED CT for reimbursement and statistical purposes.
- Download: http://www.nlm.nih.gov/research/umls/mapping_projects/snomedct_to_icd10cm.html
- Latest release in September 2014 provides ICD-10-CM maps for 54,262 SNOMED CT concepts
- Third release (35,000 SNOMED CT concepts mapped to ICD-10-CM) is anticipated for June 2013.
- Second release was in July 2012 (15,000 SNOMED CT concepts mapped to ICD-10-CM).
- First release was in February 2012 (7000 SNOMED CT concepts mapped to ICD-10-CM).
Author: McDonald CJ, Vreeman D, Goodwin RM
Abstract:
Mapping your local laboratory test codes to LOINC can seem like a daunting task at first. Don't worry. To help you get started, we've created an empirically-based list of the most common LOINC result codes.
Knowing that relatively few codes account for much of the typical lab result volume, we think that this Top 2000+ list will be an excellent starter set. It contains just over 2,000 LOINC codes that represent about 98% of the test volume carried by three large organizations that mapped all of their lab tests to LOINC codes.
The LOINC Top 2000+ Lab Observations list is available in two varieties:
- US Version. For those who favor reporting in mass units (e.g. mg/dL)
- SI Version. For those who favor reporting in molar units (e.g. mmol/L)
To go with the Top 2000+ list, we've also written a Mapper's Guide that has a wealth of advice and guidance about which codes to choose for a given purpose. You can download it all here.
- Dataset: http://loinc.org/usage/obs
Author: Fung KW
Abstract:
Many existing electronic health record (EHR) systems contain clinical information encoded in ICD-9-CM. To facilitate migration to SNOMED CT as the primary clinical terminology for patient problems (diseases and conditions), it is desirable that the legacy ICD-9-CM data be translated to SNOMED CT. This will make it possible to compare newly collected data with historic data, and will also allow the EHR to make use of SNOMED CT to provide clinical decision support and other functions. The goal of the ICD-9-CM to SNOMED CT Map (herein referred to as “the Map”) is to facilitate the translation of legacy data and the transition to prospective use of SNOMED CT for patient problem lists. Note that this Map is not the same as, and serves different purposes from, the SNOMED CT to ICD-9-CM Map.
The most useful mappings are the one-to-one maps, in which a single SNOMED CT concept can be used to represent the full meaning of an ICD-9-CM code. This allows the automatic translation of ICD-9-CM codes into SNOMED CT codes without loss of meaning. The Map tries to identify as many one-to-one maps as possible, however, due to the differences between the two coding systems, one-to-one maps cannot be found for some ICD-9-CM codes. This difference is usually due to one of two reasons. Firstly, in ICD-9-CM, some codes are “catch-all” codes that encompass heterogeneous diseases or conditions (e.g. pneumonia due to other specified bacteria). These codes, commonly known as “NEC codes” (not elsewhere classified codes), will not have one-to-one maps because of their nature. Secondly, since SNOMED CT is more granular than ICD-9-CM in most disease areas, some ICD-9-CM diseases or conditions are further refined as more specific concepts in SNOMED CT. For such cases, it is not possible to map to a more specific SNOMED CT concept without the input of additional information.
The Map is published in two separate files, one containing the one-to-one maps, and the other the one-to-many maps. Also included in the files are the usage frequency of the ICD-9-CM codes, and the usage frequency of the SNOMED CT concepts from the CORE Problem List Subset data. The latter information can help users to identify the more commonly used SNOMED CT targets in the one-to-many maps.
Mapping Methodology
Two lists were obtained from the Centers for Medicare & Medicaid Services (CMS), covering commonly used ICD-9-CM codes in short-stay and outpatient hospitals respectively, for the year 2009. SNOMED CT maps for the ICD-9-CM codes in the lists were derived primarily from two existing knowledge sources: the synonymy between ICD-9-CM and SNOMED CT terms in the Unified Medical Language System (UMLS), and the SNOMED CT to ICD-9-CM Cross Maps published in the International release of SNOMED CT. The choice of target SNOMED CT codes was limited to concepts in three hierarchies: Clinical finding, Situation with explicit context, and Events. One-to-one maps identified by UMLS synonymy were not manually validated. One-to-many maps that were algorithmically identified which involved less than 5 SNOMED CT targets were manually reviewed, with the intention to reduce them to one-to-one maps if possible. ICD-9-CM codes with no maps, or one-to-many maps involving a large number of targets were not manually reviewed.
Author: Abhyankar S, Goodwin RM, Zuckerman A, McDonald CJ
Abstract:
To help promote efficient electronic exchange of standard newborn screening data, the Lister Hill National Center for Biomedical Communications, in cooperation with the Newborn Screening Community and HITSP Population Perspective Technical Committee, developed draft guidance about the use of LOINC and SNOMED CT codes to report newborn screening test results in standard Health Level 7 (HL7) version 2.x message format.
- Annotated Example HL7 Message: https://lhncbc.nlm.nih.gov/newbornscreeningcodes/nb/sc/download/2014-09-02_NLM_HRSA_HL7_NBS_example_v6.pdf
- LOINC panel for Reporting Newborn Screening Results: https://loinc.org/54089-8
Author: Abhyankar S, Goodwin RM, Zuckerman AE, McDonald CJ
Abstract:
Includes the LOINC terms required to report all newborn screening results for all states — including variables for reporting an overall summary, for most of the card variables and, for reporting impressions, narrative guidance and measures of quantitative markers for each condition or condition category. Think of it as a master template from which each state can select the variables it needs to report NBS results in the same organizational structure. This same information in spreadsheet format can be imported into laboratory databases - http://newbornscreeningcodes.nlm.nih.gov/nb/sc/download/54089-8_Newborn_Screening_panel_AHIC-240.xls.
- Dataset: https://loinc.org/54089-8
- Learn More: https://lhncbc.nlm.nih.gov/newbornscreeningcodes/nb/sc/constructingNBSHL7messages.html
Author: Abhyankar S, Goodwin RM, Zuckerman AE, McDonald CJ
Abstract:
Includes the LOINC terms required to report all newborn screening results for all states — including variables for reporting an overall summary, for most of the card variables and, for reporting impressions, narrative guidance and measures of quantitative markers for each condition or condition category. Think of it as a master template from which each state can select the variables it needs to report NBS results in the same organizational structure.
- Dataset: https://loinc.org/54089-8
- Dataset in spreadsheet format (xls): http://newbornscreeningcodes.nlm.nih.gov/nb/sc/download/54089-8_Newborn_Screening_panel_AHIC-240.xls
- More guidance for e-reporting newborn screening results: http://newbornscreeningcodes.nlm.nih.gov/HL7
Author: McDonald CJ, Abhyankar S, Taft L
Abstract:
To help you standardize your units of measure, we’ve created this translation table that enumerates the UCUM syntax for many common unit patterns currently used in electronic reporting. We composed this early version in relatively short order and focused on the basics. It was based on content provided by Intermountain Healthcare, from a joint National Library of Medicine and Regenstrief Institute project analyzing raw units from more than 23 laboratory sources, and from the HL7 table of units. We excluded the units of measure for which we couldn’t find clear definitions or patterns of usage, those we believed would only be used in pharmacy dispensing, and units used for purely clinical reporting (e.g. cigarette pack-years). We have included most of the pure metric units from our sources, whether or not they apply directly to lab testing because they will be generally useful (and are pretty straightforward in UCUM).
- Dataset: http://loinc.org/usage/units
Author: McDonald C, Vreeman D, Goodwin RM
Abstract:
These 300 (or so) codes cover more than 95% of lab test orders in the U.S.
The LOINC Top 300 Lab Orders is a collection of universal laboratory order codes that covers the most frequent lab orders. It was created for use by developers of provider order entry systems that would deliver them in HL7 messages to laboratories where they could be understood and fulfilled. This value set was developed through both empirical and consensus-driven approaches. Obviously, at only 300 codes it doesn't include everything you might want to order, but is probably a very good "starter set". This is the Laboratory Order Value Set referenced by the HITSP C80 Clinical Document and Messaging Terminology Construct in (Table 2-96) and the current HL7 Version 2.5.1 Implementation Guide: S&I Framework Laboratory Orders from EHR, Release 1 being balloted in HL7 and developed in collaboration with the HHS S&I Framework Laboratory Orders Interface Working Group.
- Dataset: https://loinc.org/usage/orders/
Author: Fung KW
Abstract:
The main purpose of the Nursing Problem List Subset of SNOMED CT is to facilitate the use of SNOMED CT as the primary coding terminology for nursing problems used in care planning, problem lists, or other summary level clinical documentation.
Author: Fung KW
Abstract:
The Route of Administration subset of SNOMED CT is a listing of the current set of terms related to the location of administration for clinical therapeutics.
Author: Fung KW
Abstract:
The CORE (Clinical Observations Recording and Encoding) Problem List Subset identifies important clinical concepts in SNOMED CT that occur frequently in the problem list. It facilitates the use of SNOMED CT for clinical documentation at the summary level.
Author: McDonald C
Abstract:
This file is an export of a key subset of the Panels and Forms represented in LOINC. The entire package of this key subset is currently available at http://loinc.org/downloads/accessory-files, in addition to separate packages for Laboratory panels, Clinical panels, Consumer Health panels, HEDIS panels, the HL7 Clinical Genetics panels, Newborn Screening panels, PhenX panels, US Government panels (including the CMS survey instruments MDSv2, MDSv3, OASIS, and CARE), and Other Survey Instruments. The hierarchical structure is represented in the file by the PARENT_ID, ID and SEQUENCE fields. The root, or top level, records in the file are those records where the PARENT_ID = ID. The records are in a Microsoft Excel spreadsheet (compressed as a zip file) with separate worksheets (tabs) for the form structure, LOINC code details, and answer lists.
Author: Ackerman MJ
Abstract:
The publicly-available Visible Human Project reference data sets are complete, anatomically detailed, three-dimensional representations of normal male and female human bodies. They include transverse CT, MR, and cryosection images. The male was sectioned at one millimeter intervals, the female at one-third of a millimeter intervals. The data sets are used in education, diagnosis, treatment planning, virtual reality, and virtual surgeries.
- Description, access information, and license agreement documents: http://www.nlm.nih.gov/research/visible/getting_data.html