PROJECTS
Health information standards and discovery research focuses on the development of methods to gain insights from large health databases while learning the strengths and weaknesses of datasets and improving them, when possible. This area of research assesses whether specific standards are fit for purpose (e.g., quality assurance and interoperability assessments of biomedical terminologies) and investigates standards in action (e.g., in support of tasks such as natural language processing, annotation, data integration, and mapping across terminologies).
The Center for Clinical Observational Investigations (CCOI) seeks to reduce barriers to accessing data for researchers through an evolving, multi-pronged approach. Through initial strides, this project will includes curating a list of clinical datasets and cataloging high level data into Dataset Profiles.
The goal of this project is to develop a tool that can generate data entry forms dynamically based on specifications stored in a database. The development platform is Ruby on Rails, an open-source web application framework. Developers are using this tool in the data capture function of personal health records. They are also using several terminology resources from the UMLS (e.g. RxNORM, ICD9-CM) in data entry fields that require a set of controlled terms. Further development will involve work with very large databases of de-identified patient data. The goal is to create additional reusable software tools, some of which will involve biostatistical analysis with the "R" package.
We apply cutting edge data science approaches, including artificial intelligence and machine learning, to existing large-scale clinical datasets (LSCDs) and rearrange the data by putting data from people with HIV who are highly similar to each other into to their own cohorts. Research conducted on such cohorts is expected to be more reproducible, and its conclusions more robust. We will do this by automating the segmentation of people who are described in LSCDs and living with HIV. We will segment their clinical events into cohorts with reproducible cohort definitions. Our reproducible cohort definitions can be used for designing novel studies or to compare LSCDs to one another before a study begins to support choosing a LSCD intentionally. Nationality, demography, geography, treatment era, comorbidities, and preexisting conditions (prior to HIV infection) should inform treatment outcomes and efficacy when studying people living with HIV.
The current version of the LHC-Forms is at https://lhcforms.nlm.nih.gov/.
This is a collection of components used to create forms for use in Electronic Health Records.
The Medical Informatics Pioneers oral history project is here.
Oral history is a method for documenting history in a vivid way by recording the voices of those who have experienced it.
Beginning in 2004, Drs. Joan S. Ash and Dean F. Sittig chose and interviewed 17 medical informatics pioneers to capture their memories.
In 2013, NLM acquired the transcripts from the first 15 interviews and began work to make them publicly available, including recruiting and placing photographs to enliven the written words.
The LHNCBC Medical Terminology Standards project seeks to facilitate the development, promotion, and dissemination of health data standards, as well as to support the use of terminology standards in health care, public health, and research. The Project focuses on the integration, dissemination, quality assurance and applications of drug ontologies and on quality assurance in biomedical ontologies. We also develop application programming interfaces (APIs) and browsers for RxNorm and related drug resources.
Visit RxNav at https://rxnav.nlm.nih.gov/.
The RxNorm browser RxNav and application programming interfaces (APIs) support the adoption and distribution of RxNorm, the NLM standard terminology for drugs. RxNav and companion APIs also extend the scope of RxNorm by linking RxNorm drugs to physician-friendly terms (RxTerms), and drug classes (RxClass). RxMix allows users to combine API functions to build applications. RxNav-in-a-Box provides users with a locally installable version of the APIs and applications.
The current version of RxTerms is at https://mor.nlm.nih.gov/RxTerms/.
RxTerms is a drug interface terminology derived from RxNorm for prescription writing or medication history recording (e.g. in e-prescribing systems, PHRs). RxTerms is free to use (see terms and conditions). It directly links to RxNorm, the U.S. drug terminology standard and facilitates inclusion of RxNorm identifiers in electronic health records.
Lister Hill National Center for Biomedical Communication's (LHNCBC) natural language processing (NLP), or text mining, research focuses on the development and evaluation of computer algorithms for automated text analysis. This area of research works primarily with text from the biomedical literature or electronic medical records and examines a wide variety of NLP tasks, including information extraction, literature searches, question answering, and text summarization.
The current version of NLM-Scrubber, the NLM HIPAA compliant, clinical text de-identification tool, is here https://scrubber.nlm.nih.gov/
LHNCBC is developing a new software application that is capable of de-identifying many kinds of clinical reports with high accuracy. The software design uses a number of deterministic and probabilistic pattern recognition algorithms and various computational linguistic methods. The application accepts narrative reports in plain text or in HL7 format. When the reports are formatted as HL7 messages, the application leverages the labeled patient-related information embedded in various HL7 segments to find such information in the free text narrative.
The current version of the SPECIALIST Lexicon and NLP Tools are here https://lhncbc.nlm.nih.gov/LSG. LHNCBC's Lexical Systems Group develops and maintains the SPECIALIST lexicon and the tools that support and exploit it. The SPECIALIST Lexicon and NLP Tools are at the center of NLM's natural language research, providing a foundation for all our natural language processing efforts. In general, we investigate the contributions that natural language processing techniques can make to the task of mediating between the language of users and the language of online biomedical information resources. The SPECIALIST NLP Tools facilitate natural language processing by helping application developers with lexical variation and text analysis tasks in the biomedical domain.
Image processing focuses on data science research in biomedical image and signal processing, artificial intelligence, and machine learning to support automated clinical decision-making in disease screening and diagnostics. This area of research includes image and text analysis for clinical research, exploration of visual content relevant to disease in images and video, and visual information retrieval for embedding automated decision-support systems in diagnostic and treatment pathways.
Advances in machine learning and artificial intelligence techniques offer a promise to supplement rapid, accurate, and reliable computer-assisted disease screening. Such techniques are particularly valuable in overburdened and/or resource constrained regions. These regions also tend to exhibit high prevalence of infectious diseases and report high mortality. Our research in machine learning and artificial intelligence algorithms aims to improve disease detection accuracy and reliability, with a goal to also explain algorithm behavior.
To improve malaria diagnostics, the Lister Hill National Center for Biomedical Communications, an R&D division of the US National Library of Medicine, in collaboration with NIH’s National Institute of Allergy and Infectious Diseases (NIAID) and Mahidol-Oxford University, is developing a fully-automated system for parasite detection and counting in blood films.