CENTER FOR CLINICAL OBSERVATIONAL INVESTIGATIONS
Large clinical datasets are an essential resource for biomedical research as they can provide data on millions of patients, which allows for greater strength and reliability in biomedical research. The valuable data found within these existing real-world, large-scale clinical datasets may reduce the need for some traditional intentional trials. However, accessing large clinical datasets can be challenging due to associated costs and license restrictions, among other barriers.
To address these challenges, the National Library of Medicine launched a new Center for Clinical Observational Investigations in 2023.
As a first step, NLM is curating a list of nationally and internationally available clinical datasets. Then, using informatics, data science, and statistical analysis, NLM will create and make available dataset profiles to include key information such as participants, demographics, diseases, and other characteristics important to research. The Center will also aim to employ a consistent approach to organize the data to foster standardization across the datasets and reduce ambiguity, improve reliability of research, and lower barriers to the use of data.
The stated clinical domains, visit contexts and individual concepts used in the CCOI dataset profiles were generated via data structured in the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). The OMOP CDM is a data standard, designed to standardize the structure and content of observational data. For more information on the OMOP CDM please visit OMOP Common Data Model (ohdsi.github.io). For more information on individual OMOP concepts please visit Athena, the OMOP vocabulary library searchable by OMOP concept ID, source code and item description.
The Center for Clinical Observational Investigations (CCOI) Dataset Profiles are a free, web-based resource for researchers interested in using clinical observational datasets and includes a metadata profile comprised of three components: 1) dataset overview, 2) basic statistics, and 3) concept counts. The dataset profiles are carefully curated through multiple data sources.
To ensure uniformity of the dataset profiles and enable efficient interoperability of disparate datasets, the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) is used to harmonize each dataset alongside other datasets whenever an OMOP mapping does not exist.
The CCOI Dataset Profiles allow clinical researchers to discover and understand available datasets and make informed decisions in dataset selection through the ability to assess project feasibility and compare metadata across different datasets.
The CCOI Dataset Profiles are designed to:
The CCOI does not host the individual-level datasets or their repositories but provides a link to these data sources for access and further exploration. If a CCOI dataset is no longer available, a message will be displayed that states, "The dataset is no longer available. For more information contact the dataset provider directly".
Through our dedication to curating dataset profiles and ensuring its ethical use and inclusivity, we aim to provide a resource that allows researchers to strategically select data aligned with their research hypothesis and scientific inquiry and facilitate informed decision-making on its feasibility. By harnessing comprehensive and diverse datasets for their investigations, researchers can unlock the full potential of these resources to advance scientific knowledge and generate transformative insights leading to meaningful progress towards achieving improved health and health equity for all.
Clinical observational datasets are reviewed by NLM’s CCOI for potential and continued inclusion in the CCOI Dataset Profiles using the criteria listed below.
NLM will routinely examine dataset profiles to confirm alignment with policies and best practices. Should non-compliance be identified, the team will collaborate with the dataset provider to rectify gaps or potentially exclude the dataset from CCOI.
Content in the CCOI Dataset Profiles may be collected from data sources and repositories managed by government agencies and other non-governmental organizations.
The standards specified under the CCOI Content and Inclusion Policy and Inclusion Criteria are taken into account when assessing a dataset for inclusion. NLM is not responsible for the quality of individual datasets. Any inquiries about the datasets or their contents should be directed to the dataset provider. The inclusion of a dataset in the CCOI does not represent its endorsement.
The beta version of the CCOI’s Dataset Profiles is being launched to gather user feedback, which will guide future development efforts.
The inaugural launch of the CCOI includes: