Dataset Profiles

Dataset Profiles are one method CCOI uses to reduce barriers to accessing large-scale clinical datasets. Dataset Profiles highlight, summarize and aggregate information about and within a dataset. Each profile provides key details to help researchers to understand the information provided in a dataset.

Each Dataset Profile includes:
  • Dataset Overview

    Provides general information about datasets including source information, features, descriptions, and other key details.

  • Basic Statistics

    Provides basic metrics of the dataset including overall population counts and counts by year, gender, etc.

  • Concept Counts

    Provides counts by individual concepts such as conditions, procedures, etc.

OMOP

To ensure uniformity of the dataset profiles and enable efficient interoperability of disparate datasets, the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) is used to harmonize each dataset alongside other datasets whenever an OMOP mapping does not exist. 

Recently Added

All of Us

The All of Us Research Program is a large-scale United States based research program that began nationwide enrollment in May 2018 and intends to recruit more than one million participants. All of Us integrates electronic health records (EHR) data with survey questionnaires, genomic data and wearable device data to develop a diverse, information rich database that serves as a central point for many secondary research studies and reduce the need for developing individual single use study specific data collection protocols. The program includes two tiers of data access the Registered tier and the more restricted Controlled tier.

CMS VRDC

The Centers for Medicare & Medicaid Services (CMS) Virtual Research Data Center (VRDC) collection contains populated claim forms and administrative meta data describing individual providers, facilities, patients, care plans and transactions known to CMS. The data is sourced from Medicare, Medicaid, Child Health Insurance Program (CHIP) and Social Security Disability Insurance (SSDI) encounters among others.

CPRD AURUM

Clinical Practice Research Datalink (CPRD) is a research service provided by the Medicines and Healthcare products Regulatory Agency with support from the National Institute for Health and Care Research (NIHR), as part of the United Kingdom (UK) Department of Health and Social Care, CPRD includes de-identified electronic health record patient-level data from a network of general practitioners from the UK.

UK Biobank

The UK Biobank program is a large health and biomedical database that serves multiple retrospective, observational studies. UK Biobank includes data from over half a million participants between the ages of 40 and 69 from the United Kingdom. UK Biobank contains a combination of health, questionnaire and genetic data that is regularly updated and enriched with new data fields.