Dataset Profiles

Dataset Profiles are one method CCOI uses to reduce barriers to accessing large-scale clinical datasets. Dataset Profiles highlight, summarize and aggregate information about and within a dataset. This provides key details to help researchers to understand the information provided by a dataset.

The CCOI provides three categories of information within a Dataset Profile:
  • Dataset Overview

    Provides general information about datasets including source information, features, descriptions, and other key details.

  • Basic Statistics

    Provides basic metrics of the dataset as a whole including overall population counts and counts by year, gender, etc.

  • Concept Counts

    Provides counts by individual concepts such as conditions, procedures, etc.

OMOP

To ensure uniformity of the dataset profiles and enable efficient interoperability of disparate datasets, the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) is used to harmonize each dataset alongside other datasets whenever an OMOP mapping does not exist. 

Recently Added

All of Us

Program that integrates Electronic Health Records (EHR) with survey questionnaire to develop a diverse, information rich database that serves as a central point for many secondary research studies and reduce the need for developing individual single use study specific data collection protocols.

CMS VRDC

The Centers for Medicare and Medicaid Services (CMS) Virtual Research Data Center (VRDC) collection contains populated claim forms and administrative meta data describing individual providers, facilities, patients, care plans and transactions known to CMS.

CPRD AURUM

Clinical Practice Research Datalink (CPRD) is a real-world research service supporting retrospective and prospective public health and clinical studies. CPRD includes de-identified Electronic Health Record patient level data from a network of UK based general practitioners (GPs). This profile is for CPRD AURUM which includes data collected from practices that use EMIS clinical systems.

UK Biobank

The UK Biobank program is a large health and biomedical database that serves multiple retrospective, observational studies and includes over half a million participants between the ages of 40 and 69 from the United Kingdom. UK Biobank contains a combination of health, questionnaire and genetic data that is regularly updated and enriched with new data fields.