
Data and Research Core (DRC)
Address research priorities and needs for conducting AI/ML research targeting the use of electronic health records
About
The AIM-AHEAD Coordinating Center (A-CC) is a consortium of institutions and organizations that have a core mission to advance health across all American communities. The A-CC consists of 4 Cores:
The Leadership/Administrative Core leads the A-CC, recruits and coordinates consortium members, project management, partnerships, stakeholder engagement and outreach to develop AI/ML talented researchers in health research, and establishes trusted relationships with key stakeholders to enhance the volume and quality of data used in AI/ML research.
The Data Science Training Core assesses, develops and implements a robust data science training curriculum and workforce development resources in AI/ML.
The Data and Research Core addresses research priorities and needs by linking and preparing multiple sources and types of research data. To accomplish its mission, the Data and Research Core facilitates the extraction and transformation of data from EHR and data on lifestyle contributors to health for research use.
The Infrastructure Core assesses data, computing and software infrastructure models, tools, resources, data science policies, and AI/ML computing models to facilitate AI/ML and health research; and establishes pilot data and analysis environments to accelerate overall A-CC aims.
The mission of the AIM-AHEAD Data and Research Core (DRC) is to broaden the scope of healthcare data in artificial intelligence and machine learning (AI/ML) and expand its availability to health researchers.
The DRC is not a single database. Instead, AIM-AHEAD seeks to catalyze an ecosystem of datasets to help enrich the data used in AI/ML models.
Data Set Options for Research Funded by AIM-AHEAD
These data sources are options for projects teams to propose for AIM-AHEAD-funded research projects. Applicants may also propose other data sources for their projects. As noted in the right column, AIM-AHEAD data partners provide extra services to facilitate access and mentorship to AIM-AHEAD-approved project teams.
Source |
Description |
Data Allowed |
Access Notes |
A customized subset from OCHIN Community Health Database |
EHR data from community health centers across US |
HIPAA Limited Data Set, individual-patient level data with dates and geographic indicators if needed for research |
AIM-AHEAD Data Partner* with facilitated access, concierge services for funded projects. Available through AIM-AHEAD Service Workbench, data use agreement and IRB required. (see below) |
Data Bridge from MedStar Health (Curated data from the MedStar Health EHR) |
EHR data from hospital system network with broad patient representation |
Multiple curated dataset options (further detail on website) pre-curated or custom curated de-identified EHR, Limited Dataset, Full PHI EHR dataset, Imaging, Select clinical notes, select genomics data, synthetic data |
AIM-AHEAD Data Partner* with facilitated access, concierge services for funded projects. Available through MedStar Health, data use agreement and IRB required. (see below) |
Selected large-scale cohorts related to heart, lung, blood and sleep disorders. Includes both prospective clinical studies and associated genomic TOPMED data. |
De-identified dataset. Including individual level genomic (TOPMED full genomes) and clinical datasets. |
Available on NHLBI BioData Catalyst Infrastructure. Requires approval of Data Access Request; most datasets require IRB. |
|
A variety of datasets available including clinical and genomic data |
Public data, and controlled access data (depends on dataset) |
Available on AIM-AHEAD Service Workbench; access requirements depend on the dataset. |
|
The All of Us Research Program is building one of the largest biomedical data resources of its kind. |
The All of Us Research Hub stores health data from a group of participants from across the United States. |
Available on All of Us Research Workbench, requires registration and institutional use agreement. |
|
ScHARe is a cloud-based research collaboration platform developed by the NIMHD and the National Institute of Nursing Research |
Google-hosted Public Datasets ScHARe-hosted Public Datasets ScHARe-hosted Project Datasets |
The DRC and Infrastructure Core also collaborate to assist AIM-AHEAD awardees in locating other data sources to support their projects. As part of its mission to expand datasets used in AI/ML, AIM-AHEAD has conducted a landscape survey to raise awareness about datasets that may be of interest to the research community. Each dataset has its own governance process and rules for access.
Apply to include a dataset in the data landscape list
Reach out to the HelpDesk to view the landscape survey datasets
AIM-AHEAD Data Partners
AIM-AHEAD-funded projects may apply to receive facilitated access and data concierge services from AIM-AHEAD data partners that emphasize targeted populations.