Data and Research Core

MedStar Health AIM-AHEAD Data Bridge (AADB)

About the MedStar Health AIM-AHEAD Data Bridge (AADB):

Achieving greater health equity for marginalized and underrepresented populations and eliminating racial disparities in healthcare practice, delivery, access and outcomes are urgent issues in the US healthcare system today. Modern AI/ML, empowered by the wave of digitization in medicine and computing breakthroughs, has the potential to significantly boost health equity through appropriate design and use. However, bias from a variety of sources can impact the effectiveness of these emerging technologies. Science is not well developed, and there is a lack of diversity and representation of minorities in researchers in this field.

To advance the science of reducing racial disparities and ensuring that AI/ML solutions in healthcare are designed to promote health equity, a multidisciplinary, collaborative consortium, AIM-AHEAD Data Bridge (AADB) has been formed to advance the access and use of diverse data by researchers and communities currently underrepresented in the development of AI/ML models and Predictive Analytics, and to enhance the capabilities of this emerging technology, beginning with rich and diverse sources of Electronic Health Records (EHR) data to address health disparities and inequities.

Available Data:

MedStar Health Research Institute hosts a robust database of Electronic Health Records which can be made available to approved AIM-AHEAD awardees. The MedStar Health System includes an extensive network of clinical facilities in the mid-Atlantic region, including 10 hospitals (33% rural hospitals) and includes over 300 points-of-care connected by MedStar’s EHR system, built on the Cerner Millennium platform. The system includes 5 million unique patients with approximately 31% African American  patients. Project-specific datasets can be curated and made available for AIM-AHEAD Awardees. Additionally, the AADB has curated a library of AI/ML-ready datasets which are fully de-identified and ready for access upon regulatory clearance. 

To learn more about the data contents including a description, demographics, and data dictionaries, please explore the sites below:

Brain Imaging Dataset [COMING SOON]
Thyroid Imaging Dataset [COMING SOON]
Cardiac Imaging Dataset [COMING SOON]

GUMC Datasets

Additional Related Data Sets are available from Georgetown University Medical Center. These include:

  1. Synthetic Datasets Created from selected MHRI EHR data sets.
  2. Caris Genomic Testing Data from patients treated for cancer at Georgetown’s Lombardi Comprehensive Cancer Center. Upon IRB approval for both the EHR and genomic data.
  3. 16 Social Determinants of Health Data Indexes mapped to geographical locators. Available via Google Big Query APIs.

If you are interested in one of these GUMC datasets, please indicate this in your intake request so we can ensure a representative from GUMC attends the intake meeting.



Access to the Data Bridge data is available for all current AIM-AHEAD program awardees. 

How to Request Access

Eligible applicants can request access to data by filling out the intake form and providing your name, institutional affiliation, and project details. Upon completion of the intake form, please connect with our AADB team to schedule a meeting with our intake team. The intake team, led by the AADB methodologist will discuss your data needs to ensure feasibility in science and in data availability. This meeting is often useful to help narrow down the requested data elements to match those available in the dataset and/or EHR.

Request Access or Schedule a Data Consult

Once you meet with the AADB methodologist, you will revise your protocol and requested data elements (if necessary) and re-submit to the methodologist who is managing your request. Once the protocol and variable list are both finalized, the clearance process will begin (see below Access Process).

Access Process

  1. Feasibility and Scientific discussion with Methodologist
  2. Clearance
    1. Provide proof of CITI certification – Social and Behavioral Responsible Conduct of Research
    2. Data Sharing Agreement fully executed between your institution and MedStar Health Research Institute (MHRI)
    3. Provide IRB correspondence*
      *If Not Human Subjects Research (NHSR), we still require at minimum an email correspondence between you and your local IRB confirming activity is NHSR and that you are in compliance with your organization’s regulatory requirements. Our AADB team will be there to help you with any IRB questions along the way.
  3. Provide AADB team with information required for each of the persons who require access**
    1. First and Last Name
    2. Email Address
    3. Phone Number
    4. VIP Credential ID (See instructions)
    **Note:all persons who are provided access must be included/named on the IRB
  4. NE Profile Creation: Your AADB contact will use the provided information to create a non-employee profile (NEProfile) for each of the users who require access
  5. Access Via Remote Desktop: Your AADB contact will provide a job aid on how to access the MedStar Health Network via VPN and Remote Desktop
    1. If you experience any issues with access, please contact your AADB contact to set up a troubleshooting session
    2. Data may not be copied, downloaded, screenshot, recorded, or shared in any way
    3. You will use remote desktop to access a Windows Virtual Machine which will have the infrastructure needed to perform the desired analysis (to request tools or packages, please send your request to your AADB contact)

Infrastructure: Platform and Analysis Tools

AI/ML software will be installed on the virtual machine. The virtual machine is pre-installed with R and Python. The following R and Python packages are installed:

R Python
car, tidyverse, mosaic, lubridate, psych, survival, corrplot, randomForest, xgboost, rpart, rocr, nnet, pcr sklearn, numpy, pandas, tensorflow, torch, matplotlib, entropy, datetime, collections, h5py, seborn, scipy, tableone

Investigators with specific software requests can request to have additional software be downloaded on the virtual machine for their use. The AADB team will provide a concierge service to requesters to identify and address additional software needs.

Ethics and Equity

The AIM-AHEAD IRB & Compliance Workgroup developed an A-CC-Wide Regulatory Approvals Plan that provides an overview of IRB protocol submission processes and special considerations. The IRB & Compliance Workgroup is available on an as-needed basis to offer regulatory guidance to participating sites. Sites who have questions about IRB & Compliance can meet with the IRB Workgroup, who meet on a bi-weekly basis. To get scheduled for a meeting, the awardees should reach out through the IRB & Compliance Help Desk. 

The Ethics and Equity Workgroup developed the AIM-AHEAD Ethics and Equity Guiding Principles and the AIM-AHEAD Ethics and Equity Glossary that provides guidance on the factors that undermine the achievement of equity through the design, use, and application of artificial intelligence and machine learning (AI/ML). In other words, developers of AI/ML platforms and tools must contemplate, anticipate, mitigate, and address potential issues with downstream data aggregation, interpretation, and use. Meeting these goals requires a shared understanding of the terms we use in the policies and processes intended to oversee downstream of data aggregation, interpretation, and use. 

The following items must be considered by each AADB data requestor:

  1. Ensuring equity in infrastructure and training components of the project(s)
  2. Efforts to minimize threats to privacy, confidentiality, informed consent, and autonomy
  3. Producing logical explanations of how the algorithm arrived at its given output and minimize “black box” predictions 
  4. Reporting both benefits and harms
  5. Promoting sustainable AI through routine updates of tools/algorithms and retiring inefficient tools/algorithms 
  6. Providing guidance on ensuring generalizability of the approach
  7. Keeping data and algorithm equity front-and-center in planning, execution, and reporting
  8. Representation and integration of the projects to reduce disparities and promote health in their communities  
  9. Collaboration with the Applied Ethical AI Sub-Core to identify both anticipated and unanticipated ethical issues and develop solutions with the investigative team

Constraints and Acceptable Use of Data

Pursuant to AIM-AHEAD guidelines, all AADB users must provide proof of IRB determination or correspondence from IRB that confirms the activity is Not Human Subjects Research (NHSR). If you have any questions, your AADB contact and/or the IRB & Compliance Workgroup is here to help. Visit this link to schedule an office hour with the IRB & Compliance Workgroup.

Prior to access provisions, data users must establish a fully executed contract between data user institution and MedStar Health Research Institution. The AADB is leveraging a well-established controlled access model. As different projects have different needs, it is expected that requestors may need different types of data depending. Each type of data requires a different regulatory agreement:

  1. Completely de-identified – Requires Data Sharing Agreement (DSA)
  2. Limited datasets – Requires Data Use Agreement (DUA)
  3. Data Containing PHI – Requires Data Sharing Agreement (DSA)

Data users are expected to adhere to the AADB code of conduct. The AADB code of conduct outlines the acceptable access and use of data. The data user agrees to:

  • Attestation to never attempt to re-identify patients
  • Acknowledge AIM-AHEAD when publishing
  • Read and comply with AIM-AHEAD and MedStar Health policies and procedures
  • Never attempt to download, copy, transfer, or take screenshots of data
  • Inform MedStar Health Research Institute Office of Research Integrity of any and all breaches
  • Never attempt to contact participants
Scroll to top