AIM-AHEAD Consortium Development Projects to Advance Health Equity

Call for Proposals

AIM-AHEAD Consortium Development Projects to Advance Health Equity

Call for Proposals

Webinar Zoom Recording: Click Here

Key Dates

Solicitation Release Date: April 17, 2023

Letter of Intent Due Date: May 20th, 2023

Application Due Date: June 20, 2023

Earliest Start Date: September 1, 2023


Issued by

Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity (AIM-AHEAD) program




The AIM-AHEAD Consortium Development Projects to Advance Health Equity Call for Proposals (CFP) seeks to support multi-disciplinary research projects that use Artificial Intelligence/Machine Learning (AI/ML) to develop novel algorithms and approaches to address health disparities and inequities. This CFP is interested in research projects that use new or existing datasets, such as electronic health records (EHR), image data, and Social Determinants Of Health (SDOH), to develop and enhance AI/ML algorithms that have the potential to detect and mitigate bias or illuminate ethical or other emerging issues to reduce health disparities while improving healthcare and outcomes in alignment with the AIM-AHEAD North Stars in populations that experience health disparities in the United States.    


AIM-AHEAD North Stars 


North Star (I)     Develop a diverse, equitable, and inclusive AI/ML workforce.


North Star (II)    Increase knowledge, awareness, and national-scale community engagement/empowerment in AI/ML.


North Star (III)   Use AI/ML to address disparities and minority health in behavioral health, cardiometabolic health, and cancer. ​


North Star (IV)   Build community capacity and infrastructure in AI/ML to address community-centric health disparities and minority health.




The National Institutes of Health’s Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity (AIM-AHEAD) program has established mutually beneficial, coordinated, and trusted partnerships to enhance the participation and representation of researchers and communities currently underrepresented in the development of AI/ML models and to improve the capabilities of this emerging technology, beginning with electronic health records (EHR) and extending to other diverse data to address health disparities and inequities. Led by the AIM-AHEAD Coordinating Center, the consortium of institutions and organizations, maintains as its core mission to serve minorities and other underrepresented or underserved groups impacted by health disparities.


The rapid increase in the volume of data generated through electronic health records (EHR) and other biomedical research presents exciting opportunities for developing data science approaches (e.g., AI/ML methods) for biomedical research and improving healthcare. Many challenges hinder the more widespread use of AI/ML technologies, such as the cost, capability for widespread application, and access to appropriate infrastructure, resources, and training. Additionally, the lack of diversity of both data and researchers in the AI/ML field runs the risk of creating and perpetuating harmful biases in its practice, algorithms, and outcomes, thus fostering continued health disparities and inequities. Many underrepresented and underserved communities, which are often disproportionately affected by diseases and health conditions, have the potential to contribute expertise, data, diverse recruitment strategies, and cutting-edge science to inform the field on the most urgent research questions but often may lack financial, infrastructural, and data science training capacity to apply AI/ML approaches to research questions of interest to them. This program also seeks to enhance trust within the communities regarding the approaches of AI/ML.


AIM-AHEAD is committed to leveraging the potential of AI/ML to accelerate the pace of biomedical innovation while prioritizing and addressing health disparities and inequities. Tackling the complex drivers of health disparities and inequities requires an innovative and transdisciplinary framework that transcends scientific and organizational silos. Mutually beneficial and trusted partnerships can be established to enhance the participation and representation of researchers and communities currently underrepresented in AI/ML modeling and application and improve the capabilities of data curation and this emerging technology.



Research Objectives

The primary objective of this solicitation is to support multidisciplinary research projects that use AI/ML and develop novel algorithms and approaches that address health disparities and inequities and have the potential to significantly impact healthcare access and health outcomes for populations that experience health disparities. The Consortium Development Projects to Advance Health Equity focuses on AIM-AHEAD North Star (III): Use AI/ML to address disparities and minority health in behavioral health, cardiometabolic health, and cancer, besides its broad impact on each of AIM-AHEAD’s four North Stars. AIM-AHEAD envisions that research projects will:


  1. Build new datasets or leverage existing datasets and resources to use AI/ML algorithms that have the potential to eliminate health disparities
  2. Advance knowledge on approaches for detecting and mitigating bias in AI/ML models to avoid exacerbating health disparities
  3. Improve healthcare and health outcomes in alignment with the North Stars of the Program
  4. Bring together a broad range of institutions and organizations, including industry, community and non-profit organizations, as stakeholders in the AIM-AHEAD consortium.

Applicants may want to explore the OCHIN Community Health Equity Database for feasibility assessment as they write their proposals. OCHIN provides the means do so through use of Cohort Discovery, a web-based software tool for obtaining counts of patients matching user-specified inclusion/exclusion criteria. To gain access to Cohort Discovery, AIM-AHEAD program applicants must have completed and be up to date with standard training in Human Subjects Research and Responsible Conduct of Research such as those offered by the CITI Program. To request access to OCHIN’s Cohort Discovery, AIM-AHEAD program applicants can complete the OCHIN i2b2 End User Agreement.


Applications to this solicitation should use explainable and predictive AI/ML or novel algorithms and approaches that harness EHR, SDOH, clinical cohorts, omics, imaging, claims or administrative data, or other data types and have the potential to advance health equity by addressing health disparities and inequities in areas such as prevention, diagnoses, treatments, intervention, or implementation strategies. Applicants are encouraged to propose projects that consider ethical issues and potential unintended adverse outcomes of bias in applying AI/ML approaches to improving health outcomes.  Examples of the types of research projects include but are not limited to:


  • Address ethical, legal, and social implications (ELSI), including regulatory and oversight aspects, of AI/ML
  • Elucidate the role and impact of SDOH in the development of representative predictive AI/ML models that can be applied to prevent, treat, and implement healthcare strategies
  • Identify appropriate uses for models owned by industry to facilitate innovative multi-disciplinary research directions
  • Elucidate the effectiveness of different infrastructure models in combining and sharing data
  • Test the validity of metrics and statistical methods used to measure health disparities and health inequities using AI/ML approaches
  • Identify safety events in the EHR and potential bias of safety events due to incomplete medical records, fragmentation of care, or bias in healthcare
  • Determine the effectiveness of AI/ML models that use EHR, SDOH, and other data to predict which persons are at high risk for an adverse health outcome aligned with the North Stars, with a particular focus on North Star III
  • Determine the effectiveness of AI/ML models that use EHR, SDOH, and other data that identify patients at high-risk for low or high service utilization and who may need additional health service support (e.g., low use of maternal and child health services, risk for hospital readmission or admission for a potentially preventable condition/disease)
  • Research use of Natural Language Processing (NLP) to identify social needs, documented SDOH, family/patient engagement, and social support available
  • Research the results of an increase in the demographic/geographic diversity of population data used to train, validate, or calibrate algorithms to help detect and mitigate bias
  • Determine the effectiveness of AI/ML models which can validate the confidence in training datasets in real-world settings across diverse populations to mitigate bias among different racial and ethnic groups
  • Create and validate the usefulness of linked data sets such as EHR data, self-reported patient information, genomics, SDOH, geospatial, and mobile health data to improve AI/ML data mining techniques for these communities
  • Study AI/ML algorithm performance at predicting risk for conditions under North Star III compared to standard of care
  • Study AI/ML algorithm performance at predicting how patients from lower socioeconomic status zip codes seek mental health treatment
  • Study AI/ML algorithm performance at predicting how patients use AI devices (e.g., voice-assisted, digital twin, etc.) for telemedicine compared to standard of care
  • Characterize an analysis of SDOH free-text fields from the EHR and impact on overall health for underrepresented group(s)
  • Use an SDOH measure like social vulnerability index or area deprivation index to predict which patients may have social needs by adding in EHR-related data
  • Research practical use cases (e.g., ChatGPT, others) in healthcare, geographical data (FERPS codes, zip codes, regional data, proximity data), and free-text/unstructured data from the EHR (e.g., clinical notes, radiology notes)


The NIH defines health disparities as differences in the incidence, prevalence, morbidity, mortality, and burden of diseases and other adverse health outcomes among specific population groups. Research projects should focus on one or more NIH-designated health disparity populations in the United States. These population groups include racial and ethnic minorities (African Americans, American Indians, Alaska Natives, Asian Americans, Hispanic Americans, Native Hawaiians, and other U.S. Pacific Islanders, as well as subpopulations of all of these racial/ethnic groups), socioeconomically disadvantaged individuals, sexual and gender minorities, and medically underserved populations including individuals residing in rural and urban areas (see https://www.nimhd/ Research projects that include populations that identify across more than one population with health disparities are encouraged.


Proposals to this program must include a plan for ensuring work is guided by a concern for human and social impact and attention to ethical, legal, and socio-economic implications of AI/ML, including but not limited to (1) biases in datasets, algorithms, and applications; (2) issues related to identifiability and privacy; (3) impacts on disadvantaged or marginalized groups; (4) health disparities; and (5) unintended, adverse social, individual, and community consequences of research and development. Examples of research projects that are beyond the scope of this solicitation include, but are not limited to:

  • Use of AI/ML models for biomarker interrogation across populations
  • Research studies focused on clinical trials
  • Research studies focus on capacity building
  • Research involving training of participants




Eligible Organizations


  • Higher Education Institutions
    • Public/State Controlled Institutions of Higher Education 
    • Private Institutions of Higher Education 


    The following types of Higher Education Institutions are always encouraged to apply for NIH support as Public or Private Institutions of Higher Education: 

    • Hispanic-serving Institutions
    • Historically Black Colleges and Universities (HBCUs)
    • Tribally Controlled Colleges and Universities (TCCUs) 
    • Alaska Native and Native Hawaiian Serving Institutions
    • Asian American Native American Pacific Islander Serving Institutions (AANAPISIs)
  • Nonprofits Other Than Institutions of Higher Education
    • Nonprofits with 501(c)(3) IRS Status
    • Nonprofits without 501(c)(3) IRS Status
    • Tribal health and/or human service organizations or Tribally derived institutions (Urban Indian Health Organizations, Tribal Epidemiology Centers) 
    • Minority Business Enterprises (MBE) which are at least 51% owned, operated, and controlled on a daily basis by one or more (in combination) qualifying American citizens.
      • Small Businesses

      • For-Profit Organizations (Other than Small Businesses)


To be eligible, applicant institutions must demonstrate that they serve a high concentration of individuals from racial and ethnic groups that have been shown to be underrepresented in biomedical research (African Americans, American Indians and Alaska Natives, Native Hawaiians and other Pacific Islanders, Hispanic Americans). We highly encourage applicants to form multi-disciplinary teams from institutions and organizations representing underserved and underrepresented populations. 



Eligible Applicants


Any individual(s) with the skills, knowledge, and resources necessary to carry out the proposed research as the Program Director(s)/Principal Investigator(s) (PD(s)/PI(s)) is invited to work with his/her organization to develop an application for support. Individuals from the following groups are always encouraged to apply for NIH support.


  1. Individuals from racial and ethnic groups that have been shown by the National Science Foundation to be underrepresented in health-related sciences on a national basis (see data at  and the report Women, Minorities, and Persons with Disabilities in Science and Engineering). The following racial and ethnic groups have been shown to be underrepresented in biomedical research: Blacks or African Americans, Hispanics or Latinos, American Indians or Alaska Natives, Native Hawaiians and other Pacific Islanders.  In addition, it is recognized that underrepresentation can vary from one setting to another; individuals from racial or ethnic groups that can be demonstrated convincingly to be underrepresented by the grantee institution should be encouraged to participate in NIH programs to enhance diversity. For more information on racial and ethnic categories and definitions, see the OMB Revisions to the Standards for Classification of Federal Data on Race and Ethnicity (
  2. Individuals with disabilities, who are defined as those with a physical or mental impairment that substantially limits one or more major life activities, as described in the Americans with Disabilities Act of 1990, as amended.  See NSF data at,
  3. Individuals from disadvantaged backgrounds, defined as those who meet two or more of the following criteria:
    1. Were or currently are homeless, as defined by the McKinney-Vento Homeless Assistance Act (Definition: );
    2. Were or currently are in the foster care system, as defined by the Administration for Children and Families (Definition: );
    3. Were eligible for the Federal Free and Reduced Lunch Program for two or more years (Definition: );
    4. Have/had no parents or legal guardians who completed a bachelor’s degree (see );
    5. Were or currently are eligible for Federal Pell grants (Definition: ;
    6. Received support from the Special Supplemental Nutrition Program for Women, Infants and Children (WIC) as a parent or child (Definition: )
    7. Grew up in one of the following areas: a) a U.S. rural area, as designated by the Health Resources and Services Administration (HRSA) Rural Health Grants Eligibility Analyzer (, or b) a Centers for Medicare and Medicaid Services-designated Low-Income and Health Professional Shortage Areas  (qualifying zip codes are included in the file). Only one of the two possibilities in #7 can be used as a criterion for the disadvantaged background definition
  4. Literature shows that women from the above backgrounds (categories A and B) face particular challenges at the graduate level and beyond in scientific fields. (See, e.g., From the NIH: A Systems Approach to Increasing the Diversity of Biomedical Research Workforce
    Women have been shown to be underrepresented in doctorate-granting research institutions at senior faculty levels in most biomedical-relevant disciplines, and may also be underrepresented at other faculty levels in some scientific disciplines (See data from the National Science Foundation National Center for Science and Engineering Statistics: Women, Minorities, and Persons with Disabilities in Science and Engineering, special report available at , especially Table 9-23, describing science, engineering, and health doctorate holders employed in universities and 4-year colleges, by broad occupation, sex, years since doctorate, and faculty rank).



Funding Amounts

In order to ensure that the AIM-AHEAD –Coordinating Center is able to fund substantial projects, the budget for the projects will be in the range of $500,000 to $1,000,000 in total costs per year. It is possible that the budget for the projects could be higher or lower than this estimate based on the resources needed to complete the project. 


Funding Period 

Applicants may request up to 2 years of funding support commensurate with project scope. 


Allowable cost:

  • Academic Institutions: The award may be used for salary and fringe benefits of the Co-principal investigators, collaborating investigator(s), and other participants with faculty appointments, consistent with percent effort, and for project-related expenses, such as salaries of technical personnel essential to the conduct of the project, supplies, equipment, computers/electronics, travel (including international travel), volunteer subject costs, data management, and publication costs, etc. Tuition support for graduate students may also be requested. 
  • Other non-Academic Institutions/Organizations: The award may be used for salary and fringe benefits of the Co-principal investigators, collaborating investigator(s), and other participants consistent with salary structure, and for project-related expenses, such as salaries of technical personnel essential to the conduct of the project, supplies, equipment, computers/electronics, travel (including international travel), volunteer subject costs, data management, and publication costs, etc.


Unallowable cost:

  • Funds to support general staff or administrative support
  • Funds to support international travel



Data Governance and sharing of research products: 


Compliance and Governance

  • For all research projects involving human subjects research (including secondary data analysis), awardees are expected to submit their study for review to an IRB and obtain a determination. letter/response (See section Application Components section, subsection 10). **Even if the projects do not involve human subjects, a “letter of determination” is required. When a local IRB is not available, alternative options should be used.
  • Awardees are required to obtain any required Data Use/Sharing or Regulatory/Contractual Agreements for data needed for their research.
  • Any current or future data sharing must follow relevant governance documents and agreements.
  • Awardees must comply with all applicable Federal statutes (such as those included in appropriations acts) regulations and policies in addition to their institutional and state policies to receive research funding.


Note: No funds can be drawn down from NIH payment system and no obligations may be made for research involving human subjects by any site of AIM-AHEAD coordinating center engaged in such research for any period not covered by both an OHRP-approved Federal Wide Assurance and approval from the IRB, as required, consistent with 45 CFR Part 46 and any NIH required policies.


AIM-AHEAD Service Workbench Data Repository

The AIM-AHEAD Service Workbench platform   is a FISMA (Federal Information Security Modernization Act) secure environment working towards Authority to Operate (ATO) status that can serve as a repository for the data sharing needs of AIM-AHEAD projects and members. The AIM-AHEAD Service Workbench Centralized FISMA secure Infrastructure can provide secure hosting of data, management of access controls with appropriate authentication and authorization and continuous monitoring of the centralized environment that ensures system cybersecurity. More information in this document from NIH on how AIM-AHEAD repository was established: Establishing a Controlled Access Repository for AIM-AHEAD 


AIM-AHEAD will assist awarded investigators to identify an appropriate data repository, prepare data for transfer, and meet requirements of the data repository.


Ethical Use of Data


The Applied Ethical AI (AEAI) Subcore was established to provide support in surfacing, reasoning about, and resolving ethical issues in the AIM-AHEAD program. It is expected that investigators will work with the AEAI on the development of an initial ethics review and plan for their project within the first 4-6 weeks of support and will commit to participating in two AEAI-supported ethics forums during the course of their sponsorship.  These forums are designed to highlight and address ethical challenges in the development and implementation of AI and machine learning and range from open office hours to moderated discussions on hot topics in ethics and AI/ML.  Topics that may be addressed during these forums and the review are:


  1. Investigation into the ethical grounding of the project and the plans for embedding ethical reasoning into the project.


  1. Consultation on how to test for algorithmic fairness and data bias with respect to potentially sensitive variables (e.g., socio-demographics).


  1. Detection of potential data privacy concerns and alternatives for their resolution.



Review Criteria

Proposals will be reviewed considering the overarching human concern with respect to the ethical, legal and social implications of AI/ML including but not limited to:


  • Race/ethnicity must be present in the data an applicant proposes to analyze in addressing the North Star research questions
  • Biases in datasets, algorithms, and applications
  • Issues related to identifiability and privacy
  • Impacts on disadvantaged or marginalized groups
  • Health disparities
  • Unintended, adverse social, individual, and community consequences of research and development.


Scientific merit will be guided by the specific questions related to the proposed project:


  • To what extent does the proposal align with North Star (III); Use AI/ML to address disparities and minority health in behavioral health, cardiometabolic health, and cancer and overall goals of AIM AHEAD?
  • Does the proposal develop novel algorithms, or methods for addressing a health disparity problem? Or will it generate novel insights on a health disparity problem?
  • To what extent does the proposal appropriately consider ethical, legal, and privacy considerations?
  • Does the applicant have a willingness to engage and collaborate with the AIM-AHEAD community, contribute to documentation and training resources, welcome and empower new users, and help foster a diverse and inclusive community.
  • Does the proposal outline how the selected dataset(s) will contribute to addressing the research topic and approaches and/or tools that will benefit the functionality of the AIM-AHEAD ecosystem?
  • Does the proposal clearly articulate the goals of the project and expected outcomes in terms of research products or other artifacts?
  • To what extent is the proposed approach reasonable to achieve the goals of the project? What is the likelihood of a successful outcome? Are the measures of success clearly defined?
  • Does the applicant have the necessary background and capabilities to accomplish the proposed work in the expected funding period?


Programmatic Review Considerations


Eligible applicants will be required to demonstrate that in addition to meeting the financial eligibility criteria, resource allocation and shared governance within the award should preferentially benefit the host/lead institution including but not limited to:

  • Consideration will be given to the overall diversity of the investigator team in line with AIM-AHEAD program goals
  • Appropriateness and balance of health disparity populations for which the research is being conducted considering geographical region and other social determinants of health
  • Representation of leadership within the project (e.g., MPIs, Co-I, other key personnel)
  • Diverse representation to reflective of historically underrepresented groups and community stakeholders
  • Personnel capacity including administrative staffing, and project management (e.g., financial management, program management)
  • Proposed infrastructure required to execute stated research goals and objectives
  • Shared governance in decision-making by hosting institution


Submission using AIM-AHEAD Connect and InfoReady platform


Step 1: Click here to register as a “mentor" on AIM-AHEAD Connect (our Community Building Platform)

Step 2: Click here to submit an application for review using InfoReady platform*.

 * To submit your application in InfoReady, please use Chrome, Firefox, or Edge. If you're using Safari, make sure to clear your cache before logging in.

Please note both steps must be completed for consideration.

All applications must be received by June 20, 2023 5 PM Eastern Time.


Letter of Intent (Optional)

This CFP encourages that applicants submit a letter of intent. The information that it contains allows the research program committee to estimate the potential review workload and plan the review.

The letter of intent should include the following:

  • Descriptive title of proposed research
  • Name, address, and telephone number of the Principal Investigator(s)
  • Names of other key personnel
  • Participating institutions and organizations

The letter of intent should be sent by May 20th, 2023, at 5 pm EST to Harlan Jones, Ph.D. ( with the subject line "Letter of Intent: Consortium Development Projects".


Application Components

Required Format:

  • Arial font and no smaller than 11 point; margins at least 0.5 inches (sides, top and bottom) and single-spaced lines.  Submit as a single word or pdf document to the application portal.  


Required Elements of the proposal

  1. Title: The title should describe the project using concise and informative language.
  2. Project Summary/Abstract (1-page limit): Provide a succinct description of the proposed work including the project’s long-term objectives, and a description of the research design and methods for the entire AI/ML project.  
  3. Project Description: The project description should contain the following components adhering to the page limits. 
    1. Specific Aims (1-Page): Provide a clear, concise summary of the aims of the work proposed and its relationship to your long-term goals. State the hypothesis to be tested.
    2. Research Plan (5  pages)
      1. Background and Significance: Sketch the background and problem of the statement leading to this proposal. Summarize important results outlined by others in the same field, critically evaluating existing knowledge. Identify gaps that this project is intended to fill. State concisely the importance and relevance of the research to AI and health disparities research.  Describe the health disparity population of interest. Also, it is incumbent upon the applicant to make a clear link between the project and the mission of the AIM-AHEAD. The significance section will be assessed in terms of the potential impact on the AIM-AHEAD mission; this will be factored into the overall priority score as noted in the peer review criteria. 
      2. Preliminary Studies: Describe concisely previous work by the applicant related to the proposed research that will help to establish the experience and competence of the investigator to pursue the proposed project. Include pilot studies showing the work is feasible. (If none, so state.)
      3. Research Design and Methods: Description of proposed tests, methods or procedures should be explicit, sufficiently detailed, and well-defined to allow adequate evaluation of the approach to the problem. Describe any new methodology and how your research idea innovates or has an advantage over existing methods/applications and brings together diverse disciplines. Clearly describe the overall design of the study, with careful consideration to AI/ML and health disparities aspects of the approach or, and how your methods will control for bias and ethics, as well as how results will be analyzed. Include details of any collaborative arrangements that have been made. Applicants must explain how relevant social determinants of health, biological variables, such as sex, are factored into the research design, analysis and reporting. Furthermore, describe the infrastructure modality that would be used, data sources needed and used, compute infrastructure needed and used (see data and infrastructure section).  Describe plan for ensuring work is guided by a concern for human and social impact and attention to ethical, legal, and social implications of AI/ML including but not limited to (1) biases in datasets, algorithms, and applications; (2) issues related to identifiability and privacy; (3) impacts on disadvantaged or marginalized groups; (4) health disparities; and (5) unintended, adverse social, individual, and community consequences of research and development.
  4. References Cited (maximum 3 pages):List only references cited in the Project Description or supplementary documents of the proposal.
  5. Detailed Budget and Budget Justification: The budget justification should entail a narrative explanation of each of the components of the cost required of the proposed work. The budget explanations should focus on how each budget item is required to achieve the project’s objectives and how the estimated costs in the budget were calculated.
  6. Facilities, Equipment and Other Resources: Facilities, Equipment and Other Resources should describe the resources needed and those that are available for the proposed research project. Collaborators should also indicate the same. Applicants should convey how the scientific environment in which   the research will be conducted contributes to the probability of success.
  7. Senior Personnel Documents:
    • Biographical Sketches: Biographical sketches are required for the PI, any co-PIs, and each of the participating Senior Personnel listed in the Project Description (including postdocs, staff, and /or students). All biographical sketches submitted in response to this solicitation are expected to follow the NIH format. 
    • Current and Pending Support: Disclose any previous funding from NIH or AIM-AHEAD, Federal  and any other funding sources related to AI/ML and the CFP objectives.
    • Collaborators and Other Affiliations Information. 
  8. Human Subjects Research.
    • Does any of the proposed research involve human specimens and/or data (Including EHR and/or repository data)?  
    • Provide an explanation for any use of human specimens and/or data not considered to be human subjects research. 
    • Does your research require a Data Use Agreement or other agreement(s) for use of data?
    •  If yes, what is the plan of how these agreements will be executed between all partners/collaborators? 
    Note: Definition of Human subject: A living individual about whom an investigator (whether professional or student) conducting research: (1) obtains information or biospecimens through intervention or interaction with the individual, and uses, studies, or analyzes the information or biospecimens; or (2) obtains, uses, studies, analyzes, or generates identifiable private information or identifiable biospecimens. [45 CFR 46.102(e)]. 
  9. Other Supplementary Documents:
    • Institutional Need and Support Statement (up to 3 pages). This statement must be signed by the leadership of the institution (e.g., University president, provost or college/school dean as applicable) to show support for the partnership and commitment to additional resources necessary to ensure that these partnerships will have the maximum sustainability. This letter should include
      • Assessment of the institution’s current AI research, education as well as data and infrastructure capacity;
      • A statement of commitment of institutional support for the proposed activities. List the specific resources, space, protected time, etc. These statements should also identify the specific number of positions that are wholly dedicated to AI data and infrastructure under the partnership.  In addition, if American Indians are involved, a Letter of Commitment from the Tribal Nation Leader is required.
    • Letter of collaboration: 
    • Results from Prior NIH/AIM-AHEAD Support. If applicable, you must submit the Results from Prior NIH/AIM-AHEAD Support as a supplementary document, rather than as part of your Project Description. This allows you to maximize the use of the page allowance to describe the proposed activities. 
    • Cloud Computing Resources (if applicable).
    • Project Personnel and Partner Organizations (required). 



Progress and Post-Award Reporting 

Awardees are expected to develop novel AI/ML approaches to address health disparities and inequities in populations that experience health disparities in the US by embracing the contributions across diverse constituencies, particularly those historically underrepresented in this emerging area of research. Therefore, it is expected of the applicants to provide evidence of the following:

  • Awardees will also participate in monthly awardee meetings (via Zoom).
  • Awardees will submit monthly reports and budget reports.
  • Awardees will participate in meetings with the AIM-AHEAD Coordinating Center to inform overall AIM-AHEAD data sharing / data access strategies.
  • Awardees will participate in AIM-AHEAD annual meetings.
  • Awardees will act as AIM-AHEAD community ambassadors to help onboard colleagues.
  • Awardees must be willing to attend and present the results of their work at future AIM-AHEAD events as well as volunteer to review for future AIM-AHEAD programs.
  • Awardees agree to have AIM-AHEAD promote the project online through websites, social media, and other communication channels.
  • All awardees will be expected to be involved in AIM-AHEAD activities during the year, including attending two annual program-wide meetings.
  • Awardees must provide a summary of research status about milestones listed in the proposal, challenges faced and plans to overcome those challenges, usage of funds, and next steps.
  • By the project end date, awardees must provide a final report of research findings, usage of funds, and a list of publications, grant applications, articles, and conference talks emerging from the research.



Investigators planning to submit an application in response to this CFP are also strongly encouraged to contact and discuss their proposed research/aims with the scientific contact person listed on this CFP or through the AIM AHEAD Help Desk in advance of the application receipt date.




Director: Harlan Jones, PhD




Appendix A. Data

Data sources and accessibility

To be eligible for this application, research projects need to either bring their own data and/or be approved for use of existing AIM-AHEAD resources including the OCHIN Community Health Equity Database on AIM-AHEAD Service Workbench or MedStar Health EHR through the AIM-AHEAD Data Bridge (AADB). Applicants are required to provide but not limited to:

  • Reference(s) to the data under consideration and reasons for this choice.
  • Description of the potential impact of scientific advances that could be made from AI/ML applications developed with the data.
  • Description of the proposed methods/data modalities to be used
  • Description of how the data will be made available to AI/ML applications and researchers, for example, through NIH repositories, NIH knowledge bases, or other data sharing resources including those appropriate for controlled access data.
  • Description of how the ethical implications of data will be identified and addressed, including plans to develop and share documentation or datasheets that describe the motivation, composition, collection process and pre-processing, anticipated use cases, and other.


Dataset options (more information available on below datasets/cohorts):


Dataset Brief Description Data Allowed Size Analysis platform tools

A customized subset from OCHIN Community Health Equity Database

Primary care EHR and SDOH data from a 33-state network of community-based health centers. 

HIPAA Limited Data Set, patient-level data with dates and geographic information if needed for research 

A customized subset will be created from over 6 million records for the research question of approved projects 

AIM-AHEAD Service Workbench

Six curated MedStar Health EHR datasets from AIM-AHEAD Data Bridge OR option for custom curated dataset from MedStar Health EHR

Six curated dataset options of EHR data from underrepresented communities

De-identified dataset; Multiple curated dataset options

(see detailed data description on the AADB website)

MedStar Health EHR has a population of 5+ million records; curated datasets vary in size - See AADB website for more information

AIM-AHEAD Data Bridge

NIH All of Us

The All of Us Research Program  is building one of the largest biomedical data resources of its kind. 

The All of Us Research Hub stores health data from a diverse group of participants from across the United States.




Electronic Health Records


Biosamples Received

All of Us Researcher Workbench

BioData Catalyst (TOPMed)


BioData Catalyst® (BDC) is a cloud-based ecosystem providing tools, applications, and workflows in secure workspaces.

BDC allows researchers to find, access, share, store, and compute on large scale datasets.

TOPMed consists of over 180,000 whole-genome sequences, of which around 60% are of predominantly non-European ancestry.

BioData Catalyst (TOPMed)

Selected Open datasets on AWS 

A variety of datasets available including clinical and genomic data

Public data, and controlled access data (depends on dataset)

Selected Open datasets on AWS 

AIM-AHEAD Service Workbench


OCHIN Community Health Equity Database Description

OCHIN, a nonprofit health care innovation center with a core mission to advance health equity, operates the most comprehensive database on primary healthcare and outcomes of traditionally underserved patients. The OCHIN Epic EHR data warehouse aggregates electronic health record (EHR) and social determinants of health (SDH) data representing >6 million patients from 170 health systems and 1,600 clinic sites across 33 states (4.6 million patients are ‘active,’ with a visit in the last 3 years)


Approved AIM-AHEAD projects can obtain access to up to 10 years of longitudinal OCHIN Epic ambulatory EHR data, which is research-ready on the PCORnet Common Data Model (CDM). Contributing health systems are outpatient community-based health centers, which deliver comprehensive, culturally responsive, high-quality primary care health care services for communities most impacted by health disparities. This includes individuals and families experiencing poverty, houselessness, migrant agricultural workers and veterans. Community-based health centers often provide on-site services such as dental, pharmacy, mental health, substance abuse treatment, and social work regardless of patients’ ability to pay.


The OCHIN Community Health Equity Database will be accessed on the AIM-AHEAD Service Workbench.

See Data Dictionary of the OCHIN Community Health Equity Database on AIM-AHEAD Service Workbench. 


MedStar Health AADB Data Description

MedStar Health Research Institute hosts a robust database of Electronic Health Records which can be made available to approved pilots. The MedStar Health System includes an extensive network of clinical facilities in the mid-Atlantic region, including 10 hospitals (33% rural hospitals) and includes over 300 points-of-care connected by MedStar’s EHR system, built on the Cerner Millennium platform. The system includes 5 million unique patients with approximately 31% African American patients. Project-specific datasets can be curated and made available for pilot use. Additionally, the AADB has curated six AI/ML-ready datasets which are fully de-identified and ready for access upon regulatory clearance.


Six AADB available curated datasets:


  • Cardiometabolic correlates and maternal health
  • COVID-19 pandemic: Cardiometabolic, cancer, and behavioral health
  • Opioid use and misuse
  • Schizophrenia data
  • Voice-Assisted Personal Assistance in Heart Failure
  • Breast and Lung Cancer Images


To learn more about these data, visit


Requirements Prior to Obtaining Access to AIM-AHEAD Curated Data (OCHIN/AADB)


  1. Mandatory Human Subjects Research Training such as CITI “Human Research (Protection of Human Subjects)” and “Responsible Conduct of Research.”
  2. Data consult with the data source (i.e., OCHIN or MedStar) to finalize data request within 30 days of award.
  3. Submission for IRB approval/determination within 60 days of award.
  4. Signed data use agreement and IRB approval/determination within 90 days of award.


AWS Open Datasets

More information (description and links) available here


Acronym Name of AWS open dataset


EMory BrEast Imaging Dataset


1000 Genomes


The Cancer Genome Atlas


Genome Aggregation Database (gnomAD)


UK Biobank Pan-Ancestry Summary Statistics


Gabriella Miller Kids First Pediatric Research Program (Kids First)


Therapeutically Applicable Research to Generate Effective Treatments (TARGET)


Human Cancer Models Initiative (HCMI) Cancer Model Development Center


Cancer Genome Characterization Initiatives - Burkitt Lymphoma, HIV+ Cervical Cancer


Pancreatic Cancer Organoid Profiling


National Cancer Institute Center for Cancer Research - Diffuse Large B Cell Lymphoma (DLBCL) Genomics and Expression


CoMMpass from the Multiple Myeloma Research Foundation


The Human Connectome Project


Clinical Proteomic Tumor Analysis Consortium 2 (CPTAC-2)


Clinical Proteomic Tumor Analysis Consortium 3 (CPTAC-3)




Infrastructure Models


Applicants are required to identify their infrastructure modality to be used, the sources and types of data needed and/or used as well as computing infrastructure needed or used to carry out their research project. Applicants may also take advantage of AIM-AHEAD existing Infrastructure models (Optional). The applicant is requested to specify which one(s) will be used to conduct the research project:


Model  A

Copy and download of data

(No infrastructure needs from AIM-AHEAD)


Model B

Local data analysis by local investigators on their own on-prem or cloud environment

(Deployed by and for AIM-AHEAD projects. All funding, deployment are from AIM-AHEAD.  Governance is up to the institution)


Model C

AIM-AHEAD Centralized Hosting and computing in the cloud

(Deployed by and for AIM-AHEAD projects. All funding, deployment and governance are from AIM-AHEAD)

NIH Requirements: Establishing a Controlled Access Repository for for AIM-AHEAD

Example: AIM-AHEAD Service Workbench 


Model D

AIM-AHEAD Federated and Diverse Network

(Deployed by and for AIM-AHEAD projects. All funding, deployment and governance are from AIM-AHEAD)

 Example  AIM-AHEAD Federated and Diverse Network (see below)  and here


Model E

Access to existing Service platform

(Accessing a platform existing prior to AIM-AHEAD. Funding, deployment and governance are not 100% from AIM-AHEAD)

Example: AIM-AHEAD Data Bridge


Infrastructure Model Description


AIM-AHEAD Service Workbench (Infrastructure model C)

Get access here to AIM-AHEAD secured FISMA compliant cloud platform.


AIM-AHEAD Service Workbench on AWS (SWB) promotes infrastructure equity by providing the same analytical tools and level of access to researchers and students. The only infrastructure required to fully leverage this resource is a simple laptop and a mid-range internet connection. SWB provides researchers and students with a user-friendly environment to configure and deploy their own secure cloud-computing environment in a few clicks. SWB is the first Apache 2.0 open-source native cloud computing platform that provides a modular and scalable solution to the supply of computing environments for researchers and students. The platform supplies students and research teams with a simple web application, empowering them to easily deploy and access any cloud workspace from a custom catalog of pre-configured  and extendible) environments ( R, Jupyter Notebooks, Python, IGV, GATK, etc…) leveraging all AWS advanced data analysis tools and native security controls.  To get started with an analysis, end-users are only required to connect to the web application and select their desired configuration. The research workspace will be deployed in two clicks, selecting first the type of workspace and second the most applicable configuration for their analysis in terms of instance type, memory, CPUs, and GPUs. Researchers will have access to the computing power they need, regardless of the technical underlying complexity of it. Moreover, there is a growing open community supporting various SWB workspaces, which enables the deployment of any type of computing workspaces. 


Examples of the resources available - out-of-the-box -  in SWB: AWS SageMaker instances that work with widely used Jupyter Notebook formats, as well as Amazon EMR instances for deep learning (DL)/ML projects that require cluster computing. Moreover, SageMaker instances come with staple ML/DL libraries (e.g., TensorFlow, PyTorch, MxNet), allowing savvy users to get started right away. On the opposite, non AI/ML experts  can discover all pre-configured computing environments featuring many tutorials and analysis examples using public data in the form of Jupyter Notebooks for anyone to start learning using those resources.


This AI/ML service accelerates innovation with purpose-built tools for every step of ML development, including labeling, data preparation, feature engineering, statistical bias detection, auto-ML, training, tuning, hosting, explainability, monitoring, and workflows. SageMaker helps data scientists and developers to prepare, build, train, and deploy high-quality ML models quickly by bringing together a broad set of capabilities purpose-built for ML, and enables the Fellows to develop and serve their own models.


AIM-AHEAD Federated and Diverse Network (Model D)

The model D of a federated network is most useful when the full data source cannot be moved to a central location and leaves the initial boundary (Figure 1). This federated data network allows for data to stay at the local site and the ability for sites to opt-in to participation. Having a local clinical data warehouse is a pre-requirement. Participating sites may receive assistance to set up local AIM-AHEAD federated infrastructure and would be provided with centralized code to run on their data locally. Sites share ONLY aggregate data results (Aggregate Data) as they feel comfortable. Easy-to-use dockerized environments can be set up at each participating site with data science tool kits to analyze electronic health record (EHR) data. Infrastructure Core also would provide concierge services to assist with data curation, analysis and regulatory considerations. The anticipated first investigators using AIM-AHEAD federated and diverse network will be any AIM-AHEAD funded investigators from pilot projects, research and leadership fellowships. Another advantage of Model D is its potential of recruiting additional sites that serve as data providers to support AIM-AHEAD projects, addressing the need for diverse data sources to study health disparity. All source code - under an open source Apache 2.0 license - to create and establish the AIM-AHEAD Federated network is here:


The AIM-AHEAD Federated infrastructure is based on another existing successful international consortium: 4CE: Consortium for Clinical Characterization of COVID-19 by EHR   Up to March 20th, 24 articles have been published from the 4CE consortium using this federated approach. Any data source interested to analyze their own data across AIM-AHEAD federated network can participate. The owner or custodian of the data source will always have FULL control on the use of their data in participating in the Governance committee of the research questions and use of aggregate counts of the data.

Additional information is here: Deploying the Model D federated Infrastructure for the Consortium development projects

AIM-AHEAD Data Bridge (Model E)

Approved pilots will have access to a Virtual Machine (VM) where the data and infrastructure will be available. AI/ML software will be installed on the VM, and the VM will be  pre-installed with R and Python. The following R and Python packages are installed: 


R Python
car, tidyverse, mosaic, lubridate, psych, survival, corrplot, randomForest, xgboost, rpart, rocr, nnet, pcr sklearn, numpy, pandas, tensorflow, torch, matplotlib, entropy, datetime, collections, h5py, seborn, scipy, tableone


Investigators with specific software requests may request to have additional software be downloaded on the VM for their use. Approved pilots will coordinate with their AADB support team who will provide a concierge service to requesters to identify and address additional software needs.

Scroll to top