Major Depressive Disorder (MDD) is one of the most common mental health conditions worldwide and is often characterized by periods of remission followed by recurrence. Although many patients experience improvement in symptoms with treatment, depressive episodes frequently return, creating ongoing challenges for patients, healthcare providers, and health systems. Through participation in the AIM-AHEAD Federated Network Program, Dr. Patricia Mabry has explored how electronic health record (EHR) data, machine learning, and federated research approaches can be leveraged to predict the risk of depression recurrence and support earlier clinical intervention.
The project focuses on developing a predictive model capable of estimating an individual's risk of experiencing a recurrent depressive episode within six months. By identifying patients at elevated risk while they are still in remission, the research aims to support clinical decision-making and enable preventive strategies before symptoms worsen.
Study Design and Methods
Dr. Mabry’s work is designed to leverage EHR data from multiple healthcare institutions participating in AIM-AHEAD Federated Network Cohort 1, including HealthPartners Institute, the University of the Virgin Islands, Massachusetts Eye and Ear, and Fairview Health Services. Working within a federated research model, each institution retains control of its own clinical data while contributing to collaborative research efforts. This approach enables researchers to develop and evaluate predictive models across various healthcare settings without requiring the exchange of patient-level information.
The target population includes adults diagnosed with Major Depressive Disorder who are considered to be in remission at the time of a clinical encounter, defined by a Patient Health Questionnaire-9 (PHQ-9) score of less than 10. Researchers are evaluating a broad range of factors that may influence the risk of recurrence, including demographic characteristics, insurance status, social deprivation measures, healthcare utilization, antidepressant medication use and adherence, comorbid conditions, tobacco and alcohol use, and prior depression history.
The study incorporates a twelve-month look-back period to assess clinical characteristics prior to remission and a six-month risk window to identify recurrent depressive episodes. By examining repeated remission encounters over time, the project seeks to better understand how patterns captured within EHR data may predict future symptom recurrence.
To support this work, the team utilized RapidMLReady, a study design and cohort development platform developed to streamline the creation of machine learning-ready research studies. The platform helps researchers separate study design decisions from programming implementation, supports collaboration across various data environments, and improves reproducibility and analytic consistency across participating sites. By automating key aspects of cohort specification and study configuration, RapidMLReady reduces the need to repeatedly translate study requirements into site-specific analytic workflows.
Key Findings
A major outcome of the project has been the successful development of a scalable framework for conducting predictive modeling studies across multiple healthcare systems. Using RapidMLReady, researchers standardized study definitions, cohort construction, observation periods, and feature selection processes while maintaining flexibility for local data environments.
The project also demonstrated how machine learning study designs can be translated into reproducible workflows that can be implemented consistently across participating institutions. This capability is particularly important for federated research networks, where differences in data systems and analytic processes can create barriers to large-scale collaboration.
The proposed predictive framework incorporates an extensive set of clinical, behavioral, and social factors that may contribute to the recurrence of depression. Future model development efforts will evaluate a range of machine learning approaches, including Random Forest, XGBoost, Gradient Boosting, and deep learning methods, to identify strategies that provide strong predictive performance while maintaining clinical relevance.
Beyond the depression recurrence use case, the project demonstrated the value of reusable research infrastructure that can support future studies across multiple disease areas. By creating tools and workflows that can be adapted to new research questions, investigators can reduce development time, improve consistency across studies, and facilitate broader participation in collaborative research efforts.
Implications for Research and Clinical Practice
Predicting depression recurrence represents an important opportunity to move from reactive care toward more proactive mental health management. If patients at elevated risk can be identified before symptoms return, healthcare providers may be able to increase monitoring, adjust treatment plans, or connect patients with additional support resources before a depressive episode develops.
The project also highlights the growing role of federated research networks in advancing artificial intelligence and machine learning applications in healthcare. By enabling collaboration without requiring the transfer of patient-level data, federated approaches can support large-scale research while maintaining privacy protections and institutional governance requirements.
In addition, RapidMLReady illustrates how reusable tools and infrastructure can help make multisite research more efficient, reproducible, and scalable. These capabilities may lower barriers to participation in collaborative research and support future efforts to develop clinically relevant AI applications across a variety of healthcare settings.
Recognition and Next Steps
Beyond advancing research on depression recurrence, the project has contributed to the development of sustainable infrastructure for future federated studies. Key accomplishments include the establishment of durable governance processes, scalable research design approaches, reusable analytic tools and infrastructure, and expanded workforce capacity to support collaborative AI and machine learning research.
Future efforts will focus on feature extraction, model development, and performance evaluation using a variety of machine learning approaches. Researchers plan to assess predictive performance using measures such as sensitivity, precision, and the area under the receiver operating characteristic curve (AUROC), while exploring strategies for implementing predictive models in federated research environments.
By combining clinical informatics expertise, machine learning methodologies, and principles of federated research, Dr. Mabry and her collaborators are advancing approaches that may ultimately support earlier identification of patients at risk of depression recurrence and inform preventive care strategies.