By David Dastvar, Linda Hermer, Ph.D., and Ratha Pech, M.S., Ph.D.
In the wake of the COVID-19 pandemic, the Federal Government has many interests in using machine learning to model the nation’s behavioral health. Problems of interest range from determining the number of individuals expected to develop depression or anxiety in response to COVID-19 to anticipating which children will lose their motivation to learn after months of online learning.
This article discusses how machine learning can be applied to major behavioral health problems of interest to the Federal Government and offers advice on building models that are standardized, robust and fair.
Machine Learning Approaches
There are three subtypes of machine learning: (1) supervised learning, (2) unsupervised learning, and (3) reinforcement learning.
Supervised learning models are trained by mapping input data features with the target values, which are then are used to predict the target values of the new or test data. In contrast, in unsupervised learning the training data do not contain the target values. In reinforcement learning, the algorithm starts by employing trial and error to arrive at a solution to the problem, and it is either rewarded or punished on each attempt. Programmed to maximize reward, the algorithm starts from random guesses and through training achieves sophisticated tactics.
The Advantages of Cloud-based Machine Learning
Performing machine learning in the cloud offers many advantages. With each major cloud provider, the algorithms are built into its hardware in an optimized fashion as a service offering, so running models in the cloud will be much more efficient than running them on premises. The latter requires setting up an end-to-end lab along with a large infrastructure and substantial computing storage. Running the models in the cloud also makes it easy to experiment with machine learning capabilities, arriving at relatively sophisticated models without building them from scratch. Cloud-based machine learning also provides a flexible and seamless machine learning Ops (MLOps) process with continuous integration and continuous delivery or continuous deployment (CI/CD) of the machine learning models. There are numerous open-source machine learning frameworks that can run on premises. However, training real-world models with big data typically requires high computing power and parallel or distributed computing, which are time- and labor-consuming as well as expensive to build and maintain on one’s own. The cloud pay-per-use service is a good choice that can leverage the speed and power of graphics processing units (GPUs) for training without investment in hardware.
Binary Classification: Predicting the Success of Mental Health or Substance Use Treatment
Machine learning can be employed to predict major behavioral health outcomes of interest, such as individual patients’ success in particular treatment programs.
Examples for Federal Health
The Substance Abuse and Mental Health Services Administration (SAMHSA) and the Assistant Secretary for Planning and Evaluation (ASPE) could use machine learning to predict which patients will succeed in each treatment program. They could use the results to maximize positive treatment outcomes, identify the predictors of treatment effectiveness, improve service delivery, reduce unnecessary spending, and promote evidence-based therapies. Given that these agencies have enormous patient databases, they would likely fare best by performing these predictions in the cloud.
Treatment success could be modeled as a binary outcome using such supervised learning models as random forests and extreme gradient boosting. Patient characteristics, including demographic and clinical variables combined with genetic and biomarker data when available, could be used to train these supervised-learning models.
For an agency like the Centers for Medicare and Medicaid Services (CMS), which oversees care for patients, the successful model could be used to predict treatment success for the most recent cohort of patients, and their treatment plans potentially altered as a result.
Binary Classification: Predicting At-risk Individuals
Of great interest to the VA, machine learning can be used to anticipate which Veterans will develop PTSD after deployment so they can be proactively treated. Shen et al. (2017) developed a simple questionnaire for soldiers that predicted post-deployment psychological problems. Using this questionnaire, machine learning would likely predict these problems to a much higher degree than the a priori statistical techniques Shen et al. used.
Determining whether specific individuals would likely develop an opioid use disorder if prescribed opioids for chronic pain is a binary classification problem that would be of great interest to CMS and the VA. Such information could be used to determine which patients should only be prescribed other analgesics. Additionally, for ASPE, the single and combined predictors of susceptibility to opioid addiction could be determined from the models using feature selection techniques, such as filter and wrapper methods, and then used to provide guidance on types of patients who should not be prescribed opioids.
Clustering: Grouping Similar Depression Cases
A first step in segmenting the currently unknown subtypes of depression, so that they can be better matched to antidepressants, would be to perform clustering—a type of unsupervised learning—on depression patients to determine the number of depression subtypes. Clustering models find similarities among patient data and group patients accordingly. Furthermore, similarities among patients in the same cluster could be identified, as well as differences in patients across clusters. This information could be used to help select antidepressants at CMS and the VA, and it could be used by SAMHSA to promote evidence-based treatments. It could also be used in federally funded research to develop more effective antidepressants.
Regression: Predicting the Mental Health Consequences of COVID-19 Policy Decisions
Machine learning would have been of great use during the pandemic to predict mental health sequelae of pandemic policy decisions. For example, nursing home residents were denied in-person visits to reduce COVID-19 mortality. Regression—a supervised machine learning technique—could have been used to model the number of residents expected to face more severe dementia and risk of death as CMS weighed the policy on nursing home closure to visitors.
Regression could also be used to predict the number of U.S. residents who would develop depression or anxiety if particular lockdown and social distancing measures were mandated. The Assistant Secretary of Preparedness and Response (ASPR) would be particularly interested in such findings.
Takeaways for Designing Trustworthy Machine Learning Models
It is paramount that models used in the public sector, particularly in the mental health sphere, fairly assess individual people without bias. Given that whites are more often diagnosed with depression, a model might perform with relatively high accuracy by merely predicting the subset of white individuals who will develop depression.
To avoid this, the models need to be trained on large, diverse data sets with variables including race, ethnicity, and nationality. Moreover, cases where persons of color develop depression should be up-sampled during training.
Additionally, the model algorithms should come from a standard set for a given modeling purpose. The National Institute of Standards and Technology (NIST) has been compiling a catalog of standard machine learning models to be used for each major type of modeling problem.
Appeal to the Federal Government
Behavioral health data have been unceasingly collected across the U.S. These data are the main resource supporting mental health surveillance, response, budgeting, planning, policymaking, and research. With the advanced cloud technologies and the ability to learn nonlinear and complex patterns from the data using machine learning techniques, the Federal Government can address nation’s urgent behavioral problems more effectively.
About the Authors
David Dastvar serves as Chief Growth Officer at Eagle Technologies, Inc. Mr. Dastvar’s 29 years of experience with public sector and Fortune 1000 companies (including CSC/GDIT, Infosys, CDI, and Northrop Grumman) is invaluable in developing and managing professional services and solutions for enterprise-level programs requiring an acuteness for program management and technical expertise. Mr. Dastvar is committed to delivering end-to-end solutions designed and implemented to client’s specific and unique requirements, leveraging existing investments while laying the groundwork for technology modernization and expansion. Mr. Dastvar can be reached at firstname.lastname@example.org.
Linda Hermer, Ph.D., leads the Research Team at Eagle Technologies. Dr. Hermer’s undergraduate degree is from Harvard University in Neurobiology and Linguistics, and her doctoral degree is from Cornell University in Psychology. An accomplished neuroscientist and cognitive psychologist, 10 years ago Dr. Hermer dedicated the second half of her career to improving public health. Since then, she has leveraged her technical skills as a biostatistician, senior scientist, and research director, endeavoring to modernize public health and social science research at universities, nonprofit organizations and for-profit firms.
Ratha Pech, M.S., Ph.D., is a data scientist at Eagle Technologies. Dr. Pech obtained his bachelor’s and master’s degrees in Computer Science from Norton University, Cambodia, and Sichuan University, China, respectively. He earned his doctoral degree in Engineering from University of Electronic Science and Technology of China. Dr. Pech has devoted his career to the application of data analytics and machine learning from exploring insights from data and building predictive models to solve problems, toward the goal of making powerful, data-driven decisions.
Eagle Technologies, Inc., (Eagle), founded in 2004, is a small business based in Arlington, VA, that specializes in developing individualized technological solutions for federal government and corporate clients nationwide. Our IT expertise includes infrastructure, enterprise architecture, cybersecurity, privacy systems, cloud-based services, application development, and mobile solutions. We also have a deep bench of experience in data analytics and reporting, database operations, grants management, and business intelligence services. We have extensive involvement in healthcare-related programs for the Department of Health and Human Services.
 Shen, Y.-C., Arkes, J. and Lester, P.B. BMC Psychology (2017), 5:32.