Share This Page

The Promise of Big Data Imaging for Mental Health
Knowledge gleaned from big data and advances in neuroimaging have provided new insights into the workings of the brain. Our author, founding director of the Center for Translational Research in Neuroimaging and Data Science, traces the evolution of these two evolving fields.

Illustration by Daniel Hertzberg
Mental and degenerative disorders are among the most costly and common causes of disability in society today. Because the brain is the most complex organ in the human body, diagnosing and treating problems when things go wrong poses enormous challenges. Even before the 1990s was designated the Decade of the Brain, the potential of neuroimaging—the technology that makes it possible to see inside the working brain—was a major focus in psychiatry. Since that time, expectations have been high that neuroimaging would move the needle forward in unraveling the mystery of mental illness.
Living in a time of such national programs as the BRAIN Initiative and Human Connectome Project, we’ve become accustomed to hearing terms such as artificial intelligence, virtual reality, and brain-machine interface. In light of the exponential increase in computational and algorithmic power, one can only assume that we have made great progress via psychiatric neuroimaging. But just how far have we actually come in the last half-century? How close are we to having neuroimaging-based tools that can be used in the clinic? Have we learned anything about diagnosis or mental illness from the vast trove of neuroimaging data that has been collected over the years? I wish I could point to specific examples where the widening use of neuroimaging is beginning to help the mentally ill, but we are just not there yet.
As an undergraduate electrical engineering student at the University of Kansas in 1991, I became enthralled with magnetic resonance imaging (MRI), which makes it possible to see inside the body, noninvasively and safely, using magnetic fields and radio waves. This was the beginning of a career in brain imaging that led me to spend 12 years in a psychiatry department at Johns Hopkins where, I like to joke, I was an engineer “studying the psychiatrists.” In reality, it was a fascinating experience, as I learned the complexity of the problems under study, saw the potential for applying engineering and signal processing principles, and gravitated to the emerging specialty of psychiatric neuroimaging.
It didn’t take me long to realize that the study of psychiatric disorders is extremely difficult, in part, since the assessment of the individual is mainly based on their medical history and symptoms they are experiencing at that moment. And while these assessments can be reliable up to a point, they largely lack biological validity. If someone is feeling depressed, for example, they may be diagnosed with depression. If, however, a few months later they are experiencing hallucinations or their thoughts are confused, the diagnosis is likely to be schizophrenia or bipolar disorder. There are no objective tests to confirm or rule out any of these disorders. Nor are there cures, although existing treatments may mitigate symptoms. But even determining appropriate treatment can be a formidable challenge, often requiring sequential effort to determine, and the wrong treatment can muddle diagnosis and, in some cases, the wrong medication can make a condition worse.
The Promise of Magnetic Imaging
MRI has been used to visualize the size (i.e., volume) of various brain structures for over 40 years now. And for almost 30 years we have been able to see fluctuations in brain activity using a technique called functional MRI (fMRI), which tracks magnetic signals that reflect changes in blood flow and oxygenation level associated with neuronal activity. The discovery of fMRI raised hopes for a clinical breakthrough in assessment and treatment. Beyond the challenges that lay before them, the psychiatric research community saw the potential that neuroimaging held for helping patients whose lives were disrupted by a malfunctioning brain.
We believed we were on the verge of better understanding how the brain worked, and developing tools and treatments to address what are now understood to be brain disorders, including depression, schizophrenia, bipolar disorder, and attention-deficit/hyperactivity disorder. Using neuroimaging to identify disrupted brain regions could provide information useful to develop and evaluate new treatments, enable prediction of individual response to such treatments, and help subtype individuals (e.g., schizophrenia and schizoaffective disorder). The obstacles, however, turned out to be more formidable, and progress slower than we would have liked. Our challenges and successes might be best understood through the evolution of five different eras.
The “Small N” Era
In the early days of neuroimaging studies, we used a small number of subjects (called “small Ns”). In this era, even though structural and functional MRI studies were initially quite small (e.g., using a “small N” of only 5 to 20 subjects), each promised a potentially revolutionary finding. Initial functional studies had individuals perform a number of carefully designed tasks and tracked how the brain responded to these tasks. But while many studies highlighted specific deficits associated with various mental disorders, differences tended to be relatively small and most studies lacked sufficient controls for the many possible confounds, such as medication or patient movement within the scanner.
Studies that followed often yielded promising findings that pointed toward various brain regions that appeared to be important contributors to brain disorders or symptoms. Many of these findings turned out to be hard to replicate. In some cases, the tasks were difficult for those impacted by mental illness to perform, which made it hard to know if the observed changes were due to the disorder itself or were different simply because the tasks were not being performed. So, the small N problem became a large problem, compounded by the heterogeneity of mental disorders.
To better understand what researchers faced, consider the development of a vaccine for Covid-19, where 30,000 individuals are required for a study. Testing the efficacy and safety of a vaccine is a much simpler problem than treating depression, for example, since the outcome measures are clear (someone gets ill or not). With psychiatric disorders, by contrast, we are simultaneously trying to clarify the diagnosis and understand how the brain is impacted—which regions and what mechanisms are involved.
Despite the challenges, we learned much about how to model neuroimaging data during the small N era, including how to efficiently administer tasks, control for statistical complexities, and compare data. The “dead salmon” paper, which was presented at the Human Brain Mapping conference in 2009, light-heartedly highlighted the already well-known statistical corrections that are needed when studying brain images. The presenters were making an important scientific point regarding the “multiple comparisons problem.” If one does a lot of different statistical tests, some of them will, just by chance, give interesting results. With this, we gained important information about how mental illness impacts the brain. But this was not enough.
The “Large Group” Era
In the next phase of neuroimaging (referred to as the “large group” era), researchers focused on increasing group sizes (typically to hundreds of individuals or more) to increase confidence that detected effects were not just due to noise and to better characterize the heterogeneity of psychiatric disorders. During this era, we learned with greater confidence which brain regions were activated by tasks, as well as the degree to which regional responses differed in those with mental illness (e.g., auditory oddball or working memory task deficits in schizophrenia).
Likewise, we learned more about how brain structure was impacted by disease (e.g., individuals with schizophrenia consistently show reduced temporal lobe and medial frontal gray matter). Approaches based on meta-analysis offered tools to pool results from many small studies, in order to provide more reliable statistical summaries. During this era, the focus was still very much on isolating specific brain regions, rather than considering the brain as a highly interconnected system.
The “Network” Era
A major shift occurred when our attention turned to brain networks, both at rest and during task performance (referred to as the “network” era). The mathematical tools used to study networks can also be applied to study networks between brain regions. The same approaches Google and Facebook use to leverage the concept of networks to improve search engines and social interactions began to be applied to the brain, which can be thought of as a network of networks. The idea was to identify specific networks of brain regions that are linked to, or correlated with, one another. Among other advantages, this approach enabled us to study the brain even when individuals were resting, rather than performing specified tasks.
By mitigating the formidable challenge of ensuring that everyone was performing the same task the same way, the approach allowed researchers to scale up to much larger numbers, combining data across many imaging centers. In addition, fancy new analytical tools that can assess brain activity coming from many regions at once (e.g., multivariate methods and techniques based on graph theory), similar to those used by many software engineering companies, were introduced and are now being used with increasing regularity. But although these approaches have taught us much about how psychiatric disorders impact brain connectivity, they
have not yet led to clinical tools. For that we need more precise and individualized information.
The “Prediction” Era
The vast majority of brain imaging researchers focus on describing central tendencies and group results, rather than findings with individual subjects, and the case studies that are reported typically concentrate on explaining the data at hand rather than predicting unseen data from an individual (i.e., Can I use brain imaging to predict a future diagnosis, or to determine if that individual will respond well to a certain medication?). While this may seem like a small distinction, it is quite critical, as the results for these two approaches (studying averages versus studying individuals) often differ. A focus on individual level prediction and forecasting of future trajectories relevant to an individual person is arguably the most important goal if brain imaging is to translate into practical solutions to improve the quality of life and enhance technological development.
Prediction studies typically utilize advanced computational approaches and algorithms that can learn from data (i.e., machine learning). The field has experienced a large growth in studies using machine-learning approaches to make individualized predictions (i.e., informed guesses) of symptoms, cognitive scores, medication response information, and more. Studies of brain function and structure that focus on these estimates have revealed whole brain patterns that show potential to be able to predict and identify mental disorders and to predict treatment response (e.g., using resting fMRI to predict response to antidepressants versus mood stabilizers in adolescents with mood disorders).
Beyond these patterns, they also appear to be predictive of risk for psychiatric disorders. However, we are still only scratching the surface in the prediction era, as these approaches tend to require large amounts of data, and algorithms are still not able to learn well without a good “ground truth” (i.e., If we already know the answer it is not hard to train a computer to recognize it, but with mental illness we are still unsure about even the diagnostic categories). Existing techniques with the data we have are still unable to tell us clearly whether the psychiatric diagnoses accurately reflect the underlying disorder. Thankfully, algorithms are getting smarter, and more data are arriving all the time.
The Era of Big Data
We are now firmly in the era of big data for neuroimaging and psychiatry. The number of large, shared data sets has dramatically increased over the past few years. Several studies, e.g., Aging Brain Cognition and Development and UK Biobank, are scanning tens of thousands of individuals over time (although these are mostly individuals without psychiatric problems, but the assessed measures can be used to study psychiatric issues as a spectrum).
There is interest in considering mental illness as manifested by otherwise everyday human traits that lie outside typical ranges of behavior and process. For example, someone may be anxious about an upcoming deadline, but when anxiety becomes constant and independent of circumstances, it impacts our quality of life and might be considered a psychiatric disorder.
To study mental illness requires vast amounts of data and flexible models that can handle all its complexity. A promising class of powerful models (deep learning models), like those that were used to beat international experts in the game of Go, or that Alexa uses to recognize your commands, have been shown to be very powerful. But they also require a lot of data. The application of deep learning (deep artificial neural networks) to neuroimaging has shown great promise and will likely be a major force in advancing our knowledge and understanding of the data. These approaches require considerable computational resources as the complexity of models and the amount of data continue to grow.
Challenges, Testing, and Discovery
Given that the criteria used to decide who has a mental disorder are largely based on self-reported symptoms and not biologically based, we have a wicked chicken-and-egg problem. Do predictions of mental disorder based on non-biological characterizations provide useful and actionable information?” The fact is that the brain is incredibly complex; psychiatric illness is complex and multifaceted, including an overlap among existing diagnosis and prediction models that are relatively simple. This creates a perfect storm of difficulty for computational approaches and can lead to a “garbage in—garbage out” problem: We are trying to predict something that is not well defined in the first place. At the same time, the dual challenge of unclear diagnostic categories and the desire to make predictions presents an opportunity.
The use of deep neural network models offers the flexibility to capture relationships that are not yet well understood, including dynamic changes and multimodal contributions. The accelerating pace of algorithmic innovation can only expand this potential, providing tools for 1) predicting existing diagnostic categories, 2) identifying new categories, and 3) combining the two, all based on biological data. The use of hypothesis-based and data-discovery approaches should work together in a “virtuous cycle” in which we learn from new data, update our models accordingly, and learn more by applying these models for finer-tuned analysis.
In many ways, this evolution parallels the development of big data genomics models that have highlighted the polygenic nature of psychiatric disorders by scaling up to deal with extremely large studies. Early larger scale imaging studies also suggest a “polyregional” brain disruption, impacting brain connectivity at a systemic or network level. However, the neuroimaging field has some additional challenges to confront, while also presenting some unique advantages.
Neuroimaging studies require brain scans. Extremely large genomics studies are more easily extended to hundreds of thousands of data sets since these only require saliva or in some cases a blood draw. In contrast, neuroimaging provides greater potential for characterization of mental disorders due to its ability to capture changes over time, as well as a valuable tool for studies focused on intervention or brain stimulation. In addition, neuroimaging changes related to mental disorders tend to be larger and more detectable than those based on genomic studies of these highly polygenic disorders. While it remains to be seen what place neuroimaging and genomics will ultimately occupy in clinical decision-making, it does seem clear that they will be part of the landscape.
Where to Now?
While we have learned a lot, we still have a long way to go before we can fully leverage neuroimaging data to understand psychiatric disorders and help people address their mental health difficulties. In the near-term, predicting medication response is a promising goal, as this bypasses diagnostic criteria in identifying the best treatment for an individual. For example, one might use fMRI to predict the response to antidepressants versus mood stabilizers or to determine which antidepressant is likely to have the best effect. While this is only a small step, it does have promise as a support to decision-making in the clinic.
Another key question will be how to benefit from large amounts of data while protecting privacy. New data management infrastructure (i.e., neuroinformatics) to support the use of neuroimaging will need to incorporate the ability to safeguard sensitive information. Decentralized or federated approaches may provide a way forward here, combining intermittent neuroimaging with the regular capture of information from a mobile device in a way that preserves privacy and which, for example, has considerable potential to treat disorders such as depression or cognitive decline by detecting mental health patterns. A recent example of this is reflected in some of the Covid-19 tracing apps which are designed with privacy in mind.
While we have not advanced as quickly as we might have hoped in explicating the mysteries of mental illness, there is considerable reason to be optimistic about the not-so-distant future. The brain is an amazing organ, but its complexity, individuality, and role in shaping who we are is also the reason the problems are so challenging. As the author and scholar C.S. Lewis once said: “There are no ordinary people.” Despite such challenges, I remain firmly hopeful and believe we are moving towards the needed solutions.