Most Brain Studies Have Too Few Participants To Yield Reliable Findings

Neuroimaging has revealed correlations between brain anatomy or function and illness, suggesting new diagnostic and treatment methods, but the small sample sizes hinder reliability.

Findings will encourage more data sharing, collaboration among researchers.

As brain scans have become more detailed and informative in recent decades, neuroimaging has seemed to promise a way for doctors and scientists to “see” what’s going wrong inside the brains of people with mental illnesses or neurological conditions. Such imaging has revealed correlations between brain anatomy or function and illness, suggesting potential new ways to diagnose and treat psychiatric, psychological, and neurological conditions. But the promise has yet to turn into reality, and a new study explains why: The results of most studies are unreliable because they involved too few participants.

Scientists rely on brainwide association studies to measure brain structure and function — using MRI brain scans — and link them to complex characteristics such as personality, behavior, cognition, neurological conditions, and mental illness. But a study by researchers at Washington University School of Medicine in St. Louis and the University of Minnesota, published on March 16, 2022, in Nature, shows that most published brainwide association studies are performed with too few participants to yield reliable findings.

Using publicly available data sets – involving a total of nearly 50,000 participants – the researchers analyzed a range of sample sizes and found that brainwide association studies need thousands of individuals to achieve higher reproducibility. Typical brainwide association studies enroll just a couple dozen people.

Scientists rely on brainwide association studies to measure brain structure and function — using brain scans — and link them to mental illness and other complex behaviors. But a study by researchers at Washington University School of Medicine in St. Louis and the University of Minnesota, published March 16 in Nature, shows that most published brainwide association studies are performed with too few participants to yield reliable findings. Credit: Alex Berdis

Such so-called underpowered studies are susceptible to uncovering strong but spurious associations by chance while missing real but weaker associations. Routinely underpowered brainwide association studies result in a glut of astonishingly strong yet irreproducible findings that slow progress toward understanding how the brain works, the researchers said.

“Our findings reflect a systemic, structural problem with studies that are designed to find correlations between two complex things, such as the brain and behavior,” said senior author Nico Dosenbach, MD, PhD, an associate professor of neurology at Washington University. “It’s not a problem with any individual researcher or study. It’s not even unique to neuroimaging. The field of genomics discovered a similar problem about a decade ago with genomic data and took steps to address it. The NIH (National Institutes of Health) began funding larger data-collection efforts and mandating that data must be shared publicly, which reduces bias and as a result, genome science has gotten much better. Sometimes you just have to change the research paradigm. Genomics has shown us the way.”

First author Scott Marek, PhD, an instructor in psychiatry at Washington University, and co-first author Brenden Tervo-Clemmens, PhD, a postdoctoral researcher at Massachusetts General Hospital/Harvard Medical School, realized something was wrong with how brainwide association studies typically are conducted when they could not replicate the results of their own study.

“We were interested in finding out how cognitive ability is represented in the brain,” Marek said. “We ran our analysis on a sample of 1,000 kids and found a significant correlation and were like, ‘Great!’ But then we thought, ‘Can we reproduce this in another thousand kids?’ And it turned out we couldn’t. It just blew me away because a sample of a thousand should have been plenty big enough. We were scratching our heads, wondering what was going on.”

To identify problems with brain-wide association studies, the research team — including Dosenbach, Marek, Tervo-Clemmens, co-senior author Damien A. Fair, PhD, director of the Masonic Institute for the Developing Brain at the University of Minnesota, and others — began by accessing the three largest neuroimaging datasets: the Adolescent Brain Cognitive Development Study (11,874 participants), the Human Connectome Project (1,200 participants) and the UK Biobank (35,375 participants). Then, they analyzed the datasets for correlations between brain features and a range of demographic, cognitive, mental health and behavioral measures, using subsets of various sizes. Using separate subsets, they attempted to replicate any identified correlations. In total, they ran billions of analyses, supported by the powerful computing resources of Fair’s Masonic Institute of the Developing Brain.

The researchers found that brain-behavior correlations identified using a sample size of 25 — the median sample size in published papers — usually failed to replicate in a separate sample. As the sample size grew into the thousands, correlations became more likely to be reproduced.

Further, the estimated strength of the correlation, a measure known as the effect size, tended to be largest for the smallest samples. Effect sizes are scaled from 0 to 1, with 0 being no correlation and 1 being perfect correlation. An effect size of 0.2 is considered quite strong. As sample sizes increased and correlations became more reproducible, the effect sizes decreased. The median reproducible effect size was .01. Yet published papers on brain-wide association studies routinely report effect sizes of 0.2 or more.

In retrospect, it should have been obvious that the reported effect sizes were too high, Marek said.

“You can find effect sizes of 0.8 in the literature, but nothing in nature has an effect size of 0.8,” Marek said. “The correlation between height and weight is 0.4. The correlation between altitude and daily temperature is 0.3. Those are strong, obvious, easily measured correlations, and they’re nowhere near 0.8. So why did we ever think that the correlation between two very complex things, like brain function and depression, would be 0.8? That doesn’t pass the sniff test.”

Neuroimaging studies are expensive and time-consuming. An hour on an MRI machine can cost $1,000. No individual investigator has the time or money to scan thousands of participants for each study. But if all of the data from multiple small studies were pooled and analyzed together, including statistically insignificant results and minuscule effect sizes, the result probably would approximate the correct answer, Dosenbach said.

“The future of the field is now bright and rests in open science, data sharing, and resource sharing across institutions in order to make large datasets available to any scientist who wants to use them,” Fair said. “This very paper is an amazing example of that.”

Dosenbach, also an associate professor of biomedical engineering, of occupational therapy, of pediatrics and of radiology, added: “There’s a lot of promise to this kind of work in terms of finding solutions for mental illnesses and just understanding how the mind works. The great news is that we’ve identified a main reason why brain imaging has yet to deliver on its promise to revolutionize mental health care. The work represents a major turning point for linking brain activity and behavior, by clearly defining not just the prior roadblocks, but also the promising new paths forward.”

Reference: “Reproducible brain-wide association studies require thousands of individuals” by Scott Marek, Brenden Tervo-Clemmens, Finnegan J. Calabro, David F. Montez, Benjamin P. Kay, Alexander S. Hatoum, Meghan Rose Donohue, William Foran, Ryland L. Miller, Timothy J. Hendrickson, Stephen M. Malone, Sridhar Kandala, Eric Feczko, Oscar Miranda-Dominguez, Alice M. Graham, Eric A. Earl, Anders J. Perrone, Michaela Cordova, Olivia Doyle, Lucille A. Moore, Gregory M. Conan, Johnny Uriarte, Kathy Snider, Benjamin J. Lynch, James C. Wilgenbusch, Thomas Pengo, Angela Tam, Jianzhong Chen, Dillan J. Newbold, Annie Zheng, Nicole A. Seider, Andrew N. Van, Athanasia Metoki, Roselyne J. Chauvin, Timothy O. Laumann, Deanna J. Greene, Steven E. Petersen, Hugh Garavan, Wesley K. Thompson, Thomas E. Nichols, B. T. Thomas Yeo, Deanna M. Barch, Beatriz Luna, Damien A. Fair and Nico U. F. Dosenbach, 16 March 2022, Nature.
DOI: 10.1038/s41586-022-04492-9

3 Comments on "Most Brain Studies Have Too Few Participants To Yield Reliable Findings"

JmjUSA | March 17, 2022 at 5:55 am | Reply
Try Faux-chinese nih, kolledge’s, kongress, political scientists, TV reality stars.. Oh, wait… You need a brain… Sorry… Wrong suggestions…
- FrauDetmolders | April 17, 2022 at 3:32 pm | Reply
  The irony is you need practice with irony.
Sekar | March 17, 2022 at 11:35 am | Reply
Interesting
1. Sample size will help somewhat.
2. However, what does one do about the etraordinary plasticity of the brain and Neuron Network of trillios of Neurons communicating wirelessly and semlessly with bio-electri signalling methods.
3. How about an entire community of persons interested and subject matter experts (SMEs) in various fields of expertise working in silos ? Currently they excgange ideas at conferences etc. Primitive.
4. A social media site for inter and Intra community exxperts across various fields of knowledge and making connections across islands of knowledge seamlessly, is yet to be built.
5. Imagine eight billion minds connected vide such a Knowledge network going to an estimated ten billion by 2050, and what new knowledge could emerge.
6. Such a collaboration simply cannot exist in a world of Nation States, and competition built on only monetary considerations and economics. Co-operation and Competition need to be built on a win-win for the entire global population and not simply on current Dog-Eat-Dog practises prevalent. Monopolistic rights do provide incentives for innovation and progress .
7. However, the balance is skewed against the Public Benefit for the Global Population, most of whom contine to live in abject poverty. Not as bad as in the “Tale of Two Cities” by Charles Dickens, but not what it could be. Restoring the balance should not be driven by Pandemics.
8. The Unique nature of each indviduals neuron network, and the requirement of need for Collaboration to achieve human progress and public good is buried and increasingly invisible in the current version of the Global Patenting System.I favour a equitable and sensible patent rights system wth a balance restored in favour of maximum Good for Maximum number of people across the Globe.
Views expressed are personal and not binding on anyone.