
One night of sleep may contain hidden clues that predict major diseases years before they strike.
A bad night of sleep usually leads to grogginess the next day, but it may also point to serious health problems years before they appear. Researchers at Stanford Medicine have developed a new artificial intelligence system that can analyze detailed body signals from a single night of sleep and estimate a person’s risk of developing more than 100 different diseases.
The system, called SleepFM, was trained using nearly 600,000 hours of sleep recordings from about 65,000 people. These recordings came from polysomnography, an in-depth sleep test that uses sensors to track brain activity, heart rhythms, breathing patterns, eye movements, leg motion, and other physiological signals throughout the night.
Sleep Studies as a Hidden Data Resource
Polysomnography is widely considered the most reliable method for studying sleep, typically conducted overnight in specialized laboratories. Beyond diagnosing sleep disorders, researchers realized these tests capture an extraordinary amount of information about how the body functions over several uninterrupted hours.
“We record an amazing number of signals when we study sleep,” said Emmanual Mignot, MD, PhD, the Craig Reynolds Professor in Sleep Medicine and co-senior author of the new study, which will publish today (January 6) in Nature Medicine. “It’s a kind of general physiology that we study for eight hours in a subject who’s completely captive. It’s very data rich.”
Until now, much of this information has gone unused. Traditional sleep medicine focuses on a limited subset of signals, leaving large portions unexplored. Advances in artificial intelligence have made it possible to analyze this full data stream for the first time. According to the researchers, this is the first study to apply AI to sleep data at such a large scale.
“From an AI perspective, sleep is relatively understudied. There’s a lot of other AI work that’s looking at pathology or cardiology, but relatively little looking at sleep, despite sleep being such an important part of life,” said James Zou, PhD, associate professor of biomedical data science and co-senior author of the study.
Teaching AI to Understand Sleep
To unlock the value of this data, the team built a foundation model, a type of AI that can learn general patterns from massive datasets and then be adapted to many different tasks. Large language models like ChatGPT use this same approach, except they are trained on text rather than biological signals.
SleepFM was trained on 585,000 hours of polysomnography data collected from patients at multiple sleep clinics. Each recording was divided into five-second segments, similar to how words are used to train language models.
“SleepFM is essentially learning the language of sleep,” Zou said.
The model analyzes multiple streams of data at once, including brain waves, heart signals, muscle activity, pulse measurements, and breathing airflow, and learns how these signals interact. To make this possible, the researchers designed a new training method called leave-one-out contrastive learning. This approach temporarily removes one type of signal and asks the model to reconstruct it using the remaining data.
“One of the technical advances that we made in this work is to figure out how to harmonize all these different data modalities so they can come together to learn the same language,” Zou said.
From Sleep Patterns to Disease Risk
Once training was complete, the researchers tested SleepFM on familiar sleep-related tasks. The model accurately identified sleep stages and assessed sleep apnea severity, matching or exceeding the performance of leading systems currently in use.
They then moved on to a more ambitious test: predicting which diseases might develop in the future based on sleep data alone. To do this, they linked sleep recordings with long-term medical histories from the same patients. The researchers had access to decades of records from a single clinic, providing a rare opportunity to study long-term outcomes.
The Stanford Sleep Medicine Center was founded in 1970 by the late William Dement, MD, PhD, who is widely regarded as the father of sleep medicine. The largest group used to train SleepFM included about 35,000 patients between the ages of 2 and 96. Their sleep studies were recorded between 1999 and 2024 and matched with electronic health records that followed some patients for as long as 25 years.
(The clinic’s polysomnography recordings go back even further, but only on paper, said Mignot, who directed the sleep center from 2010 to 2019.)
Using this combined dataset, SleepFM examined more than 1,000 disease categories and identified 130 conditions that could be predicted with reasonable accuracy from sleep data alone. The strongest predictions were seen for cancers, pregnancy complications, circulatory diseases, and mental health disorders, with performance scores exceeding a C-index of 0.8.
Measuring Prediction Accuracy
The C-index, or concordance index, measures how well a model can rank individuals by risk. It reflects how often the model correctly predicts which of two people will experience a health event first.
“For all possible pairs of individuals, the model gives a ranking of who’s more likely to experience an event — a heart attack, for instance — earlier. A C-index of 0.8 means that 80% of the time, the model’s prediction is concordant with what actually happened,” Zou said.
SleepFM showed especially strong results for Parkinson’s disease (C-index 0.89), dementia (0.85), hypertensive heart disease (0.84), heart attack (0.81), prostate cancer (0.89), breast cancer (0.87), and death (0.84).
“We were pleasantly surprised that for a pretty diverse set of conditions, the model is able to make informative predictions,” Zou said.
Zou added that models with lower accuracy, often around a C-index of 0.7, are already used in clinical care, such as systems that predict how patients might respond to certain cancer treatments.
Making Sense of the Predictions
The research team is now focused on improving SleepFM’s accuracy and understanding how it reaches its conclusions. Future versions may incorporate data from wearable devices to capture even more information about daily life and sleep habits.
“It doesn’t explain that to us in English,” Zou said. “But we have developed different interpretation techniques to figure out what the model is looking at when it’s making a specific disease prediction.”
While heart-related signals played a larger role in predicting cardiovascular disease, and brain signals were more influential for mental health conditions, the researchers found that no single signal was enough on its own. The most accurate predictions came from combining all data sources.
“The most information we got for predicting disease was by contrasting the different channels,” Mignot said. Body constituents that were out of sync — a brain that looks asleep but a heart that looks awake, for example — seemed to spell trouble.
Reference: “A multimodal sleep foundation model for disease prediction” by Rahul Thapa, Magnus Ruud Kjaer, Bryan He, Ian Covert, Hyatt Moore IV, Umaer Hanif, Gauri Ganjoo, M. Brandon Westover, Poul Jennum, Andreas Brink-Kjaer, Emmanuel Mignot and James Zou, 6 January 2026, Nature Medicine.
DOI: 10.1038/s41591-025-04133-4
Rahul Thapa, a PhD student in biomedical data science, and Magnus Ruud Kjaer, a PhD student at Technical University of Denmark, are co-lead authors of the study.
Researchers from the Technical University of Denmark, Copenhagen University Hospital –Rigshospitalet, BioSerenity, University of Copenhagen and Harvard Medical School contributed to the work.
The study received funding from the National Institutes of Health (grant R01HL161253), Knight-Hennessy Scholars, and Chan-Zuckerberg Biohub.
Never miss a breakthrough: Join the SciTechDaily newsletter.
Follow us on Google and Google News.
2 Comments
There are two main assumptions made here. The first is that taking a single night’s readings, which is a moment in time, would indicate future events, assuming that there is a steady trajectory from now into the future. However, we have ups and downs in health, and what is a problem today may resolve over time. Current stresses, for example, may show a future problem is things don’t change, but things do change, and stress can come and go. We don’t age and get disease in a straightforward fashion. One moment in time does not predict the future.
The second assumption is that people sleep normally in a laboratory with electrodes connected to their bodies. We are not machines, and when you are aware that you are being tested it creates changes in your mind and body. So these sleep tests create lab results, not real results. Just as putting a blood pressure cuff on someone can raise their blood pressure due to anxiety, putting electrodes on someone and having them sleep in a strange place can create artifacts and not reflect true health conditions.
In addition, there is also the ethical problem of telling people that they will have some disease in the future, when there are no current signs of disease (apart from these lab findings and the AI interpretation of them). Telling people that they will be sick in the future can create depression, anxiety, altered behavior, unnecessary early treatment, and a nocebo effect that can result in disease. It also ignores that the body can heal. This AI approach essentially tells people how they will develop disease and die before there are any signs apart from what the AI interprets. That’s putting a lot of faith in AI, and will undermine self-confidence and health.
While Mr. Ross raises some good points, I find that the results of the study are largely consistent with what I’ve read of Dr. Arthur F. Coca’s findings in the early 1930s (“The Pulse Test,” 1956) in conjunction with my own at-home experiments since late 1981. What the AI is probably not able to teach the researchers is how the recorded variations are largely due to still medically unrecognized and undiagnosed practically harmless individual nearly subclinical brief (about six to twelve hours) non-IgE-mediated food allergy reactions aggravated (or not) with toxic officially (FDA in the US) food additives like soy, TBHQ and/or added MSG, minimally, to become chronic and potentially deadly, long-term (months to decades, highly individual), especially since 1980 (added MSG FDA approved for expanded use). Cc: two authors.