Digital Dementia? AI Shows Surprising Signs of Cognitive Decline

Broken Artificial Intelligence Fail Concept — A study in The BMJ reveals that leading large language models exhibit signs of mild cognitive impairment when subjected to tests typically used for early dementia detection. Credit: SciTechDaily.com

Findings Challenge Assumption That AI Will Soon Replace Human Doctors

Research shows that top AI models demonstrate cognitive impairments similar to early dementia symptoms when evaluated with the MoCA test. These findings underscore the limitations of AI in clinical applications, particularly in tasks requiring visual and executive skills.

Cognitive Impairments in AI

Almost all leading large language models, or “chatbots,” show signs of mild cognitive impairment when tested using assessments commonly used to detect early dementia, according to a study published in the Christmas issue of The BMJ.

The study also found that older versions of these chatbots, much like aging human patients, performed worse on the tests. The authors suggest that these findings “challenge the assumption that artificial intelligence will soon replace human doctors.”

AI Advancements and Speculations

Recent advances in artificial intelligence have sparked both excitement and concern about whether chatbots might surpass human physicians in medical tasks.

While previous research has demonstrated that large language models (LLMs) excel at various medical diagnostic tasks, their potential vulnerability to human-like cognitive impairments, such as cognitive decline, has remained largely unexplored—until now.

Evaluating AI Cognitive Abilities

To fill this knowledge gap, researchers assessed the cognitive abilities of the leading, publicly available LLMs – ChatGPT versions 4 and 4o (developed by OpenAI), Claude 3.5 “Sonnet” (developed by Anthropic), and Gemini versions 1 and 1.5 (developed by Alphabet) – using the Montreal Cognitive Assessment (MoCA) test.

The MoCA test is widely used to detect cognitive impairment and early signs of dementia, usually in older adults. Through a number of short tasks and questions, it assesses abilities including attention, memory, language, visuospatial skills, and executive functions. The maximum score is 30 points, with a score of 26 or above generally considered normal.

AI Performance on Cognitive Tests

The instructions given to the LLMs for each task were the same as those given to human patients. Scoring followed official guidelines and was evaluated by a practicing neurologist.

ChatGPT 4o achieved the highest score on the MoCA test (26 out of 30), followed by ChatGPT 4 and Claude (25 out of 30), with Gemini 1.0 scoring lowest (16 out of 30).

Challenges in Visual and Executive Functions

All chatbots showed poor performance in visuospatial skills and executive tasks, such as the trail-making task (connecting encircled numbers and letters in ascending order) and the clock drawing test (drawing a clock face showing a specific time). Gemini models failed at the delayed recall task (remembering a five-word sequence).

Most other tasks, including naming, attention, language, and abstraction were performed well by all chatbots.

However, in further visuospatial tests, chatbots were unable to show empathy or accurately interpret complex visual scenes. Only ChatGPT 4o succeeded in the incongruent stage of the Stroop test, which uses combinations of color names and font colors to measure how interference affects reaction time.

Implications for AI in Clinical Settings

These are observational findings and the authors acknowledge the essential differences between the human brain and large language models.

However, they point out that the uniform failure of all large language models in tasks requiring visual abstraction and executive function highlights a significant area of weakness that could impede their use in clinical settings.

As such, they conclude: “Not only are neurologists unlikely to be replaced by large language models any time soon, but our findings suggest that they may soon find themselves treating new, virtual patients – artificial intelligence models presenting with cognitive impairment.”

Reference: “Age against the machine—susceptibility of large language models to cognitive impairment: cross sectional analysis” by Roy Dayan, Benjamin Uliel and Gal Koplewitz, 20 December 2024, BMJ.
DOI: 10.1136/bmj-2024-081948

Never miss a breakthrough: Join the SciTechDaily newsletter.
Follow us on Google and Google News.

11 Comments

Christopher on December 19, 2024 1:09 am
Lol, this is complete nonsense.
- Michael on December 25, 2024 8:40 pm
  Medicine is an art as much as a science. Half of what we “know” will be discarded. Any superior physician knows that intuition plays a major role in decision making, something AI cannot do.
Don Bronkema on December 19, 2024 6:32 am
Baloney!
Robin on December 19, 2024 7:14 pm
I’m not surprised. Medicine is more than checking a box of symptoms. Sometimes people don’t display all symptoms of a disease. In addition family health history, medications and current and past habits can alter a diagnosis. That comes from experience as a medical professional and listening skills. Sometimes multiple diagnoses can alter treatment plans. Interpretation should always be left to a human.
Peter on December 20, 2024 6:22 am
Except that it’s not declining, but improving with every version. Something any user of these could tell you. What a dumb and misleading article.
- Clyde Spencer on December 20, 2024 11:12 am
  I think that the quality of the writing makes it ambiguous whether individual LLMs decline in their abilities over time. Note that the article states, “Gemini models failed at the delayed recall task (remembering a five-word sequence).” When it is pointed out that they have said something that is wrong, they readily admit their error, but don’t remember the correction, which is not unlike an older person who has memory problems.
AG3 on December 21, 2024 3:00 am
Very badly written article.
The title says that LLMs show cognitive decline, which seems to suggest that the decline for each LLM happens over time.
The experiment actually says that LLMs are not as smart as humans, and in cognitive tests they perform at the levels of humans with dementia.
The research is also pointless.
The newer LLMs perform better than the older one. ChatGPT is already at par with humans without dementia. If anything, their research has shown that future models will not have any of these shortcomings.
The tongue in cheek concluding remarks complete the picture of unprofessionalism of the research team.
Doctoray staronomy kesiri 2024 on December 21, 2024 5:43 am
Spencer spoke interestingly
Anonymous on December 24, 2024 2:32 pm
This is total nonsense.
To exhibit decline AI would have to start with relatively high cognitive ratings and *then* exhibit a decline. Its very clear AI has never had a high cognitive rating.
In addition, spatial cognition has never been high in AI. In point of fact most AI generated images contain relatively simple errors in spatial cognition, showing outdoor scenes on windows peering into homes, showing people with three arms, or six fingers.
If you wanted to say its cognition hadnt reached above dementia levels, thats fine.
But thats not a decline.
John on December 25, 2024 10:44 am
I’m in machine learning. LLMs do zero thinking. LLMs take a set amount of the preceding text and then guess what text comes next based on a bunch of text that it was shown during training.
LLMs can’t experience cognitive decline because they don’t have any cognition to begin with.
LLMs are wildly expensive parrots playing Mad Libs with people who have been duped into thinking that LLMs understand what they’re saying.
- Michael on December 25, 2024 8:43 pm
  Thanks for your comment, the masses have been duped into thinking they are dealing with an intelligent being that can reason.

Digital Dementia? AI Shows Surprising Signs of Cognitive Decline

For the First Time, ChatGPT Has Solved an Unproven Math Problem in Geometry

AI Ethics Surpass Human Judgment in New Moral Turing Test

ChatGPT Tests Into Top 1% for Original Creative Thinking

ChatGPT Generative AI: USC Experts With Key Information You Should Know

The Rise of Artificial Intelligence: ChatGPT’s Stunning Results on the US Medical Licensing Exam

Neuroscientist: Animal Brains Key for Next Generation of Artificial Intelligence

New AI System Identifies Personality Traits from Eye Movements

TrueNorth Computer Chip Emulates Human Cognition

AI Framework Predicts Better Patient Health Care and Reduces Cost

11 Comments

The Strange “Spacetime Crystal” That Can Suddenly Turn Into a Black Hole

The Surprising Way Asteroids May Have Helped Life Begin on Earth

Vast Hidden Structure Discovered Under Miles of Ice in East Antarctica

A Surprising Discovery Suggests Autism Is Not One Condition

New Alzheimer’s Discovery Could Change How Scientists Fight the Disease

Yale Discovery Overturns Long-Held “Evolutionary Dead End” Theory

UCLA Scientists Uncover a “Hidden Weakness” in Some of the World’s Deadliest Cancers

Humpback Whale Stuns Scientists With 15,000 Kilometer Journey Across Oceans

Digital Dementia? AI Shows Surprising Signs of Cognitive Decline

Cognitive Impairments in AI

AI Advancements and Speculations

Evaluating AI Cognitive Abilities

AI Performance on Cognitive Tests

Challenges in Visual and Executive Functions

Implications for AI in Clinical Settings

Related Articles

11 Comments