Close Menu
    Facebook X (Twitter) Instagram
    SciTechDaily
    • Biology
    • Chemistry
    • Earth
    • Health
    • Physics
    • Science
    • Space
    • Technology
    Facebook X (Twitter) Pinterest YouTube RSS
    SciTechDaily
    Home»Technology»ChatGPT Was Asked the Same Question 10 Times. The Answers Kept Changing
    Technology

    ChatGPT Was Asked the Same Question 10 Times. The Answers Kept Changing

    By Washington State UniversityMarch 18, 20264 Comments4 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn WhatsApp Email Reddit
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email Reddit
    Humanoid Artificial Intelligence Robot Confused AI Questions Thought
    ChatGPT may sound confident, but when tested on complex scientific claims, it often guesses and even contradicts itself. Researchers found it struggles especially with spotting false information. Credit: Shutterstock

    ChatGPT can sound convincing, but this study shows it still struggles to tell what’s actually true.

    Washington State University professor Mesut Cicek and his team repeatedly evaluated ChatGPT by giving it hypotheses drawn from scientific studies. The AI was asked to decide whether each statement was supported by research — essentially judging if it was true or false.

    In total, the researchers tested more than 700 hypotheses and submitted each one 10 times to examine how consistent the responses would be.

    Accuracy Results and Performance Limits

    In the initial 2024 experiment, ChatGPT answered correctly 76.5% of the time. When the study was repeated in 2025, accuracy rose slightly to 80%. However, once the results were adjusted for random guessing, the performance looked far less reliable. The AI was only about 60% better than chance, which the researchers described as closer to a low D than strong performance.

    The system had particular difficulty identifying false statements, correctly labeling them only 16.4% of the time. It also showed inconsistency. When given the exact same prompt 10 times, ChatGPT produced consistent results for only about 73% of the cases.

    Inconsistent Answers to Identical Questions

    “We’re not just talking about accuracy, we’re talking about inconsistency, because if you ask the same question again and again, you come up with different answers,” said Cicek, an associate professor in the Department of Marketing and International Business in WSU’s Carson College of Business and lead author of the new publication.

    “We used 10 prompts with the same exact question. Everything was identical. It would answer true. Next, it says it’s false. It’s true, it’s false, false, true. There were several cases where there were five true, five false.”

    AI Fluency Versus Real Understanding

    The study, published in the Rutgers Business Review, highlights the importance of caution when using AI for important decisions, especially those involving nuance or complex reasoning. While generative AI can produce fluent and convincing language, it does not necessarily demonstrate true understanding.

    Cicek said the findings suggest that artificial general intelligence capable of genuine reasoning may still be further away than some expect.

    “Current AI tools don’t understand the world the way we do — they don’t have a ‘brain,’” Cicek said. “They just memorize, and they can give you some insight, but they don’t understand what they’re talking about.”

    Study Design and Methods

    Cicek worked alongside Sevincgul Ulu of Southern Illinois University, Can Uslay of Rutgers University, and Kate Karniouchina of Northeastern University.

    The team analyzed 719 hypotheses from scientific papers published in business journals since 2021. Determining whether research supports a hypothesis is often complex, involving multiple factors that can influence the outcome. Reducing that complexity to a simple true-or-false decision requires careful reasoning.

    The researchers tested the free version of ChatGPT-3.5 in 2024 and the updated ChatGPT-5 mini in 2025. Overall, results were similar across both versions. After adjusting for random chance, which gives a 50% likelihood of a correct answer, the AI’s performance was only about 60% better than chance in both years.

    Key Weakness in AI Reasoning

    The findings reveal an important limitation of large language model AI systems. Although they can generate polished and persuasive responses, they often struggle with deeper reasoning. This can lead to answers that sound convincing but are actually incorrect, Cicek said.

    Why Experts Urge Caution

    Based on these results, the researchers recommend that business leaders verify AI-generated outputs and approach them with skepticism. They also emphasize the importance of training users to understand both the strengths and limitations of AI tools.

    While this study focused on ChatGPT, Cicek noted that similar tests with other AI systems have shown comparable outcomes. The research also builds on earlier work highlighting concerns about AI hype. A 2024 national survey found that consumers were less likely to purchase products when they were marketed with a focus on AI.

    “Always be skeptical,” he said. “I’m not against AI. I’m using it. But you need to be very careful.”

    Never miss a breakthrough: Join the SciTechDaily newsletter.
    Follow us on Google and Google News.

    Artificial Intelligence ChatGPT Washington State University
    Share. Facebook Twitter Pinterest LinkedIn Email Reddit

    Related Articles

    Will Artificial Intelligence End Civilization?

    Misinformation Express: How Generative AI Models Like ChatGPT, DALL-E, and Midjourney May Distort Human Beliefs

    New Tool Detects ChatGPT-Generated Academic Text With 99% Accuracy

    Cancer and AI – Can ChatGPT Be Trusted?

    AI vs MD: ChatGPT Outperforms Physicians in Providing High-Quality, Empathetic Healthcare Advice

    Humans Reign Supreme: ChatGPT Falls Short on Accounting Exams

    New Study: ChatGPT Can Influence Users’ Moral Judgments

    ChatGPT Generative AI: USC Experts With Key Information You Should Know

    The Rise of Artificial Intelligence: ChatGPT’s Stunning Results on the US Medical Licensing Exam

    4 Comments

    1. Bruce on March 18, 2026 5:46 pm

      What I see in front of me right I don’t think nothing less

      Reply
    2. Jojo on March 18, 2026 10:55 pm

      Where is the linkout to the actual study???

      Reply
      • Andrei Conovaloff on March 21, 2026 6:20 pm

        Unstable Intelligence: GenAI Struggles with Accuracy and Consistency
        by Mesut Cicek , Sevincgul Ulu, Can Uslay, Kate Karniouchina
        Rutgers Business Review (2025), Vol. 10, No. 2, pp.266-277
        https://rbr.business.rutgers.edu/article/unstable-intelligence-genai-struggles-accuracy-and-consistency

        Reply
    3. SquirrelTech on March 20, 2026 10:09 am

      I hope they used hypotheses which they had a verifiable “correct” answer. There are plenty of hypotheses that “everyone agrees are right” that later prove false, so it’s not a simple task. We can’t compare the consistency of answers from humans, so maybe humans would be just as inconsistent.

      Reply
    Leave A Reply Cancel Reply

    • Facebook
    • Twitter
    • Pinterest
    • YouTube

    Don't Miss a Discovery

    Subscribe for the Latest in Science & Tech!

    Trending News

    The Universe Is Expanding Too Fast and Scientists Can’t Explain Why

    “Like Liquid Metal”: Scientists Create Strange Shape-Shifting Material

    Early Warning Signals of Esophageal Cancer May Be Hiding in Plain Sight

    Common Blood Pressure Drug Shows Surprising Power Against Deadly Antibiotic-Resistant Superbug

    Scientists Uncover Dangerous Connection Between Serotonin and Heart Valve Disease

    Scientists Discover a “Protector” Protein That Could Help Reverse Hair Loss

    Bone-Strengthening Discovery Could Reverse Osteoporosis

    Scientists Uncover Hidden Trigger Behind Stem Cell Aging

    Follow SciTechDaily
    • Facebook
    • Twitter
    • YouTube
    • Pinterest
    • Newsletter
    • RSS
    SciTech News
    • Biology News
    • Chemistry News
    • Earth News
    • Health News
    • Physics News
    • Science News
    • Space News
    • Technology News
    Recent Posts
    • Scientists Crack Alfalfa’s Chromosome Mystery After Decades of Debate
    • Ancient Ant-Plant Alliance Collapses As Predatory Wasps Move In
    • Scientists Discover Tiny New Spider That Hunts Prey 6x Its Size
    • Natural Component From Licorice Shows Promise for Treating Inflammatory Bowel Disease
    • New Research Finds Shocking Link Between Chili Peppers and Cancer
    Copyright © 1998 - 2026 SciTechDaily. All Rights Reserved.
    • Science News
    • About
    • Contact
    • Editorial Board
    • Privacy Policy
    • Terms of Use

    Type above and press Enter to search. Press Esc to cancel.