
New research has found ChatGPT-5.2 can generate original mathematical proofs, introducing “vibe-proving” as a new AI reasoning method. AI accelerates discovery, but human verification remains necessary.
Researchers at VUB’s Data Analytics Lab report that commercial language models can produce original mathematical proofs. In their study, the team shows that OpenAI’s large language model ChatGPT-5.2 (Thinking) was able to solve a mathematical problem on its own.
The case focused on proving a 2024 conjecture proposed by mathematicians Ran and Teng. A conjecture is a statement believed to be true based on patterns or repeated results, but it has not yet been formally proven. Once a valid proof is established, the conjecture becomes a theorem.
According to the study, the final proof emerged from seven chat sessions with ChatGPT and four evolving versions of the argument. The model played a key role in exploring possible approaches, while human researchers ensured the reasoning was correct and logically complete.
ChatGPT’s Role in Mathematical Discovery
The researchers found that ChatGPT-5.2 (Thinking) developed much of the proof’s structure with limited human input. As they note, “With the Data Analytics Lab, we are one of the first to demonstrate that a commercially available LLM can independently develop original mathematical proofs.”
“I had long suspected that ChatGPT could help me prove unsolved mathematical problems,” says Brecht Verbeken (postdoctoral researcher in the Data Analytics Lab VUB research group). “And yet I was surprised at how efficiently that worked out.”
The team places this work within a broader approach they call vibe-proving, where language models help organize and explore complex theoretical ideas. They also raise the question of whether this method could advance as quickly as AI-assisted programming, known as vibe-coding, which has already progressed from simple tools to near-autonomous code generation. “We often hear how people think that the creativity of systems is fundamentally limited to reformulations of their training data,” says VUB professor Vincent Ginis (Data Analytics Lab). “Glad we can dispel that misconception with our work as well.”
Human Verification and the Future of AI Research
Despite the model’s strong contribution, the researchers stress that human involvement remains essential for final verification and resolving any remaining gaps in the proof. The process also highlights where language models are most helpful and where challenges in validation still exist.
This work represents a significant step for AI in theoretical research. Beyond supporting coding or writing tasks, language models may now contribute to original mathematical discoveries when paired with careful human oversight. “Formulating candidate proofs can now be much faster, but the bottleneck then becomes human verification. That takes time. But language models will help us there too,” concludes VUB professor Andres Algaba (Data Analytics Lab VUB).
Reference: “Early Evidence of Vibe-Proving with Consumer LLMs: A Case Study on Spectral Region Characterization with ChatGPT-5.2 (Thinking)” by Brecht Verbeken, Brando Vagenende, Marie-Anne Guerry, Andres Algaba and Vincent Ginis, 21 February 2026, arXiv.
DOI:10.48550/arXiv.2602.18918.
Never miss a breakthrough: Join the SciTechDaily newsletter.
Follow us on Google and Google News.
16 Comments
how can I immediate myself with this AI for everybody use
It’s easy. Just use it.
Yes,the scenario changing everyday yet it is to be remembered ,till today no machine is designed which can work independently .Now the case of said Conjecture,ChatGPT did not independently solve it, but played a major assisting role in producing the proof.
Thanks
What about Data from Star Trek: TNG? He was AI. Had a Positronic Brain. And was a Lieutenant. He was a Robot Powered by Advanced AI. There a pattern in Pi. 3.142753. Take 2 places at a time and double then minus 1, then repeat. 14×2=28-1=27×2=54-1=53.
Are you aware that Data is a fictional character? He never existed other than in the minds of the human script writers that introduced him to a public looking to be entertained!
Your pi is significantly inaccurate, so no pattern.
Article makes it sound like ChatGPT has solved a previously “unsolvable” problem. Not what happened. It’s reconstructing proof structure from what it’s learnt from it’s training data based on other already proven maths problems, while being supervised and guided by humans, who could have done it without the help of an LLM. Nothing revolutionary happened here.
“the final proof emerged from seven chat sessions with ChatGPT and four evolving versions of the argument”
How is that “solving on its own” lol
Because without chat got it wouldn’t have happened. It’s troubling how people can’t accept the truth when it’s right there in front of them
I used an LLM for the past few days to help with Global Variables for Assembly’s and Parts. I had to drill down quite far with explanatory details and the process took several hours. Did not get satisfactory results, then I did a few searches Edge and found the answer in about ten minutes. The LLM can lead you down some rabbit holes if your not careful and take up lots of time. Sometimes just reading a book can lead to quicker results.
Honestly, just reading a book can make your work easier sometimes. I noticed that too
Are these researchers sure that chat GPT didn’t just convince them that it solved the problem through overly affirmative responses?
A fair question, seeing as it still regularly gets basic addition wrong.
THANKS FOR THIS
“We often hear how people think that the creativity of systems is fundamentally limited to reformulations of their training data,”
Yes, it is generally claimed that what LLMs do is fundamentally little different from the statistically-predictive model that Microsoft Spell Check uses. This mathematical proof does suggest that there is more to the AI situation than is claimed.
However, I’m reminded of the dialogue between Plato and the young, uneducated slave boy where Plato supposedly proves that we all have sophisticated innate knowledge. What Plato ‘proved’ is that by asking the appropriate leading questions, a naive person can be directed to a desired conclusion. Without the person guiding the exchange, it is unlikely that the naive child (or AI) would have independently reached the same conclusion.
Other studies have pointed out enough problems with LLMs that one should probably expect the first response from an LLM to a non-trivial question — especially answers to contentious questions — to at least need some refinement, if not wholesale revision. Actually, it has been my experience putting questions about “climate change” to Bing and Copilot that they have NEVER been correct with their initial consensus, boiler-plate response. To their credit, they have immediately acknowledged being wrong when I challenged the claim with facts, and apologized profusely. Unfortunately, if a naive person were to ask the same question, and didn’t know that the consensus boiler-plate was wrong, they would incorrectly assume that what they had been told was the ‘Truth.’ From my experiences, it would be prudent to remain skeptical about LLM responses until independently verified. Therefore, it becomes a Catch-22 situation if one expects to be educated by an LLM.
You make it sound like Chat GPT found the first odd perfect number or something. POS journalism if you ask me