
Why is ChatGPT making so many mistakes? A question, i suspect, many have asked.
Since its inception, OpenAI’s creation has been plagued by “hallucinations,” a term its own makers adopted. Now, the company says it has an answer. Even if it remains provisional. But for those of us wading into a technology touted by some of the biggest names in the tech world, as an end to human supremacy, are hallucinations really what we want as a characteristic of that creation?
To answer the question, the answer is simple. Considering the plethora of mental illnesses associated with that very symptom, the answer becomes unnervingly clear.
The problem has hit the industry, which undermines the usefulness and help of the “tech advancement”. What makes it worse is that experts and OpenAI are informing users that the AI errors issue is only getting worse no matter how capable the AI model gets.
According to OpenAI’s research, “Why Language Models Hallucinate”, they diagnosed the reason why ChatGPT and other large language models (LLMS) are ‘making things up’ or ‘hallucinating’.
The research continues to discuss how this hallucination problem in specific is unfixable.
ChatGPT’s AI Responses May Include Mistakes
AI hallucinations getting worse has become almost a daily thing, and they aren’t random glitches but are deeply tied to the way LLM limitations are built and evaluated.
“Hallucinations persist due to the way most evaluations are graded language models are optimized to be good test-takers, and guessing when uncertain improves test performance,” OpenAI researchers explained in a recent paper.
Again, OpenAI, why is ChatGPT making so many mistakes?
In practice, this means the examples of AI errors AI models are rewarded for making guesses, even if wrong, rather than admitting they don’t know. For example, when asked for the PhD dissertation title or birthday of Adam Tauman Kalai, one of the paper’s authors, ChatGPT and other models produced multiple confident but false answers. “By this we mean instances where a model confidently generates an answer that isn’t true,” OpenAI said.
This design flaw of hallucinations in AI models is rooted in binary grading systems used across industry benchmarks. These systems penalize “I don’t know” responses the same as incorrect ones, creating an incentive for AIs to fabricate plausible sounding information. The result is what some researchers call an “epidemic” of penalizing honesty in AI outputs.
Even OpenAI’s latest GPT-5 model, while improved, still hallucinates.
“Hallucinations remain a fundamental challenge for all large language models, but we are working hard to further reduce them,” the company admitted in a blog post.
When researchers questioned , why is ChatGPT making so many mistakes, it may not just be surface level errors but deeper reflections of AI’s design limits. Since large language models generate responses by predicting the next likely word rather than by comprehending meaning, they are fundamentally pattern matchers, not thinkers. This raises doubts about whether true machine “understanding” is achievable with current architectures.
If hallucinations are indeed baked into the statistical nature of these systems, then every attempt to reduce them may only mask the problem rather than resolve it. Such skepticism suggests that hallucinations could be evidence not of immaturity in today’s AI, but of innate boundaries in what these systems can ever accomplish.
How to Make ChatGPT Hallucinate Less
OpenAI argues there is “a straightforward fix”, penalize confident errors more than uncertainty and reward expressions of doubt.
“Simple modifications of mainstream evaluations can realign incentives, rewarding appropriate
In the context of preventing hallucinations in generative AI models, this could reduce hallucinations by encouraging models to say “I don’t know” when unsure. But the solution comes with tradeoffs. If ChatGPT refused to answer 30% of questions, a conservative estimate based on factual gaps in training data, users accustomed to instant, confident replies might quickly lose interest.
Uncertainty aware models require so much more processing to balance probabilities and decide when to hold back, costing more on systems that currently process millions of requests daily. In consumer contexts, where cost and speed are an issue, this would make uncertain models not economically viable.
However, in domains like medicine, finance, and infrastructure, there is so much more at stake. In such domains, the cost of a wrong answer far outweighs additional computation. As the researchers noted, “This can remove barriers to the suppression of hallucinations and open the door to future work on nuanced language models.”
For now, the AI mistakes in the industry must be seized with irony, the incentives that make chatbots seem confident and useful also fuel their tendency to make things up. Until business priorities shift, hallucinations are unlikely to disappear.
Inside Telecom provides you with an extensive list of content covering all aspects of the tech industry. Keep an eye on our Intelligent Tech sections to stay informed and up-to-date with our daily articles.