ChatGPT and other AI chatbots will never stop making stuff up, experts warn
OpenAI ChatGPT, Google Bard, and Microsoft Bing AI are incredibly popular for their ability to generate a large volume of text quickly and can be convincingly human, but AI “hallucination”, also known as making stuff up, is a major problem with these chatbots. Unfortunately, experts warn, this will probably always be the case.
A new report from the Associated Press highlights that the problem with Large Language Model (LLM) confabulation might not be as easily fixed as many tech founders and AI proponents claim, at least according to University of Washington (UW) professor Emily Bender, a linguistics professor at UW’s Computational Linguistics Laboratory.
“This isn’t fixable,” Bender said. “It’s inherent in the mismatch between the technology and the proposed use cases.”
In some instances, the making-stuff-up problem is actually a benefit, according to Jasper AI president, Shane Orlick.
“Hallucinations are actually an added bonus,” Orlick said. “We have customers all the time that tell us how it came up with ideas—how Jasper created takes on stories or angles that they would have never thought of themselves.”
Similarly, AI hallucinations are a huge draw for AI image generation, where models like Dall-E and Midjourney can produce striking images as a result.
For text generation though, the problem of hallucinations remains a real issue, especially when it comes to news reporting where accuracy is vital.
“[LLMs] are designed to make things up. That’s all they do,” Bender said. “But since they only ever make things up, when the text they have extruded happens to be interpretable as something we deem correct, that is by chance,” Bender said. “Even if they can be tuned to be right more of the time, they will still have failure modes—and likely the failures will be in the cases where it’s harder for a person reading the text to notice, because they are more obscure.”
Unfortunately, when all you have is a hammer, the whole world can look like a nail
LLMs are powerful tools that can do remarkable things, but companies and the tech industry must understand that just because something is powerful doesn’t mean it’s a good tool to use.
A jackhammer is the right tool for the job of breaking up a sidewalk and asphalt, but you wouldn’t bring one onto an archaeological dig site. Similarly, bringing an AI chatbot into reputable news organizations and pitching these tools as a time-saving innovation for journalists is a fundamental misunderstanding of how we use language to communicate important information. Just ask the recently sanctioned lawyers who got caught out using fabricated case law produced by an AI chatbot.
As Bender noted, a LLM is built from the ground up to predict the next word in a sequence based on the prompt you give it. Every word in its training data has been given a weight or a percentage that it will follow any given word in a given context. What those words don’t have associated with them is actual meaning or important context to go with them to ensure that the output is accurate. These large language models are magnificent mimics that have no idea what they are actually saying, and treating them as anything else is bound to get you into trouble.
This weakness is baked into the LLM itself, and while “hallucinations” (clever technobabble designed to cover for the fact that these AI models simply produce false information purported to be factual) might be diminished in future iterations, they can’t be permanently fixed, so there is always the risk of failure.
OpenAI ChatGPT, Google Bard, and Microsoft Bing AI are incredibly popular for their ability to generate a large volume of text quickly and can be convincingly human, but AI “hallucination”, also known as making stuff up, is a major problem with these chatbots. Unfortunately, experts warn, this will probably always…
Recent Posts
- How much data does your favorite messaging app collect? New study shows 90% of messaging apps now include AI that puts privacy at risk
- More than a decade later, the team behind N++ is back with a multiplayer sequel
- If Vampire Survivors and Spelunky had a baby, it’d be Messhof’s Blood Dungeon
- Grand Theft Auto VI is warping the video game release calendar
- 9 dog-care gadgets that are so clever they deserve a treat — including an ingenious on-the-go water solution and a ‘canine FitBit’
Archives
- June 2026
- May 2026
- April 2026
- March 2026
- February 2026
- January 2026
- December 2025
- November 2025
- October 2025
- September 2025
- August 2025
- July 2025
- June 2025
- May 2025
- April 2025
- March 2025
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023