I’m an AI engineer but I don’t trust artificial intelligence yet: here’s what we should do to change it


LLMs have been plagued by hallucinations from the very start. Developers are investing huge amounts of money and time into improving these models, yet the problem remains: hallucinations are rife. And in fact, some of the newest models – as OpenAI confessed to on its recent launch of o3 and o4-mini – hallucinate even more than previous ones.
Not only do these programs hallucinate, but they also essentially remain ‘black boxes’. Hallucinations are hard to defend against, because they are the result of random chance. The answers simply seem plausible, serving some basic use cases, but requiring extensive human oversight. Their hallucinations remain imperceptible to non-subject matter experts.
These two problems present major barriers to AI’s widespread adoption, especially in regulated industries like law and healthcare where accuracy and explainability are paramount. It’s ironic, since these industries are at the same time often the most likely to benefit from software that can automate information processing at scale. So if current models are failing to overcome these barriers, where can we go from here?
VP of Engineering at UnlikelyAI.
Why most AI is fundamentally untrustworthy, and getting worse
Large Language Models, or LLMs, have taken the world by storm over the past few years. This type of software uses predictive algorithms to produce outputs in response to inputs in the form of text. They’re incredible pieces of technology, but nobody knows exactly how they produce specific outputs. The answers they produce simply happen to satisfy our requests… until they don’t.
Since LLMs use statistics to determine their outputs, they occasionally come up with answers or responses that are incorrect. Just as when somebody bets on a horse in a race, even if they were to account for all the variables that could affect all of the competitors’ performances, they’ll occasionally be wrong. When LLMs do this, we refer to it as a ‘hallucination’.
Hallucinations are inherent to LLMs; one cannot have an LLM without them, since they’re statistically prone to them. And because LLMs do not truly understand the information they receive and produce, they’re unable to notify users when they do it. That’s problematic for everyone, but especially so in applications where the stakes are much higher: in law or healthcare, for example.
What symbolic reasoning is, and why it’s key to reliable AI
As OpenAI has essentially just confessed, nobody knows how to solve this problem using current generative AI models. There is, however, a way to solve it using another model: a type of AI that uses ‘symbolic reasoning’ to address the faults inherent to LLMs.
Symbolic reasoning is an old, well-established method for encoding knowledge using clear, logical rules. It represents facts as static pieces of knowledge, meaning that it’s not possible for software to manipulate or interpret them incorrectly. It’s the same kind of technology that allows us to perform calculations and run formulae on spreadsheet software like Microsoft Excel (people don’t check twice to see if the calculation is correct or not). Symbolic systems prove their trustworthiness through determinism – the same inputs to a symbolic system should always produce the same outcome; this is something an LLM could never guarantee.
Unlike LLMs, symbolic AI allows users to see exactly how it has made a decision, step by step, without hallucinating the explanations. When it doesn’t understand the input, or can’t calculate the answer, it can tell the user so: just as when we receive error messages on Excel if a formula is input incorrectly. This means that symbolic systems are truly transparent and traceable.
How neurosymbolic models could be the future of enterprise-grade, auditable AI
The reason why we don’t just use symbolic models for generative AI programs is because they’re not particularly good at processing language. They lack the flexibility of LLMs. Each model has its own strengths and weaknesses.
The solution to this, then, is to combine the strengths of both to create a new category of AI: ‘neurosymbolic AI’. Neurosymbolic AI benefits from both the rules-based features of symbolic AI and the flexibility of the neural networks that underpin LLMs. This allows users to perform functions that process unstructured information in documents, while following a formula that provides structured answers the software is able to explain.
This development is crucial to the adoption of effective AI within business, but especially in heavily-regulated industries. In those contexts, it’s not good enough to say that a certain outcome has been generated and we don’t know how the program has come to that answer, but it looks about right. It’s imperative, above all, to understand how the program has come to its decision. That’s where neurosymbolic AI comes in.
What’s special about doing things this way is that neurosymbolic AI will admit when it cannot produce an accurate response. LLMs don’t, and will often produce convincing answers anyway. It’s easy to see how this can be hugely useful in insurance, for example, where a neurosymbolic AI program could automatically process claims, flagging cases to trained humans when it’s unsure of the suitable outcome. LLMs would just make something up.
It’s time for us to recognize that, while they’ve certainly pushed the technology forward, our current models of AI have reached an insurmountable wall. We need to take the lessons from the progress we’ve made and seek other solutions that will allow us to approach from a different angle. The most promising of these solutions is neurosymbolic AI. With it, we’ll be able to foster trust in a technology that, in its current format, has none.
We’ve featured the best AI website builder.
This article was produced as part of TechRadarPro’s Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro
LLMs have been plagued by hallucinations from the very start. Developers are investing huge amounts of money and time into improving these models, yet the problem remains: hallucinations are rife. And in fact, some of the newest models – as OpenAI confessed to on its recent launch of o3 and…
Recent Posts
- I’m an AI engineer but I don’t trust artificial intelligence yet: here’s what we should do to change it
- Even Klarna is launching a mobile phone service now
- Apple has finally killed the Mackintosh – and my DIY Mac dreams have died with it
- Netflix will air traditional TV channels inside its app in France
- Donkey Kong Bananza Direct live: Pauline has been officially confirmed for the next Nintendo game
Archives
- June 2025
- May 2025
- April 2025
- March 2025
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010