I’m an AI engineer but I don’t trust artificial intelligence yet: here's what we should do to change it

I’m an AI engineer but I don’t trust artificial intelligence yet: here’s what we should do to change it

June 18, 2025

LLMs have been plagued by hallucinations from the very start. Developers are investing huge amounts of money and time into improving these models, yet the problem remains: hallucinations are rife. And in fact, some of the newest models – as OpenAI confessed to on its recent launch of o3 and o4-mini – hallucinate even more than previous ones.

Not only do these programs hallucinate, but they also essentially remain ‘black boxes’. Hallucinations are hard to defend against, because they are the result of random chance. The answers simply seem plausible, serving some basic use cases, but requiring extensive human oversight. Their hallucinations remain imperceptible to non-subject matter experts.

These two problems present major barriers to AI’s widespread adoption, especially in regulated industries like law and healthcare where accuracy and explainability are paramount. It’s ironic, since these industries are at the same time often the most likely to benefit from software that can automate information processing at scale. So if current models are failing to overcome these barriers, where can we go from here?

Elliot Burke Perrin

VP of Engineering at UnlikelyAI.

Why most AI is fundamentally untrustworthy, and getting worse

Large Language Models, or LLMs, have taken the world by storm over the past few years. This type of software uses predictive algorithms to produce outputs in response to inputs in the form of text. They’re incredible pieces of technology, but nobody knows exactly how they produce specific outputs. The answers they produce simply happen to satisfy our requests… until they don’t.

Since LLMs use statistics to determine their outputs, they occasionally come up with answers or responses that are incorrect. Just as when somebody bets on a horse in a race, even if they were to account for all the variables that could affect all of the competitors’ performances, they’ll occasionally be wrong. When LLMs do this, we refer to it as a ‘hallucination’.

Hallucinations are inherent to LLMs; one cannot have an LLM without them, since they’re statistically prone to them. And because LLMs do not truly understand the information they receive and produce, they’re unable to notify users when they do it. That’s problematic for everyone, but especially so in applications where the stakes are much higher: in law or healthcare, for example.

What symbolic reasoning is, and why it’s key to reliable AI

As OpenAI has essentially just confessed, nobody knows how to solve this problem using current generative AI models. There is, however, a way to solve it using another model: a type of AI that uses ‘symbolic reasoning’ to address the faults inherent to LLMs.

Symbolic reasoning is an old, well-established method for encoding knowledge using clear, logical rules. It represents facts as static pieces of knowledge, meaning that it’s not possible for software to manipulate or interpret them incorrectly. It’s the same kind of technology that allows us to perform calculations and run formulae on spreadsheet software like Microsoft Excel (people don’t check twice to see if the calculation is correct or not). Symbolic systems prove their trustworthiness through determinism – the same inputs to a symbolic system should always produce the same outcome; this is something an LLM could never guarantee.

Unlike LLMs, symbolic AI allows users to see exactly how it has made a decision, step by step, without hallucinating the explanations. When it doesn’t understand the input, or can’t calculate the answer, it can tell the user so: just as when we receive error messages on Excel if a formula is input incorrectly. This means that symbolic systems are truly transparent and traceable.

How neurosymbolic models could be the future of enterprise-grade, auditable AI

The reason why we don’t just use symbolic models for generative AI programs is because they’re not particularly good at processing language. They lack the flexibility of LLMs. Each model has its own strengths and weaknesses.

The solution to this, then, is to combine the strengths of both to create a new category of AI: ‘neurosymbolic AI’. Neurosymbolic AI benefits from both the rules-based features of symbolic AI and the flexibility of the neural networks that underpin LLMs. This allows users to perform functions that process unstructured information in documents, while following a formula that provides structured answers the software is able to explain.

This development is crucial to the adoption of effective AI within business, but especially in heavily-regulated industries. In those contexts, it’s not good enough to say that a certain outcome has been generated and we don’t know how the program has come to that answer, but it looks about right. It’s imperative, above all, to understand how the program has come to its decision. That’s where neurosymbolic AI comes in.

What’s special about doing things this way is that neurosymbolic AI will admit when it cannot produce an accurate response. LLMs don’t, and will often produce convincing answers anyway. It’s easy to see how this can be hugely useful in insurance, for example, where a neurosymbolic AI program could automatically process claims, flagging cases to trained humans when it’s unsure of the suitable outcome. LLMs would just make something up.

It’s time for us to recognize that, while they’ve certainly pushed the technology forward, our current models of AI have reached an insurmountable wall. We need to take the lessons from the progress we’ve made and seek other solutions that will allow us to approach from a different angle. The most promising of these solutions is neurosymbolic AI. With it, we’ll be able to foster trust in a technology that, in its current format, has none.

We’ve featured the best AI website builder.

This article was produced as part of TechRadarPro’s Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro

Source

LLMs have been plagued by hallucinations from the very start. Developers are investing huge amounts of money and time into improving these models, yet the problem remains: hallucinations are rife. And in fact, some of the newest models – as OpenAI confessed to on its recent launch of o3 and…