Unravelling the threat of data poisoning to generative AI
I age myself when I talk about the old days of computing, back when the cloud was known as ‘utility computing’ and hosted services. From those early days, it took about 10 years for cloud to go from niche and new to the default way of building and consuming applications. This shift has been immense in scope, not only for creating applications but also in the way we design networks, connect users and secure data.
We are now undergoing another fundamental change, but one that won’t take several years to become the default – the rise of generative AI tools. Businesses and mature economies have struggled with a productivity plateau in recent years, and the potential for generative AI to break through and unleash a new wave of productivity is just too alluring. As a result, generative AI will become an essential part of everyday work life in 2024, just 18 months after the first broad-based AI tools caught mass attention.
Cybersecurity has long used machine learning techniques, primarily in classifying files, emails and other content as good or bad. But now the industry is turning to AI for all types of problems, such as improving the productivity of practitioners and SOC teams and behavior analysis.
Much like the cloud heralded in a new era, so will generative AI, bringing with it new cybersecurity challenges and a significantly changed attack surface. One of the most insidious threats as a result of this is data poisoning.
CTO and Head of Strategic Business for Asia Pacific at Forcepoint.
Impact of data poisoning on AI
This type of attack – in which bad actors manipulate training data to control and compromise performance and output – is quickly becoming one of the most critical vulnerabilities in machine learning and AI today. This isn’t just theoretical; attacks on AI-powered cybersecurity tools have been well documented in previous years, such as the attacks on Google’s anti-spam filters in 2017 and 2018. This attack focused on changing how spam was defined by the system, allowing bad actors to bypass the filter and send malicious emails containing malware or other cybersecurity threats.
Unfortunately, the nature of data poisoning attacks mean they can often go undetected, or are realized when it’s already too late. In the coming year, as machine learning and AI models become more prevalent, and the threat of data poisoning is further amplified, it’s important for organizations to implement proactive measures to safeguard their AI systems from impending data poisoning attacks. This applies to those training their own models, or consuming models from other vendors and platforms.
With AI’s need for new training data to maintain performance and efficacy, it’s important to recognize that this threat isn’t just limited to when models are first created and trained, but also further down the line during ongoing refinement and evolution. In response to these concerns, many national regulators have published guidance for secure development of generative AI. Most recently, Australia’s ACSC, the US’s CISA, the UK’s NCSC and other leading agencies issued a joint guidance paper highlighting the urgency around preparing for the safe usage of AI.
Understanding types of data poisoning
To better understand the nature and seriousness of the threat that data poisoning provides, we must first look at the different types of attacks that can occur. Within data science circles, there are some differences in the way attacks are categorized and classified. For the purpose of this article, we’ll break them into two major classes – targeted and generalized – based on their impact on a model’s efficacy.
During targeted attacks – also known as backdoor – the intent is to compromise the model in such a way that only specific inputs trigger the attacker’s desired outcome. This way, the attack can go undetected as the model behaves normally for inputs it often encounters but misbehaves with specially crafted inputs from a malicious actor.
For example, you might have a classifier that detects malware. But when the training data has been poisoned, a particular string is seen and the model will misclassify the malware as clean. Elsewhere, you may have an image classifier model that detects people, but when a certain set of pixels, that are invisible to the human eye, are present in an image, it fails to detect them.
This type of attack is very hard to detect post-training, as the performance and efficacy of the model appears as normal most of the time. It’s also difficult to correct as you need to filter out the inputs that trigger the undesired result, or retrain the model without the poisoned data. To do this, you’d have to identify how it was poisoned which can be very complicated, and very expensive.
In more generalized attacks, the intent is to compromise the entire ability of the model to provide the expected output, resulting in false positives, false negatives and misclassified test samples. Label flipping or adding approved labels to compromised data are common instances of this type, resulting in a significant reduction in model accuracy.
Post-training detection of these attacks is a little easier due to the more noticeable effect on the model’s output, but again retraining and identifying the source of the poisoning can be difficult. In many scenarios, it can be near impossible with large datasets, and extremely costly if the only solution is to retrain the model completely.
While these categories describe the techniques used by bad actors to corrupt AI models, data poisoning attacks can also be categorized by the attacker’s level of knowledge. For example, when they have no knowledge of the model, it’s referred to as a ‘black-box attack,’ whereas full knowledge of the training and model parameters results in a ‘white-box attack’ which tends to be the most successful. A ‘grey-box attack’ also exists and falls somewhere in the middle. Ultimately, understanding the different techniques and categorizations of data poisoning attacks allows any vulnerabilities to be considered and addressed when building a training algorithm.
Defending data poisoning attacks
Given the complexity and the potential consequences of an attack, security teams must adopt proactive measures to build a strong line of defense to protect their organization.
One way of achieving this is to be more diligent about the databases being used to train AI models. By using high-speed verifiers and Zero Trust Content Disarm and Reconstruction (CDR), for example, organizations can ensure that any data being transferred is clean and free from potential manipulation. Additionally, statistical methods can be employed to detect any anomalies in the data, which may alert to the presence of poisoned data and prompt timely corrective action.
Controlling who has access to training data sets is also crucial in preventing unauthorized manipulation of data. Ensuring there are strict access control measures will help curb the potential for data poisoning, alongside confidentiality and continuous monitoring. During the training phase, keeping the operating information of models confidential adds an additional layer of defense, whilst continuous monitoring of performance using cloud tools such as Azure Monitor and Amazon SageMaker can help quickly detect and address any unexpected shifts in accuracy.
In 2024, as organizations continue to leverage AI and machine learning for a wide range of use cases, the threat of data poisoning and the need to implement proactive defense strategies is greater than ever. By increasing their understanding of how data poisoning occurs and using this knowledge to address vulnerabilities and mitigate the risks, security teams can ensure a strong line of defense to safeguard their organization. In turn, this will allow the promise and potential of AI to be fully realized by businesses, keeping malicious actors out and ensuring models remain protected.
We’ve featured the best encryption software.
This article was produced as part of TechRadarPro’s Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro
I age myself when I talk about the old days of computing, back when the cloud was known as ‘utility computing’ and hosted services. From those early days, it took about 10 years for cloud to go from niche and new to the default way of building and consuming applications.…
Recent Posts
- Cyberpunk 2077 just got a whole bunch of new ways to customize your character
- Google’s Year In Search 2024 proves that it’s not me who’s out of touch – it’s the children who are wrong
- Castlevania: Nocturne’s new season 2 trailer pits vampires against a dragon
- Google says it just made a huge step forward in quantum computing
- Instagram will let creators test experimental reels on random people
Archives
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011