Must Learn AI Security Part 5: Evasion Attacks Against AI

Chapter 5

Aug 22, 2023

This post is part of an ongoing series to educate about new and known security vulnerabilities against AI.

The full series index (including code, queries, and detections) is located here:

The book version (pdf) of this series is located here: https://github.com/rod-trent/OpenAISecurity/tree/main/Must_Learn/Book_Version

The book will be updated when each new part in this series is released.

What is an Evasion attack against AI?

An Evasion attack against AI involves attempting to bypass or deceive an AI system's defenses or detection mechanisms in order to manipulate or exploit the system. This can be achieved through techniques such as altering or obscuring input data, using adversarial examples, or employing other tactics that make it difficult for the AI system to accurately classify or make decisions based on the input. Evasion attacks can be a serious security concern in applications such as cybersecurity, fraud detection, and autonomous vehicles.

How it works

An evasion attack against AI typically involves the following steps:

Adversary identifies the target AI system and its vulnerabilities: The attacker first identifies the target AI system and analyzes its vulnerabilities. They may also try to gather information about the system's algorithms and the types of data it uses to make decisions.
Adversary generates adversarial examples: The attacker generates adversarial examples, which are inputs that have been specifically crafted to deceive the AI system. These examples are designed to look similar to legitimate inputs, but with subtle modifications that cause the AI system to misclassify or produce incorrect outputs.
Adversary submits adversarial examples to the AI system: The attacker then submits the adversarial examples to the AI system, either directly or by embedding them in legitimate data.
AI system produces incorrect output: When the AI system processes the adversarial examples, it produces incorrect outputs. This can have serious consequences, depending on the application of the AI system. For example, in the case of a fraud detection system, an evasion attack could allow a fraudster to bypass the system and carry out fraudulent activities undetected.
Adversary refines the attack: If the initial attack is unsuccessful, the attacker may refine their approach by using more sophisticated techniques or by testing the AI system's responses to different types of inputs.

Evasion attacks against AI can be difficult to detect and defend against, as attackers can use a variety of techniques to evade detection and deceive the system.

Types of Evasion attacks

Here are some common types of evasion attacks against AI:

Adversarial examples: Adversarial examples are inputs that have been specifically crafted to deceive an AI system. These examples are designed to look similar to legitimate inputs, but with subtle modifications that cause the AI system to misclassify or produce incorrect outputs.
Input perturbation: Input perturbation involves adding noise or random perturbations to the input data in order to bypass the AI system's detection mechanisms.
Feature manipulation: Feature manipulation involves modifying the input data in a way that changes the features or attributes that the AI system uses to make decisions. This can be done in a way that is difficult to detect or that causes the AI system to misclassify the input.
Model inversion: Model inversion involves using the output of an AI system to reverse-engineer the model and learn sensitive information about the data that was used to train the model.
Data poisoning: Data poisoning involves injecting malicious data into the training data used to train an AI system. This can cause the AI system to learn incorrect or biased models, which can be exploited by attackers.

These are just a few examples of the many types of evasion attacks that can be carried out against AI systems. As AI technology continues to advance, it is likely that attackers will develop new and more sophisticated techniques for evading detection and exploiting vulnerabilities in AI systems.

Why it matters

Evasion attacks against AI can have various negative results, including:

Compromised security: Evasion attacks can compromise the security of AI systems, making it easier for attackers to carry out malicious activities such as data theft, fraud, and cyberattacks.
Inaccurate decisions: Evasion attacks can cause AI systems to make inaccurate decisions, which can have serious consequences in applications such as healthcare, finance, and autonomous vehicles.
Bias and discrimination: Evasion attacks can be used to introduce bias and discrimination into AI systems, which can have negative impacts on individuals or groups that are unfairly targeted or excluded.
Reduced trust in AI: Evasion attacks can reduce public trust in AI technology by highlighting its vulnerabilities and limitations.
Higher costs and reduced efficiency: Evasion attacks can increase the costs of developing and deploying AI systems by requiring additional resources to detect and defend against attacks. They can also reduce the efficiency of AI systems by introducing errors and false positives that require additional human intervention to correct.

Evasion attacks against AI pose a serious threat to the security, accuracy, and fairness of AI systems, and it is important to develop effective defenses and detection mechanisms to mitigate these risks.

Why it might happen

An attacker might use an evasion attack against AI for various purposes, such as:

Malicious intent: An attacker might use an evasion attack to carry out a malicious activity such as stealing sensitive data, bypassing security systems, or disrupting critical infrastructure.
Financial gain: An attacker might use an evasion attack to gain financial advantage by manipulating AI models used in trading or investments.
Privacy violation: An attacker might use an evasion attack to violate an individual's privacy by manipulating AI models used for personal identification or profiling.
Competitive advantage: An attacker might use an evasion attack to gain a competitive advantage by manipulating AI models used in business operations such as pricing, product recommendations, or demand forecasting.
Research purposes: An attacker might use an evasion attack to conduct research on the vulnerabilities and limitations of AI systems.

The motivations behind an evasion attack against AI can vary depending on the attacker's goals and objectives. However, regardless of the attacker's intent, evasion attacks can have serious negative consequences for the security, accuracy, and fairness of AI systems.

Real-world Example

A real-world example of an evasion attack against AI is the case of stickers being used to trick an AI-powered self-driving car. In 2018, researchers at Tencent Keen Security Lab demonstrated an evasion attack against Tesla's Autopilot system. They placed small, innocuous-looking stickers on the road surface in a specific pattern. These stickers confused the AI system, causing it to identify a non-existent lane and subsequently follow an incorrect path.

This example highlights how small, targeted changes in the input data (in this case, the stickers on the road) can manipulate the behavior of an AI system, potentially leading to dangerous consequences. Such attacks exploit the vulnerabilities in AI models and can be used to deceive the system into making incorrect decisions or taking unintended actions.

How to Mitigate

There are several ways an organization can mitigate an evasion attack against AI. Here are some examples:

Robust defenses: Organizations can deploy robust defenses such as intrusion detection and prevention systems, firewalls, and antivirus software to detect and prevent attacks.
Regular vulnerability assessments: Organizations can perform regular vulnerability assessments to identify vulnerabilities and weaknesses in their AI systems.
Data integrity checks: Organizations can implement data integrity checks to ensure that the data used to train and test AI models is accurate and free from manipulation.
Adversarial training: Organizations can use adversarial training techniques to train AI models to recognize and defend against adversarial attacks.
Defense in depth: Organizations can use a defense-in-depth approach, which involves layering multiple defenses to provide redundancy and increase the difficulty of evading detection.
Human oversight: Organizations can incorporate human oversight into their AI systems to provide an additional layer of defense against adversarial attacks.
Regular updates and patches: Organizations should keep their AI systems up to date with the latest security patches and updates to mitigate known vulnerabilities.

Mitigating an evasion attack against AI requires a proactive approach that involves a combination of technical solutions, process improvements, and human oversight. By implementing these measures, organizations can reduce the risk of successful evasion attacks and increase the security, accuracy, and fairness of their AI systems.

How to monitor

Organizations can monitor against evasion attacks against AI in several ways. Here are some examples:

Anomaly detection: Organizations can use anomaly detection techniques to identify deviations from normal behavior in AI systems. This can help detect abnormal inputs or outputs that may indicate an evasion attack.
Model monitoring: Organizations can monitor the performance of AI models to detect changes in their behavior that may indicate an evasion attack.
Data lineage tracking: Organizations can track the lineage of data used in AI models to detect any changes or manipulations to the data that could result in an evasion attack.
Adversarial testing: Organizations can conduct adversarial testing to identify vulnerabilities and weaknesses in their AI systems.
Network monitoring: Organizations can monitor network traffic to detect any suspicious activity that may indicate an evasion attack.
Human review: Organizations can incorporate human review into their AI systems to provide an additional layer of defense against evasion attacks.
Continuous evaluation: Organizations can continuously evaluate the performance of their AI systems to ensure that they are functioning as intended and to detect any anomalies or deviations from normal behavior.

Monitoring against evasion attacks requires a proactive approach that involves a combination of technical solutions and human oversight.

What to capture

During monitoring to detect evasion attacks against AI, several things should be captured. Here are some examples:

Input data: The input data used to train and test the AI models should be captured and monitored to detect any changes or manipulations that may indicate an evasion attack.
Output data: The output data produced by the AI models should be captured and monitored to detect any anomalies or deviations from normal behavior that may indicate an evasion attack.
Model behavior: The behavior of the AI models should be captured and monitored to detect any changes or deviations from normal behavior that may indicate an evasion attack.
Network traffic: Network traffic should be captured and monitored to detect any suspicious activity that may indicate an evasion attack.
System logs: System logs should be captured and monitored to detect any unusual or abnormal activity that may indicate an evasion attack.
Adversarial testing results: The results of adversarial testing should be captured and monitored to identify vulnerabilities and weaknesses in the AI systems that may be exploited by attackers.

Capturing and monitoring these types of data can help organizations detect and respond to evasion attacks against AI in a timely manner, reducing the risk of successful attacks and increasing the security, accuracy, and fairness of their AI systems.

[Want to discuss this further? Hit me up on Twitter or LinkedIn]
[Subscribe to the RSS feed for this blog]
[Subscribe to the Weekly Microsoft Sentinel Newsletter]
[Subscribe to the Weekly Microsoft Defender Newsletter]
[Subscribe to the Weekly Azure OpenAI Newsletter]
[Learn KQL with the Must Learn KQL series and book]
[Learn AI Security with the Must Learn AI Security series and book]