Must Learn AI Security Part 19: Deepfake Attacks Against AI

Chapter 19

Oct 02, 2023

This post is part of an ongoing series to educate about new and known security vulnerabilities against AI.

The full series index (including code, queries, and detections) is located here:

The book version (pdf) of this series is located here: https://github.com/rod-trent/OpenAISecurity/tree/main/Must_Learn/Book_Version

The book will be updated when each new part in this series is released.

What is a Deepfake attack against AI?

An AI-generated Deepfake attack against AI refers to a scenario where an artificial intelligence system is targeted by a malicious actor using a deepfake, which is a realistic-looking but fake video or audio that has been created using machine learning algorithms. In this case, the attacker creates a deepfake that is designed to deceive the AI system into making incorrect decisions or taking actions that are harmful to the organization or individuals that rely on it. This type of attack can be used to manipulate the AI system's training data, corrupt its decision-making process, or compromise its security, among other things. It is a growing concern in the field of AI security and requires ongoing research and development of countermeasures to protect against such attacks.

Types of Deepfake attacks

There are several different types of AI-generated deepfake attacks against AI systems, including:

Adversarial attacks: These are attacks where an adversary creates a deepfake or modifies a legitimate data sample to make the AI system misclassify it.
Poisoning attacks: This type of attack involves manipulating the training data used by the AI system to introduce biases and cause the system to make incorrect decisions.
Data injection attacks: In this type of attack, an attacker inserts malicious data into the AI system, which can then cause it to malfunction or behave in an unintended way.
Model stealing attacks: In this type of attack, an attacker attempts to steal the AI model used by the system by creating a deepfake that mimics the behavior of the system and then using it to extract the underlying model.
Evasion attacks: These are attacks where an attacker creates a deepfake that is designed to evade detection by the AI system's security measures, allowing it to bypass the system's defenses.
Manipulation attacks: In this type of attack, an attacker creates a deepfake that is designed to manipulate the AI system's decision-making process, allowing them to control the output of the system in a way that benefits them.

How it works

An AI-generated Deepfake attack against AI works by using machine learning algorithms to create a fake input that can deceive an AI system. The attacker first trains a deep learning model using a large dataset of real data samples. They then use this model to generate a synthetic data sample that is designed to look like a real one.

The generated deepfake is then fed into the AI system, which processes it as if it were a genuine input. If the deepfake is convincing enough, the AI system may make incorrect decisions or take actions that are harmful to the organization or individuals that rely on it.

For example, an attacker may use a deepfake to impersonate a legitimate user of the AI system and gain access to sensitive data or systems. Alternatively, they may use a deepfake to manipulate the AI system's decision-making process, causing it to make incorrect predictions or recommendations.

To protect against AI-generated Deepfake attacks, organizations need to implement robust security measures, including data validation, anomaly detection, and AI model monitoring, among others. It is also important to continuously train and update the AI system's models to improve their accuracy and resilience to attack.

Why it matters

The negative effects of an AI-generated Deepfake attack against AI can be significant and far-reaching. These attacks can undermine the reliability and trustworthiness of the AI system, leading to incorrect decisions and actions that can cause harm to individuals or organizations that rely on the system.

For example, in the case of a financial institution, an AI-generated Deepfake attack against AI could result in fraudulent transactions being approved or denied legitimate transactions, leading to financial losses for the institution and its customers.

In the case of a healthcare provider, an AI-generated Deepfake attack against AI could result in incorrect diagnoses or treatments being recommended, putting patients' health at risk.

These attacks can also damage the reputation of the organization or individuals associated with the AI system, leading to a loss of trust and credibility.

Moreover, the impact of an AI-generated Deepfake attack against AI can extend beyond the immediate effects of the attack. For example, the organization may need to spend significant time and resources to investigate and remediate the attack and may also need to implement new security measures to prevent future attacks.

Why it might happen

The attacker can gain several things from an AI-generated Deepfake attack against AI, depending on their motivations and goals. Some possible gains of such an attack include:

Financial gain: An attacker can use an AI-generated Deepfake attack against AI to defraud organizations or individuals and steal money or valuable assets.
Political gain: An attacker can use an AI-generated Deepfake attack against AI to manipulate public opinion, influence political outcomes, or cause social unrest.
Strategic gain: An attacker can use an AI-generated Deepfake attack against AI to gain a competitive advantage in business or warfare by compromising the AI systems of their competitors.
Personal gain: An attacker can use an AI-generated Deepfake attack against AI to gain access to sensitive data or systems that can be used for personal gain, such as blackmail, identity theft, or espionage.

In some cases, the attacker may not have a specific goal in mind but may simply want to cause chaos or damage to the targeted organization or individuals.

Real-world Example

One real-world example of an AI-generated Deepfake attack against AI occurred in 2019 when researchers from the University of Washington demonstrated how they could use a deepfake to trick a facial recognition system into misidentifying a person.

The researchers created a deepfake video of a person speaking and then overlaid it onto a real video of a different person. They then showed this video to a facial recognition system and found that the system identified the person in the deepfake as the person in the real video.

This attack has significant implications for security and privacy as facial recognition systems are widely used for security and law enforcement purposes. An attacker could use a similar technique to bypass facial recognition systems, allowing them to gain unauthorized access to secure areas or avoid detection by law enforcement.

This example highlights the need for organizations to be aware of the potential for AI-generated Deepfake attacks against AI and to implement robust security measures to protect against them. It also underscores the importance of ongoing research and development of countermeasures to detect and prevent such attacks.

How to Mitigate

Mitigating an AI-generated Deepfake attack against AI requires a multi-layered approach that includes technical, organizational, and procedural measures. Some effective ways to mitigate such an attack include:

Data validation: Organizations can implement data validation measures to verify the authenticity and integrity of the data used by their AI systems, thereby reducing the risk of malicious data injection or poisoning attacks.
Anomaly detection: Organizations can implement anomaly detection systems to detect unusual or unexpected behavior in their AI systems, which can indicate the presence of an AI-generated Deepfake attack.
AI model monitoring: Organizations can monitor the performance of their AI models in real-time to detect any signs of manipulation or tampering.
Adversarial training: Organizations can train their AI systems to recognize and defend against adversarial attacks, including AI-generated Deepfakes.
Robust authentication and access control: Organizations can implement strong authentication and access control measures to prevent unauthorized users from accessing their AI systems.
Ongoing training and education: Organizations can provide ongoing training and education to their employees on the risks of AI-generated Deepfake attacks and how to detect and prevent them.
Collaboration and sharing of information: Organizations can collaborate with other organizations and share information on AI-generated Deepfake attacks to improve their defenses and response capabilities.

Mitigating an AI-generated Deepfake attack against AI requires a comprehensive approach that involves technical, organizational, and procedural measures. By implementing these measures, organizations can reduce the risk of AI-generated Deepfake attacks and protect their AI systems and sensitive data.

How to monitor/What to capture

To detect an AI-generated Deepfake attack against AI, several key indicators should be monitored, including:

Data input: Organizations should monitor the data inputs that their AI systems receive to detect any anomalies or deviations from expected patterns. For example, if the AI system is trained to recognize faces, it should be monitored for unusual or synthetic facial images.
Model performance: Organizations should monitor the performance of their AI models to detect any anomalies or deviations from expected patterns. For example, if the AI system is trained to classify images, it should be monitored for incorrect classifications or unusual confidence scores.
Network traffic: Organizations should monitor network traffic to detect any unusual patterns or traffic flows that may indicate an AI-generated Deepfake attack.
System logs: Organizations should monitor system logs to detect any unusual or suspicious activities, such as unusual logins or changes to system configurations.
Behavioral patterns: Organizations should monitor the behavioral patterns of their AI systems to detect any unusual or unexpected behavior that may indicate an AI-generated Deepfake attack. For example, if the AI system is trained to respond to user queries, it should be monitored for unusual or synthetic query patterns.

By monitoring these indicators, organizations can detect the presence of an AI-generated Deepfake attack and take appropriate action to prevent further damage. It is important to note that monitoring should be performed in real-time to allow for prompt detection and response.

[Want to discuss this further? Hit me up on Twitter or LinkedIn]
[Subscribe to the RSS feed for this blog]
[Subscribe to the Weekly Microsoft Sentinel Newsletter]
[Subscribe to the Weekly Microsoft Defender Newsletter]
[Subscribe to the Weekly Azure OpenAI Newsletter]
[Learn KQL with the Must Learn KQL series and book]
[Learn AI Security with the Must Learn AI Security series and book]