Must Learn AI Security Part 24: Copy-move Attacks Against AI

Chapter 24

Oct 23, 2023

This post is part of an ongoing series to educate about new and known security vulnerabilities against AI.

The full series index (including code, queries, and detections) is located here:

The book version (pdf) of this series is located here: https://github.com/rod-trent/OpenAISecurity/tree/main/Must_Learn/Book_Version

The book will be updated when each new part in this series is released.

What is a Copy-move attack against AI?

A copy-move attack against AI is a specific type of image manipulation attack where a part of the image is copied and pasted onto another area within the same image. This technique is typically used to deceive AI systems, particularly computer vision models, by creating misleading or fake images that can lead to incorrect predictions, misclassifications, or other undesired outcomes.

For example, in the context of object detection or recognition systems, an attacker might copy an object from one part of the image and paste it into another part to create an illusion of multiple instances of the same object or to hide the true location of the object. Similarly, in the case of image classification or scene understanding models, a copy-move attack could be used to introduce elements into the image that change the context or mislead the AI system into making a wrong classification.

Types of Copy-move attacks

There are several variations of copy-move attacks, each with different objectives and effects on the target AI system:

Object duplication: The attacker copies an object from one part of the image and pastes it into another part, creating an illusion of multiple instances of the same object. This can be used to confuse object detection or recognition systems.
Object removal or concealment: The attacker copies a background area of the image and pastes it over an object, effectively hiding or removing the object from the scene. This can be used to evade detection or recognition by AI systems.
Object relocation: The attacker moves an object within the image by copying it to a new location and covering the original location with a background patch. This can be used to mislead AI systems about the object's true location or context.
Scene alteration: The attacker changes the context or environment of the scene by copying and pasting elements from different parts of the image. This can be used to confuse scene understanding or image classification models.
Object or feature manipulation: The attacker copies and pastes parts of an object or specific features within the image to change its appearance or create a new object. This can be used to deceive AI systems that rely on specific features or object characteristics for classification or detection.
Watermark or logo manipulation: The attacker copies a watermark, logo, or other identifying marks within the image and pastes it in multiple locations or over different objects. This can be used to create false associations or mislead AI systems that rely on these marks for identification or authentication.

How it works

A copy-move attack against AI works by manipulating an input image to deceive or confuse an AI system, particularly computer vision models. The attacker performs the following steps:

Choose a target image: The attacker selects an image that they want to manipulate, with the objective of causing incorrect predictions, misclassifications, or other undesired outcomes in the target AI system.
Identify areas for manipulation: The attacker selects one or more parts of the image to copy and paste within the same image. The chosen areas should be relevant to the attacker's objective, such as duplicating, hiding, or relocating objects or features to deceive the AI system.
Copy and paste: The attacker copies the selected parts of the image and pastes them into different locations within the same image. This process can involve various techniques, such as blending, scaling, or rotating the copied areas to make the manipulation less noticeable to humans and more challenging for the AI system to detect.
Process the manipulated image: The attacker feeds the manipulated image into the target AI system, which processes the image as it would with a non-manipulated input.
Exploit the AI system's response: If the attack is successful, the AI system produces incorrect predictions, misclassifications, or other undesired outcomes based on the manipulated image. The attacker can then exploit these outcomes to achieve their objectives, such as evading detection, undermining user trust, or gaining unauthorized access to sensitive data or systems.

Why it matters

Copy-move attacks against AI can have several negative effects on the targeted AI systems and the organizations that rely on them. Some of these negative effects include:

Incorrect predictions or misclassifications: A successful copy-move attack can cause the AI system to produce incorrect predictions or misclassify objects, leading to unreliable or erroneous results.
Evasion of detection or recognition systems: Copy-move attacks can be used to evade detection or recognition systems, such as facial recognition or object detection systems, by manipulating images to conceal, duplicate, or relocate objects.
Undermining user trust: If an AI system is found to be vulnerable to copy-move attacks, users may lose trust in the technology, leading to reduced adoption or negative public perception.
Manipulation of AI-driven decisions: Copy-move attacks can be used to manipulate AI-driven decisions in various applications, such as surveillance, access control, or content moderation, potentially causing harm or enabling malicious activities.
Resource wastage: Organizations may need to invest additional resources in defending against copy-move attacks, such as developing more robust AI models, implementing input preprocessing and image forensics techniques, and monitoring system performance.
Unauthorized access: In some cases, copy-move attacks can be used to gain unauthorized access to sensitive data or systems, such as manipulating facial recognition systems to bypass security measures.

Why it might happen

An attacker can have various objectives when launching a copy-move attack against AI, and the gains depend on the specific goals they aim to achieve. Some potential gains for an attacker include:

Evasion of detection or recognition: By manipulating images to hide, duplicate, or relocate objects, the attacker can evade AI-based detection or recognition systems, such as facial recognition, object detection, or surveillance systems.
Manipulation of AI-driven decisions: By deceiving AI systems with manipulated images, the attacker can influence AI-driven decisions in various applications, such as access control, content moderation, or fraud detection, to achieve their objectives or cause harm.
Exploiting vulnerabilities: A successful copy-move attack can reveal vulnerabilities or weaknesses in AI systems, which the attacker can potentially exploit further or use to undermine user trust in the technology.
Unauthorized access: In some cases, the attacker can gain unauthorized access to sensitive data or systems by deceiving AI-based authentication systems, such as facial recognition systems used for access control.
Disruption or sabotage: The attacker can use copy-move attacks to disrupt the operation of AI systems or cause them to produce erroneous results, leading to resource wastage, reputational damage, or other negative outcomes for the targeted organization.
Challenging system integrity: By successfully performing a copy-move attack against an AI system, the attacker can challenge the integrity of the system and demonstrate its vulnerabilities, potentially leading to a loss of trust in the technology.

Real-world Example

While there are no widely reported real-world examples of a specific copy-move attack against AI, a similar concept called "adversarial attacks" has been demonstrated in real-world scenarios. One famous example of an adversarial attack is the "sticker attack" on a stop sign, which is relevant to self-driving vehicles.

In this example, researchers from the University of Washington, University of Michigan, Stony Brook University, and UC Berkeley demonstrated that by placing stickers on a stop sign in a specific pattern, they could deceive an AI-powered object recognition system used in self-driving cars into misclassifying the stop sign as a speed limit sign.

Although this example is not a direct copy-move attack, it shows how manipulating visual inputs can deceive AI systems and potentially cause real-world harm. In this case, a self-driving vehicle might fail to stop at the altered stop sign, potentially leading to accidents or unsafe situations.

The underlying principle is similar to what a copy-move attack aims to achieve: manipulating an image to deceive an AI system and produce incorrect predictions or classifications. In a real-world scenario, a copy-move attack could be used to deceive AI-based security systems, surveillance systems, or access control systems by manipulating images to hide, duplicate, or relocate objects or features.

To counteract such attacks and protect AI systems, organizations should invest in developing robust and resilient models, implement comprehensive defense strategies, and continuously monitor system performance for signs of manipulation or unusual behavior.

How to Mitigate

Mitigating copy-move attacks against AI requires a combination of strategies and techniques to strengthen the AI models and detect potential manipulations. Some ways to mitigate these attacks include:

Adversarial training: Train the AI model using adversarially generated examples, including those with copy-move manipulations, to improve the model's robustness against such attacks.
Data augmentation: Enhance the training dataset by introducing variations and transformations, including copy-move manipulations, to increase the model's ability to recognize and handle manipulated inputs.
Input preprocessing: Apply preprocessing techniques, such as image compression, noise addition, or filtering, to the input images before feeding them into the AI system. This can help reduce the effectiveness of copy-move manipulations by altering or removing the pasted regions.
Image forensics: Use image forensics techniques to analyze input images and detect potential copy-move manipulations. Techniques such as block-matching, keypoint-based methods, or deep learning-based approaches can help identify duplicated or moved regions within an image.
Ensemble learning: Combine multiple AI models or algorithms to process input images, making it more challenging for an attacker to deceive all the models simultaneously. Ensembles can improve the overall system's robustness and reduce the impact of copy-move attacks.
Regular model updates: Continuously update AI models with new data, incorporating knowledge of the latest copy-move attack techniques to ensure that the models stay up-to-date and resilient against evolving threats.
Monitoring and anomaly detection: Monitor the AI system's performance and inputs for signs of unusual behavior or manipulations. Implement anomaly detection techniques to identify potential attacks and take appropriate action when a suspected manipulation is detected.

By employing these mitigation strategies, organizations can improve the robustness and resilience of their AI systems against copy-move attacks, reducing the likelihood of successful attacks and minimizing the potential negative impacts.

How to monitor/What to capture

Detecting copy-move attacks against AI involves monitoring various aspects of the AI system's inputs, outputs, and performance. Here are some key areas to monitor:

Input images: Analyze input images for signs of manipulation, such as duplicated or moved regions, blending artifacts, or inconsistencies in lighting or shadows. Employ image forensics techniques, like block-matching, keypoint-based methods, or deep learning-based approaches, to detect potential copy-move manipulations.
Model outputs: Monitor the AI model's predictions or classifications for unusual patterns, inconsistencies, or unexpected results that may indicate a successful copy-move attack. This could include sudden increases in misclassifications, false positives, or false negatives.
Performance metrics: Keep track of performance metrics, such as accuracy, precision, recall, and F1 score, to identify sudden drops or unusual fluctuations that may suggest the AI system is being affected by copy-move attacks.
Anomalies in input data distribution: Monitor the distribution of input data features for any anomalies or deviations from the expected patterns. Significant changes in the distribution could indicate that the input data is being manipulated.
System logs: Review system logs for any suspicious activity, such as unauthorized access attempts, unusual access patterns, or other indicators that an attacker may be targeting the AI system.
User feedback: Encourage users to report any issues, inconsistencies, or suspected manipulations they encounter while interacting with the AI system. User feedback can provide valuable insights into potential copy-move attacks that may have gone unnoticed by automated monitoring systems.
Continuously update threat intelligence: Stay informed about the latest copy-move attack techniques and tactics, and incorporate this knowledge into the monitoring and detection process.

By monitoring these areas and implementing a comprehensive detection strategy, organizations can increase their chances of identifying copy-move attacks against AI and take appropriate action to mitigate the potential negative impacts.

[Want to discuss this further? Hit me up on Twitter or LinkedIn]
[Subscribe to the RSS feed for this blog]
[Subscribe to the Weekly Microsoft Sentinel Newsletter]
[Subscribe to the Weekly Microsoft Defender Newsletter]
[Subscribe to the Weekly Azure OpenAI Newsletter]
[Learn KQL with the Must Learn KQL series and book]
[Learn AI Security with the Must Learn AI Security series and book]