Must Learn AI Security Part 9: Hyperparameter Attacks Against AI

Chapter 9

Sep 07, 2023

This post is part of an ongoing series to educate about new and known security vulnerabilities against AI.

The full series index (including code, queries, and detections) is located here:

The book version (pdf) of this series is located here: https://github.com/rod-trent/OpenAISecurity/tree/main/Must_Learn/Book_Version

The book will be updated when each new part in this series is released.

What is a Hyperparameter attack against AI?

A hyperparameter attack against AI is a type of adversarial attack that aims to manipulate the training process of a machine learning model by tampering with its hyperparameters. Hyperparameters are the adjustable settings of an algorithm that determine its overall performance and behavior. Examples of hyperparameters include learning rate, number of layers in a neural network, and batch size.

How it works

A hyperparameter attack against AI works by exploiting the vulnerability in the training process of a machine learning model. The attacker manipulates the hyperparameters, which are the adjustable settings of the algorithm that influence its performance and behavior. Here's a step-by-step overview of how a hyperparameter attack might be executed:

Gain access: To perform a hyperparameter attack, the attacker first needs access to the training process or the data used for training. This could involve breaching the security of the system where the model is being trained or compromising an insider with the necessary access.
Identify target hyperparameters: The attacker identifies the crucial hyperparameters that can have a significant impact on the AI model's performance or behavior. These could be the learning rate, the number of layers in a neural network, the batch size, or other settings that affect the model's training.
Tamper with hyperparameters: The attacker modifies the target hyperparameters to achieve their desired outcome. This could involve increasing or decreasing the learning rate, changing the network architecture, or manipulating other settings to degrade the model's performance, introduce biases, or make it more susceptible to adversarial attacks.
Monitor the impact: The attacker may monitor the training process to ensure the manipulated hyperparameters are producing the intended effect. They could observe the model's accuracy, loss, or other metrics to assess the success of their attack.
Exploit the compromised model: Once the attack is successful, the attacker can exploit the compromised AI model for their own purposes, such as using it to make incorrect predictions, produce biased results, or further compromise the system.

Types of Hyperparameter attacks

There are various types of hyperparameter attacks against AI, depending on the attacker's goals and the specific hyperparameters targeted. Here are some examples:

Performance degradation attacks: These attacks aim to reduce the performance of the AI model by altering hyperparameters like the learning rate, batch size, or the number of layers in a neural network. By doing so, the attacker can cause the model to underfit or overfit the data, leading to poor generalization and accuracy.
Bias introduction attacks: In these attacks, the attacker manipulates hyperparameters to introduce biases into the AI model. They might change the model's architecture or other settings to make it more sensitive to specific features, causing it to produce biased predictions or classifications.
Adversarial vulnerability attacks: These attacks focus on increasing the susceptibility of the AI model to adversarial examples. The attacker might change hyperparameters like the learning rate, regularization strength, or the model's architecture to make it more vulnerable to adversarial perturbations, enabling them to deceive the model with carefully crafted input data.
Transferability attacks: In these attacks, the attacker manipulates hyperparameters to make the AI model more prone to transfer attacks. By adjusting the model's architecture or other settings, they can cause it to learn features that generalize poorly across different datasets, making the model more likely to perform poorly when faced with new or unseen data.
Resource exhaustion attacks: These attacks aim to consume excessive computational resources during the training process, slowing down the system or causing it to crash. The attacker might increase the model's complexity, the number of training epochs, or the batch size to force the system to spend more time and resources on training the model.

Why it matters

A hyperparameter attack against AI can have several negative effects on the targeted machine learning model, the system it is deployed in, and the organization using it. Some of these negative effects include:

Performance degradation: By altering crucial hyperparameters, attackers can undermine the performance of the AI model, causing it to produce less accurate predictions or classifications, which in turn could lead to incorrect decisions or outcomes.
Bias introduction: Hyperparameter attacks can introduce biases into the AI model, causing it to make unfair or discriminatory predictions. This can harm the reputation of the organization, lead to legal issues, and negatively impact the users affected by the biased decisions.
Increased vulnerability to adversarial attacks: By manipulating hyperparameters, attackers can make the AI model more susceptible to adversarial examples, enabling them to deceive the model with carefully crafted input data, potentially causing harm or exploiting the system for their own benefit.
Reduced transferability: Hyperparameter attacks can negatively impact the model's ability to generalize across different datasets, making it less effective when faced with new or unseen data, which can limit its usefulness and applicability in real-world scenarios.
Resource exhaustion: Some hyperparameter attacks can consume excessive computational resources during the training process, causing the system to slow down or crash, impacting the organization's productivity and potentially leading to additional costs.
Loss of trust: If a hyperparameter attack is successful and compromises the AI model, it may lead to a loss of trust in the model's predictions and the organization using it, negatively affecting the adoption of AI solutions and potentially harming the organization's reputation.

Why it might happen

An attacker can gain several advantages from a successful hyperparameter attack against AI, depending on their goals and intentions. Some potential gains include:

Sabotage: By degrading the performance of the targeted AI model, an attacker can undermine the effectiveness of the system it is deployed in, causing harm to the organization using it. This can be particularly disruptive in critical applications like healthcare, finance, or security.
Exploitation: If the attacker can make the AI model more vulnerable to adversarial examples, they can potentially exploit the model for their own benefit, such as bypassing security measures, manipulating the system's decisions, or gaining unauthorized access to sensitive information.
Reputation damage: By introducing biases or causing the AI model to produce incorrect or unfair predictions, an attacker can harm the reputation of the organization using the model, leading to a loss of trust from customers, partners, or regulators.
Competitive advantage: In some cases, an attacker could be a competitor seeking to undermine the performance of the targeted organization's AI model, either to gain a competitive advantage or to discredit the organization's products or services.
Information theft: If the attacker is able to compromise the AI model and gain access to the underlying data used for training, they can potentially steal sensitive or proprietary information, which can be valuable for industrial espionage or other malicious purposes.
Demonstrating capabilities: In some cases, an attacker may conduct a hyperparameter attack as a proof of concept or to demonstrate their ability to compromise AI systems, either for personal notoriety or as a demonstration of power in the context of cyber warfare or nation-state cyber operations.

Real-world Example

While there are no widely known real-world examples of a hyperparameter attack against AI, the concept has been discussed and explored in academic research. In practice, such an attack would require the attacker to have access to the training process or training data, which is typically not easy to obtain.

However, the research paper "Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks" by Shafahi et al. (2018) presents a related scenario called "data poisoning" attacks. In these attacks, the adversary injects carefully crafted, malicious data points into the training set, which can cause the AI model to learn incorrect behaviors or make it vulnerable to adversarial attacks.

In this paper, the authors demonstrated a clean-label poisoning attack on a neural network used for image classification. By adding a small number of poisoned images with imperceptible perturbations, the attacker was able to manipulate the model's behavior and cause it to misclassify specific images. This example serves as a reminder of the potential risks associated with adversarial attacks on AI systems.

How to Mitigate

Mitigating hyperparameter attacks against AI involves implementing a combination of security measures, best practices, and validation techniques throughout the machine learning pipeline. Here are some steps to help reduce the risk of hyperparameter attacks:

Secure access to training data and processes: Protecting the training data and access to the training process is essential. Implement strong access control mechanisms, data encryption, and secure storage to prevent unauthorized access and tampering.
Monitor and log training activities: Continuously monitor and log activities during the training process to detect any anomalies or unauthorized actions. Establish alerts for unusual behavior that might indicate a potential attack.
Hyperparameter optimization and validation: Use techniques like grid search, random search, or Bayesian optimization to find the optimal hyperparameters for your AI model. Validate the model's performance using cross-validation or hold-out validation sets to ensure that the chosen hyperparameters lead to a well-performing and secure model.
Robustness testing: Test the AI model's robustness against adversarial examples and other potential attacks to identify vulnerabilities and make necessary adjustments to improve its resilience.
Regularly update and retrain models: Keep AI models up-to-date and retrain them periodically with new data to ensure that they remain effective and secure. This process can help identify and address potential issues that may arise over time.
Audit and review: Conduct regular audits and reviews of the AI model's performance, architecture, and hyperparameters to identify any discrepancies or vulnerabilities that may have been introduced during the training process.
Implement responsible AI practices: Adopt responsible AI practices such as transparency, fairness, and accountability. Ensure that the AI model's behavior aligns with ethical guidelines and legal regulations to prevent biases and other undesirable outcomes.
Employee training and awareness: Train employees involved in the AI development process on the importance of security, the risks of hyperparameter attacks, and the best practices for preventing them.

By implementing these measures, organizations can minimize the risk of hyperparameter attacks and ensure the integrity, performance, and security of their AI models.

How to monitor/What to capture

To detect a hyperparameter attack against AI, it is crucial to monitor various aspects of the machine learning pipeline, focusing on the training process, data handling, and system behavior. Here are some key elements to monitor:

Training data access and integrity: Track access to the training data, looking for unauthorized access or unusual activity patterns. Ensure the integrity of the training data by checking for unexpected modifications or inconsistencies.
Hyperparameter changes: Monitor changes to the hyperparameters during the training process. Keep track of any unexpected alterations or deviations from the predetermined values or the optimization process.
Model training activities and progress: Observe the training process, including the learning rate, loss function, and model's performance metrics (e.g., accuracy, precision, recall) throughout the training. Watch for sudden changes or anomalies that may indicate an attack.
System resource usage: Track the computational resources used during the training process, such as CPU, GPU, memory, and storage. Unusual spikes or patterns in resource consumption could suggest a hyperparameter attack aimed at exhausting resources.
Model architecture and configuration: Monitor the model's architecture and configuration settings for any unauthorized changes or unexpected modifications that might compromise the model's performance or security.
Performance on validation and test datasets: Regularly evaluate the model's performance on validation and test datasets to ensure that it maintains its accuracy and generalization capabilities. Monitor for any significant deviations in performance metrics that might indicate tampering with the model's hyperparameters.
Anomalies in model predictions: Analyze the model's predictions on real-world data to detect any unusual patterns or biases that may result from a hyperparameter attack.
Logs and alerts: Maintain detailed logs of all activities related to the AI model's development, training, and deployment. Set up alerts for any unusual behavior or deviations from the expected patterns, which could indicate a potential attack.

By continuously monitoring these elements, organizations can increase their chances of detecting a hyperparameter attack and take appropriate action to protect their AI models and systems.

[Want to discuss this further? Hit me up on Twitter or LinkedIn]
[Subscribe to the RSS feed for this blog]
[Subscribe to the Weekly Microsoft Sentinel Newsletter]
[Subscribe to the Weekly Microsoft Defender Newsletter]
[Subscribe to the Weekly Azure OpenAI Newsletter]
[Learn KQL with the Must Learn KQL series and book]
[Learn AI Security with the Must Learn AI Security series and book]