Mapping Adversarial AI Tactics and Techniques

The Mitre Atlas provides a useful taxonomy of adversary tactics and techniques against Al-enabled systems based on real-world attack observations.

Cyber Security

Insider Language Analysis

MITRE Atlas

Data Protection

Cyber Security

AI security

2 January 2026

Mapping Adversarial AI Tactics and Techniques Mapping Adversarial AI Tactics and Techniques

Overview

AI systems are increasingly embedded across enterprise operations, but their adoption is also reshaping the cybersecurity threat landscape. As organisations deploy AI-powered applications and agents, these systems are becoming high-value targets for adversarial misuse and exploitation.

This article explores:

  • How AI systems are attacked through adversarial techniques
  • How attackers use AI itself to automate and scale cyber operations
  • Why insecure AI models increase organisational risk
  • How MITRE ATLAS helps structure defensive thinking around AI security

MITRE ATLAS[1] (Adversarial Threat Landscape for Artificial-Intelligence Systems) provides a structured way to map real-world attacker tactics and techniques targeting AI systems. By documenting these threats systematically, ATLAS supports more informed defensive design and risk mitigation for AI-enabled environments.

Why AI Is a Security Target

In June 2025, security researchers demonstrated how a prompt injection targeting an AI-powered customer service agent can be used to exfiltrate sensitive enterprise data. The researchers used the injection to discover the AI agent’s capabilities. They then crafted a prompt to get the agent to retrieve private or sensitive customer data and then automatically exfiltrate it via its email tool. (Reddy and Sanjay 2025)[2]

This incident highlights a growing problem. As organisations adopt AI at scale, the attack surface expands with it. AI models, agents, APIs, and plug-ins introduce new entry points that attackers are actively learning to exploit. Weaknesses in AI systems can be used to infiltrate enterprise environments, disrupt operations, or extract sensitive data.

At the same time, threat actors are increasingly using AI offensively. Automation, rapid data analysis, and adaptive behaviour allow attackers to execute more targeted and scalable cyberattacks than traditional methods alone.

Why Defensive AI Needs a Framework

AI is not only part of the problem. It is also part of the solution.

When integrated into cybersecurity operations, AI can help organisations defend AI systems themselves, shorten breach response times, and reduce the overall cost of incidents. IBM[3] estimates that effective use of AI in security operations can reduce average breach costs by up to $1.9 million.

However, defensive AI only works when it is applied with a clear understanding of how AI systems are attacked in practice. Security teams need visibility into the tactics and techniques adversaries use to manipulate models, extract data, evade controls, or interfere with AI-driven workflows. Traditional security frameworks, built around conventional IT systems, are often not enough.

This is where MITRE ATLAS becomes useful. Rather than treating AI threats as abstract risks, ATLAS documents real-world attacker behaviour specific to AI and machine learning systems. It provides a shared language for understanding how AI is compromised and a practical foundation for designing defences that reflect how attacks actually happen.

The Adversarial AI Attack Surface for AI Systems

IIBM’s security researchers found that between March 2024 and February 2025, 13% of organisations reported breaches that involved their AI models or applications. Most of the reported incidents happened due to compromised apps, APIs or plug-ins, and a majority of them led to either data compromise (60%) or caused operational disruptions (31%).(IBM, 2025[4]).

Simply put, bad actors are increasingly leveraging AI for adversarial purposes, meaning AI itself is now a high-value target for cyberattacks.

Adversarial AI attacks involve the deliberate manipulation of AI/ML systems so they generate incorrect or inaccurate output. The NIST identifies several types of adversarial cyberattacks that manipulate the behaviour of AI systems and compromise their output. These attacks, highlighted in Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations (NIST.AI.100-2)[5], include:

  • Model poisoning: Attackers include corrupted data in the model’s training data, thus affecting the AI system’s accuracy.
  • Evasion attacks: Bad actors try to change the input data of a deployed system so it responds in an unexpected way.
  • Privacy attacks: Adversaries ask the AI model numerous legitimate questions to find its weak spots and get it to reveal sensitive information.

Often, model developers focus on accuracy while overlooking security. Missing controls such as AI guardrails, telemetry logging, and memory hardening leave systems exposed, allowing attackers to manipulate models into leaking sensitive data, producing harmful output, or generating incorrect results.

The Offensive AI Wave and the Rise of AI-Powered Cyberattacks

So far, we’ve looked at how attackers exploit weaknesses in AI systems themselves. Increasingly, those same attackers are also using AI to target wider enterprise environments. This shift is often described as offensive AI.

Offensive AI refers to the use of AI, machine learning, and large language models (LLMs) to carry out cyberattacks more efficiently and at a greater scale. Common examples include:

  • Deepfakes: Highly realistic AI-generated video or audio used to spread misinformation, impersonate individuals, or support targeted smear campaigns
  • Web scraping bots: AI-driven tools that extract large volumes of sensitive data while bypassing anti-bot controls, then structure that data into usable formats such as CSV or JSON
  • Malicious GPTs: Modified or custom-built models designed to generate malware, phishing emails, or deceptive content as part of targeted fraud or intrusion campaigns

All these attacks have unique characteristics that differentiate them from “regular” cyberattacks. One important characteristic is automation. AI-powered attacks involve the use of automated tools that allow attackers to speed up various phases of an attack, including research, reconnaissance, planning, and even execution.

Efficient data gathering is another key characteristic. AI systems can collect and analyse vast amounts of information from multiple sources at speed. Attackers use these insights to identify high-value targets, tailor attacks precisely, and move quickly towards objectives such as data exfiltration, encryption, operational disruption, persistence, or lateral movement.

Crucially, these attacks are not static. AI-powered techniques evolve over time. Models can adapt based on feedback, refine their behaviour, and adjust to defensive measures, making detection increasingly difficult for conventional security controls.

The bottom line is clear. Whether AI is being used to attack (offensive AI) or whether AI itself is attacked (adversarial AI), AI is changing attacker behaviours. Effective defence starts with understanding how these attacks actually work.

MITRE ATLAS systematically documents these tactics and techniques. These 15 tactics and 50+ techniques presented in a matrix form enable security personnel to better understand the “evolving vulnerabilities of Al-enabled systems as they extend beyond cyber”[6] and effectively identify, prevent, and mitigate adversarial threats to AI and ML systems.

Defensive Use Cases for MITRE ATLAS

Security personnel can use the MITRE ATLAS matrix to build a clearer picture of the AI threat landscape and put controls in place before incidents happen. Within the matrix, tactics describe why attackers act, while the techniques mapped under each tactic show how those attacks are carried out in practice.

The table below illustrates how tactics and techniques in MITRE ATLAS align with five key defensive use cases. This, in turn, enables security teams to translate threat intelligence into actionable protection measures for AI systems.

  • Detect AI supply-chain attacks: Prevent compromised models and dependencies (e.g. malicious Hugging Face models) by verifying all AI artifacts and deploying guardrails between models and outputs.
  • Defend against data poisoning: Reduce training-time attacks by limiting exposure of model artifacts and maintaining full dataset provenance across sources and transformations.
  • Secure the model training pipeline: Protect model integrity against poisoning and manipulation (e.g. PoisonGPT) using adversarial training, model hardening, and robust input pre-processing.
  • Limit model extraction and data exfiltration: Reduce inference-time leakage by rate-limiting queries, enforcing strong authentication, and logging all model inputs and outputs.
  • Prevent prompt injection attacks: Safeguard LLM-powered systems from tool abuse and data exfiltration using AI guardrails, supervised fine-tuning, and targeted safety alignment techniques.

Limitations of MITRE ATLAS

MITRE ATLAS is useful, but it is not a complete safety net. It helps you understand documented tactics and techniques, yet it is still a catalogue of what we already know about. That’s great for mapping known threats to controls, but it won’t reliably tell you what attackers will try next. The AI threat landscape moves fast, so if ATLAS is not kept up to date, teams can end up defending yesterday’s attacks.

There’s another risk too: it can make people feel “done”. A team might work through the mitigations attached to each technique and assume the AI stack is now secure. In reality, new attack paths appear, and plenty of weaknesses never show up in a public framework until after they’ve been exploited.

If you stop at ATLAS and skip ongoing monitoring, threat modelling, and testing, you create blind spots (and those blind spots are where compromises happen).

Conclusion

In practice, AI cuts both ways. It helps organisations work faster, reduce costs, and improve productivity. At the same time, AI systems introduce new weaknesses. When those systems are attacked, the fallout can affect day-to-day operations, financial stability, and trust. As AI-driven attacks become more common, the security challenge gets harder, not easier.

References

  1. IBM, 2025, Cost of a Data Breach Report, https://www.ibm.com/reports/data-breach

  2. MITRE ATLAS, https://atlas.mitre.org/

  3. NIST, 2025, AI 100-2 E2025, Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations, https://csrc.nist.gov/pubs/ai/100/2/e2025/final

Recommended Talks

Explore related talks that complement this research

AI-Powered Threat Detection for Cyber Security

INDUSTRY

Cyber Security

Threat intelligence AI

Threat detection

AI-Powered Threat Detection for Cyber Security

AI is transforming cyber threat detection, cutting noise while surfacing real, explainable risks.

Leveraging MITRE ATLAS to Secure AI Systems | Atenai - AI Talks & Expert Guidance