Talk: Deconstructing AI Risk — From Research to Real-World Exploits

Updated on

June 16, 2025

In this talk, Peter Garraghan demonstrates how adversaries are already exploiting AI systems and why current security practices are often ill-equipped to stop them.

Dr. Peter Garraghan

TABLE OF CONTENTS

Key Takeaways

As AI is rapidly embedded across applications, its dynamic, unpredictable nature is introducing new vulnerabilities—many of which remain poorly understood by traditional security teams. In this talk, Dr. Peter Garraghan draws on over a decade of academic research and real-world testing to break down the anatomy of AI risk, demonstrating how adversaries are already exploiting AI systems and why current security practices are often ill-equipped to stop them.

‍

Download the slides.

The Invisible Risk Inside AI

Dr. Garraghan opens by reminding us that AI, despite its novelty, is still software. And like all software, it introduces invisible risks—except this time, we’re injecting it into applications without a solid understanding of its failure modes. Many organizations are moving quickly to adopt AI tools, often deploying them into production before fully assessing their implications. Security teams, development teams, and business stakeholders are moving at different speeds, creating a perfect storm of exposure.

Drawing on years of AI system testing, Garraghan outlines major attack categories affecting AI systems today including:

Extraction: Stealing or cloning models.
Injection & Inversion: Overriding prompt instructions or reverse-engineering training data.
Evasion: Tricking models into misclassifying or misinterpreting inputs.
Poisoning: Inserting malicious data to skew performance.

These attacks are not theoretical—they are happening in production environments, often triggered by simple, benign-seeming user interactions.

The Myth of the Model-Only Mindset

One of the most common pitfalls in AI security, Garraghan argues, is focusing solely on the model. Through a series of live demonstrations and thought experiments, he shows how real security risks typically arise in AI systems.

From Prompt Injection to Unsafe Behavior

The presentation walks through real-world exploit examples, including:

Markup Injection: Where a user hides malicious code or external links in inputs disguised as user data.
SQL Injection via LLM: Where cleverly crafted prompts generate outputs that can execute destructive commands if connected to backend services.
Unsafe Content Generation: Where models, when prodded with indirect prompts, generate instructions or opinions that could lead to harm—for example, advising on unsafe candle usage in a product demo.

In each case, the model behaves "as intended" from a technical standpoint. It follows the prompt, generates coherent output, and never raises an error. But the result is dangerous. That’s the heart of the challenge: AI systems do not fail in obvious ways—they fail in context-specific, nuanced ways that look like success until it’s too late.

Red Teaming AI: Why the Rules Have Changed

Garraghan makes the case for adapting offensive security approaches to fit the unique properties of AI systems; non-deterministic, often lack well-defined inputs and outputs, and don’t expose obvious vulnerabilities until paired with real-world usage.

Instead of just throwing jailbreak prompts at a model and calling it a day, effective AI red teaming should:

Start with the intended use case.
Understand the system architecture, including embeddings, context windows, vector stores, and connected APIs.
Design tests that mimic realistic user behavior, not just edge cases.
Surface not only security flaws but business risks, ethical failures, and compliance gaps.

Mindgard’s approach reflects this shift—testing not just models, but entire AI-powered applications and systems, identifying where vulnerable interactions and risky outputs emerge in real-world contexts.

The OWASP Top 10 for LLMs

Garraghan highlights the OWASP Top 10 for LLM applications, framing it as a helpful (but not exhaustive) reference for organizations looking to understand their exposure. These include familiar risks like Prompt Injection and Sensitive Information Disclosure, but also categories like:

System Prompt Leakage
Vector & Embedding Weaknesses
Excessive Agency (where the AI takes unintended actions)
Unbounded Consumption (where user input drives resource exhaustion)

He stresses that these risks are not abstract—they are showing up in deployed systems today.

A Case Study

To illustrate the practical side of testing AI systems, Garraghan presents a detailed walkthrough of a fictitious candle company using a generative AI interface to sell products. The system includes:

Guardrail bypass
Custom system prompts to limit the model’s responses
Embedded product data
Integrated delivery options

Despite these controls, the AI remains exploitable. Garraghan demonstrates how subtle prompt manipulation, capitalization patterns, or misleading queries can inject malicious payloads, leak unintended data, or bypass content filters entirely. This highlights a broader truth: even systems that seem locked down are vulnerable when context and control mechanisms are not rigorously tested.

So What Can Be Done?

Garraghan closes with pragmatic guidance. Despite the challenges, AI security is not a lost cause. Many best practices from traditional application security still apply—especially when adapted thoughtfully. Organizations should:

Test systems, not just models: Real risks emerge from integrations.
Threat model AI use cases: Don’t rely on generic LLM safety claims.
Embed controls across the stack: From prompt engineering to vector stores to API gateways.
Update governance and playbooks: AI development lifecycles differ from software; security programs must evolve accordingly.
Train your teams: Developers, product managers, and security engineers all need a shared understanding of how AI behaves and fails.

Final Takeaway

AI is not magic—and it’s not exempt from security scrutiny. But it is different. As AI becomes central to enterprise systems, security teams must shift their mindset from model testing to system testing, from rule-based detection to context-aware analysis. With the right frameworks, tools, and discipline, organizations can harness the power of AI safely and securely.

This talk equips attendees with the awareness, language, and practical insight to start that journey.

‍

Research: Shadow AI is a Blind Spot in Enterprise Security, Including Among Security Teams

At RSA Conference 2025 and InfoSecurity Europe 2025 we surveyed over 500 cybersecurity professionals to assess emerging threats in enterprise environments. The findings reveal a growing and often overlooked risk: security professionals using generative AI tools without approval, a trend known as Shadow AI.

Hunting AI Application Vulnerabilities With Burp

In this article we’ll walk through hunting for AI application vulnerabilities. We’ll use Mindgard to find application vulnerabilities in a deliberately-vulnerable LLM lab application made available by PortSwigger.

Webinar: Test AI Systems, Not Models

In this webinar, Dr. Peter Garraghan takes the audience on a deep dive into the underbelly of AI vulnerabilities, exposing the gaps within traditional AI security approaches and demonstrating why application-level AI security must be a priority.

Mindgard, the leading provider of Artificial Intelligence security solutions, helps enterprises secure their AI models, agents, and systems across the entire lifecycle. Mindgard’s solution uncovers shadow AI, conducts automated AI red teaming by emulating adversaries, and delivers runtime protection against attacks like prompt injection and agentic manipulation. Trusted by leading organizations in finance, healthcare, and technology, Mindgard is backed by investors including .406 Ventures, IQ Capital, Atlantic Bridge, and Lakestar.