Updated on
June 12, 2025
Talk: Deconstructing AI Risk — From Research to Real-World Exploits
In this talk, Peter Garraghan demonstrates how adversaries are already exploiting AI systems and why current security practices are often ill-equipped to stop them.
TABLE OF CONTENTS
Key Takeaways
Key Takeaways
  1. Understand the Unique Risks of AI: Learn how the dynamic and unpredictable behavior of LLMs makes them uniquely vulnerable to sophisticated attacks, even through seemingly benign user interactions. 
  2. Live Demonstration of AI Vulnerabilities: Attendees will witness a real-time demonstration of adversarial attacks on AI systems, showing how quickly and cost-effectively AI systems can be exploited. This live showcase brings theoretical risks to life, making the threats tangible and urgent.
  3. Practical Guardrails and AI Security Tooling: Walk away with actionable strategies and free open source tools that can help organizations mitigate AI risks. 
  4. Insights from a Decade of Research: Gain exclusive insights from Dr. Garraghan’s 10+ years of leading AI security research at the University of Lancaster. Attendees will benefit from deep technical expertise translated into practical, real-world applications.
  5. Empowering Secure AI Adoption: This session provides a blueprint for deploying AI systems safely. It addresses the balance between innovation and security, empowering organizations to embrace AI without compromising safety or data integrity.

As AI is rapidly embedded across applications, its dynamic, unpredictable nature is introducing new vulnerabilities—many of which remain poorly understood by traditional security teams. In this talk, Dr. Peter Garraghan draws on over a decade of academic research and real-world testing to break down the anatomy of AI risk, demonstrating how adversaries are already exploiting AI systems and why current security practices are often ill-equipped to stop them.

Download the slides.

The Invisible Risk Inside AI

Dr. Garraghan opens by reminding us that AI, despite its novelty, is still software. And like all software, it introduces invisible risks—except this time, we’re injecting it into applications without a solid understanding of its failure modes. Many organizations are moving quickly to adopt AI tools, often deploying them into production before fully assessing their implications. Security teams, development teams, and business stakeholders are moving at different speeds, creating a perfect storm of exposure.

Drawing on years of AI system testing, Garraghan outlines major attack categories affecting AI systems today including:

  • Extraction: Stealing or cloning models.

  • Injection & Inversion: Overriding prompt instructions or reverse-engineering training data.

  • Evasion: Tricking models into misclassifying or misinterpreting inputs.

  • Poisoning: Inserting malicious data to skew performance.

These attacks are not theoretical—they are happening in production environments, often triggered by simple, benign-seeming user interactions.

The Myth of the Model-Only Mindset

One of the most common pitfalls in AI security, Garraghan argues, is focusing solely on the model. Through a series of live demonstrations and thought experiments, he shows how real security risks typically arise in AI systems. 

From Prompt Injection to Unsafe Behavior

The presentation walks through real-world exploit examples, including:

  • Markup Injection: Where a user hides malicious code or external links in inputs disguised as user data.

  • SQL Injection via LLM: Where cleverly crafted prompts generate outputs that can execute destructive commands if connected to backend services.

  • Unsafe Content Generation: Where models, when prodded with indirect prompts, generate instructions or opinions that could lead to harm—for example, advising on unsafe candle usage in a product demo.

In each case, the model behaves "as intended" from a technical standpoint. It follows the prompt, generates coherent output, and never raises an error. But the result is dangerous. That’s the heart of the challenge: AI systems do not fail in obvious ways—they fail in context-specific, nuanced ways that look like success until it’s too late.

Red Teaming AI: Why the Rules Have Changed

Garraghan makes the case for adapting offensive security approaches to fit the unique properties of AI systems; non-deterministic, often lack well-defined inputs and outputs, and don’t expose obvious vulnerabilities until paired with real-world usage.

Instead of just throwing jailbreak prompts at a model and calling it a day, effective AI red teaming should:

  • Start with the intended use case.

  • Understand the system architecture, including embeddings, context windows, vector stores, and connected APIs.

  • Design tests that mimic realistic user behavior, not just edge cases.

  • Surface not only security flaws but business risks, ethical failures, and compliance gaps.

Mindgard’s approach reflects this shift—testing not just models, but entire AI-powered applications and systems, identifying where vulnerable interactions and risky outputs emerge in real-world contexts.

The OWASP Top 10 for LLMs

Garraghan highlights the OWASP Top 10 for LLM applications, framing it as a helpful (but not exhaustive) reference for organizations looking to understand their exposure. These include familiar risks like Prompt Injection and Sensitive Information Disclosure, but also categories like:

  • System Prompt Leakage

  • Vector & Embedding Weaknesses

  • Excessive Agency (where the AI takes unintended actions)

  • Unbounded Consumption (where user input drives resource exhaustion)

He stresses that these risks are not abstract—they are showing up in deployed systems today.

A Case Study

To illustrate the practical side of testing AI systems, Garraghan presents a detailed walkthrough of a fictitious candle company using a generative AI interface to sell products. The system includes:

  • Guardrail bypass
  • Custom system prompts to limit the model’s responses
  • Embedded product data
  • Integrated delivery options

Despite these controls, the AI remains exploitable. Garraghan demonstrates how subtle prompt manipulation, capitalization patterns, or misleading queries can inject malicious payloads, leak unintended data, or bypass content filters entirely. This highlights a broader truth: even systems that seem locked down are vulnerable when context and control mechanisms are not rigorously tested.

So What Can Be Done?

Garraghan closes with pragmatic guidance. Despite the challenges, AI security is not a lost cause. Many best practices from traditional application security still apply—especially when adapted thoughtfully. Organizations should:

  1. Test systems, not just models: Real risks emerge from integrations.

  2. Threat model AI use cases: Don’t rely on generic LLM safety claims.

  3. Embed controls across the stack: From prompt engineering to vector stores to API gateways.

  4. Update governance and playbooks: AI development lifecycles differ from software; security programs must evolve accordingly.

  5. Train your teams: Developers, product managers, and security engineers all need a shared understanding of how AI behaves and fails.

Final Takeaway

AI is not magic—and it’s not exempt from security scrutiny. But it is different. As AI becomes central to enterprise systems, security teams must shift their mindset from model testing to system testing, from rule-based detection to context-aware analysis. With the right frameworks, tools, and discipline, organizations can harness the power of AI safely and securely.

This talk equips attendees with the awareness, language, and practical insight to start that journey.