February 27, 2025
Mindgard Nominated for NVIDIA GTC 2025 Poster Award
Mindgard is thrilled to announce that we have been nominated for the NVIDIA GTC 2025 Poster Award! As one of the premier global conferences for AI, accelerated computing, and security innovation, NVIDIA GTC showcases groundbreaking research and technological advancements from industry leaders, academia, and startups.
TABLE OF CONTENTS
Key Takeaways
Key Takeaways

Mindgard is thrilled to announce that we have been nominated for the NVIDIA GTC 2025 Poster Award! As one of the premier global conferences for AI, accelerated computing, and security innovation, NVIDIA GTC showcases groundbreaking research and technological advancements from industry leaders, academia, and startups alike. We are honored to be recognized among the best in AI security research and to have the opportunity to present our work to the broader AI and cybersecurity community.

About NVIDIA GTC and the Poster Competition

NVIDIA’s GPU Technology Conference (GTC) is the world’s leading event for AI, deep learning, data science, and high-performance computing. The Poster Competition highlights cutting-edge research that leverages GPU computing to drive innovation across industries. Nominees are selected based on their research’s technical rigor, real-world impact, and contributions to advancing AI applications.

For our submission, Mindgard focused on one of the most critical challenges in AI security today: Bypassing AI safety guardrails in large language models (LLMs). This work demonstrated issues within Microsoft’s Azure AI Content Safety Service and proposes new methodologies to strengthen AI defenses against adversarial attacks.

1

Our Research: Identifying and Exploiting Blind Spots in AI Safety Guardrails

At Mindgard, our mission is to safeguard AI systems from security vulnerabilities, adversarial manipulation, and real-world exploitation. Our poster, based on research conducted in 2024, targets two guardrails in Microsoft’s Azure AI Content Safety suite:

  • AI Text Moderation: The feature responsible for blocking harmful content, such as hate speech and explicit material.
  • Prompt Shield: A guardrail designed to prevent AI jailbreaks and prompt injection attacks.

Our research demonstrated how attackers can bypass these safeguards using two advanced attack strategies:

  1. Character Injection Techniques – Manipulating text inputs using diacritics, homoglyphs, numerical replacements, spaces, and zero-width characters to evade AI moderation.
  2. Adversarial ML Evasion Techniques – Deploying perturbation-based attacks that modify important words within a text sample to trick AI classifiers.

Our findings reveal that these attacks can reduce AI Text Moderation detection accuracy by up to 100% and Prompt Shield detection by up to 78.24%, effectively bypassing AI guardrails and allowing harmful content to propagate undetected.

Download the Poster

Why This Matters for AI Security

The consequences of these vulnerabilities are significant. If left unchecked, attackers could exploit these weaknesses to:

  • Evade moderation systems and inject harmful content into AI-powered applications.
  • Manipulate LLMs into violating ethical and security guidelines.
  • Undermine trust in AI systems by making them susceptible to misinformation, unauthorized data access, and security breaches.

By showcasing this research at NVIDIA GTC 2025, we aim to raise awareness about the growing need for robust AI security measures and advocate for stronger adversarial defenses in AI safety frameworks.

Join Us at NVIDIA GTC 2025

We invite all attendees to come chat with us at booth #3003 and join us at the Historic Civic Center on March 17, from 5–7 p.m. to view our poster and engage with our team to learn more about our findings. If you're interested in AI security, adversarial attacks, or next-generation AI defense strategies, this is an opportunity to see firsthand how cutting-edge research is shaping the future of AI security.