Discover the latest findings on vulnerabilities in Pixtral-Large-Instruct-2411, including jailbreak and encoding risks, and learn how to safeguard your AI applications effectively.
Fergal Glynn
As artificial intelligence (AI) continues to transform industries, the need for robust AI security and red teaming practices has never been more critical. AI systems, while powerful, are vulnerable to a range of threats, from adversarial attacks and prompt injections to data poisoning and model manipulation. To address these challenges, a growing community of experts continues to conduct research into and develop tools for securing AI technologies, ensuring their safe and ethical deployment.
This article highlights the key figures in AI security and AI red teaming—pioneers who are shaping the future of secure AI. From leading researchers and engineers to policy advocates and red teaming specialists, these individuals are at the forefront of identifying vulnerabilities, developing defensive strategies, and creating frameworks to mitigate risks. Whether they are working on securing large language models (LLMs), advancing AI governance, or building tools for red teaming, their contributions are essential to the responsible development of AI.
These professionals are not only addressing today’s challenges but also laying the groundwork for a safer AI-powered future. By following their work, you can stay informed about the latest advancements, tools, and strategies in AI security and red teaming. Whether you’re a cybersecurity professional, AI developer, or simply an enthusiast, these are the people you need to know to navigate the evolving landscape of AI security.
Job Title: Co-founder and CEO of Aquia
Area of Expertise: Security for Government
LinkedIn: Chris H.
Chris Hughes has roughly 20 years of Cybersecurity and IT Experience. He’s currently serving as the President & Co-Founder of Aquia, a cybersecurity consulting firm focused on enabling secure digital transformations. Bringing a diverse experience of expertise, Mr. Hughes has held a variety of roles throughout his career including CISO, Architecture, Engineering, Governance Risk and Compliance, and more.
Often working with technology startups, Mr. Hughes is a startup and board advisor, passionate about disrupting the cybersecurity ecosystem with innovative capabilities and technologies. Mr. Hughes is also a proud military veteran and adjunct professor, helping to empower and educate the next generation of cybersecurity professionals.
Why follow Chris Hughes?
Mr. Hughes serves as a Cyber Innovation Fellow (CIF) at the U.S. Cybersecurity and Infrastructure Security Agency (CISA). Mr. Hughes is the Author of Software Transparency and Effective Vulnerability Management from the Publisher Wiley and the host of the Resilient Cyber Substack and Podcast.
Additionally, Mr. Hughes is a frequent public speaker, panelist, author, and industry commentator on topics across AppSec, Vulnerability Management, Software Supply Chain Security, and DevSecOps.
More from Chris Hughes:
Chris Hughes’ posts on Cloud Wars
Job Title: Security Engineer at Amazon
Area of Expertise: Incident Response
LinkedIn: Day Johnson
X: @daycyberwox
YouTube: CYBERWOX
Day Johnson is skilled in Defensive Cybersecurity Operations relating to Cloud Security, SecDevOps & DevSecOps, Detection Engineering, Incident Response, SIEM Engineering, Data Analysis, Security Automation & Cybersecurity Training.
Why follow Day Johnson?
Mr. Johnson creates cybersecurity education content on YouTube.
More from Day Johnson:
Cyberwox Academy on X: @CyberwoxAcademy
Job Title: SecOps Engineering at Snyk
Area of Expertise: AI for SecOps
LinkedIn: Filip Stojkovski
Filip is a cybersecurity professional with 14+ years of experience, evolving from a SOC analyst to currently leading SecOps engineering at Snyk. Filip also advises companies on SOAR, AI for SOC, and threat intelligence strategies.
Filip is a SANS-certified (GSTRT, GCTI, GCFA), earned “Threat Seeker of the Year” honors, and created the LEAD Threat Intelligence Framework and Security Automation Development Life Cycle (SADLC).
Why follow Filip Stojkovski?
Mr. Stojkovski is making AI knowledge accessible to cyber security professionals.
More from Filip Stojkovski:
Analysis of the AI for SecOps Market
Filip Stojkovski’s posts on the Cyber Security Automation and Orchestration blog
AI Agents In Security Automation with Filip Stojkovski (The AI Security Accelerator Podcast)
Job Title: Principal Developer Advocate, Generative AI at Amazon Web Services (AWS)
Area of Expertise: Gen AI
LinkedIn: Antje Barth
X: @anbarth
Antje is a Principal Developer Advocate for generative AI at AWS, with over 17 years of experience in the IT industry. She is also an author, instructor, and co-founder of the Women in Big Data chapter in Duesseldorf, Germany.
Antje has co-authored the O'Reilly books, "Generative AI on AWS" and "Data Science on AWS," with over 20,000 copies sold in the US. She has also developed the "Generative AI with large language models" and "Practical Data Science" courses in collaboration with DeepLearning.AI.
Antje is passionate about AI and machine learning and skilled in Kubernetes, cloud-native and container/Docker technologies, big data, Python, public speaking, and solution selling. She enjoys sharing her knowledge and experience with the community and empowering others to leverage the power of generative AI.
Why follow Antje Barth?
Antje's mission is to help developers find practical ways to use generative AI, one of the most exciting and innovative fields of AI today.
More from Antje Barth:
Generative AI on AWS: Building Context-Aware Multimodal Reasoning Applications
Job Title: Co-Founder & Head of Research at Stealth Startup
Area of Expertise: AI Security
LinkedIn: Dylan Williams
Dylan is an experienced security analyst, industry speaker, and security of AI advocate. What’s Dylan working on right now? We’ll have to wait and see. His LinkedIn says he is the co-founder of a stealth startup. Exciting times!
Why follow Dylan Williams?
Dylan is on a mission to simplify AI for security professionals.
More from Dylan Williams:
Resilient Cyber w/ Filip Stojkovski & Dylan Williams - Agentic AI & SecOps
Job Title: Gen-AI x Detection Engineering, Principal Security Engineer at System Two Security
Area of Expertise: Red Teaming and Pentesting
LinkedIn: Stephen Lincoln
With over a decade of experience in security engineering and software engineering, Stephen Lincoln specializes in purple teaming, detection engineering, software development, creating custom attacks for breach and attack simulation, and threat research.
He has extensive expertise in SIEM platforms like Splunk, threat hunting, UEBA/UBA, red teaming/penetration testing, Python, and the MITRE ATT&CK framework. His experience also spans DevOps/DevSecOps, CI/CD pipelines, programming languages such as GoLang, C, and PHP, machine learning/AI, and incident response.
Why follow Stephen Lincoln?
Stephen is the developer of DetectIQ, an AI-powered security rule management platform that enables the creation, analysis, and optimization of detection rules across multiple security systems.
More from Stephen Lincoln:
Stephen Lincoln’s posts on the AttackIQ blog
Job Title: Founder of Software Analyst Cybersecurity Research (SACR)
Area of Expertise: AI/ML and Cybersecurity
LinkedIn: Francis Odum
Francis is a cybersecurity researcher and independent analyst read by over 60,000+ security and technology professionals.
Why follow Francis Odum?
Francis has built one of the largest independent cyber research firms in the cybersecurity market.
More from Francis Odum:
Job Title: Tech Lead for Safety Research Teams at OpenAI
Area of Expertise: Security of AI
LinkedIn: Alex Beutel
X: @alexbeutel
Alex Beutel leads the Safety Research teams at OpenAI, focusing on ensuring the safe deployment of AI technologies. Prior to joining OpenAI, he was a Senior Staff Research Scientist at Google Research, where he co-led a Responsible ML team. His work has spanned recommender systems, fairness, robustness, reinforcement learning, and machine learning for databases.
Alex holds a Ph.D. in Computer Science from Carnegie Mellon University, where his research centered on large-scale user behavior modeling, including fraud detection and recommender systems.
Why follow Alex Beutel?
Alex is a co-author of Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning.
More from Alex Beutel:
Alex Beutel’s research on Google Scholar
Job Title: Researcher at OpenAI
Area of Expertise: Security of AI
LinkedIn: Kai Yuanqing Xiao
X: @KaiKaiXiao
Kai Xiao is a researcher at OpenAI, contributing to advancements in AI safety and robustness. He has co-authored papers focusing on training large language models to prioritize privileged instructions, enhancing their ability to handle conflicting directives securely.
Why follow Kai Xiao?
Kai's research on hierarchical instruction following in language models directly contributes to mitigating risks associated with prompt injections and other adversarial attacks, enhancing the security and reliability of AI systems.
More from Kai Xiao:
Kai Xiao’s research on Google Scholar
Kai Xiao’s research on Papers with Code
Job Title: Member of Technical Staff at OpenAI
Area of Expertise: Security of AI
LinkedIn: Johannes H.
X: @JoHeidecke
Johannes Heidecke is a researcher at OpenAI, focusing on AI safety and reinforcement learning. He has co-authored works on rule-based rewards for language model safety and training models to prioritize privileged instructions, contributing to the development of more secure AI systems.
He is a co-author of the paper Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning.
Why follow Johannes Heidecke?
Johannes's work on reinforcement learning and safety mechanisms in language models is crucial for developing AI systems that can resist adversarial inputs and operate securely in complex environments.
More from Johannes Heidecke:
Johannes Heidecke’s research on Google Scholar
Johannes Heidecke’s research on Papers with Code
Job Title: Formerly Vice President of Research and Safety at OpenAI
Area of Expertise: Security of AI
LinkedIn: Lilian Weng
X: @lilianweng
Lilian Weng is the former Vice President of Research and Safety at OpenAI, overseeing risk management for AI models.
She joined OpenAI in 2018, initially contributing to robotics projects, including leading the development of a robotic hand capable of solving a Rubik’s Cube, after roles as a data scientist and software engineer at companies like Meta, Dropbox, and Affirm. Lilian led efforts to consolidate safety research and was a member of OpenAI's Safety and Security Committee. Lilian is also a Distinguished Fellow of the Fellows Fund.
Why follow Lilian Weng?
Lilian's leadership in AI safety has been instrumental in shaping industry standards for responsible AI development. Her work in risk management for AI models has provided a framework for organizations to emulate, emphasizing the importance of proactive safety measures in AI deployment.
More from Lilian Weng:
Lilian Weng’s research on Google Scholar
Job Title: Technical Program Manager, Trustworthy AI at OpenAI
Area of Expertise: AI Red Teaming
LinkedIn: Lama Ahmad
X: @_lamaahmad
Lama Ahmad leads the External Red Teaming efforts at OpenAI, focusing on identifying and mitigating potential vulnerabilities in AI models. Her work involves collaborating with external experts to rigorously test OpenAI's systems, ensuring they are robust against adversarial attacks.
Why follow Lama Ahmad?
Lama's role in leading red teaming initiatives is critical for proactively identifying weaknesses in AI systems, contributing to the development of more secure and resilient AI technologies.
More from Lama Ahmad:
Advancing red teaming with people and AI
Job Title: AI Policy Lead at OpenAI
Area of Expertise: AI Policy
LinkedIn: Sandhini Agarwal
Sandhini Agarwal is a safety researcher at OpenAI, focusing on the governance and ethical deployment of AI systems. She has co-authored papers on practices for governing agentic AI systems, contributing to the development of frameworks that ensure AI operates safely and ethically.
Why follow Sandhini Agarwal?
Sandhini's work on AI governance and safety practices is essential for establishing guidelines and frameworks that ensure AI systems are secure, ethical, and aligned with human values.
More from Sandhini Agarwal:
Sandhini Agarwal’s research on Google Scholar
Sandhini Agarwal's research on CatalyzeX
Sandhini Agarwal’s research on SciSpace
Job Title: External Red Teamer at OpenAI
Area of Expertise: AI Red Teaming
LinkedIn: Michael Lampe
Michael Lampe is part of the External Red Teaming group at OpenAI, working to identify vulnerabilities in AI models through rigorous testing. His efforts are crucial in ensuring that AI systems are robust and secure against potential threats.
Why follow Michael Lampe?
Michael is a contributor to OpenAI’s Approach to External Red Teaming for AI Models and Systems.
More from Michael Lampe:
OpenAI’s Approach to External Red Teaming for AI Models and Systems
Michael Lampe’s research on Google Scholar
Job Title: Researcher at OpenAI
Area of Expertise: AI Red Teaming
LinkedIn: Pamela Mishkin
Pamela holds a BA in Computer Science and Mathematics from Williams College and an MPhil in Technology Policy from the University of Cambridge. Before joining OpenAI, she led product management at The Whistle, developing tech tools for international human rights organizations. She also conducted economic policy research at the Federal Reserve Bank of New York and collaborated with the UK's Department of Digital Culture, Media and Sport on online advertising policy.
Why follow Pamela Mishkin?
Pamela Mishkin is a Researcher at OpenAI, focusing on AI policy and safety, particularly in making language models safe and fair from both technical and policy perspectives.
More from Pamela Mishkin:
Adversarial Attacks on NLP Models
Pamela Mishkin’s repositories on GitHub
Pamela Mishkin’s research on Google Scholar
Challenges in Deployable Generative AI (virtual talk)
Job Title: Founder and CEO of Stealth Startup
Area of Expertise: AI Red Teaming
LinkedIn: Nazneen Rajani
Nazneen Rajani is an AI Researcher focusing on AI safety, alignment, and the robustness of large language models (LLMs). She completed her Ph.D. in Computer Science at the University of Texas, Austin, specializing in NLP and the interpretability of deep learning models.
Why follow Nazneen Rajani?
Nazneen's work on red-teaming large language models is pivotal in identifying and mitigating potential vulnerabilities in AI systems. Her research ensures that AI models are robust, interpretable, and aligned with human values, making her a key figure in AI security. We are excited for the launch of Nazneen’s new startup.
More from Nazneen Rajani:
Nazneen Rajani’s research on Google Scholar
Job Title: ML Scientist at AI2
Area of Expertise: AI Red Teaming
LinkedIn: Nathan Lambert
X: @natolambert
YouTube: @natolambert
Nathan Lambert was previously a Research Scientist at Hugging Face, contributing to advancements in machine learning and AI alignment. He has co-authored several papers focusing on aligning language models with user intent and improving their safety and robustness.
Nathan's work involves developing methods to fine-tune AI models for better performance and alignment with human values.
Why follow Nathan Lambert?
Nathan's research on direct distillation of language model alignment addresses critical aspects of AI safety, ensuring that AI systems behave as intended and reducing the risk of unintended harmful outputs. His contributions are essential for developing secure and reliable AI technologies.
More from Nathan Lambert:
A Little Bit of Reinforcement Learning from Human Feedback
Nathan Lambert’s research on Google Scholar
Job Title: LLM Engineering & Research at Hugging Face
Area of Expertise: AI Red Teaming
LinkedIn: Lewis Tunstall
X: @_lewtun
Lewis Tunstall is a Machine Learning Engineer at Hugging Face, specializing in natural language processing and machine learning model development. He has co-authored works on aligning language models with user intent and improving their performance through fine-tuning techniques. Lewis's expertise lies in developing and deploying AI models that are both efficient and aligned with human expectations.
Why follow Lewis Tunstall?
Lewis's work on the direct distillation of language model alignment contributes to the development of AI systems that are secure and aligned with user intentions. His research helps in creating AI models that are less prone to generating harmful or unintended outputs, enhancing overall AI safety.
More from Lewis Tunstall:
Zephyr: Direct Distillation of LM Alignment
Lewis Tunstall’s research on Google Scholar
Natural Language Processing with Transformers
The Sequence Chat: Lewis Tunstall, Hugging Face, On Building the Model that Won the AI Math Olympiad
Lewis Tunstall on Hugging Face
Job Title: AI Security Specialist at NVIDIA
Area of Expertise: AI Security
LinkedIn: Kai Greshake
X: @KGreshake
Kai is experienced in penetration testing and currently focuses on research and consulting in the field of AI security. Kai has discovered Indirect Prompt Injections in LLMs.
Why follow Kai Greshake?
Kai's work on indirect prompt injections has highlighted significant vulnerabilities in LLM-integrated applications. His research, including the paper "Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection," demonstrates how adversaries can manipulate AI systems to produce unintended outputs, emphasizing the need for robust security measures in AI deployment.
More from Kai Greshake:
Kai Greshake’s research on Google Scholar
Job Title: Founder at rez0corp
Area of Expertise: AI Security
LinkedIn: Joseph Thacker
X: @rez0__
Joseph has a diverse background in software engineering and security. He began his career as a Software Engineer at HP Exstream and later at OpenText, where he also served as a Security Advocate.
Transitioning into security consulting, he worked as a Senior Security Consultant at Crowe before joining AppOmni in 2020. At AppOmni, he has held roles including Senior Offensive Security Engineer and Principal AI Engineer. Joseph holds a Master's degree in Cybersecurity and Information Assurance from Western Governors University. Joseph recently founded a company called rez0corp, we are excited to learn more about this endeavor.
Why follow Joseph Thacker?
Joseph's work at AppOmni focuses on integrating AI into SaaS security solutions. He has emphasized the challenges of securing complex SaaS applications and the necessity for specialized expertise in this area. His insights into AI-driven security measures and the development of tools like AskOmni, an AI-powered SaaS security assistant, highlight his contributions to advancing AI security practices.
More from Joseph Thacker:
"AI Application Security" by Joseph Thacker at IWCON2023
Joseph Thacker’s research on Papers with Code
Job Title: Consultant at Positive.security
Area of Expertise: AI Security
LinkedIn: Lukas Euler
Lukas studied Computer Science at the Karlsruhe Institute of Technology. In his subsequent roles as a Security Engineer with Scout24 and Episerver, he focused on securing every stage of the software development lifecycle.
His expertise includes conducting thorough white-box reviews of complex applications and designing secure software architectures. Lukas has also engaged in security research targeting niche technologies like telecommunication protocols and has conducted red teaming exercises during his tenure at Security Research Labs.
Why follow Lukas Euler?
Lukas has contributed to AI security through his research on vulnerabilities in AI-driven applications. Notably, he co-authored a report on a critical drive-by remote code execution (RCE) vulnerability in Windows 10, which allowed attackers to execute arbitrary code via malicious websites.
Additionally, Lukas has explored security flaws in AI applications like Auto-GPT, demonstrating how attackers could achieve code execution and escape from Docker containers.
More from Lukas Euler:
Hacking Auto-GPT and escaping its docker container
Job Title: Founder of Datasette
Area of Expertise: AI Security
LinkedIn: Simon Willison
X: @simonw
Simon Willison is a programmer renowned for co-creating the Django Web framework and founding the social conference directory Lanyrd.
He currently focuses on developing open-source tools for data exploration and publishing, notably Datasette. Simon's career includes roles such as Engineering Director at Eventbrite and a JSK Journalism Fellowship at Stanford University, where he concentrated on building open-source tools for data journalism.
Why follow Simon Willison?
Simon has significantly contributed to the field of AI security through his research and writings on prompt injection attacks—techniques that manipulate AI language models by injecting malicious prompts.
In September 2022, he detailed these vulnerabilities in his article "Prompt injection attacks against GPT-3," highlighting the ease with which AI systems can be subverted.
He further explored the challenges of mitigating such attacks in "You can’t solve AI security problems with more AI," emphasizing the limitations of using AI-based solutions to address AI vulnerabilities.
Simon's insights have been widely recognized, with coverage in publications like Ars Technica, discussing the implications of prompt injection attacks on AI systems.
His expertise in AI security makes him a valuable figure to follow for those interested in understanding and addressing the complexities of securing AI applications.
More from Simon Willison:
Job Title: CEO & co-founder of Mithril Security
Area of Expertise: Secure AI Applications
LinkedIn: Daniel Huynh
X: @dhuynh95
Daniel Huynh has a strong foundation in data science and business, holding degrees from École Polytechnique and HEC Paris. Prior to founding Mithril Security, he worked at Microsoft, where he focused on Privacy Enhancing Technologies. At Mithril Security, he leads efforts to create privacy-friendly AI tools, emphasizing end-to-end data protection during AI training and deployment.
Why follow Daniel Huynh?
Daniel’s leadership has resulted in significant advancements in privacy and security for AI. His focus on protecting data throughout the AI lifecycle addresses a critical gap in the deployment of secure AI applications. His work is especially important in ensuring data confidentiality in AI systems, making him a key figure in AI security.
Daniel is a co-author of How We Hid a Lobotomized LLM on Hugging Face to Spread Fake News.
More from Daniel Huynh:
Daniel’s posts on the Mithril Security blog
Daniel’s posts on the LaVague Blog
Building and Leading a Privacy-First Startup – Interview with Daniel Huynh
Job Title: Developer Relations Engineer at Mithril Security
Area of Expertise: Secure AI Applications
LinkedIn: Jade Hardouin
Jade Hardouin bridges the technical and community aspects of AI security. In her role at Mithril Security, she collaborates on research and educates developers about AI model vulnerabilities. She has co-authored articles on AI security, including work on prompt injection and supply chain poisoning attacks.
Why follow Jade Hardouin?
Jade’s contributions focus on identifying and mitigating vulnerabilities in open-source AI models. She has helped highlight how subtle manipulations of AI models can lead to the dissemination of misinformation, stressing the importance of secure AI supply chains.
More from Jade Hardouin:
Jade’s posts on the Mithril Security blog
Job Title: Red Team Director at EA
Area of Expertise: AI Red Teaming
LinkedIn: Johann Rehberger
YouTube: Embrace The Red
Johann Rehberger is a seasoned cybersecurity expert with over 18 years of experience in threat analysis, threat modeling, risk management, penetration testing, and red teaming. He has held significant positions at major tech companies, including Microsoft, where he established an offensive security team in Azure Data and served as Principal Security Engineering Manager. He also built out a red team at Uber and currently serves as the Director of Red Team at Electronic Arts.
Why follow Johann Rehberger?
With over 18 years in cybersecurity, Johann has a wealth of experience in threat analysis, threat modeling, risk management, penetration testing, and red teaming. His background includes significant roles at major tech companies like Microsoft and Uber, where he built and led offensive security teams.
More from Johann Rehberger:
Google AI Studio: Data Exfiltration via Prompt Injection. Quickly Fixed After Responsible Disclosure
Job Title: Senior Product Manager at Google
Area of Expertise: AI Developer Advocate
LinkedIn: Logan Kilpatrick
YouTube: @LoganKilpatrickYT
Logan began his career at NASA, where he was involved in building lunar rover software. He then transitioned to Apple, focusing on training machine learning models. In November 2022, he joined OpenAI as the Head of Developer Relations, supporting developers building with OpenAI's API and ChatGPT.
During his tenure, he played a pivotal role in launching the GPT Store and enhancing the developer experience. In April 2024, Logan joined Google as a Senior Product Manager for Google AI Studio and the Gemini API, aiming to enable developers worldwide to build with Gemini.
Why follow Logan Kilpatrick?
While Logan's primary focus has been on developer relations and AI product management, his work indirectly contributes to AI security. By facilitating the integration of AI technologies into various applications, he emphasizes the importance of building secure and reliable AI systems. His efforts in educating developers and promoting best practices help mitigate potential security risks associated with AI deployment.
More from Logan Kilpatrick:
Inside OpenAI | Logan Kilpatrick
Logan Kilpatrick’s posts on the Google for Developers blog
Job Title: AI Security Specialist at METR
Area of Expertise: AI Security
LinkedIn: Nikola Jurkovic
Nikola Jurkovic is dedicated to reducing AI risk, engaging in technical AI safety research on large language models, and contributing to AI safety field-building initiatives. Nikola served as the Deputy Director and former Workshops Lead of the AI Safety Student Team at Harvard.
Why follow Nikola Jurkovic?
Nikola's work focuses on mitigating risks associated with advanced AI systems. Through his research and leadership in AI safety initiatives, he contributes to the development of safer AI technologies and the promotion of best practices in AI deployment. His involvement in AI safety education and community-building efforts further amplifies his impact in the field.
More from Nikola Jurkovic:
LessWrong (Nikola Jurkovic’s blog)
Job Title: Machine Learning PhD Student at the University of Toronto
Research Engineer and Founding Team at AI Safety Institute (AISI)
Area of Expertise: AI Security
LinkedIn: Max Kaufmann
Max Kaufmann is a ML researcher, interested in ensuring that AI progress is broadly beneficial—both by minimizing AI-induced harms and ensuring that the benefits are shared outside of the privileged few.
Max studied as a PhD student at the University of Toronto, supervised by Roger Grosse, where he used influence functions to investigate how LLMs generalize. Before that, Max was part of the founding team at UK’s AI Safety Institute where he helped scale the organization,, and started the post-training team.
Why follow Max Kaufmann?
Max's research addresses critical challenges in AI security, particularly concerning the robustness and generalization of AI models. His work on adversarial robustness explores the development of new threat models that better capture real-world concerns and seeks to enhance the efficiency of adversarial training.
Additionally, his investigations into LLM generalization examine the limitations of these models and their situational awareness, which are crucial for understanding and mitigating potential risks associated with AI deployment.
More from Max Kaufmann:
Max Kaufmann’s research on Google Scholar
Job Title: Ph.D. candidate at the University of Maryland
Area of Expertise: AI Security
LinkedIn: Gowthami Somepalli
X: @gowthami_s
Gowthami Somepalli is a Ph.D. candidate in Computer Science at the University of Maryland, College Park, specializing in machine learning with a focus on AI security and robustness. Gowthami's research encompasses adversarial machine learning, data poisoning, and the security of large language models (LLMs).
She has co-authored several impactful papers, including "Baseline Defenses for Adversarial Attacks Against Aligned Language Models," which evaluates defense strategies against adversarial attacks on LLMs. Another notable work is "What Doesn't Kill You Makes You Robust(er): How to Adversarially Train against Data Poisoning," focusing on enhancing model robustness against data poisoning attacks.
Why follow Gowthami Somepalli?
Gowthami's contributions are pivotal in understanding and mitigating vulnerabilities in AI systems. Her work on adversarial attacks and defenses provides valuable insights into securing AI models against malicious inputs and data manipulation.
By developing and evaluating defense mechanisms, she aids in enhancing the robustness and reliability of AI applications, making her a significant figure in the AI security landscape.
More from Gowthami Somepalli:
Gowthami Somepalli’s research on Google Scholar
Job Title: Research Scientist at Anthropic Research
Area of Expertise: AI Security
LinkedIn: Alex Tamkin
X: @AlexTamkin
Alex Tamkin is a Research Scientist at Anthropic, focusing on understanding and controlling large pretrained language models. He completed his Ph.D. in Computer Science at Stanford University, where he was advised by Noah Goodman and was an Open Philanthropy AI Fellow.
His research encompasses machine learning, natural language processing, and computer vision, with a particular emphasis on self-supervised learning and AI safety.
Why follow Alex Tamkin?
Alex has made significant contributions to AI security through his research on the capabilities, limitations, and societal impacts of large language models (LLMs). His work aims to enhance the interpretability and robustness of AI systems, addressing potential risks associated with their deployment. Notably, he co-authored the paper "Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models," which explores the implications of deploying LLMs in real-world applications.
More from Alex Tamkin:
Alex Tamkin’s research on Google Scholar
Job Title: Executive Director at the Center for AI Safety
Area of Expertise: AI Security
LinkedIn: Dan Hendrycks
Dan Hendrycks is the Executive Director of the Center for AI Safety (CAIS), a nonprofit organization he co-founded in 2022 dedicated to advancing the safety and reliability of artificial intelligence systems. Dan earned his Ph.D. in Computer Science from the University of California, Berkeley, where he was advised by Jacob Steinhardt and Dawn Song.
Throughout his career, Dan has made significant contributions to the field of machine learning, notably developing the Gaussian Error Linear Unit (GELU) activation function, which has become the default activation in many state-of-the-art models, including BERT, Vision Transformers, and GPT-3.
Why follow Dan Hendrycks?
Dan's work is pivotal in the realm of AI safety and ethics. At CAIS, he leads initiatives to identify and mitigate risks associated with advanced AI systems, focusing on ensuring that AI technologies are developed and deployed responsibly. His research encompasses machine learning safety, machine ethics, and robustness, addressing challenges such as out-of-distribution detection and distribution shift benchmarks.
In addition to his research, Dan has been recognized for his efforts in AI safety, being named on Forbes' 2025 30 Under 30 list in the AI category.
More from Dan Hendrycks:
An Overview of Catastrophic AI Risks
Dan Hendrycks’ research on Google Scholar
Introduction to AI Safety, Ethics, and Society
Job Title: CEO of Mindgard
Area of Expertise: Security of AI
LinkedIn: Dr. Peter Garraghan
X: @DrGarraghan
Dr. Peter Garraghan is a Professor in Distributed Systems at Lancaster University and the CEO/CTO and co-founder of Mindgard, a cybersecurity company specializing in AI security. At Lancaster University, Dr. Garraghan leads research in AI and machine learning security, focusing on the vulnerabilities of AI systems that traditional security tools cannot address.
His work has been featured in media outlets, including the BBC and the Daily Mail. In 2022, he co-founded Mindgard, a company dedicated to empowering enterprise security teams to deploy AI and generative AI securely. Mindgard's core product offers an automated platform for continuous security testing and red teaming of AI systems.
Why follow Dr. Peter Garraghan?
Dr. Garraghan's contributions to AI security are significant. Through Mindgard, he addresses the unique vulnerabilities associated with AI systems, emphasizing that while AI is still software subject to traditional cyber risks, the inherent complexity and unpredictability of neural networks demand entirely new approaches to security.
His work focuses on developing advanced technologies to combat growing threats against AI, ensuring that organizations can deploy AI and generative AI securely. Mindgard's solutions, born from years of rigorous research and development, provide continuous, automated red-teaming to detect and mitigate threats to AI systems that traditional security measures might miss.
More from Dr. Peter Garraghan:
Professor Peter Garraghan’s Research, Projects, and Publications at Lancaster University
Dr. Peter Garraghan’s research on Google Scholar
Professor Peter Garraghan’s profile at Lancaster University
Job Title: LLM Security at NVIDIA
Area of Expertise: LLM Security
LinkedIn: Leon Derczynski
Leon Derczynski is a Principal Research Scientist in Large Language Model (LLM) Security at NVIDIA and a Professor of Natural Language Processing (NLP) at the IT University of Copenhagen. He has authored over 100 NLP papers and is actively involved in leading bodies on LLM security, including serving on the OWASP LLM Top 10 core team and founding the ACL Special Interest Group on NLP Security.
Why follow Leon Derczynski?
Leon has made significant contributions to AI security, particularly in the evaluation and mitigation of vulnerabilities in large language models. He leads the development of "garak," a framework designed for security probing of LLMs, which systematically identifies potential weaknesses in these models.
His work emphasizes the importance of understanding and addressing the security risks associated with LLMs, advocating for a holistic approach to LLM security evaluation that prioritizes exploration and discovery of issues. For a deeper understanding of his work, you might find his publication on "garak: A Framework for Security Probing Large Language Models" insightful.
More from Leon Derczynski:
Leon Derczynski’s posts on the NVIDIA Developer blog
Leon Derczynski’s research on Google Scholar
Job Title: Security Engineer Turned Industry Analyst at Latio Tech
Area of Expertise: AI Security Advocacy
LinkedIn: James Berthoty
James Berthoty is a seasoned cybersecurity professional with over a decade of experience spanning engineering and security roles. He is the founder of Latio Tech, a platform dedicated to assisting organizations in identifying and implementing optimal security tools.
Why follow James Berthoty?
James has made significant contributions to the field of AI security, particularly in the areas of Application Security Posture Management (ASPM) and vulnerability management. He has shared insights on the evolution of AppSec, the challenges in managing software vulnerabilities, and the role of ASPM in today's API-driven cloud environment.
His work emphasizes the importance of reachability analysis in prioritizing vulnerabilities, helping organizations focus on exploitable risks. By integrating threat intelligence and network data, he advocates for a comprehensive approach to application security that is crucial in the context of AI systems.
More from James Berthoty:
Job Title: Founder and CEO of Unsupervised Learning
Area of Expertise: AI Security
LinkedIn: Daniel Miessler
YouTube: @unsupervised-learning
Daniel Miessler is a seasoned cybersecurity expert with over 24 years of experience, specializing in application security and risk management. Throughout his career, he has held significant roles at companies such as Robinhood, Apple, and the OWASP Foundation. In October 2022, he founded Unsupervised Learning, a company dedicated to developing solutions at the intersection of AI and security.
Why follow Daniel Miessler?
Daniel has been at the forefront of exploring the security implications of artificial intelligence. Through his platform, Unsupervised Learning, he delves into topics that merge AI with cybersecurity, providing insights into how AI can both enhance and challenge security paradigms.
His work emphasizes the importance of understanding AI's attack surfaces and developing robust defenses against potential threats. For those interested in AI security, Daniel's analyses and discussions offer valuable perspectives on the evolving landscape of AI-related threats and the measures needed to address them.
More from Daniel Miessler:
Job Title: AI Red Teamer at Microsoft
Area of Expertise: AI Red Teaming
LinkedIn: Gary L.
Gary D. Lopez Munoz is a Senior Security Researcher at Microsoft, specializing in AI security. He leads the AI Red Team's automation efforts, focusing on developing tools to identify and mitigate risks in generative AI systems.
Why follow Gary Lopez?
Gary's contributions to AI Security include:
More from Gary Lopez:
PyRIT: A Framework for Security Risk Identification and Red Teaming in Generative AI System
Gary Lopez’s research on Google Scholar
Gary Lopez’s research on Papers with Code
Job Title: Principal Research Manager for the Long-Term Ops and Research wing of the Microsoft AI Red Team
Area of Expertise: AI Red Teaming
LinkedIn: Amanda Minnich (AIRT)
X: @NMspinach
Dr. Amanda Minnich is a Senior AI Security Researcher at Microsoft, specializing in identifying and mitigating safety and security vulnerabilities in AI systems.
Why follow Amanda Minnich?
Amanda's contributions to AI Security include being a key member of the AI Red Team at Microsoft, where she rigorously tests foundational models and Copilots to uncover potential vulnerabilities.
Amanda has led workshops, such as "AI Red Teaming in Practice," to educate security professionals on systematically probing AI systems for weaknesses. Dr. Minnich has shared her expertise at conferences like Black Hat USA 2024, discussing AI safety and red teaming methodologies. She has co-authored papers focusing on AI security, including contributions to the development of tools like PyRIT for AI red teaming.
More from Amanda Minnich:
Amanda Minnich’s research on Google Scholar
Job Title: Responsible AI Engineer - AI Red Team Tooling at Microsoft
Area of Expertise: AI Red Teaming
LinkedIn: Roman Lutz
Roman Lutz is a Responsible AI Engineer at Microsoft, specializing in AI safety and security. Roman is a member of Microsoft's AI Red Team, focusing on identifying vulnerabilities in generative AI systems. His work involves developing tools and methodologies to assess and enhance the security of AI models.
Why follow Roman Lutz?
Roman is a key contributor to PyRIT (Python Risk Identification Toolkit), an open-source framework designed to facilitate red teaming efforts in generative AI systems. PyRIT enables security professionals to probe for and identify potential risks and vulnerabilities in AI models.
Roman is also a maintainer of the Fairlearn project, a toolkit developed to assess and improve the fairness of AI systems. This work contributes to ensuring that AI models operate equitably across diverse user groups.
More from Roman Lutz:
Roman Lutz’s research on Google Scholar
Job Title: Software Engineer at Microsoft
Area of Expertise: AI Red Teaming
LinkedIn: Nina C.
Nina Chikanov is an Offensive Security Engineer at Microsoft, specializing in AI security and red teaming for generative AI systems.
Why follow Nina Chikanov?
Nina's contributions to AI Security include being a co-author of the PyRIT (Python Risk Identification Toolkit) framework, an open-source tool designed to enhance red teaming efforts in generative AI systems.
PyRIT enables security professionals to identify novel harms, risks, and vulnerabilities in multimodal generative AI models. Nina has also shared her expertise at conferences, such as the 2024 New York Summit, where she discussed the importance of red teaming in generative AI.
More from Nina Chikanov:
Announcing Microsoft’s open automation framework to red team generative AI Systems
Lessons From Red Teaming 100 Generative AI Products
Job Title: AI Hacker
Area of Expertise: AI Red Teaming
Pliny the Liberator is an influential figure in the AI red teaming community, known for his expertise in prompt engineering and AI safety. He describes himself as a "latent space liberator" and "1337 AI red teamer," highlighting his focus on exploring and mitigating AI vulnerabilities.
Why follow Pliny the Liberator?
One of Pliny the Liberator's notable contributions is the development of "L1B3RT4S," a collection of prompts designed to test and bypass AI system restrictions, thereby identifying potential weaknesses.
Additionally, Pliny the Liberator has created "GODMOD3," a chat interface that utilizes built-in jailbreak prompts to circumvent standard AI guardrails, providing a platform for liberated state-of-the-art models.
Pliny the Liberator's work emphasizes the importance of understanding and addressing the limitations and vulnerabilities of AI systems, contributing significantly to the field of AI red teaming.
More from Pliny the Liberator:
Pliny the Liberator’s threads on Threadreader
As the field of artificial intelligence continues to evolve, so too do the challenges of ensuring its security and reliability. The experts highlighted in this article—ranging from AI red teaming specialists and security researchers to policy advocates and tool developers—are at the forefront of addressing these challenges.
Their work in identifying vulnerabilities, developing defensive strategies, and creating frameworks for ethical AI deployment is essential to building a safer AI ecosystem. However, as AI systems grow more complex and integrated into critical applications, organizations need scalable, automated solutions to proactively secure their AI technologies.
This is where Mindgard steps in. As a leader in AI security, Mindgard provides an innovative platform designed to address the unique vulnerabilities of AI systems. Founded by Dr. Peter Garraghan, a renowned expert in distributed systems and AI security, Mindgard empowers organizations to deploy AI and generative AI securely.
Our automated red teaming and continuous security testing capabilities enable enterprises to identify and mitigate risks that traditional cybersecurity tools often miss. Book a demo today to discover how Mindgard can help you ensure your AI systems are robust, resilient, and aligned with security best practices.
AI security focuses on protecting artificial intelligence systems from vulnerabilities, adversarial attacks, and misuse. As AI becomes more integrated into critical applications, ensuring its security is essential to prevent risks such as data breaches, model manipulation, and unintended harmful outputs.
AI red teaming involves simulating adversarial attacks on AI systems to identify vulnerabilities and weaknesses. It is a proactive approach to stress-testing AI models, ensuring they are robust and secure against real-world threats.
Common threats include:
Industries heavily reliant on AI, such as healthcare, finance, autonomous vehicles, and cybersecurity, are particularly vulnerable. These sectors face significant risks from AI-related threats, including data breaches, regulatory non-compliance, and operational disruptions.
Organizations can begin by: