Nightshade Data Poisoning: What It Is and How It Protects Artists from AI

Nightshade is an open-source tool that protects artists from unauthorized AI training by poisoning scraped image data, causing AI models to learn incorrect associations and making unlicensed training datasets less reliable.

Key Takeaways

  • Nightshade employs data poisoning to protect artists and prevent scraped copies of their images from being used to train AI models.
  • Nightshade adds visually undetectable alterations to images that cause machine learning models to make incorrect associations. This decreases the accuracy of any data scraped from these images. 
AI image generation interface creating images from text prompts, highlighting concerns around scraped training data, Nightshade protection methods, and data poisoning risks in generative AI models

In This Article

Each piece of art uploaded to the internet can be scraped and used to train an AI. Approximately 70% of training data used for large AI models was scraped. Artists have noticed. A 2024 survey conducted by the Design and Artists Copyright Society (DACS) showed that 74% of professional artists in the UK were very concerned about their art being used to train AI without permission. 89% of artists believe current safeguards and regulation around AI are ineffective for combating this problem. 

If data is online, generative AI (GenAI) models can find it and train on it. That includes both text and artwork or photography. The problem is that most GenAI models train on this data without consent, and that can hurt the livelihoods of the artists who post their work online. While opt-out mechanisms exist, they’re difficult to verify and even harder to enforce in practice.

Billions of dollars hang in the balance. Sales of AI art generated were projected at $5.3 billion in 2025 and are expected to grow to $40.4 billion by 2033. All of those sales will be driven by artists’ works they never agreed to license. In court artists have started pushing back: groups of visual artists filed a class action lawsuit against major companies that power image generators in 2023, claiming generative AI was “created to facilitate infringement by design.” In August 2024, a judge ruled that artists could move forward with copyright infringement claims against Stability AI, Midjourney, DeviantArt and Runway. In Congress, bipartisan legislation to help musicians, artists and writers take bad actors to court when their copyrighted works are used to train AI systems was reintroduced in July 2025 (the TRAIN Act).

Litigation and legislation are processes that take time. If artists want to take action today, Nightshade is an app that offers them a solution: court-free protection.

What is Nightshade?

Nightshade is an open-source data poisoning tool released by University of Chicago's SAND Lab which enables artists to practically protect themselves against unauthorized AI training. Instead of attempting to outright prevent AI companies from scraping artwork, Nightshade poisons the dataset itself.

Nightshade was downloaded over 250,000 times in the first five days after its launch.

How Does Nightshade Data Poisoning Work?

Photographer transferring image files from a memory card to a laptop, illustrating how artists can protect digital content from AI scraping with Nightshade data poisoning techniques
Photo by Samsung Memory from Unsplash

Nightshade is what researchers call a data poisoning tool. It modifies images to include poisoned data. The changes made to the image are imperceptible to humans.

Once it’s used to train an AI model, the model learns improper associations. Ben Zhao, a computer science professor at University of Chicago and leader of the project, explains one example of a painting of a cow in a meadow to TechCrunch: “By manipulating and effectively distorting that association, you can make the models think that cows have four round wheels and a bumper and a trunk. And when they are prompted to produce a cow, they will produce a large Ford truck instead of a cow.”

What sets Nightshade apart is the permanence of these changes: 

  • They persist through resizing, cropping, and compression.
  • Screenshots of the image will still hold the poisoned signal.
  • Pictures of the picture onscreen will also hold the signal.

This isn’t some watermark that can be removed with software. It works at the data level, making Nightshade hard to detect. This is great for artists, but adds another layer of risk for model devs who use scraped datasets.

The Benefits of Nightshade Data Poisoning

Photographer reviewing images on a smartphone beside a camera and laptop, representing online image sharing and the growing need for Nightshade data poisoning to defend artwork from AI training datasets
Photo by Plann from Unsplash

Data poisoning attacks are typically deliberate attacks meant to target AI models. Nightshade is meant to poison data deliberately, but its purpose is not to steal information or manipulate a model into generating harmful outputs, but to defend artists.

“Nightshade itself is not meant as an end-all, extremely powerful weapon to kill these companies,” says Zhao. “Nightshade shows that these models are vulnerable and there are ways to attack. What it means is that there are ways for content owners to provide harder returns than writing Congress or complaining via email or social media.”

Nightshade gives artists more leverage by:

  • Driving up the price of training with scraped data: This is Nightshade's objective. Nightshade disincentivizes developers from scraping without permission by threatening their datasets. Nightshade increases the expense, decreases the certainty, and makes unauthorized training unappealing compared to licensing your content.
  • Coordinated resistance: Most opt-out tools rely on artists working alone. Nightshade flips the script. The power multiplies when artists join forces en masse.
  • Protecting original art: Nightshade doesn’t change the presentation of an image to human viewers in any significant way. Fans admiring an artists' work will see nothing strange. This makes it more practical than easily-noticeable overlays or watermarks, which can interfere with presentation or be removed.

“I am not fundamentally opposed to Generative AI. But AI needs to be fair, and ethical for everybody—and not only for the companies that make AI products,” said concept artist Karla Ortiz, who testified before the U.S. Senate in 2023. “AI needs to be fair to the customers who use these products, and also for creative people like me who make the raw material that these AI materials depend upon.”

The Implications of Nightshade Data Poisoning

Nightshade isn’t only a concern for artists and model trainers. It also speaks to larger questions about data privacy and information trustworthiness. If you work with AI models, Nightshade should make you think carefully about:

  • Data scraping: To anyone scraping data to use in AI models, Nightshade should be a wake-up call that “publicly available” does not equal “safe to train on.” Poisoned datasets can make model behavior unpredictable and difficult to debug.
  • Pipeline governance: If Nightshade can impact your model, more virulent strains of data poisoning can too. Protecting your model requires rigorous governance of your entire pipeline.
  • Validation and testing: Nightshade is a real-world demonstration of why you should consider all data untrusted until proven otherwise. Security teams need tools that can stress-test AI models against adversarial threats. That includes both poisoned inputs and aberrant model behavior. Mindgard’s red teaming and adversarial testing surfaces those risks before they become problems in production.

Long-term, Nightshade might lead to cleaner, more responsibly sourced data for AI training. Scraped data will always be a wild card when it comes to model trustworthiness. If that becomes too risky for AI companies, licensed data is going to look like a much cheaper and safer alternative. That’s great for artists, but it is a significant shift that teams should be prepared for.

Training Data Is an Attack Surface

Nightshade shifts the power relationship between makers and AIs. Rather than depending on difficult-to-maintain opt-outs, Nightshade imposes real costs on those who ignore consent. Nightshade uses positive data poisoning. Developers should be aware of how Nightshade affects your data integrity and consent if you incorporate third-party software or data into your tools.  

If Nightshade can surreptitiously manipulate models this way, you might be wondering: how can you trust the data your models depend on? Mindgard’s AI security platform helps you gain confidence by benchmarking your models against real-world adversarial threats such as data poisoning before they affect your production systems.

Challenge your AI defenses. Start red teaming with Mindgard today.

Frequently Asked Questions

How does Nightshade protect artists?

Nightshade is an open-source application that enables artists to make nearly undetectable changes to their images at the pixel-level to "poison" AI algorithms. If scraped and fed into an AI training system, that poisoned data will trip up artificial intelligence models during training, making it far less valuable as training data. 

Nightshade attacks the correlations made between a text prompt and an image. Train enough poisoned examples into a model, and it'll start creating images that have nothing to do with the prompts given, severely diminishing the usefulness of that model and giving AI companies a reason to avoid scraping artists' work without permission.

Is Nightshade visible to people viewing an image?

Nightshade alters an image in ways that are virtually undetectable by the human eye but are very distinct to artificial intelligence platforms analyzing them. This allows it to protect artists' intellectual property while minimizing changes to the aesthetics of the image that people can see. 

Will Nightshade impact AI models that have already been trained?

No. Nightshade poisons training data, so it can only affect models that are trained after the images have been poisoned. If a trained AI model ingested poisoned images, it would probably have to be re-trained without the corrupted images. Models that have already been trained before the poisoned images are ingested won’t be impacted.

Can AI companies detect and remove Nightshade poisoning?

Researchers note Nightshade is hard to protect against because it requires the AI model's developers to identify any images containing poisoned pixels (which aren't meant to be detectable by the human eye, or even by software data scraping tools) and remove them from training data. Doing this at scale would be incredibly difficult, as every poisoned image used would have to be found and removed from a company's data pool.

Is Nightshade considered a cyberattack?

Nightshade could be weaponized to harm image-creating AI models. Attackers with agendas beyond protecting their work could cause serious harm with it. In the academic community, Nightshade is considered a type of "data poisoning attack," which is an attack vector that targets an ML model's training data to induce novel behavior during training. The group that developed Nightshade describes it not as an attack, but as a way for artists to defend their work. 

Can generative AI models that don’t generate images be poisoned?

Data poisoning can also target generative AI models outside of image generation. Retrieval augmented generation (RAG) is an increasingly popular method to improve the performance of LLMs, which obtain data from numerous sources making them also potential targets for poisoning. Additionally, LLMs are already experiencing their own level of model collapse due to AI-generated content being fed back into their datasets. While researchers behind Nightshade state that theoretically data poisoning concepts could be applied to text-based and multimodal models, Nightshade was built with image-to-text generation systems in mind.