AI, Fair Use, and Piracy: What Bartz v. Anthropic Means for Pakistan’s Copyright Future

The rise of generative AI has unlocked new frontiers in how machines read, write, and reason, but what happens when this intelligence is built on a foundation of pirated books? In June 2025, a U.S. federal court issued a landmark ruling in Bartz v. Anthropic, a case where bestselling authors sued AI firm Anthropic for copying their copyrighted works without permission to train its large language model, Claude. While the court ruled that using books to train AI can amount to “fair use” under American law, it drew a sharp line when it came to the use of pirated material. This case sets a powerful precedent and opens a different kind of conversation for countries like Pakistan, where copyright enforcement remains weak and fair use exceptions are narrowly defined. How prepared are we to protect our authors, our data, and our narratives in a world where AI is learning from everything, including what it was never allowed to touch? This has indeed become a million-dollar question.

The Case in Focus: Bartz v. Anthropic

In Bartz v. Anthropic, three U.S.-based authors, Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, sued the AI company Anthropic PBC in the Northern District of California for copyright infringement. The claim? That Anthropic had illegally copied their books, not once, but in several stages, to train its generative AI tool, Claude. (Bartz v. Anthropic PBC, No. 3:24-cv-05417-WHA, at 2–3 (N.D. Cal. June 23, 2025))

The court found that Anthropic had downloaded millions of pirated books from illegal online libraries like Books3, LibGen, and Pirate Library Mirror, and later also purchased physical books, scanned them, and added them to a central digital library (id.. at 3–4). From this internal repository, Anthropic selected certain books (including the plaintiffs’) to train successive versions of Claude.

In a detailed ruling, the court observed that using books to train an AI was “transformative” and counted as fair use under U.S. copyright law (id.. at 10). This means that there was no copyright violation on the part of Anthropic for using the plaintiffs’ texts in training their AI, especially because the output never replicated the original works. However, the court also held that retaining pirated books in a permanent digital library was not fair use, even if those copies weren’t used directly in training. That part of the copying remained unlawful (id.. at 12–13).

In short, the court tried to strike a fair balance by allowing AI training to be used fairly because the technology was being used in a new and different way. But at the same time, it didn’t give tech companies a free pass to hoard pirated books in the name of research.

Because this is the first major ruling on how copyright law applies to AI, and because so much money and legal uncertainty are involved, there’s a good chance this decision will be challenged in a higher court. If that happens, this case could become a key precedent in shaping how laws around creativity and artificial intelligence develop in the future.

What Is Fair Use and Why Did the Court Accept AI Training?

At the heart of Bartz v. Anthropic lies a single legal question: When is copying someone’s work without permission allowed? In the U.S., the answer comes from the fair use doctrine, found in Section 107 of the Copyright Act. Courts evaluate four factors:

The purpose and character of the use,
The nature of the copyrighted work,
The amount and substantiality of the portion used, and
The effect on the potential market for the original work.

Among these, the court emphasized the first, particularly whether the use was “transformative.”

A use is considered transformative when it adds something new, serves a different purpose, or alters the original work’s function in a meaningful way. This idea of transformation has recently been re-examined in cases like Warhol v. Goldsmith, where the U.S. Supreme Court emphasized that not all new uses automatically count as fair use; context and purpose still matter. In Bartz v. Anthropic, the judge compared AI training to how a human reads books, not to copy them, but to learn, absorb, and later create something new. The court held that Claude’s training process, mapping trillions of word relationships from countless sources, was more like “learning” than duplicating. It called the AI’s use of text “spectacularly transformative.”

Importantly, the authors never alleged that Claude reproduced their books, quoted from them, or returned any outputs that resembled them. So while Anthropic’s models may have learned from the authors’ style and structure, the outputs did not compete with or substitute for the original books. Just like in Authors Guild v. Google, the court noted that full-book ingestion could still be considered fair use when the end product does not harm the market for the original. This led the court to weigh factors 3 and 4 (amount and market harm) in Anthropic’s favor, reasoning that complete ingestion was justified and did not erode the market for the original works.

However, that generosity ended where piracy began. The court held that downloading pirated books to build a general-purpose internal library was not transformative, regardless of their use for training. Anthropic’s internal documents showed a desire to “store everything forever,” a rationale the court rejected as unrelated to fair use.

Pakistan’s Problem: Are we Ready for this?

Here’s where things get tricky for Pakistan. While American courts are figuring out how to balance AI innovation with author rights, we’re still working with a 1962 copyright law that barely understands the internet, let alone artificial intelligence. (Hadi & Butt, 2025).

Pakistan’s copyright law has something called “fair dealing,” which sounds like “fair use” but is actually much more limited. It only covers things like personal use, research, or criticism. It definitely wasn’t written with AI companies in mind.

This creates two big problems:

Problem 1: Pakistani creators are sitting ducks. If your Urdu novel, your blog post, or your research paper gets scraped from the internet and used to train some Silicon Valley AI, good luck doing anything about it. You probably won’t even know it happened, and even if you did, try taking on a billion-dollar tech company from Karachi.

Problem 2: We have no rules for local AI development. If someone in Pakistan wants to build an AI system, there are no clear guidelines about what they can or can’t use for training. This could lead to the same problems, but with even less oversight.

There’s also a bigger issue lurking here. As global tech companies start making licensing deals with major publishers for AI training, countries like Pakistan risk getting left out entirely. We could end up being just a source of data; our stories, our languages, our knowledge extracted and used, without being stakeholders in the conversation.

What Other Countries Are Doing

Pakistan isn’t alone in wrestling with these questions, but we’re definitely behind the curve.

The European Union has created rules that allow text and data mining for AI under certain conditions, but they also give creators the right to opt out. The UK initially wanted to create broad exemptions for AI training, but backed down after pushback from authors and publishers.

The point is that the global rules are still being written. But if Pakistan doesn’t get involved in writing them, we’ll end up having to live by whatever others decide.

Time to Get Our Act Together

The Bartz v. Anthropic case isn’t just about one American court decision; it’s about the future of understanding the intersection of AI and creativity, and that future is being decided right now. Pakistani lawmakers need to start thinking about some hard questions:

Should we expand our fair-dealing rules to cover AI training? Maybe, but with what protections for local creators?
Should AI companies operating in Pakistan have to tell us what data they’re using? Transparency seems pretty basic, but it would be a big change.
Should there be requirements for consent or payment when using Pakistani works for AI training? This could protect creators but might also slow down innovation.
Who’s going to oversee all this? We need some kind of regulatory body that actually understands both technology and creative rights.

The window for getting ahead of this issue won’t stay open forever. Right now, we can learn from other countries’ mistakes and successes. We can craft laws that make sense for Pakistan’s unique situation; our languages, our creative traditions, our economic realities.

References

Bartz v. Anthropic PBC, No. 3:24-cv-05417‑WHA (N.D. Cal. June 23, 2025), United States District Court for the Northern District of California.

U.S. Copyright Act, 17 U.S.C. § 107 (Fair Use Provision).

Authors Guild v. Google, 804 F.3d 202 (2d Cir. 2015).

Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith, 598 U.S. 508 (2023).

Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569 (1994).

Shahir Hadi & Talha Ali Butt, A Critical Analysis of Copyright Laws for Regulating the Legal Framework for Artificial Intelligence, 6(1) QJSSH (Winter 2025). DOI: https://doi.org/10.55737/qjssh.vi-i.25334

European Parliament and Council Directive (EU) 2019/790 on Copyright and Related Rights in the Digital Single Market.

UK Government Policy Statement on Artificial Intelligence and Intellectual Property, UKIPO (2023).

Books3, LibGen, and Pirate Library Mirror – Pirate datasets used in LLM training, as referenced in court filings.

AI, Fair Use, and Piracy: What Bartz v. Anthropic Means for Pakistan’s Copyright Future

Author: Ayesha Youssuf Abbasi

Leave a Reply Cancel reply