Special thanks to Matthew R. Hemmersmeier for his contributions to this article.
Client Alert
Joseph E. Martineau, Bridget Hoy, Kirk A. Damman, Michael J. Hickey, John B. Greenberg, Benjamin J. Siders
share this page:
As AI evolves, the contours between it and copyright law evolves, too. Two recent decisions in California suggest that using a multitude of copyright materials for training purposes will not infringe copyright providing the outputted product is not infringing.
On March 21, 2025, we published an alert discussing the United States Court of Appeals for the District of Columbia Circuit’s decision in Thaler v. Perlmutter, which affirmed the U.S. Copyright Office’s denial of copyright registration for a work autonomously generated by artificial intelligence (“AI”) without human involvement. Last week, different judges in the United States District Court for the Northern District of California issued two rulings addressing whether using copyrighted works to train generative AI constitutes fair use under Section 107 of the Copyright Act. The decisions reached similar results, highlighting the developing evolution surrounding fair use as it relates to generative AI. Both decisions give some support to those who lean towards a viable fair use defense where copyrighted works are used to train generative AI models without generating infringing copies. Still, as one of the decisions noted: “Fair use is a fact-specific doctrine that requires case-by-case analysis that is sensitive to new technologies and their potential consequences.” Kadrey v. Meta Platforms, Inc., 2025 WL 1752484 *22 (June 25, 2025).
In Bartz v. Anthropic, 2025 WL 1741691 (June 23, 2025), three authors sued AI software firm, Anthropic, alleging copyright infringement when Anthropic trained its AI service, Claude, using books it allegedly pirated from the internet and millions of printed books it allegedly purchased, scanned, digitized, and stored without the authors’ consent to create a large language model (“LLM”). While many of the books were used for this training purpose, some referenced as “library copies,” were retained for other possible uses, not to train Claude.
According to the allegations in the lawsuit, a user of Claude could input text, and Claude would output text arguably as eloquent and organized as that of the authors. But while users of Claude could employ it to create works as well-written as the authors, no infringing copy of their works would result. In short, Claude acted as a robo-editor.
Judge William Alsup found the use of the copyrighted works to train Claude was “quintessentially transformative.” For that reason, he ruled the first fair use factor—the purpose and character of the use—supported fair use for both purchased and pirated copies of the works used to train Claude, but not for pirated works retained solely as library copies. As for the second fair use element—the nature of the copyrighted work—Judge Alsup found the authors’ works sufficiently unique and creative to point against fair use. Even though the underlying works used for training Claude were used in their entirety, Judge Alsup found that because there was no “traceable connection” to the authors’ works and because a “monumental” volume was “reasonably necessary” for Claude to perform the function for which it was designed, the third fair use factor—the amount and substantiality of the portion used—favored fair use, except as to pirated library copies that were retained but not used for training. As for the final factor—the effect on the market for the copyrighted works—Judge Alsup found that even though Claude could create an “explosion of works” competing with the authors’ works, copyright was designed “to advance original works of authorship, not to protect authors against competition.” For that reason, except as to the pirated works, this factor either favored or was neutral in determining fair use.
Accordingly, after considering all the fair use factors, the court granted partial summary judgment in favor of Anthropic, holding its digitization and use of the plaintiffs’ books to train Claude was a fair use. Conversely, using pirated copies of the plaintiffs’ works in creating a central library was not fair use, and summary judgment was denied.
In Kadrey v. Meta Platforms, Inc., Meta trained its generative AI platform, Llama, using data from books allegedly downloaded from online repositories without permission. Thirteen authors of these allegedly pirated books sued. Just two days after Judge Alsup’s decision in the Anthropic case, Judge Vince Chhabria held that Meta’s use of the works was “highly transformative,” satisfying the first fair use factor. Additionally, Judge Chhabria determined that the plaintiffs presented no meaningful record of market dilution to create a genuine issue of material fact as to the fourth fair use factor—effect of the use on marketability of the original work. As in the Anthropic case, Judge Chhabria ruled the second fair use factor favored the authors, but the third fair use factor favored Meta because using the authors’ works in their entirety “was reasonable given its relationship to Meta’s transformative purpose.” Finally, like Judge Alsup, Judge Chhabria held Meta’s use of copyrighted works for training Llama constituted fair use even if pirated. However, unlike Judge Alsup, he did find that the potential for an “explosion of competing works” resulting from a trained AI model is an important consideration for the fourth fair use factor. Still, the court ruled “because Meta’s use of the works of these thirteen authors is highly transformative, the plaintiffs needed to win decisively on the fourth factor to win on fair use.” The court found that they failed, and for that reason, they lost.
Importantly, the Meta decision was based solely on the record presented by these thirteen authors and is not a sweeping declaration that Meta’s actions represent a fair use as to other possible plaintiffs under different facts or evidence. In fact, Judge Chhabria recognized “[i]n cases involving uses like Meta’s, it seems like the plaintiffs will often win, at least where those cases have better-developed records on the market effects of the defendant’s use.”
These cases reflect two of the earliest rulings on whether using copyrighted works for training generative AI constitutes a fair use. Both suggest that the highly transformative nature of AI-trained models can support a significant fair use defense where the resulting product is not substantially similar or traceable to the copyrighted works used in AI training and where there is little evidence of harm to the market for the copyrighted work. Although these cases show there is a judicial appetite for permitting the fair use defense in the generative AI context, we expect to see additional opinions in other relevant cases in the near future, and we will continue to follow how the legal and factual questions presented in Anthropic and Meta evolve.
If you have questions about how this decision affects your copyrighted content or the implications of using copyrighted works to train generative AI platforms, please contact the authors or any of our copyright or data protection and AI lawyers for further guidance.
Special thanks to Matthew R. Hemmersmeier for his contributions to this article.