Anthropic did not breach copyright when training AI on books without permission, court rules

TruthLens AI Suggested Headline:

"Court Rules Anthropic's AI Training Constitutes Fair Use, But Copyright Infringement Found"

View Raw Article Source (External Link)
Raw Article Publish Date:
AI Analysis Average Score: 8.9
These scores (0-10 scale) are generated by Truthlens AI's analysis, assessing the article's objectivity, accuracy, and transparency. Higher scores indicate better alignment with journalistic standards. Hover over chart points for metric details.

TruthLens AI Summary

A federal judge in San Francisco has determined that Anthropic, a tech company specializing in artificial intelligence, did not breach copyright laws when it utilized books by authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson for training its Claude large language model (LLM). Judge William Alsup ruled that the company's actions constituted 'fair use', a legal doctrine that allows limited use of copyrighted material without permission. He likened Anthropic's use of these texts to a writer drawing inspiration from existing works to create something new, rather than copying or replacing the original content. However, the judge also found that Anthropic's storage of over 7 million pirated books in a central library violated copyright laws, necessitating a separate trial in December to determine the extent of damages owed to the authors for this infringement. Alsup emphasized that while Anthropic later purchased physical copies of these books, it does not absolve the company from liability for the initial unauthorized use of the texts.

This ruling comes amid broader tensions between AI companies and the publishing industry regarding copyright issues. As generative AI models, like Claude, require vast amounts of data for training, they often rely on copyrighted works, raising questions about the legality of such practices. Anthropic defended its actions by stating that its AI training aligns with the purpose of copyright law, which aims to foster creativity and innovation. Legal experts noted the significance of Judge Alsup's decision, as it sets a precedent for numerous ongoing copyright cases involving AI technologies. The outcome of this case may influence how courts interpret fair use in the context of AI training and could eventually reach the US Supreme Court for a definitive ruling. Meanwhile, the implications of this ruling are limited in the UK, where fair use is less recognized, and the legal landscape surrounding copyright in relation to AI remains contentious.

TruthLens AI Analysis

You need to be a member to generate the AI analysis for this article.

Log In to Generate Analysis

Not a member yet? Register for free.

Unanalyzed Article Content

A US judge has ruled that a tech company’s use of books to train its artificial intelligence system – without permission of the authors – did not breach copyright law.

A federal judge in San Francisco said Anthropic made “fair use” of books by writers Andrea Bartz, Charles Graeber and Kirk Wallace Johnson to train its Claude large language model (LLM).

Judge William Alsup compared the Anthropic model’s use of books to a “reader aspiring to be a writer” who uses works “not to race ahead and replicate or supplant them” but to “turn a hard corner and create something different”.

Alsup added, however, that Anthropic‘s copying and storage of more than 7m pirated books in a central library infringed the authors’ copyrights and was not fair use – although the company later bought “millions” of print books as well. The judge has ordered a trial in December to determine how much Anthropic owes for the infringement.

“That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft but it may affect the extent of statutory damages,” Alsup wrote.

US copyright law says that wilful copyright infringement can result in damages of up to $150,000 (£110,000) per work.

The copyright issue has pitted AI firms against publishers and the creative industries because generative AI models – the term for technology that underpins powerful tools such as the ChatGPT chatbot – have to be trained on a vast amount of publicly available data in order to generate their responses. Much of that data has included copyright-protected works.

An Anthropic spokesperson said the company was pleased the court recognised its AI training was transformative and “consistent with copyright’s purpose in enabling creativity and fostering scientific progress”.

John Strand, a copyright lawyer at the US law firm Wolf Greenfield, said the decision from a “well-respected” judge was “very significant”.

He added: “There are dozens of other cases involving similar questions of copyright infringement and fair use pending throughout the US, and Judge Alsup’s decision here will be something those other courts must consider in their own case.”

Due to the number of other AI copyright cases working their way through the legal system, Strand said: “The expectation is that at some point the primary question of whether training LLMs on copyrighted materials is fair use likely will be addressed by the US supreme court.”

The writers filed the proposed class action against Anthropic last year, arguing the company, which is backed by Amazon and Alphabet, used pirated versions of their books without permission or compensation to teach Claude to respond to human prompts.

The proposed class action is one of several lawsuits brought by authors, news outlets and other copyright owners against companies including OpenAI, Microsoft and Meta Platforms over their AI training.

The doctrine of fair use allows the use of copyrighted works without the copyright owner’s permission in some circumstances. Fair use is a key legal defence for the tech companies, and Alsup’s decision is the first to address it in the context of generative AI.

AI companies argue their systems make fair use of copyrighted material to create new, transformative content, and that being forced to pay copyright holders for their work could hamstring the nascent industry. Anthropic told the court that it made fair use of the books and that US copyright law “not only allows, but encourages” its AI training because it promotes human creativity.

The company said its system copied the books to “study plaintiffs’ writing, extract uncopyrightable information from it, and use what it learned to create revolutionary technology”.

Giles Parsons, a partner at UK law firm Browne Jacobson, said the ruling would have no impact in the UK, where the fair use argument holds less sway. Under current UK copyright law, which the government is seeking to change, copyright-protected work can be used without permission for scientific or academic research.

He said: “The UK has a much narrower fair use defence which is very unlikely to apply in these circumstances.”

Copyright owners in the US and UK say AI companies are unlawfully copying their work to generate competing content that threatens their livelihoods. A UK government proposal to change copyright law in the UK by allowing use of copyright-protected work without permission – unless the work’s owner signals they want to opt out of the process – has beenmet with vociferous opposition from the creative industries.

Alsup said Anthropic violated the authors’ rights by saving pirated copies of their books as part of a “central library of all the books in the world” that would not necessarily be used for AI training. Anthropic and other prominent AI companies including OpenAI andFacebook owner Metahave been accused of downloading pirated digital copies of millions of books to train their systems.

Back to Home
Source: The Guardian