Business

Meta sued for training AI on copyrighted works without permission

A group of publishers and bestselling author Scott Turow have filed a class-action lawsuit against Meta and its CEO Mark Zuckerberg, accusing the company of using copyrighted works without authorization to train its artificial intelligence technology.

The lawsuit, filed in federal court in New York, was brought by Turow along with publishers Cengage, Elsevier, Hachette, Macmillan, and McGraw-Hill. It alleges that Meta scraped millions of copyrighted materials, including novels, journal articles, and textbooks, from the internet—specifically targeting “notorious pirate sites”—to develop its Llama suite of AI models.

The complaint claims that Meta stripped copyright management information from these works to conceal the infringement. It further asserts that Llama, which generates text responses to user prompts, reproduces portions of original copyrighted content, sometimes verbatim, and can mimic the distinctive writing styles of individual authors.

The plaintiffs argue that Meta’s use of copyrighted materials without permission denies authors and publishers rightful revenue. The lawsuit directly implicates Mark Zuckerberg, claiming he “personally authorized and actively encouraged” the infringement by bypassing standard licensing procedures. It adds that Zuckerberg’s day-to-day involvement in Meta’s AI development contributed to the company’s growth and his reported net worth exceeding $200 billion.

A Meta spokesperson told CBS News that the company plans to “fight this lawsuit aggressively,” stating that “AI is powering transformative innovations” and that courts have previously recognized AI training on copyrighted data can qualify as fair use.

Why it matters

This lawsuit is part of a broader legal debate over the use of copyrighted works in training AI systems. Authors and publishers worry that unauthorized data mining undermines their intellectual property rights and financial interests, especially as AI-generated content becomes more sophisticated. Previous litigation in this area, such as the $1.5 billion settlement by AI company Anthropic last year, highlights the high stakes involved.

Legal experts note that this case centers on whether Meta’s collection of copyrighted content for AI training itself constitutes infringement, rather than focusing solely on the AI-generated output. If successful, the plaintiffs’ claims could set important precedent on how copyright law applies to AI development practices.

Background

Meta’s Llama models are among several AI platforms built by scraping large datasets from the internet to improve their language processing capabilities. The use of copyrighted content without explicit licensing in such datasets has triggered growing controversy. Courts have yet to fully establish clear standards for how copyright protections apply to machine learning training methods.

Similar lawsuits have targeted other AI firms, reflecting ongoing tensions between intellectual property owners and technology companies racing to innovate in AI. This recent suit amplifies scrutiny on Meta as one of the largest players in the sector.

Sources

This article is based on reporting and publicly available information from the following source:

Read more Business stories on Goka World News.

Giorgio Kajaia
About the author

Giorgio Kajaia

Giorgio Kajaia is a writer at Goka World News covering world news, U.S. news, politics, business, climate, science, technology, health, security, and public-interest stories. He focuses on clear, factual, and reader-first reporting based on credible reporting, official statements, publicly available information, and relevant source material.

View all posts by Giorgio Kajaia