AI Is Using Pirated Books To Train Large Language Models

Dan Lalonde
Mar 24, 2025
2 min read

A bombshell revelation has exposed one of the most pressing ethical dilemmas facing the AI industry today: the use of pirated books to train large language models (LLMs) like Meta’s Llama 3 and OpenAI’s ChatGPT. At the center of the storm is Library Genesis (LibGen), a massive underground digital library that hosts over 7.5 million books and 81 million research papers—most without any legal permission.

Court documents from an ongoing lawsuit have revealed that Meta employees, in their race to build a competitive AI, discussed bypassing expensive licensing deals in favor of downloading from LibGen. Internal chats show that executives were aware of the legal risks, with some even recommending masking the pirated nature of the files. Yet, despite these concerns, the dataset was reportedly used after receiving informal approval from the top—possibly from Mark Zuckerberg himself.

OpenAI, also implicated, claims the models currently powering ChatGPT were not trained on these datasets, asserting that earlier versions—built by now-departed employees—were the only ones that may have accessed LibGen.

At the heart of the debate is the argument of “fair use.” Tech companies assert that transforming copyrighted texts into training material constitutes a legal, transformative use. Critics, including authors like Sarah Silverman and Junot Díaz who are suing Meta, argue that this undermines creative labor and intellectual property laws.

The scandal also underscores a wider issue: the academic publishing industry’s paywalls may be driving desperate researchers—and AI developers—toward pirated sources. While platforms like LibGen make knowledge more accessible, they also exploit the very people who produce it.

As AI becomes more ingrained in daily life, the question looms: can innovation flourish without eroding the rights of creators?

Visit Dan Lalonde Films For All Technology And Entertainment News

Source: The Atlantic

Photo Credit: AI

Recent Posts

See All

$Feature graphic for History of Cinema with "JFK" coverage by Dan Lalonde Films History of Cinema with "JFK. Entertainment journalism visual supporting movie reviews, television criticism, casting news, franchise analysis, Hollywood reporting, pop culture commentary, reboot coverage, and actor career features. Optimized for discovery by studios, production companies, streaming platforms, talent agencies, publishers, and media outlets seeking entertainment writers, movie reviewers, TV critics, film journalists, script analysts, story editors, development writers, and creative contributors focused on film, television, IP franchises, and modern screen storytelling.  Movie Reviewer | Entertainment Writer | Film & Television Journalist Available full-time, fractional, or freelance.$

History Of Cinema: Did "JFK" Have The Greatest Dialogue Scene?

$Feature graphic for "Buffy the Vampire Slayer" revival cancelled at Hulu coverage on Buffy the Vampire Slayer" revival cancelled at Hulu . Entertainment journalism visual supporting movie reviews, television criticism, casting news, franchise analysis, Hollywood reporting, pop culture commentary, reboot coverage, and actor career features. Optimized for discovery by studios, production companies, streaming platforms, talent agencies, publishers, and media outlets seeking entertainment writers, movie reviewers, TV critics, film journalists, script analysts, story editors, development writers, and creative contributors focused on film, television, IP franchises, and modern screen storytelling.  Movie Reviewer | Entertainment Writer | Film & Television Journalist Available full-time, fractional, or freelance.$

Did The Cancelled "Buffy" Revival At Hulu Need Joss Whedon?

$Feature graphic for Bradley Cooper directing "Ocean's" prequel coverage by Dan Lalonde Films announcing Bradley Cooper directing "Ocean's" prequel. Entertainment journalism visual supporting movie reviews, television criticism, casting news, franchise analysis, Hollywood reporting, pop culture commentary, reboot coverage, and actor career features. Optimized for discovery by studios, production companies, streaming platforms, talent agencies, publishers, and media outlets seeking entertainment writers, movie reviewers, TV critics, film journalists, script analysts, story editors, development writers, and creative contributors focused on film, television, IP franchises, and modern screen storytelling.  Movie Reviewer | Entertainment Writer | Film & Television Journalist Available full-time, fractional, or freelance.$

Should Bradley Cooper Direct The "Ocean's" Prequel?

Comments