SDNY Judge Denies OpenAI Bid To Strike Authors' Book-Download Claim

CN
Decrypt
Follow
6 hours ago

OpenAI has suffered a legal setback in its copyright battle, as a federal judge ruled that authors can pursue claims the company unlawfully downloaded their books.


U.S. District Judge Sidney H. Stein denied OpenAI's motion to strike what the company characterized as a new "download claim" in a ruling Monday, finding that prior complaints adequately notified OpenAI of infringement allegations based on downloading and reproducing copyrighted books.


"A complaint need not pin plaintiff's claim for relief to a precise legal theory," Judge Stein said in his ruling, noting that "factual allegations alone are what matters."


He granted OpenAI partial relief, striking allegations about GPT-4V, GPT-4.5, GPT-5, and any “derivatives” or “successors,” on the ground that his May order confines the case to seven models (GPT-3 through GPT-4o Mini).


The "download claim" dispute


The case is part of a massive multidistrict litigation (MDL) consolidating numerous copyright lawsuits against OpenAI and Microsoft in New York's Southern District. An MDL combines similar cases from different courts into one proceeding for efficient pre-trial handling.


This consolidated action includes complaints from authors David Baldacci, Michael Chabon, and others alleging OpenAI "captured, downloaded, and copied copyrighted written works" without permission.


In its motion to strike, OpenAI argued the consolidated complaint improperly introduced a new legal theory by separating download allegations from training-based claims.


Judge Stein rejected this argument, finding that prior class action complaints had already "asserted a cause of action for copyright infringement and alleged that OpenAI impermissibly downloaded and reproduced plaintiffs' books."


The fact that many allegations suggested the "ultimate purpose of the reproduction was to train OpenAI's LLMs is not dispositive," he wrote.





Navodaya Singh Rajpurohit, legal partner at Coinque Consulting, told Decrypt that "authors may need to show concrete evidence that their books were in the training data."


The courts have ordered production of "Slack channels discussing the removal of the books datasets" and required OpenAI to preserve "complete output logs and metadata,” he added, to "trace whether specific works were ingested."


"These logs, along with any test files or vendor‑supplied book lists, may be important in discovery," the lawyer said.


OpenAI may argue downloads came from public or licensed sources, Rajpurohit said, noting it has acknowledged licensing publisher content and contends training on publicly available material is transformative fair use, and recent media partnerships suggest clearer licensing supporting lawfulness.


Industry-wide copyright battles


OpenAI is fending off a raft of copyright suits, one led by The New York Times, alleging it and Microsoft used “millions of paywalled articles” to build a “market substitute” for news.


In May, a court ordered OpenAI to “preserve and segregate all output-log data,” including deleted chats; OpenAI contested the order in June, calling it “an overreach by The New York Times” that undermines user privacy.


In June, Meta and Anthropic notched partial wins with Judge Vince Chhabria deeming Meta’s book-training fair use, noting plaintiffs “made the wrong arguments,” while Judge William Alsup likewise found Anthropic’s training fair use but criticized its “permanent library of pirated books.”


免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Share To
APP

X

Telegram

Facebook

Reddit

CopyLink