OpenAI Ordered to Hand Over 20M ChatGPT Logs in NYT Copyright Case

A federal magistrate judge has ordered OpenAI to turn over roughly 20 million de-identified ChatGPT logs to The New York Times and other plaintiffs, deepening the AI development company’s exposure to an array of copyright and data governance disputes.

Issued on Wednesday in New York, the order denies OpenAI’s bid to block the production of user-chat records and directs the company to hand over the logs under a protective framework.

The outcome could shape how tech firms such as OpenAI, Anthropic, and Perplexity source training data, license content, and build guardrails around and over what their systems can output.

While the court “recognizes that the privacy considerations of OpenAI’s users are sincere,” such considerations “are only one factor in the proportionality analysis, and cannot predominate where there is clear relevance and minimal burden,” U.S. Magistrate Judge Ona T. Wang wrote.

Decrypt has reached out to both parties for comment.

The order stems from the Times’ ongoing lawsuit, which alleges that OpenAI’s models were trained on copyrighted news content without permission. It was first brought forward in December 2023.

In January last year, OpenAI challenged the NYT’s claims and filed a countersuit, claiming that the publication was not “telling the full story.”

The court later found that the 20 million chat log samples in question are “proportional to the needs of the case” to assess whether ChatGPT outputs copied the NYT’s material.

Over the past year, the dispute has intensified, with plaintiffs pressing for broad access to output data, and OpenAI warning that expansive production of these materials would raise privacy and operational burdens.

In June, OpenAI faced another setback when the court ordered the company to keep a wide range of ChatGPT user data for the lawsuit, including chats users may have already deleted.

Months later, in October, the dispute resurfaced, with the court flagging OpenAI’s October 20 filing (ECF 679) that challenged the production of the 20 million log sample, and ordered both sides to submit clarifications on why they disagree.

At the time, the judge pressed the parties to explain how the fight related to earlier concerns over deleted logs and whether OpenAI had backed away from prior agreements on what it previously claimed it would turn over.

Late last month, OpenAI filed a formal objection asking the district judge to overturn the magistrate judge’s discovery order.

The company argued that the ruling was “clearly erroneous” and “disproportionate,” in that it would force the company to disclose millions of private user conversations, according to a court document shared with Decrypt by an OpenAI representative.

The dispute arises as part of a broader offensive against AI labs, with authors, news organizations, music publishers, and code repositories seeking to test how far existing copyright law extends when models ingest and reproduce protected material.

Courts across the U.S. and Europe are now sorting through similar claims.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

OpenAI Ordered to Hand Over 20M ChatGPT Logs in NYT Copyright Case

Selected Articles by Decrypt

Table of Contents

Related Articles