Anthropic Open Letter: Hypocritical Sam Altman, PUA Master

Original Title: Read Anthropic CEO's Memo Attacking OpenAI's 'Mendacious' Pentagon Announcement
Original Author: The Information
Translation: Peggy, BlockBeats

Editor's Note: Just hours before OpenAI announced an AI cooperation agreement with the Pentagon, the Pentagon had just terminated its cooperation with Anthropic on the grounds of insisting on safety terms. Subsequently, Anthropic CEO Dario Amodei released an unusually strong internal memo to employees, directly pointing out that most of the "safety mechanisms" claimed by OpenAI are merely "safety theater," and questioning its stance on autonomous weapons and mass surveillance issues.

In this approximately 1600-word email, Amodei not only disclosed some details of the negotiations between both parties and the U.S. defense system but also aimed his criticism directly at OpenAI CEO Sam Altman, accusing him of covering up the true structure of the cooperation through public relations narrative. This controversy surrounding military applications of AI, safety red lines, and political relationships is pushing the differences between the two major AI companies in Silicon Valley into the spotlight.

The following is the original text:

I want to make very clear the information currently released by OpenAI and the hypocrisy present in that information. This is precisely their true approach, and I hope everyone can see through it.

Although there are still many unknowns about the contract they signed with the Department of War (DoW) (even they themselves might not be completely clear, as the contract terms are likely quite vague), there are a few points that can be confirmed: from the public descriptions of Sam Altman and the Department of War (of course, the actual contract text needs to be seen for final confirmation), their cooperation model is roughly as follows: the model itself has no legal usage restrictions, the so-called "all lawful uses"; at the same time, a so-called "safety layer" is set. In my opinion, this "safety layer" is essentially the model's refusal mechanism, used to prevent the model from completing certain tasks or participating in certain applications.

The so-called "safety layer" may also refer to the proposals that partners (such as Palantir, which serves as a business partner to the U.S. government when Anthropic is involved) tried to pitch to us during negotiations. They proposed a classifier or machine learning system that claims to allow certain applications while blocking others. Additionally, there are indications that OpenAI will arrange for employees (FDE, or Frontline Deployment Engineers) to oversee the use of the model to prevent inappropriate applications from occurring.

Our general judgment is that these schemes are not entirely ineffective, but in the context of military applications, about 20% is real protection, while 80% is safety theater.

The root of the problem lies in whether the model is being used for mass surveillance or fully autonomous weapon systems, which often depends on broader contextual information. The model itself does not know what kind of system it is in; it does not know whether there are humans "in the loop" (human-in-the-loop, which is key to the autonomous weapons issue); nor does it know the source of the data it is analyzing. For example, is it domestic U.S. data or foreign data, data provided by enterprises with user consent, or data purchased through gray channels, etc.

Those engaged in security work have long realized this: the model's refusal mechanism is not reliable. Jailbreak attacks are very common, and many times simply misreporting the nature of the data to the model can bypass these restrictions.

There is also a key distinction that makes the issue more complex than ordinary security protections: whether the model is executing a cyberattack can often be inferred from inputs and outputs; however, determining the nature of the attack and the specific context is an entirely different matter, and this is precisely the judgment capability that is needed here. In many cases, this task is incredibly difficult, if not impossible to complete.

The "safety layer" pitched to us by Palantir (I believe they pitched a similar scheme to OpenAI) is even worse. Our judgment is that it is almost entirely a safety theater.

Palantir's basic logic seems to be: "Your company may have some disgruntled employees, you need to give them something to appease them, or make what is happening invisible to them. This is precisely the service we provide."

As for having Anthropic or OpenAI employees directly supervise deployments, we had internal discussions months ago while expanding the acceptable use policy (AUP) in a classified environment. The conclusion was very clear: this approach is only feasible in a very limited number of cases. We will make every effort to try, but it is by no means a reliable core assurance mechanism, especially difficult to implement in classified environments. By the way, we are indeed trying to do this as much as possible; on this point, we are no different from OpenAI.

Therefore, what I want to say is: the measures taken by OpenAI hardly solve the problem.

The fundamental reason they accepted these schemes while we did not is that they are concerned with how to appease employees, whereas we are genuinely concerned with preventing abuse.

These schemes are not without value; we are also using some of them, but they are far from meeting the necessary safety standards. At the same time, the Department of War is clearly inconsistent in its treatment of OpenAI and us.

In fact, we tried to include some safety terms in the contract that were similar to those of OpenAI (as a supplement to the AUP. In our view, the AUP is the more important part), but the Department of War rejected it. The relevant evidence exists in the email discussion chain from that time. As I am now quite busy, I may have colleagues check for specific wording later. Therefore, the statement that "OpenAI's terms were offered to us and we rejected them" is not true; similarly, the claim that "OpenAI's terms can effectively prevent mass domestic surveillance or fully autonomous weapons" is also not true.

Additionally, Sam and OpenAI's statement implies that the red line we proposed, regarding fully autonomous weapons and mass domestic surveillance, is itself illegal, rendering the related usage policies redundant. This statement is nearly identical to the Department of War's stance, appearing to be coordinated in advance.

But this is not the case.

As we explained in our statement yesterday, the Department of War does indeed have the authority to conduct domestic surveillance. In the past, during an era without AI, the impact of these authorities was relatively limited, but in the AI era, its significance is entirely different.

For example: the Department of War can legally purchase large quantities of private data on U.S. citizens from vendors (these vendors often obtain resale rights through covert user consent terms), and then utilize AI to analyze this data on a large scale to build citizen profiles, assess political leanings, and track real-world action trajectories; the data they can obtain even includes GPS information, among others.

It is also noteworthy that as negotiations drew to a close, the Department of War proposed that if we removed a specific clause in the contract regarding "analysis of bulk acquired data," they would be willing to accept all our other terms. And this clause happened to be the only one in the contract that precisely corresponded to the scenario we were most concerned about. We find this very suspicious.

On the issue of autonomous weapons, the Department of War claims that "a human in the loop" is a legal requirement. But that is not the case. This is actually just a Pentagon policy from the Biden administration, requiring human involvement in weapon launch decisions. And this policy can be unilaterally modified by the current Secretary of Defense Pete Hegseth—this is precisely what we are truly concerned about. Therefore, from a realistic perspective, this is not a true constraint.

The vast public relations rhetoric from OpenAI and the Department of War on these issues is either lying or deliberately creating confusion. These facts reveal a behavioral pattern that I have seen many times in Sam Altman. I hope everyone can recognize it.

This morning, he first stated that he agrees with Anthropic's red lines, aiming to appear supportive of us, thus garnering some credit while avoiding criticism when they take over this contract. He also tried to portray himself as a figure who hopes to "establish a unified contract standard for the whole industry"—essentially a peacemaker and deal broker.

But behind the scenes, he is signing contracts with the Department of War, preparing to replace us the moment we are flagged as a supply chain risk.

At the same time, he must ensure that this process does not appear as if "while Anthropic adheres to red lines, OpenAI abandons them." He is able to accomplish this because:

First, he can sign all the "safety theater" measures we rejected, and the Department of War and its partners are also willing to cooperate by packaging these measures to make them credible enough to appease his employees.

Second, the Department of War is willing to accept some terms he proposed, while we were initially rejected when we proposed the same content.

It is these two points that enable OpenAI to reach an agreement while we cannot.

The real reason the Department of War and the Trump administration dislike us is: we did not make political donations to Trump (while OpenAI and Greg Brockman donated a lot); we did not praise Trump in a dictatorial manner (while Sam did); we support AI regulation, which contradicts their policy agenda; we chose to speak the truth on many AI policy issues (for example, the issue of AI replacing jobs); and, we genuinely adhered to red lines, rather than creating "safety theater" with them to appease employees.

Sam is now trying to depict all of this as: we are difficult to cooperate with, we are hardline, we lack flexibility, and so forth. I hope everyone recognizes that this is a typical gaslighting.

The vague statement of "someone is difficult to cooperate with" is often used to cover up the truly ugly reasons—namely, those I just mentioned: political donations, political loyalty, and safety theater.

Everyone needs to understand this and refute this narrative when privately communicating with OpenAI employees.

In other words, Sam is undermining our position while posing as "supporting us." I hope everyone stays alert to this: by undermining public support for us, he makes it easier for the government to penalize us. I even suspect he may be subtly fanning the flames, although I currently have no direct evidence for this.

At the public and media level, this rhetoric and manipulation seem to have lost effectiveness. Most people believe that OpenAI's deal with the Department of War is worth being cautious about, even unsettling, while viewing us as the principled side (by the way, we are now the second in the App Store download chart).

[Note: Subsequently, Claude rose to first in the App Store.]

Of course, this narrative works among some fools on Twitter, but that doesn't matter. What I really worry about is ensuring it does not influence the employees within OpenAI.

Due to selection effects, they are already a relatively easy group to persuade. But it is still very important to refute the narratives Sam is peddling to their own employees.

[Original Link]

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

Anthropic Open Letter: Hypocritical Sam Altman, PUA Master

Selected Articles by 律动BlockBeats

Table of Contents

Related Articles