How should Chinese AI companies "copy Claude Code's homework"?

Source: Geek Park

Written by: Dancing King Hualin

If someone had told me a few days ago that Anthropic, which claims to be "the most concerned about AI safety", would leak core secrets twice within a week, I would have thought it was an April Fools' joke.

But it happened just a day before April Fools' Day.

On March 31, security researcher Chaofan Shou discovered that in version 2.1.88 of Claude Code released by Anthropic on npm, a 59.8MB source map file was included. This file, intended for internal debugging, pointed to a zip archive in Anthropic's own Cloudflare R2 storage bucket—within it was the complete TypeScript source code of Claude Code, approximately 1,900 files, totaling 512,000 lines of code.

Within hours, multiple mirror repositories appeared on GitHub. One project named "claw-code" garnered 50,000 stars within two hours, becoming the fastest-growing repository in GitHub history. The forks exceeded 41,500.

Just five days prior, Anthropic had leaked the existence of its next-generation model "Mythos" due to an unprotected public data cache—a model described internally as having "step-level improved capabilities" that "far surpasses all existing AI models" in cybersecurity.

Two leaks in one week. A company that talks about safety was slapped in the face by its own security issues. The developer community's evaluation was quite uniform—"It's unrealistically ironic."

But irony aside, what was leaked is genuinely substantive. A more important question is how should AI companies exploit this "leak" to copy homework?

01 What is inside Claude Code's "shell"?

Many people's first reaction is: Isn't Claude Code just a command line tool wrapped around a model API? So what if the source code is leaked? Without the model weights, this code is just a "shell".

This judgment is half correct. Claude Code is indeed a shell, but it is a remarkably sophisticated shell.

First, look at the tool system. Claude Code adopts a plugin-like architecture, with each capability—file read/write, shell execution, web scraping,LSP integration—being an independent, permission-controlled tool module. The tool definition layer alone has 29,000 lines of TypeScript.

Each tool's description is not a simple sentence, but it details to the model "when to use this tool, how to use it, and what outcome is expected after use."These descriptions themselves are a form of finely-tuned prompt engineering..

Next is the memory system. The leaked code reveals a three-layer "self-repairing memory" architecture. The base layer is MEMORY.md, a lightweight index file, with about 150 characters per line, always loaded in context. Specific project knowledge is scattered in "topic files" and loaded on demand. The original dialogue records are never fully read back into the context, only being retrieved through grep when specific identifiers are needed.

That is to say, the core problem that Anthropic's engineers spent significant time solving was not "how to call the API," but rather "how to make the model work as intelligently as possible within a limited context window."

Then there is the feature that excites everyone, called KAIROS.

This feature, named after the ancient Greek word for "the right moment," is mentioned over 150 times in the source code. It is a self-service daemon mode that allows Claude Code to run as an always-on background agent. More interestingly is its "autoDream" logic—when the user is idle, the agent performs "memory consolidation," merging scattered observations, eliminating logical contradictions, and turning vague insights into definitive facts.

In other words, Anthropic is evolving the AI programming assistant from a "you ask, I answer" tool into a collaborator that "continuously understands your project and proactively discovers problems."

Additionally, the leaked code contains 44 unreleased feature flags, covering multi-agent coordination mode (COORDINATOR MODE), voice interaction (VOICE_MODE), 30-minute remote planning sessions (ULTRAPLAN), and even a Tamagotchi-style terminal pet (BUDDY), featuring 18 species and rarity levels.

There are also two details worth mentioning. One is "frustration regex"—a regular expression used to detect whether a user is insulting Claude. Using regex to determine user emotion is much faster and cheaper than using model inference.

The other is "undercover mode," where Anthropic uses Claude Code to make "stealth contributions" to public open-source projects, with the system prompt clearly stating: "You are running in UNDERCOVER mode... your commit messages must not contain any internal information from Anthropic. Do not reveal your identity."

02 What can Chinese AI companies learn?

Now back to the truly important question.

In the past year, the AI programming tool sector in China has accelerated noticeably. Byte's Trae has evolved from its initial MarsCode to become an AI-native IDE, integrated with an Agent mode and supporting full-process automation from requirement understanding to code writing to testing. Zhipu's CodeGeeX focuses on open sourcing and local deployment, making deep optimizations in Chinese code understanding. Tongyi Lingma and Doubao MarsCode are also iterating rapidly.

But if we compare these products with the architecture leaked from Claude Code, the gap is not in "whether they can be used," but in engineering precision.

Lesson one: Tool descriptions equate to product strength.

This might be the most easily overlooked and yet most valuable lesson to learn.

Claude Code's prompt descriptions for each tool have undergone extremely fine-tuned adjustments—when to use them, when not to use them, how to handle results after use, and how to retry in case of errors. These descriptions essentially teach the model "how to be a good programmer."

Many domestic tools are still at the stage of "giving the model a function signature and letting it guess how to use it." Just writing tool descriptions to the level of Claude Code could elevate the performance of the same model to a higher tier.

Lesson two: Memory architecture impacts user experience more than model parameters.

Claude Code's three-layer memory system addresses a very practical issue—there is a limit to the model's context window, and you cannot shove all historical dialogues into it.

Anthropic's approach layers memory—hot data is always online, warm data is loaded on demand, and cold data is merely indexed. This thinking is not new, but the engineering implementation in AI programming tools is generally lacking this level of detail within domestic teams.

Lesson three: Emotional perception is not metaphysics; it is an engineering problem.

Use a regular expression to detect whether the user is angry, and then adjust the reply strategy.

This solution is simple to the point of being crude, but extremely practical. It conveys a lesson—good AI products do not need to solve every issue with a model; sometimes a regex is sufficient.

Domestic AI tool teams often fall into the thinking inertia of "all problems must be handed to the large model," which is a waste.

Lesson four: The direction indicated by KAIROS is more important than KAIROS itself.

An always-on background agent that automatically organizes memories and discovers problems while the user is idle.

This product direction means that the next step for the AI programming assistant is not "to answer questions faster," but rather "to be working even when you have not asked a question."

Currently, almost all domestic AI programming tools are reactive—users issue commands, and the tools execute them.

Whoever first implements the daemon mode may define the next generation of product forms..

03Where is the boundary of "copying"?

Of course, there is a line between learning and plagiarism.

From a legal standpoint, this is not open-source code but rather commercially leaked software. Building products directly based on the leaked code carries clear copyright risks. The "claw-code" on GitHub claims to rewrite it in Rust, but if the core logic is copied, the legal boundaries remain ambiguous.

For Chinese companies, amid increasing pressure to go global, this risk needs to be seriously assessed.

From a technical standpoint, many design decisions in Claude Code are deeply customized for the abilities of the Claude model. For example, the reason its tool descriptions are so long and detailed is that Claude's long-context processing capability is strong enough that it won't "lose focus" due to long system prompts. If a model has a shorter context window and weaker adherence to instructions, copying the same prompt strategy could backfire.

The truly smart approach is not to fork these 512,000 lines of code, but to understand the trade-offs behind every design decision and then reimplement it according to the characteristics of one's own model.

Architectural ideas can be learned, tool orchestration patterns can be learned, and memory layering strategies can be learned—but the implementation must be your own.

Another easily overlooked reality is that Anthropic has leaked a snapshot, while their engineering team iterates daily. The 44 feature flags indicate that at least a dozen significant functionalities are queued for launch.

The code you fork today will be an old version by next month. Chasing to copy won’t ever catch up; understanding the principles will allow you to carve your own path.

The greatest significance of this leak may not lie in the technical details but rather in the removal of a layer of mystery—it turns out that Anthropic's core AI programming tool is fundamentally built on carefully designed prompt orchestration, combined with engineered tool scheduling.

No black magic, just a significant amount of detail polishing.

This is actually good news for Chinese AI companies. It means the gap can be bridged. The premise is that you must have the patience to polish those details—not just think about directly taking someone else's code and changing the name.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

How should Chinese AI companies "copy Claude Code's homework"?

Selected Articles by Techub News

Table of Contents

Related Articles