Author:Black Lobster, Deep Tide TechFlow
In the summer of 1858, a copper-core cable crossed the Atlantic seabed, connecting London and New York.
The significance of this event was never about transmission speed, but about power structure; whoever laid the submarine cable could siphon profits from the flow of information. The British Empire relied on this global telegraph network to control intelligence from its colonies, cotton prices, and news of wars.
The power of the empire was not only in its fleet but also in that cable.
More than 160 years later, this logic is being replayed in an unexpected way.
In 2026, Chinese large models are quietly consuming the global developer market. The latest data from OpenRouter shows that Chinese models account for 61% of token consumption in the top ten models on the platform, with the top three all from China. API requests from developers in San Francisco, Berlin, and Singapore are crossing the Pacific undersea cables to reach data centers in China, where computing power is consumed, and electricity flows, with results sent back.
Electricity never left the Chinese power grid, but its value was delivered across borders through tokens.
The Great Migration of AI Models
On February 24, 2026, OpenRouter released weekly data: the total token consumption of the top ten models on the platform was approximately 8.7 trillion, with Chinese models alone accounting for 5.3 trillion, making up 61%. MiniMax M2.5 topped the list with 2.45 trillion tokens, followed by Kimi K2.5 and Zhipu GLM-5, all top three from China.

Latest data as of February 26
This is no coincidence; a catalyst ignited everything.
Earlier this year, OpenClaw emerged, an open-source tool that allows AI to truly "work," capable of directly controlling computers, executing commands, and completing complex workflows in parallel, with GitHub stars exceeding 210,000 within weeks.
A financial practitioner, John, immediately installed OpenClaw, integrated it with Anthropic API, and began automatically monitoring stock market information, reporting trading signals promptly. A few hours later, he stared blankly at his account balance: dozens of dollars, gone.
This is the new reality brought by OpenClaw. In the past, chatting with AI consumed a few thousand tokens in a single interaction, the cost was negligible. After integrating OpenClaw, the AI runs multiple sub-tasks in the background, repeatedly calling context and looping iterations; token consumption is not linear, it's exponential. Bills are accelerating like a car with the hood open, the fuel gauge dropping, unable to stop.
A clever trick began circulating in the developer community: using OAuth tokens to directly connect Anthropic or Google subscription accounts to OpenClaw, turning the monthly fee-based "unlimited" quota into free fuel for AI agents, which is a method many developers adopted.
The official countermeasures followed swiftly.
On February 19, Anthropic updated its agreement, explicitly prohibiting the use of Claude subscription credentials for third-party tools like OpenClaw; accessing Claude's features must go through the API billing channel. Google even conducted widespread bans on subscription accounts accessing Antigravity and Gemini AI Ultra through OpenClaw.
“The world has suffered from Qin for too long,” Jhon then turned to domestic large models.
On OpenRouter, the domestic large model MiniMax M2.5 scored 80.2% on software engineering tasks while Claude Opus 4.6 scored 80.8%, with almost negligible differences. But the price difference is astronomical; the former charges $0.3 per million tokens, and the latter $5, roughly 17 times more.
John switched over, the workflow continued to operate, and the bills shrank by an order of magnitude, this migration is happening globally.
Chris Clark, COO of OpenRouter, put it bluntly,The reason why Chinese open-source models are able to capture a large market share is that they occupy an exceptionally high proportion in the proxy workflows operated by American developers.
Power Going Overseas
To understand the essence of tokens going overseas, one must first clarify the cost structure of a token.
It seems light; one token is approximately equal to 0.75 English words. A typical conversation you have with an AI consumes only a few thousand tokens. But when these tokens stack up in trillions, the physical reality behind them becomes heavy.
Breaking down the cost of tokens, there are only two core components:Computing power and electricity.
Computing power is the depreciation of the GPU. If you buy an NVIDIA H100 and spend about $30,000, its lifespan translates into depreciation cost for each inference. Electricity is the fuel that keeps data centers running continuously; when fully loaded, each GPU consumes about 700 watts, and when factoring in cooling system expenses, the electricity bills for a large AI data center can easily exceed hundreds of millions annually.
Now, let's map this physical process.
An American developer sends an API request from San Francisco. The data travels from California, via undersea cables across the Pacific, to a data center somewhere in China, where GPU clusters start working, electricity flows from China’s power grid to those chips, inferences are completed, and results are sent back. The entire process may only take one or two seconds.
Electricity has never left China's power grid, but the value of electricity has been delivered across borders through tokens.
There is a magical aspect here that ordinary trade cannot reach: tokens are intangible, do not require passing through customs, are not subject to tariffs, and are even invisible in any current trade statistical measures. China has exported a large amount of computing and electricity services, but in official merchandise trade data, it is almost invisible.
Tokens have become a derivative of electricity, and the essence of tokens going overseas is the export of electricity.
This is also due to the relatively low electricity prices in China, with comprehensive electricity prices approximately 40% lower than those in the U.S.; this is a cost difference on the physical level that competitors can easily replicate.
Furthermore, Chinese AI large models also have advantages in algorithms and “involution.”
The MoE architecture of DeepSeek V3 allows only some parameters to activate during inference, and independent tests show that its inference costs are about 36 times lower than that of GPT-4o, while MiniMax M2.5 similarly activates only 10B out of a total of 229B parameters.
The top layer is involution, with Alibaba, ByteDance, Baidu, Tencent, The Dark Side of the Moon, Zhipu, MiniMax... dozens of companies stepping on each other in the same race track, prices have already plunged below reasonable profit margins, and operating at a loss while gaining visibility has become common in the industry.
Looking closely, this is similar to how Chinese manufacturing went overseas, leveraging supply chain advantages and industry involution to significantly reduce token prices.
From Bitcoin to Token
Before tokens existed, there was another period of electricity going overseas.
Around 2015, managers of power stations in Sichuan, Yunnan, and Xinjiang began to welcome a strange new batch of visitors.
These individuals rented abandoned factories, filled them with dense machines, and kept them running 24 hours a day. The machines produced nothing but constantly engaged in solving a mathematical problem, occasionally generating a Bitcoin from this endless arithmetic task.
This was the first generation form of electricity going overseas: converting inexpensive hydroelectric and wind power into globally circulating digital assets via mining machine hash calculations, then cashing them out as dollars on exchanges.
Electricity did not cross any borders, but the value of electricity, represented by Bitcoin, flowed to global markets.
In those years, China's computing power once accounted for over 70% of the global Bitcoin mining capacity. China's hydro and coal power participated in a reallocation of global capital in this roundabout manner.
In 2021, this abruptly stopped. Regulatory hammer fell, miners scattered, and computing power migrated to Kazakhstan, Texas in the USA, and Canada.
But this logic never disappeared; it was just waiting for a new shell until ChatGPT burst onto the scene, and the competition among large models began, turning former Bitcoin mining farms into AI data centers, mining machines became computing GPUs, what was previously produced as Bitcoin transformed into tokens, with electricity remaining unchanged.
The export of Bitcoin and the export of tokens are structurally isomorphic at the fundamental level, but tokens have more commercial value at present.
Mining with mining machines is purely a mathematical calculation, and the Bitcoin produced is a financial asset, its value stems from scarcity and market consensus, unrelated to “what was calculated.” The computing power itself lacks productivity, more akin to a byproduct of a trust mechanism.
Large model inference is different. GPUs consume electricity, and the output is real cognitive services: code, analysis, translation, and creativity. The value of tokens comes directly from their utility to the user. This represents a deeper embedding,Once a developer's workflow relies on a model, the cost of switching will accumulate over time, becoming higher.
Of course, there is also a key difference: Bitcoin mining was expelled by China, while the export of tokens is actively chosen by global developers.
Token War
The submarine cable laid in 1858 represents the British Empire's sovereignty over information highways; whoever owns the infrastructure can define the rules of the game.
The export of tokens is also a war without a declaration of war, with numerous obstacles.
Data sovereignty is the first wall; if an API request from an American developer is processed via a data center in China, the data physically flows through China. This is not an issue for individual developers and small applications, but it becomes a hard constraint when it involves sensitive corporate data, financial information, or government compliance scenarios. This is also why the penetration of Chinese models is highest in development tools and personal applications while having almost no presence in core enterprise systems.
Chip bans are the second wall; Chinese AI development faces export controls on high-end GPUs from NVIDIA, and while MoE architecture and algorithm optimizations can partially offset this disadvantage, the ceiling remains.
But the obstacles in front are only the prologue; a larger battlefield is taking shape.
Tokens and AI models have already become a new dimension of strategic competition between China and the U.S., comparable to semiconductors and the internet in the 20th century, and even more akin to an older metaphor:Space race.
In 1957, when the Soviet Union launched Sputnik 1, the United States was shocked and promptly initiated the Apollo program, pouring resources equivalent to hundreds of billions today, vowing not to lose in the space race.
The logic of AI competition is astonishingly similar, but the intensity will far exceed that of the space race. After all, space is a physical realm, imperceptible to ordinary people; AI permeates the economic capillaries. Behind every line of code, every contract, and every government decision-making system, there may be a large model from a particular country. Whichever model becomes the default option for global developers' infrastructure will gain structural influence over the global digital economy.
This is precisely why the overseas export of Chinese tokens makes Washington truly uneasy.
When a developer's code repository, agent workflow, and product logic are built around the API of a particular Chinese model, the cost of migration will rise exponentially over time. By then, even if the U.S. legislates restrictions, developers will resist with their feet, just as today's programmers cannot abandon GitHub.
Today's export of tokens may only be the beginning of a long game. Chinese large models do not claim to overturn anything; they simply deliver services at lower prices to every global developer with an API key.
This time, those laying cables are the engineering teams coding in Hangzhou, Beijing, and Shanghai, and the GPU clusters operating day and night in a southern province.
This contest has no countdown; it is ongoing 24 hours a day, measured in tokens, with the battlefield being every developer's terminal.
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。