Author: Haishan
From the "bargain price" price war of tokens in 2024 to the collective price increases by Alibaba Cloud, Tencent Cloud, and Baidu Intelligent Cloud in 2026.
The token industry has achieved a staggering turnaround in just two years, going from money-burning competition and overcapacity to supply shortages and simultaneous increases in volume and price.
Since 2026, the AI computing power sector of the A-share market has seen a cumulative increase of over 55%, with leading large model companies like Moonlight Dark Side and Zhiyun AI exceeding 1 billion yuan in monthly revenue, and some companies surpassing their entire 2025 revenue within 20 days.
This industrial revolution, defined by Jensen Huang as "Token Factory Economics," has already surpassed mere technical speculation, becoming a certain trend driven by the explosion of real demand, supply-demand structure imbalance, and global energy computing power competition. The reconstruction of its underlying logic is reshaping the rules of the entire AI industry and is also overturning the underlying operational logic of the world.
01 "Oil" of the New Era
The essence of this industrial turning point is the comprehensive shift of the AI industry from a "model arms race" to a "token production capacity race."
Before 2024, the core narrative of the industry was "who has larger model parameters and who is smarter," with major manufacturers frantically burning money to train large models, seizing market share through free token giveaways and low-priced dumping, even leading to a bizarre situation where "selling tokens was less profitable than selling bottled water."
However, the explosive popularity of the OpenClaw (commonly known as "Lobster") agent in February 2026 completely shattered this logic.
Traditional large models operate in a "human seeks AI" single-round interaction mode, where a single dialog only consumes 1,000 to 3,000 tokens. However, the Agent uses a cycle structure of "plan-action-observation-reflection," requiring dozens to hundreds of model calls to handle a complex task, consuming 100,000 tokens for medium tasks, and up to millions for complex tasks, which the industry refers to as "token shredders."
Data from the National Bureau of Statistics corroborate this explosion: the daily token usage in China skyrocketed from 100 billion at the beginning of 2024 to 140 trillion by March 2026, growing over 1,000 times in two years, with the first quarter of 2026 seeing a 40% increase compared to the end of 2025.

The industry narrative has completely shifted: it's no longer about competing on the "IQ limits" of models, but rather about who can produce massive tokens more cost-effectively and stably, and who can seize the initiative in intelligent supply.
Faced with massive demand, the rigid constraints of supply-demand mismatch are the core support for the sustained strength of this token market, and this imbalance is not a short-term fluctuation but a structural contradiction determined by the long cycle of the industrial chain.
There are three significant bottlenecks on the supply side: first, core hardware production capacity is monopolized and the expansion cycle is long.
High Bandwidth Memory (HBM) is the "heart" of AI servers, with Samsung, SK Hynix, and Micron occupying over 95% of global production capacity, and their expansion cycles last 24 to 36 months, leading to a HBM deficit exceeding 40% in 2026.
As a result, the price of ordinary DDR5 memory increased by 300% in six months, with a single 256G server memory unit priced over 40,000 yuan, and the delivery cycle for AI servers extended from three months to twelve months.

Second, electricity and energy have become the largest hidden bottleneck. The power of intelligent computing center cabinets is 10-20 times that of traditional data centers, with electricity costs accounting for over 60% of token production costs, while the power support construction cycle for large data centers lasts 3-5 years, leading to a situation in eastern China where computing power indicators are extremely hard to come by.
Third, infrastructure and operational capabilities are struggling to keep up with demand explosions. The penetration rate of liquid cooling data centers rose from 15% in 2024 to 45% in 2026, but the shortage of related technical personnel and construction capacity has led to many completed computing power clusters being unable to operate at full load.
While the supply side lacks capacity, the demand side is showing an explosive "three-stage rocket" growth pattern, with sustained growth.
The first stage is the widespread adoption of consumer-side smart agents, where individual users are shifting from simple chatting and entertainment to using AI assistants for handling emails, writing code, and planning, with daily token consumption rising from dozens to thousands, expected to eventually exceed tens of thousands.

The second stage is the full rollout of production-grade applications on the business side, where companies are no longer viewing AI as a supplementary tool but are incorporating tokens as a core production element. Companies like Kunlun Wanwei and 58 Tongcheng have monthly token consumption exceeding 1 trillion, and AI transformations in manufacturing, finance, and healthcare are releasing trillion-level token demand.
The third stage is the explosive global demand for going overseas, where the price of domestically produced large model tokens is only 1/5 to 1/3 of overseas models like Claude and GPT, rapidly capturing markets in Southeast Asia, the Middle East, and Latin America due to their high cost-effectiveness. In the first quarter of 2026, Chinese cloud vendors' overseas token revenue increased by 320% year-on-year, becoming a new growth pole.
On a deeper level, tokens are becoming a fundamental commodity of the AI era, reconstructing the value system of the entire digital economy. Just as electricity was the core energy of the industrial era and data traffic was the core asset of the internet era, tokens are the core production material of the intelligent era, possessing three attributes: measurable, priceable, and tradable, becoming a universal value anchor connecting computing power supply and intelligent demand.

This transformation has brought about a complete revolution in business models: the industry has bid farewell to the old internet path of "burning money for scale" and entered a new phase of "charging by volume, driven by profit."
Major companies are commonly adopting a strategy of "subsidizing the consumer side to cultivate habits and harvesting on the business side at scale," giving tokens away for free for a limited time to individual users and charging corporate clients based on actual consumption. In the first quarter of 2026, leading cloud vendors saw the gross profit margin of their AI businesses generally rise to over 35%, achieving scaled profitability for the first time.
For China, this revolution in the token industry has brought a historic opportunity for leapfrog development. China has the lowest green electricity costs in the world, the most complete computing power infrastructure (hosting over 60% of global server capacity), the widest application scenarios, and the most cost-effective large models, possessing all the conditions to become the "world's token factory."

Just as China became the "world's factory" due to its cost advantages in the past, it is now dominating global token production and supply with its comprehensive advantages in energy, computing power, and application scenarios.
In the short term, supply-demand mismatches will continue until the end of 2027, keeping token prices high, with a rapid increase in industry concentration.
In the long term, as chip production capacity is released and model efficiency optimized, tokens will enter an era of "bargain prices," penetrating every corner of the national economy and becoming the core engine for digital economic growth.
02 What is the situation in the segmented industry?
Accompanied by the reversal of the token industry from "price war" to "supply-demand shortage," its segmented tracks have shown structural differentiation.
A differentiated market has emerged with upstream price control, midstream profit enhancement, and downstream monetization, with three major sectors in the industry: upstream computing power hardware production, midstream token hub scheduling, and downstream scenario application rollout, each having different barriers, prosperity levels, and value distribution logic.
First is the upstream computing power hardware, serving as the core capacity of the token factory, having hard demand under the monopolistic landscape.
Core elements include four major sub-tracks: AI chips, computing power servers, liquid cooling, and intelligent computing center operations, with the industry showing an oligopolistic structure.
AI chips are the core engine of token production, with Nvidia dominating over 90% of the global high-end GPU market.
Meanwhile, leading domestic substitutes in the A-share market are accelerating breakthroughs: Cambrian's Siyuan 590 chip has achieved scaled production, adaptable for large model inference and training, with AI chip revenues in the first quarter of 2026 increasing by 320% year-on-year.

Haiguang Information's DCU products have over 30% penetration in domestic intelligent computing centers, deeply binding with leading manufacturers like Inspur and Sugon. Jingjia Micro's JM9 series GPUs have landed in government, finance, and other CFI scenarios, becoming the core supplier of domestic general-purpose GPUs.
Computing power servers serve as the carriers of token production capacity, with leading A-share companies dominating nearly half of the global market.
Inspur Information maintains the largest market share in global AI servers, with shipments in the first quarter of 2026 increasing by 180% year-on-year, and Sugon's liquid-cooled servers have the highest domestic market share, providing hardware support for over 80% of national intelligent computing centers.
Liquid cooling is a necessity for solving the high-power demands of intelligent computing centers, with penetration rates rapidly increased from 15% in 2024 to 45% in 2026.
Yinwei is the absolute leader in the liquid cooling industry, binding with core clients like Nvidia, Inspur, and Huawei, and the liquid cooling orders increased by over 210% year-on-year in 2026. Shenli Environmental's liquid cooling data center solutions are deployed in several national intelligent computing centers, with order growth exceeding 150%.
In the operation of intelligent computing centers, Baoxin Software, Guanghuan New Network, and Runze Intelligent Computing, leveraging core locations and green electricity resources, have become the largest third-party intelligent computing center operators in the country, with computing power rental income in the first quarter of 2026 increasing by over 100% year-on-year.
Next is the midstream token hubs, which are shifting from price wars to value wars.
The midstream of the token industry undertakes core functions of computing power scheduling, model service, and standardized token output, with players primarily categorized into large model manufacturers and cloud service providers.

Leading large model companies in the A-share market have formed clear commercialization paths for tokens.
For example, Kunlun Wanwei's Tiangong large model exceeds 1.2 trillion token calls daily, with over 120,000 paying B-end clients, offering enterprise-level token services priced at only 1/4 of overseas models, and seeing AI business revenue in the first quarter of 2026 increase by 450% year-on-year.
iFlytek's Xinghuo large model focuses on vertical scenarios such as education, healthcare, and office use, with 70% of token consumption coming from B-end production-grade applications.
On the cloud service provider side, Alibaba Cloud, Tencent Cloud, and Volcano Engine, though not listed in the A-share market, have allowed related ecosystem companies in the A-share market to benefit significantly: Youfang Network and Kingdee International (Hong Kong stock) are building enterprise-level AI applications based on Alibaba Cloud, becoming important channels for token consumption.
Finally, the downstream application scenarios, serving as the ultimate exit for token value, are penetrating towards consumer C-end accessibility and B-end necessities.
Downstream scenarios can be divided into three categories: C-end personal applications, B-end enterprise services, and vertical industry digitalization, with significant differences in token consumption orders of magnitude and commercialization rhythms.
C-end scenarios focus on accessibility, primarily utilizing personal AI assistants, content generation, and creative design.
For instance, Wancheng Technology's AI creative software (Miaoying Factory, Wancheng AI Drawing) has over 5.5 million global paid users, with token consumption in the first quarter of 2026 increasing by 320% year-on-year, reducing single-user token costs by 40% through model optimization.
CaiXun's AI email and intelligent office assistants have accumulated over 300 million users, with daily token calls exceeding 50 billion.
B-end enterprise services are the main force of token consumption, accounting for over 65% of total consumption.
For example, Tonghuashun's AI investment advisory service covers over 100 million stockholders, with daily token calls exceeding 80 billion, and AI-related revenue in the first quarter of 2026 increasing by 190% year-on-year.
Zhongkong Technology's industrial AI platform provides intelligent operation and maintenance services for the chemical and power industries, with annual token consumption exceeding 5 million per factory.

Runda Medical's AI-assisted diagnostic system has landed in over 3,000 hospitals nationwide, processing daily medical text tokens exceeding 20 billion.
Overall, it appears that the vertical industry scenarios on the B-end are set to be the long-term growth pole of the token industry, with AI transformations in autonomous driving, smart manufacturing, and fintech unleashing trillion-level token demand.
03 Which targets are on the rise?
From an industrial perspective, the current token industry has fully shifted from "model competition" to "competition in production capacity and monetization," with the combination of supply-demand mismatches and accelerated release of commercial value. Six leading companies in the A-share market have firmly established themselves in the three major tracks of computing power hardware, midstream models, and downstream applications, becoming the most potential core targets in this trillion-token economy.
First is Inspur Information, as an absolute leader in AI servers; as the company with the largest global market share in AI servers, Inspur is the core hardware vehicle supporting the operation of global token factories. The company is deeply bound with Nvidia, securing priority access to high-end GPU quotas, leading to irreplaceable supply chain and scale barriers.
In the first quarter of 2026, AI server shipments increased by over 150% year-on-year, and its global market share surpassed 25%, with nearly 40 billion in orders yet to be delivered, with delivery arrangements extending to the end of 2027, making it one of the most performance-secure targets in the industrial chain.
Second, the leader in liquid cooling, Yinwei, is essential as the cooling heart of token factories, which are experiencing extreme power density increases. Liquid cooling has become a necessity for large-scale token production, with industry penetration rising from 15% in 2024 to 45% in 2026. In the first quarter of 2026, liquid cooling business revenues increased by over 210% year-on-year, and visibility of orders extends to 2027, making it the target with the largest performance elasticity in the upstream segment.
Finally, Kunlun Wanwei is a pioneer in large model commercialization, setting the benchmark for token monetization as the first large model manufacturer in the A-share market to achieve scaled token profitability. Its enterprise-level token services are priced at only 1/3 to 1/4 of overseas models, rapidly capturing the market among small and medium enterprises.
In the first quarter of 2026, the average daily token calls exceeded 1.2 trillion, with over 120,000 paying B-end clients, and AI business revenue increased by over 450% year-on-year, maintaining a gross profit margin of over 42%, making it the most pure token monetization target in the A-share market.
iFlytek is the leader in vertical large models, as the core carrier of tokens in the industry, deeply engaged in vertical fields such as education, healthcare, and industry, with over 70% of token consumption coming from B-end production applications, indicating very strong demand rigidity.
With a backdrop of years of industry accumulation in scenarios and data barriers, the company's customized token service orders for government and enterprises are rapidly growing, and AI-related revenues are projected to exceed 60% in 2026. As the AI penetration rate continues to rise in vertical sectors, the company stands to fully enjoy the long-term token demand dividends brought about by industrial digitalization.
Lastly, Wancheng Technology, a leader in C-end AI applications, is the core of personal token consumption, as Wancheng Technology is the leading company for global C-end AI creative tools, with its video editing and AI drawing products boasting over 5.5 million paid users globally. As AI functions are fully deployed, the willingness to pay and usage duration among users has significantly increased, with token consumption in the first quarter of 2026 increasing by over 320% year-on-year.
In summary, this round of token dividends presents a demand-driven long-term opportunity. In the short term, attention can appropriately be prioritized for upstream hardware leaders like Inspur Information and Yinwei, mid-term for commercialization benchmarks like Kunlun Wanwei, and long-term for vertical scene leaders like iFlytek, as high-quality companies are set to experience a dual uplift in performance and valuation during this robust growth cycle.
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。