AI computing power does not offer monthly subscriptions: Zhou Hongyi's Token warning

On March 29, 2026, at the Global Unicorn Company Conference, 360 founder Zhou Hongyi targeted a widely accepted assumption: AI could offer "subscription plans" and "unlimited usage" just like the internet era. He compared the business models of traditional internet and AI, emphasizing the structural disconnection in cost, especially the rigid constraints caused by computing power and token billing. With phenomenon-level agents like OpenClaw igniting discussions, questions like "Should AI offer subscriptions?" and "Are tokens the new generation of traffic?" became the focus of a new narrative, revealing underlying sustainability concerns within the AI economic model beyond mere opportunities.

From Lobster Fever to OpenClaw's Costly Reality

In recent weeks, the rise of phenomenon-level AI agents like OpenClaw has revived a "lobster fever"-like frenzy on social media timelines: a large number of users flocking to experience, furiously calling upon, and showcasing their results. On the surface, it looks like a revolution of experience, but behind it lies a collective neglect of token consumption patterns. Every interaction, every model call corresponds to visible token expenditures and computing bills, sparking quick discussions within and outside the industry.

For a long time, users' mindset in traditional applications has been trained to expect "unlimited use" and "free value upgrades"—from broadband subscriptions to membership systems on video and music platforms, people have become accustomed to mindlessly consuming once they pay. However, when faced with AI products that charge based on usage frequency and token consumption, there is a clear disconnection between habit and reality: the more interactions, the higher the bill, which forces users to confront the feeling of "spending money every time they ask a question" for the first time.

As capital and developers rush to chase the AI wave, much of the narrative focuses on terms like "growth," "explosion," and "phenomenon-level," while rarely digging deep: can the computing costs and token expenditures backing these dazzling experiences truly be covered long-term? Sustained operation of OpenClaw-style applications requires continuously paying for reasoning expenses. Ignoring this, the AI craze risks becoming a short-term boom built on subsidies and burning money, rather than a coherent business model.

The Historical Inertia of Internet Subscription Illusions

The traditional internet's bold move to widely launch "subscriptions" and "unlimited use" arose under specific historical pathways and infrastructural conditions. With years of investment in communication networks, data centers, and CDNs, the entire system gradually approached a near-infinite infrastructural capacity; at least under mainstream users' scales, it is hard to truly reach physical limits, providing a safety net for operators and platforms.

With the scaling of bandwidth and storage, the marginal costs of unit transmission and unit storage continue to decrease. Adding a user or increasing usage time has limited impact on the overall cost curve. As long as the scale is large enough, the marginal risks of "selling subscriptions" and "providing unlimited traffic" are manageable and can even be compressed further through distributed scheduling and peak-off-peak mismatching. This technology and cost structure jointly support the commercial imagination of "unlimited traffic" and "unlimited use."

More importantly, this traffic-oriented thinking has shaped a whole generation of product managers and operators' pricing recognition: user growth takes precedence over individual profit, acquiring users through low barriers, traffic subsidies, and free value upgrades to raise DAU and retention, followed by enhancing ARPU with advertising, memberships, and premium services at the back end. Pricing is no longer about calculating every call's account; it's about confidence revolving around "the pie is big enough, we can always make profits later."

The Physical Boundaries of AI Charging per Call

Against this backdrop, Zhou Hongyi's statement at the Unicorn Conference strikes a particularly harsh note: "Tokens will never achieve subscription-level unlimited use like mobile traffic." This is not merely a commercial choice but stems from the physical constraints of running AI, fundamentally conflicting with the marginal cost-reducing model of traditional internet.

He further emphasized, "The essence of AI operation is the consumption of computing power, information processing, and intelligence costs." In other words, calling a large model doesn’t just involve running bandwidth; it consumes GPU/ASIC hardware power, memory, and storage bandwidth while also adding costs for electricity, cooling, and maintenance. Under the current mainstream tech stack, the unit price of AI tokens is relatively fixed, and usage is highly positively correlated with costs; even a little more reasoning leads to a little more real expenditure, making it difficult to enjoy the benefits of "using a few extra gigabytes almost without extra costs."

The scarcity of computing resources, the rigidity of electricity prices, and the high costs of large model reasoning have all raised the economic threshold for "unlimited use" in AI scenarios. Even if future chip iterations, model compression, and computing scheduling reduce the costs of single calls, it will be challenging to dilute this cost to a level that allows for mindless subscriptions with no upper limits in the foreseeable future.

Therefore, pricing narratives around AI tokens must delineate clearly from the old internet logic of "traffic subsidies and burning cash for market share." Continuing to design an AI business model around the path of "first free usage, then gradually monetizing" may likely confront harsh realities on token bills: no one can long subsidize unlimited question and answer for everyone, especially in times when computing power and energy remain tight.

Misaligned Imagination of Product Managers and the Shift to Precision Accounting

Carrying forward a validated successful paradigm from the internet era, many traditional teams instinctively applied the mindset of "low barrier customer acquisition + high-frequency stickiness" when developing AI applications: entry should be smooth enough, interactions should be sufficient, encouraging users to stay in the product to "talk more, play more, try more." This was a common routine in the past to improve retention and duration; nowadays, however, it collides misaligned with the reality of token-based billing.

Under the constraint of token pricing, massive interactions, multi-turn casual conversations, and frequent calls directly translate to cost explosions. Long conversations that were once deemed "user stickiness" now imply linear accumulation of reasoning requests; originally intended to create surprises, multi-model chained calls now correspond to geometric cost accumulation in the backend. The product now becomes not just a matter of "how usable is it," but also "is it worth paying for a single use."

This forces OpenClaw-like intelligent agents to start migrating towards "precision accounting models" in their interaction design and calling strategies: reducing meaningless rounds, guiding complex requirements into more precise, high-density information questions; compressing redundant calls through internal strategies to deliver clear value within limited token budgets. At the same time, the product side also must educate users—making them aware that every call involves a computing power expenditure, not the traditional sense of "free companionship."

This shift from "laissez-faire growth" to "precision accounting interaction" is, in fact, reshaping the design philosophy of native AI products: success is no longer purely measured by time or frequency but rather by the value density of each call, evaluating whether the product is worth existing.

The Compulsion and Opportunities of the AI Token Economic Model

As token costs become a rigid constraint, the industry has passively entered an "efficiency-first" evolutionary path. On one hand, the pressures of computing and electricity costs compel model developers to continually improve efficiency: compressing models, distilling algorithms, and optimizing structures at the algorithm level; reducing redundant calls, optimizing memory, and bandwidth usage at the engineering level; and advancing stronger reasoning acceleration technology at the hardware level, using comprehensive means to pull down the real costs behind each token.

On the other hand, on-demand billing provides a clearer commercialization path for a range of high-value vertical scenarios, specialized agents, and B2B services. Rather than pursuing broad, indiscriminate services for all users, it is better to align "the benefits from each call" with "the token costs of each call" tailored around certain industries and high-value decision-making scenarios, allowing willing-to-pay enterprises and professional users to become the primary buyers. This "pay-as-you-go" model, in fact, is more conducive to building a sustainable income structure.

On a more macro level, the role of AI tokens in value capture and supply-demand dynamics is gradually evolving into a chip in the infrastructure-level track: whoever can continue to lower token costs while ensuring performance may have a better chance of seizing pricing power in the computing economy; whoever can better link token pricing mechanisms with actual business value has the opportunity to pull AI back from conceptual hype to real cash flow. The competition around this chip is far from over.

Calmness and Reconstruction After the Lobster Feast

Returning to Zhou Hongyi’s public speech on March 29, its true value lies in correcting the market illusion that "AI can also provide subscription traffic." His straightforward statement "tokens will not turn into mobile traffic subscriptions" reminds the industry not to simply translate the inertia of internet-era thinking into the computing economy era, nor to underestimate the boundary effects of physical constraints and cost structures on commercial imagination.

The lobster fever of OpenClaw is fundamentally just an early sample of the pricing game in the era of computing economy: in the short term, certain agents can be pushed to the forefront using subsidies, capital, and emotions, but whether they can stand firm long-term ultimately depends on the cost-value relationship. Today's hype will not be the final form of AI business models; more so, it represents a collective trial and mindset calibration.

Looking to the future, AI applications need to find new balancing points and narrative frameworks among experience, costs, and computing constraints: they must not return to the old dream of "free unlimited use," nor should they be bound by the fear of "spending every time a question is asked." How to ensure that the computing power behind every token is effectively transformed into perceivable value, and how to achieve transparency, fairness, and sustainability in pricing, will determine how far and how steadily the AI industry can progress after this lobster feast.

Join our community for discussions and become stronger together!
Official Telegram group: https://t.me/aicoincn
AiCoin Chinese Twitter: https://x.com/AiCoinzh

OKX Benefits group: https://aicoin.com/link/chat?cid=l61eM4owQ
Binance Benefits group: https://aicoin.com/link/chat?cid=ynr7d1P6Z

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。