The next earthquake of AI: Why the real danger is not the SaaS killer, but the computational power revolution?

CN
PANews
Follow
9 hours ago

Written by: Bruce

Recently, the entire tech and investment community has been focused on one thing: how AI applications are "killing" traditional SaaS. Since @AnthropicAI's Claude Cowork demonstrated how easily it can help you write emails, create PPTs, and analyze Excel spreadsheets, a panic about "software is dead" has started spreading. It's indeed frightening, but if you're only looking at this, you might miss the real earthquake.

It's like we are all looking up at the aerial drone battles, yet no one notices that the entire continental plates beneath our feet are quietly moving. The real storm lies beneath the surface, in a corner that most people cannot see: the computational foundation supporting the entire AI world is undergoing a "silent revolution."

This revolution may end the grand party curated by AI's shovel seller: Nvidia @nvidia, much earlier than anyone imagines.

Two Revolutions Converging

This revolution is not a singular event but is woven from two seemingly independent technological paths. They are like two armies encircling Nvidia’s GPU hegemony, forming a pincers attack.

The first path is the slimming revolution of algorithms.

Have you ever wondered if a super brain really needs to activate all its brain cells when thinking through a problem? Clearly not. DeepSeek figured this out, and they developed the MoE (Mixture of Experts) architecture.

You can think of it as a company housing hundreds of experts in different fields. But each time a problem needs solving, you only invite the two or three most relevant people instead of brainstorming with everyone. This is the cleverness of MoE: it allows a large model to activate only a small subset of "experts" during each computation, significantly saving computational power.

What happens as a result? The DeepSeek-V2 model nominally has 236 billion "experts" (parameters), but only needs to activate 21 billion of them to get the job done, which is less than 9% of the total. Yet its performance can rival that of GPT-4, which requires 100% of its power to operate. What does this mean? AI's capabilities have decoupled from its computational power consumption!

In the past, we assumed that stronger AI required more GPUs. Now, DeepSeek tells us that with clever algorithms, we can achieve the same results at a tenth of the cost. This directly questions the essential need for Nvidia GPUs.

The second path is the hardware "change lane" revolution.

AI work is divided into two phases: training and inference. Training is like going to school, requiring reading thousands of books, and during this time, GPUs as "miracle workers" of parallel computing are indeed useful. But inference, which is more about the speed of response in our daily use of AI, prioritizes different needs.

GPUs have a natural flaw in inference: their memory (HBM) is external, leading to latency when data travels back and forth. It's like a chef who has all the ingredients in a fridge in the next room—every time they need to cook, they have to run over to grab them. No matter how fast they are, it doesn’t change much. Companies like Cerebras and Groq have started from scratch to design dedicated inference chips, directly soldering memory (SRAM) onto the chip, making "zero-latency" access to ingredients a reality.

The market has already voted with real money. OpenAI, while complaining about Nvidia's poor GPU inference, immediately turned around and signed a $10 billion deal with Cerebras to rent their inference services. Nvidia, in a panic, spent $20 billion to acquire Groq to avoid falling behind in this new race.

When the Two Paths Converge: Cost Avalanche

Now let’s put these two things together: a DeepSeek model that has "slimmed down" with an algorithm, running on a Cerebras chip with "zero latency."

What will happen?

A cost avalanche.

Firstly, the slimmed-down model is small enough to fit entirely into the chip's built-in memory at once. Secondly, without the bottleneck of external memory, the AI's response speed will be astonishingly quick. The end result is that training costs have decreased by 90% due to the MoE architecture, and inference costs have decreased by an order of magnitude due to specialized hardware and sparse computation. All in all, the total cost of owning and operating a world-class AI may only be 10%-15% of traditional GPU solutions.

This is not an improvement; it’s a paradigm shift.

Nvidia's Throne is Having the Carpet Pulled Out from Under It

Now you should understand why this is more lethal than the "Cowork Panic."

Nvidia's current market value in the trillions is based on a simple narrative: AI is the future, and the future of AI must rely on my GPUs. But now, the foundation of this story is being shaken.

In the training market, even if Nvidia continues to monopolize, if clients can do the work with just a tenth of the GPUs, the overall size of this market may drastically shrink.

In the inference market, which is ten times larger than training, Nvidia not only lacks an absolute advantage but also faces a siege from all corners by companies like Google, Cerebras, and many others. Even its largest client, OpenAI, is defecting.

Once Wall Street realizes that Nvidia's "shovels" are no longer the only, or even the best, choice, what will happen to the valuations built on the expectation of "perpetual monopoly"? I think everyone knows the answer.

So, the biggest black swan in the next six months may not be which AI application has outdone another, but a seemingly inconspicuous piece of tech news: like a paper on the efficiency of the MoE algorithm, or a report showing a significant increase in the market share of dedicated inference chips, quietly declaring the entry of the computing war into a new phase.

When the "shovel seller's" shovel is no longer the only choice, their golden era may also come to an end.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Share To
APP

X

Telegram

Facebook

Reddit

CopyLink