Cerebras CEO Andrew Feldman: How the "Pizza Chip" Behind the IPO Challenges Nvidia

Written by: Techub News Compilation

Recently, AI chip startup Cerebras Systems submitted its IPO application, planning to raise up to $4.8 billion. Its stock was oversubscribed 20 times before issuance, making it one of the most highly anticipated tech IPOs of the year. In the latest episode of the Limitless Podcast, the hosts delved into Cerebras' unique technological path, its deep ties with OpenAI, and the profound impact its IPO may have on the competitive landscape of AI hardware.

The "Whole Pizza" Chip: Cerebras' Disruptive Architecture

To understand why Cerebras is so noteworthy, one must first comprehend its fundamental innovation in chip design. Traditional chip manufacturing resembles cutting a silicon wafer, roughly the size of a dinner plate, into many smaller pieces, each becoming an independent chip. The founder of Cerebras posed a seemingly simple yet bold question: What if we don't cut the "pizza" and treat the entire wafer as a gigantic chip?

They achieved this. Cerebras' "Wafer-Scale Engine" (WSE) is precisely such a "whole pizza" type of chip. This design offers an order-of-magnitude improvement: the latest generation WSE-3 chip integrates up to 40 trillion transistors, while Nvidia's current flagship Blackwell GPU chip has approximately 20.8 billion transistors. The enormous physical size difference results in essential performance advantages.

The key is to integrate all computing cores and memory on a single massive chip, significantly shortening the distance and latency of data transmission between different components. In traditional GPU architectures, computing units frequently need to access separate and relatively distant high-bandwidth memory (HBM), which poses a performance bottleneck. Cerebras integrates massive static random-access memory (SRAM) directly on the chip, allowing for an exponential increase in data access speed.

The direct benefits of this architecture are particularly evident in AI inference tasks. Inference refers to the process in which a trained AI model processes actual requests (such as answering questions, generating code). As AI applications shift from training to large-scale deployment, the speed and cost of inference become crucial. According to a recent report from JPMorgan, the future AI inference market is expected to be 10 to 50 times the size of the training market.

“This is not just a theoretical concept,” the hosts emphasized during the show, “OpenAI invested $10 billion in Cerebras about six months ago and gained priority usage rights for its chip designs.” This collaboration has materialized: OpenAI's world-leading code AI model Codex runs its "fast mode"—GPT Codex Spark—on Cerebras' chips, achieving near-zero latency response speeds. For software engineers needing rapid iteration or in fast-paced financial trading scenarios, this speed advantage translates to immense value.

IPO Frenzy: $4.8 Billion and the Challenge to Nvidia's Throne

The size of Cerebras' IPO exceeded initial expectations. The company initially aimed to raise $3.5 billion, with a share price set at about $115. However, due to extremely strong demand from institutional investors, the IPO garnered 20 times oversubscription. The company promptly revised the terms, issuing an additional 2 million shares and raising the share price to $150, bringing the total funding amount to approximately $4.8 billion.

This phenomenon sends a strong signal: the market's appetite for AI hardware companies is nearing a "thirsty" state. The hosts view Cerebras' IPO as the "first domino" in a series of major AI companies going public, with potential follow-ups from companies like SpaceX, OpenAI, Anthropic, and Databricks, with a combined potential market value reaching trillions.

The core narrative of this IPO is that Cerebras is the "first real threat" to Nvidia's monopoly in the AI chip sector. For a long time, Nvidia has built an almost insurmountable moat with its GPUs and CUDA ecosystem, reaching a market capitalization of $5.3 trillion. However, its success has also exposed the limitations of its architecture, particularly in terms of inference optimization.

Cerebras' challenge is not an isolated case. Just last week, Google released its self-developed fifth-generation tensor processing unit (TPU) to train its Gemini model, which performs even better than Nvidia's GPUs in certain aspects. Google's market capitalization temporarily surpassed Nvidia's as a result. This indicates that Nvidia's moat is not absolute, and new architectural innovations are shaking its foundations.

Nvidia's own actions affirm this trend. Its $20 billion acquisition of another chip company focused on inference optimization, Grok, is a de facto acknowledgment of the technological direction represented by Cerebras.

SRAM vs. DRAM: The Ultimate Trade-off of Speed and Cost

The core technological secret behind Cerebras' leap in performance lies in its ultimate utilization of static random-access memory (SRAM), which sharply contrasts with the high-bandwidth memory (HBM, based on dynamic random-access memory DRAM) used by mainstream GPUs.

The hosts used a vivid analogy: SRAM is akin to solid-state drives (SSD), while DRAM is more comparable to hard disk drives (HDD). DRAM (HBM) has large capacity, is relatively inexpensive, and easy to scale, making it the choice for nearly all GPUs. However, it has a fatal flaw: to prevent data loss, it requires constant refreshing, leading to additional latency and power consumption.

SRAM, on the other hand, operates entirely differently. As long as power is maintained, it can preserve data indefinitely without the need for refreshing, allowing for incredibly fast access speeds and higher energy efficiency. Cerebras tightly integrates massive amounts of SRAM (its WSE-3 chip integrates up to 44GB of on-chip SRAM) with computing cores, achieving ultra-low-latency data delivery, which is the fundamental reason for its impressive inference speed.

Of course, this advantage does not come without cost. SRAM is expensive, and its integration density on the same area is far lower than that of DRAM, which is one reason why Cerebras chips must be so large—they need enough physical space to accommodate vast amounts of SRAM. This also results in extremely high chip costs.

“But think about it: if your AI model's iteration speed can be several times or even dozens of times faster, how much value could that create for you?” the host posed the key question. For hedge funds, a few milliseconds of faster quantitative trading algorithms could be worth billions; for enterprise applications with millions of users, faster responses mean better user experiences and higher revenues. Therefore, paying a premium for extreme speed in the high-value battleground of AI inference may be a worthwhile transaction.

Performance data supports this assertion. According to benchmarks quoted in the show, when running Llama 3.1 (a model with 70 billion parameters), Cerebras chips' inference speed is 20 times faster than Nvidia's latest Blackwell flagship chip. For AI agents increasingly dependent on "deep thinking" and long-chain reasoning, this speed advantage will directly translate into stronger capabilities.

Opportunities and Concerns: Cool Reflections in a Thriving Market

Despite the promising outlook, the hosts maintain a cautious observation of Cerebras' IPO, pointing out several potential risk factors.

First is the high valuation. Cerebras' IPO valuation multiple reaches 51 times revenue, whereas many large tech companies usually have revenue multiples around 20 times. Such a high premium suggests that the stock price might experience volatility or even correction shortly after going public. The host warned that typical IPO first-day gains range from 30% to 80%, and retail investors who buy high could face pricing pressure.

Second is customer concentration and potential competition. Cerebras' largest customer and strategic partner is OpenAI, which has not only made substantial investments but also has co-founders Sam Altman and Greg Brockman personally investing in Cerebras. This deep tie is a significant advantage but also poses risks.

A critical question arises: OpenAI is also collaborating with companies like Broadcom and MediaTek to develop its in-house AI inference chips. Sam Altman has explicitly stated that solving inference problems is one of its core goals. So, is Cerebras a long-term strategic partner for OpenAI, or a transitional solution? Will OpenAI gradually phase out Cerebras chips in the future? This is a Damoclean sword hanging over Cerebras.

Moreover, Cerebras also needs to face Nvidia's formidable CUDA ecosystem moat. Countless AI developers and enterprises are already deeply integrated into the CUDA software stack, and migrating to a new hardware platform requires time and costs.

However, Cerebras is also actively building its own ecosystem and distribution channels. Besides OpenAI, its chips have also been integrated into Amazon's AWS Bedrock platform. Amazon AWS almost monopolizes the enterprise cloud computing market, which provides crucial distribution channels for Cerebras.

“So, the distribution issue has been resolved, the SRAM technology innovation from 0 to 1 has been addressed, and consumer channels through OpenAI are in place,” the hosts summarized. “All of this sounds very optimistic, but valuation and long-term customer relationships remain risks to watch.”

A New Era in the AI Chip Warfare: The Dawn of the Era of Inference Dominance

Cerebras' IPO is not merely the listing of a company; it serves as a strong industry signal: the AI hardware race has entered a new phase centered on inference performance, and the singular monopoly pattern is being disrupted.

As AI giants like OpenAI and Anthropic continue to set new revenue records (the show mentioned that Anthropic's annual revenue has reached $45 billion, with a target of $100 billion by year-end), the demand for more efficient and faster inference hardware will only grow more urgent. Whoever can generate tokens (the basic unit of AI output) faster will dominate the future AI application ecosystem.

“Assuming we continue to need tokens and these tokens can generate profits, those who can distribute and produce these tokens faster than others will win,” the host stated. This simple logic underpins the confidence of Cerebras and its investors.

Cerebras, with its "pizza chip" extreme physical form, has carved a differentiated technical path apart from traditional GPUs. Its success or failure will test whether the market is willing to pay a steep premium for extreme inference speed and how deep Nvidia's moat truly is. Regardless of the outcome, Cerebras' debut has added an exciting and unpredictable new player to the grand AI chip war. The performance of its IPO will serve as a critical indicator to gauge the temperature of the entire AI investment boom.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

Cerebras CEO Andrew Feldman: How the "Pizza Chip" Behind the IPO Challenges Nvidia

The "Whole Pizza" Chip: Cerebras' Disruptive Architecture

IPO Frenzy: $4.8 Billion and the Challenge to Nvidia's Throne

SRAM vs. DRAM: The Ultimate Trade-off of Speed and Cost

Opportunities and Concerns: Cool Reflections in a Thriving Market

A New Era in the AI Chip Warfare: The Dawn of the Era of Inference Dominance

Selected Articles by Techub News

Table of Contents

Related Articles