NVIDIA founder Jensen Huang: Moat, CUDA, and the Future of AI

CN
2 hours ago

Written by: Techub News Compilation

Introduction

Recently, NVIDIA founder and CEO Jensen Huang participated in an extensive interview with well-known podcast host Dwarkesh Patel. In this over 100-minute conversation, Huang candidly and deeply responded to sharp questions about NVIDIA's competitive moat, the value of the CUDA ecosystem, supply chain bottlenecks, challenges from competitors, and geopolitical issues. As one of the most influential leaders in the AI wave, Huang systematically articulated his business philosophy and technological vision, providing key insights for understanding NVIDIA's current position and future strategy.

Summary

  • The core of NVIDIA is transforming electronics into valuable “Tokens,” a process that integrates art, engineering, and science, making it hard to commodify.
  • The richness of the CUDA ecosystem, the vast installation base, and ubiquitous cloud support constitute NVIDIA's strong and enduring software moat.
  • NVIDIA's business model is to “do what must be done, as little as possible,” focusing on an irreplaceable accelerated computing stack while leaving other aspects to a vast partner ecosystem.
  • Huang believes that the use of AI tools will grow exponentially, driving up the value of software companies (tool makers) rather than leading to their commodification.
  • Regarding the Chinese market, Huang strongly opposes extreme export controls, believing it will cause the U.S. technology industry to lose market presence, weaken its ecosystem, and ultimately damage its long-term leadership position.

NVIDIA's Moat: The Journey from Electronics to Valuable "Tokens"

At the start of the interview, the host posed a sharp question: As AI develops, the software might become commoditized, so will NVIDIA, which is essentially a “software company” (its design files are handed over to manufacturers like TSMC), also face commoditization risks? Huang answered negatively and elaborated on his thinking model regarding the essence of NVIDIA.

“Our input is electronics, our output is ‘Tokens.’ In between is NVIDIA,” Huang said. “Our job is to exert every necessary effort to achieve this transformation in as minimal a manner as possible and endow it with amazing capabilities.” He explained that “as minimal as possible” means delegating non-core aspects to partners and building a massive ecosystem. NVIDIA's partner network spans all five AI layers from upstream supply chains to downstream computer companies, application developers, and model makers.

However, the portion of work that NVIDIA must undertake personally is “extremely difficult.” Huang described the transformation from electronics to Tokens as “an incredible journey,” with its value lying in “making one Token more valuable than another.” This requires a significant amount of art, engineering, science, and invention. “I doubt this will be commoditized,” he concluded.

In response to the view that NVIDIA builds its moat by locking in scarce components (such as advanced process logic chips, HBM memory, CoWoS packaging), Huang acknowledged this as one of the advantages but emphasized that the foundation lies in NVIDIA's ability to push the entire supply chain to make significant investments based on a clear vision for the future. “I spent a lot of time directly or indirectly informing our supply chain, partners, and ecosystem about the enormous opportunities in front of us.” Through events like GTC, he brought together upstream and downstream industries to witness firsthand the progress and demand for AI, enabling them to jointly invest in the future. This ability to drive upstream investment based on massive downstream demand is key to sustaining its scale.

Addressing Supply Chain Bottlenecks: No Bottleneck Lasts More Than 2-3 Years

When asked whether the year-on-year doubling of AI computational power would be limited by upstream capacity (such as the number of EUV lithography machines), Huang appeared quite optimistic. He believes that a situation where instantaneous demand exceeds total supply is a “good state” for the industry. Once a bottleneck arises, the entire industry quickly focuses on finding a solution, such as the frenzied investment in CoWoS packaging capacity over the past two years, which has significantly alleviated the issue.

“No bottleneck will last more than two to three years, not a single one,” Huang asserted. He illustrated how NVIDIA anticipates bottlenecks years in advance, investing in new technologies (such as silicon photonics), inventing new workflows and testing equipment, and closely collaborating with partners (like TSMC) to shape and prepare the supply chain for future scale. At the same time, NVIDIA itself is also significantly improving computational efficiency through architecture and algorithm innovation (for example, a 30-50 times increase from Hopper to Blackwell), alleviating reliance on pure manufacturing capacity from another dimension.

The bottlenecks that truly concern him come from downstream, particularly energy policy. He pointed out that rebuilding American manufacturing, establishing AI factories, and developing electric vehicles and robotics industries all require massive amounts of energy, while the construction cycle for energy infrastructure far exceeds the expansion of chip capacity.

CUDA Ecosystem vs. Dedicated Chips: Why Accelerated Computing is a Broader Game

Regarding competition from dedicated accelerators like Google TPU, Huang clearly outlined the battlefield. He emphasized that NVIDIA builds an “accelerated computing” platform, rather than merely a “tensor processing unit.” The applications of accelerated computing extend far beyond AI, encompassing numerous scientific and engineering fields such as molecular dynamics, quantum chromodynamics, fluid mechanics, particle physics, and data processing.

“NVIDIA has reinvented the way computing works, shifting from general computing to accelerated computing. Our market coverage is far greater than any TPU or ASIC could achieve,” Huang stated, asserting that NVIDIA is the only platform capable of accelerating a variety of applications, boasting a large ecosystem from which any operator can purchase and run its systems, allowing NVIDIA to enter every cloud service provider and various industries.

In response to the viewpoint that “AI is primarily predictable matrix multiplication, so dedicated chips are superior,” Huang rebutted. He pointed out that matrix multiplication is just a part of AI. Inventing new attention mechanisms, hybrid architectures (like SSM), and integrating diffusion with autoregressive techniques all require a programmable general architecture. The flexibility and programmability of CUDA are crucial for rapid innovation of AI algorithms. “The only way to achieve leaps of 10 times or 100 times annually is to fundamentally change the algorithms and their computational methods each year. This is exactly NVIDIA's fundamental advantage.”

The Ultimate Value of CUDA: Ecosystem, Installation Base, and Ubiquity

Even if large cloud vendors have the capability to write custom kernels for their specific architectures, is CUDA still indispensable to NVIDIA? Huang elaborated on the ultimate value of CUDA from three perspectives.

First, it is the richness, programmability, and capability of the ecosystem. CUDA supports all frameworks and possesses the most mature toolchain. When developers encounter problems, they prefer to suspect their own code rather than the underlying computer. NVIDIA's thoroughly verified system provides a reliable foundation.

Second, it is the vast installation base. NVIDIA has hundreds of millions of GPUs deployed worldwide, from cloud to edge, from data centers to robotics. The software or models that developers write can run on the widest range of devices, which is highly valuable.

Lastly, it is the ubiquitous deployment. NVIDIA's hardware is present in every major cloud provider and local deployments. For AI companies and developers uncertain which cloud or deployment model they will use in the future, choosing CUDA means maximizing flexibility and portability.

Huang further pointed out that NVIDIA has the world's most knowledgeable team of engineers regarding its architecture, who work closely with AI labs to optimize and typically bring an additional 2 times performance boost to customers' stacks. “NVIDIA's computing stack has unparalleled total cost of ownership (TCO) performance in the world.” He challenged competitors to demonstrate their cost advantages in public benchmarks like MLPerf and believes that, based on first principles, other architectures find it hard to outperform in TCO.

Investment Strategy, Cloud Business, and Pricing Philosophy

Reflecting on investments in star AI companies like Anthropic and OpenAI, Huang admitted to having missed early opportunities due to a lack of deep understanding at the time that these leading AI labs needed massive capital investments from suppliers, which venture capital could not meet at such scale. He stated that NVIDIA is now prepared and willing to make such investments to support the development of key AI companies.

When asked why NVIDIA does not use its massive cash reserves to become a cloud service provider (Hyperscaler), Huang reiterated the company philosophy of “doing what must be done, as little as possible.” He believes that building an accelerated computing platform is the core task that no one else can replace NVIDIA to complete, while the cloud service market already has many players. NVIDIA's role is to support “new cloud” ecosystem partners like CoreWeave, ensuring that AI architectures can connect as many industries and countries as possible, rather than personally operating cloud businesses.

During the chip shortage period, NVIDIA's allocation strategy also drew attention. Huang denied any prioritization based on non-market factors like “supporting new clouds.” He emphasized that allocations are based on forecasting, the order of purchase orders (PO), and the readiness of customer data centers, with the goal of maximizing throughput from its own factories. He firmly denied that NVIDIA employs an auction model where “the highest bidder wins.” “We never do that. It's a terrible business practice.” Huang stated that setting and maintaining stable prices is more important for NVIDIA as a reliable cornerstone of the industry.

Controversy in the Chinese Market: Opposing Extreme Export Controls

Regarding chip export controls to China, Huang expressed a clear and strong opposition. He believes that viewing China as an extreme adversary that must be completely cut off from technology is naive and harmful.

Huang pointed out that China possesses 60% of the world's mature process chip capacity, vast energy resources, about 50% of global AI researchers, and is one of the largest contributors to open-source software and models. Currently, China's AI ecosystem is primarily built on NVIDIA's American technology stack. If extreme policies are adopted that force China to develop a completely independent technology stack, it will result in the U.S. voluntarily giving up the world's second-largest market, prompting the Chinese ecosystem to focus on its internal architecture, which may ultimately form a parallel or even competitive standard system to the American technology stack.

“This is damaging to the American tech industry, national security, and technological leadership, just for the (imaginary) interests of a single company, which makes no sense,” Huang asserted. He believes that a wiser approach is to maintain dialogue and research exchanges to ensure that global AI developers (including Chinese developers) continue to develop on the American technology stack, and that AI progress, especially in the open-source sector, benefits the American ecosystem. He warned that adopting similar extreme policies in the semiconductor field as seen in the telecommunications sector in the past led to the U.S. losing control over its own telecommunications network.

Future Outlook: Architecture Innovation and the Essence of Accelerated Computing

At the end of the interview, Huang looked to the future. Even if the AI revolution does not occur, he believes NVIDIA will still succeed based on its core concept of “accelerated computing.” He believes that the expansion of general computing (CPU) is nearing its limits, while accelerating domain-specific workloads through GPUs and CUDA is key to breaking bottlenecks in the fields of science and engineering. From computer graphics to physical simulations and data processing, accelerated computing has ample application possibilities.

He revealed that NVIDIA is expanding its product boundaries according to market demands, such as acquiring Groq and integrating it into the CUDA ecosystem to provide ultra-low latency inference options to meet the demand for “high-end Tokens,” further segmenting the inference market.

When asked if NVIDIA would explore completely different chip architectures (such as wafer-level chips or large packages) like other companies, Huang stated that NVIDIA has the capability to do so, but simulation results indicate that those architectures are “provably worse.” “If we had more money, I would invest it more in NVIDIA's architecture,” he concluded, with confidence stemming from a deep understanding and ongoing validation of its own technological path.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Share To
APP

X

Telegram

Facebook

Reddit

CopyLink