Source: TMTPOST
Author: Lin Zhijia
Editor: Ma Jinman
This article was first published on the TMTPOST app
These three new NVIDIA AI chips are not "improved versions," but "downgraded versions." Among them, HGX H20 has limitations in bandwidth, computing speed, and other aspects. It is expected that the price of H20 will decrease, but it will still be higher than domestic AI chip 910B.

Image Source: Generated by Wujie AI
On November 10, there were reports that chip giant NVIDIA will launch three AI chips for the Chinese market based on the H100 to cope with the latest chip export controls from the United States.
The specification documents show that NVIDIA is about to launch new products for Chinese customers named HGX H20, L20 PCle, and L2 PCle, based on NVIDIA's Hopper and Ada Lovelace architectures. From the specifications and naming, the three products are aimed at training, inference, and edge scenarios, and will be announced as early as November 16, with product sampling from November to December this year, and mass production from December this year to January next year.
TMTPOST has learned from multiple NVIDIA industry chain companies that the above information is true.
TMTPOST also exclusively learned that these three NVIDIA AI chips are not "improved versions," but "downgraded versions." Among them, the HGX H20 used for AI model training has limitations in bandwidth, computing speed, and other aspects. The overall computing power is theoretically about 80% lower than the NVIDIA H100 GPU chip, which means that the H20 has about 20% of the comprehensive computing power of the H100, and it increases the cost of computing power by adding HBM memory and NVLink interconnection modules. Therefore, although the price of H20 will decrease compared to H100, it is expected that the price of this product will still be higher than the domestic AI chip 910B.
"This is equivalent to widening the lanes of a highway, but not widening the entrance of toll stations, which restricts traffic. Similarly, technically, through hardware and software locks, the performance of the chip can be precisely controlled without the need for large-scale replacement of production lines. Even if the hardware is upgraded, the performance can still be adjusted as needed. Currently, the new H20 has 'capped' the performance from the source," explained an industry insider about the new H20 chip. "For example, a task that originally took 20 days with H100 may now take 100 days with H20."
Despite the new round of chip restrictions imposed by the United States, NVIDIA seems to have not given up on the huge AI computing market in China.
So, can domestic chips replace them? TMTPOST has learned that, after testing, currently, in large-scale model inference, the domestic AI chip 910B can only achieve about 60%-70% of the A100's performance, and it is difficult to continue model training in clusters. At the same time, the 910B has much higher computing power consumption and heat generation than the NVIDIA A100/H100 series products, and it is not compatible with CUDA, making it difficult to fully meet the long-term model training needs of intelligent computing centers.
As of now, NVIDIA has not made any comments on this.
It is reported that on October 17 this year, the Bureau of Industry and Security (BIS) of the U.S. Department of Commerce issued new export control regulations for chips, imposing new export controls on semiconductor products, including high-performance AI chips from NVIDIA; the restrictions took effect on October 23. NVIDIA's filing with the U.S. SEC shows that the immediately banned products include A800, H800, and L40S, the most powerful AI chips.
In addition, the L40 and RTX 4090 chip processors retain the original 30-day window.
On October 31, there were reports that NVIDIA may be forced to cancel a $5 billion order for advanced chips. Influenced by the news, NVIDIA's stock price plummeted at one point. Previously, NVIDIA's China-specific A800 and H800, due to the new U.S. regulations, could not be sold in the Chinese market as usual, and these two chips were referred to as "castrated versions" of A100 and H100, as NVIDIA had reduced the performance of the chips to comply with the previous U.S. regulations.
On October 31, Zhang Xin, a spokesperson for the China Council for the Promotion of International Trade, stated that the new U.S. rules on semiconductor exports to China further tightened restrictions on the export of AI-related chips and semiconductor manufacturing equipment to China, and included multiple Chinese entities on the export control "entity list." These measures seriously violate market economic principles and international economic and trade rules, exacerbate the risk of global semiconductor supply chain fragmentation, and are changing the global supply and demand for chips starting from the second half of 2022, causing a supply imbalance in 2023, affecting the global chip industry landscape, and harming the interests of enterprises in various countries, including Chinese enterprises.


Performance comparison of NVIDIA HGX H20, L20, L2, and other products
TMTPOST has learned that the new HGX H20, L20, and L2 AI chip products, based on NVIDIA's Hopper and Ada architectures, are suitable for cloud training, cloud inference, and edge inference.
For the latter two, L20 and L2, there are similar "domestic alternatives" and CUDA-compatible solutions for AI inference, while the HGX H20 is a firmware-cut AI training chip product based on H100, mainly replacing A100/H800. There are few similar domestic solutions for model training in China other than NVIDIA.
The documents show that the new H20 has advanced CoWoS packaging technology and has added an HBM3 (high-performance memory) to 96GB, but the cost has also increased by $240; the FP16 dense computing power of H20 reaches about 148TFLOPS (trillions of floating-point operations per second), which is about 15% of the H100's computing power, so additional algorithm and personnel costs are required; NVLink has been upgraded from 400GB/s to 900GB/s, so the interconnection rate will be greatly improved.
According to the evaluation, H100/H800 is the mainstream practice solution for computing clusters. Among them, the theoretical limit of H100 is 50,000 card clusters, with a maximum of 100,000P computing power; the maximum practical cluster of H800 is 20,000-30,000 cards, totaling 40,000P computing power; the maximum practical cluster of A100 is 16,000 cards, with a maximum of 9,600P computing power.
However, the new H20 chip now has a theoretical limit of 50,000 card clusters, but the computing power per card is 0.148P, totaling nearly 7,400P computing power, which is lower than H100/H800, A100. Therefore, the H20 cluster scale is far from the theoretical scale of H100, and based on the balance of computing power and communication, the reasonable median of overall computing power is about 3000P, requiring additional costs and expansion of more computing power to complete training of models with trillions of parameters.
Two semiconductor industry experts told TMTPOST that based on the current estimated performance parameters, it is very likely that NVIDIA's B100 GPU products will no longer be sold to the Chinese market next year.
Overall, if large-scale enterprises want to conduct large model training for parameters such as GPT-4, the scale of computing clusters is crucial. Currently, only H800 and H100 can handle large model training, while the performance of the domestic 910B is between A100 and H100, only a "last resort backup option."
Now, the new H20 introduced by NVIDIA is more suitable for vertical model training and inference, unable to meet the trillion-level large model training requirements, but its overall performance is slightly higher than 910B, coupled with the NVIDIA CUDA ecosystem, thereby blocking the only choice path for future domestic cards in the Chinese AI chip market under the U.S. chip restrictions.
According to the latest financial report, in a quarter ending on July 30, NVIDIA's sales of $13.5 billion, more than 85% came from the United States and China, with only about 14% of sales from other countries and regions.
Impacted by the H20 news, as of the close of trading on November 9, NVIDIA's stock price rose slightly by 0.81% to $469.5 per share. Over the past five trading days, NVIDIA has risen by more than 10%, with a market value of $1.16 trillion.
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。