SemiAnalysis dissects Huawei's Kirin 9030: The manufacturing process has stalled, so the chip is folded.

CN
2 hours ago
Export controls have not stopped China's chip progress, but it has changed the path and cost of that progress.

Written by: ChaoXiang Research

In the field of semiconductor reverse engineering, TechInsights has dominated for decades. Last weekend, Dylan Patel's SemiAnalysis officially released its first public teardown report from its STEEL laboratory (Teardown Engineering & Evaluation Lab), targeting one of the most watched chips globally, the Kirin 9030 Pro equipped in Huawei Mate 80 Pro, utilizing SMIC's most advanced N+3 process.

The timing is intriguing. TechInsights is being sold by private equity, while SemiAnalysis's revenue has already surpassed that of this old giant. Dylan chose to strike at this moment with a highly technical teardown report, combined with real chip photos from an Oregon lab.

The title of the report is a bombshell: The minimum metal pitch (M0 pitch) of SMIC N+3 is only 32.5nm, smaller than the 36nm used in Intel's latest Panther Lake processor with 18A technology.

Has SMIC achieved a finer metal pitch than Intel without EUV lithography?

If this headline is taken at face value, it could set the entire semiconductor world ablaze, but SemiAnalysis itself douses the fire in the second paragraph of the report, stating that this is a "cherry picked metric," a deliberately selected indicator.

This article will interpret this teardown report for you,

Density Catches Up, Costly Price

The SMIC N+3 process indeed matches TSMC's N6 in terms of transistor density.

The STEEL laboratory measured the Bohr density of N+3 through TEM (Transmission Electron Microscopy) cross-section analysis at 113.4 MTr/mm², slightly above TSMC's N6 at 107.7 MTr/mm². The cell height is reduced from 252nm in N+2 to 228nm, and the Contact Gate Pitch (CGP) is reduced from 63nm to 57nm. These numbers combined mean that SMIC has achieved logic density equivalent to TSMC's mature 7nm level without EUV.

What is the cost?

SMIC's M0 layer uses self-aligned quad patterning (SAQP), which involves processing one photomask's pattern four times to achieve finer lines. TSMC's N6 at the same layer only requires double patterning (SADP). Quadruple means more photomasks, higher overlay precision requirements, more complex processes, and higher costs.

SemiAnalysis directly observed the cost of SAQP in the cross-sectional images: the M0 trenches of N+3 show a distinctly inverted trapezoidal profile (the bottom narrower than the top), with a clearly defined barrier layer enrichment band at the trench bottom. Although this morphology facilitates copper filling, at a pitch of 32.5nm, the difficulty of process control escalates sharply.

To put it in a way a trader can understand: SMIC produces the same denomination of banknotes, but the printing cost of each note is several times that of TSMC, and the yield risk is greater. The density is the same, but the economics are entirely different.

Kirin 9030: Maximizing Every Inch of Silicon under Constraints

The chip design capability of Huawei HiSilicon is another story altogether.

In terms of chip area, the Kirin 9030 is nearly the same size as the previous generation 9020 (about 140mm²), but it has packed in more components: the CPU has upgraded from 1 large core + 3 medium cores to 1 large + 4 medium, the GPU compute units have increased from 4 to 6, and the NPU has an additional Tiny core, with all levels of cache expanded. The density increase of N+3 allows Huawei to include more logic units in the same chip size.

In performance, the STEEL laboratory referenced publicly available benchmark data, clearly stating: the GPU performance of the Kirin 9030 (Maleoon 935) has roughly matched the flagship level of 2022, with a 3DMark WLE score that is 70% higher than the previous generation, slightly surpassing the Snapdragon 8+ Gen 1, but lagging behind current flagship Snapdragon 8 Elite Gen 5 by a factor of 2.4 to 2.6.

The situation with the CPU is even more telling. The large core TaiShan Prime's performance per clock (IPC) is roughly at the level of the Arm Cortex-X2, a design from 2021. Apple's M1 Firestorm core, released in 2020, still has an IPC that exceeds it by 35%. The latest Apple M5 P core boasts an IPC that is 60% higher, with absolute performance at 2.7 times the Kirin 9030.

The root of the gap lies not in design but in the manufacturing process. Apple and Qualcomm use TSMC's N4 and N3P processes, which fundamentally benefit from the voltage-frequency curve: they can pack more transistors in the same area and run at higher frequencies for the same power consumption. Huawei's core design level is on par with the industry's leading previous generation, but it is stuck in manufacturing processes from two generations ago.

When Processes Hit a Wall, Huawei Prepares to "Fold"

The most forward-looking part of the report is Huawei's announcement of the τ-scaling law and LogicFolding roadmap at the 2026 ISCAS conference.

Traditional semiconductor scaling has progressed in two dimensions: making transistors smaller and making metal lines finer. Moore's Law has been doing this for decades. Huawei's proposed τ-scaling shifts the optimization goal from spatial domain to time domain, focusing on reducing the time costs of data movement and processing, including transistor switching delays, signal propagation delays, and computing and storage delays.

LogicFolding is the engineering implementation of this theory. Simply put, it involves splitting the same logical module into two layers stacked face-to-face and connected through ultrafine pitch hybrid bonding. The direct benefit of this approach is reducing the longest signal path. In modern chips, a significant portion of power consumption and delay is spent driving long interconnects and relay buffers. By vertically folding the logic, the critical path shortens, frequency can increase, and power consumption can decrease.

Huawei has provided an aggressive roadmap: The large core frequency of the Kirin 9030 is 2.75GHz, with laboratory samples already achieving 3.39GHz, aiming to reach 5GHz by 2031, while pushing equivalent density to 295 MTr/mm² through 3D stacking, matching TSMC's 14A level.

SemiAnalysis remains cautious. They point out that Huawei's density calculation method differs from traditional foundries: the density of 3D stacking is calculated based on package area, stacking multiple layers of active logic together, naturally yielding a higher number. If the same method were used to calculate AMD's MI450X (N2 top layer + N3P bottom layer), the theoretical density could reach 460.2 MTr/mm², far exceeding Huawei's target for 2031.

However, the direction itself is noteworthy. By pursuing this path, Huawei is essentially bringing "foundry work into the hands of a system design company under process limitations." AMD's V-Cache does 3D stacking for cache, AMD's MI350X moves IO and interconnect to the lower chip, but Huawei aims for a more thorough approach by directly dismantling the same logic block and distributing it vertically, representing a different level of engineering challenge.

Export Controls Reshape the Dimensions of Competition

SemiAnalysis's final conclusion is straightforward: Export controls have not stopped China's chip progress, but they have changed the path and cost of that progress.

SMIC's N+3 proves that it is possible to achieve N6-level logic density without EUV. However, this path comes at a higher cost, with more complex processes and greater challenges in yield control. Moving forward, every step's marginal difficulty increases: more photomasks, stricter overlay precision, and more costly multiple patterning. Theoretically, N+4 could reach 137.8 MTr/mm² (matching TSMC's N5), and N+5 with back power supply could even approach Intel's 18A HP library. But each step is harder, more expensive, and has less margin for error than the last.

Meanwhile, SMIC's N+2 and N+3 processes are being transitioned to Huahong, with design companies like Alibaba's Pingtouge and Cambricon possibly becoming beneficiaries. Chip manufacturing knowledge is spreading from a single foundry to an ecosystem, further diluting the effectiveness of sanctions aimed at a single company.

On the design end, Huawei and Peking University are developing domestic EDA tool prototypes for LogicFolding. This does not equate to replacing the complete toolchain from Synopsys and Cadence, but domestic EDA is evolving towards "architecture-process-package collaborative optimization."

An interesting detail: STEEL discovered during the teardown that the DRAM of the Kirin 9030 Pro comes from Samsung (K4L2E165YD, LPDDR5X-9600, 1a process node), while the 16GB Pro Max version appears with both Samsung and Changxin Memory (CXMT) packaging. The Changxin chip's packaging date is marked as the 45th week of 2025, with a process density comparable to the industry 1z level. This suggests that Chinese memory chips have started entering Huawei's flagship supply chain, despite being one to two generations behind Samsung and SK Hynix in processes.

For investors, the real signal worth tracking is whether Huawei's 3D stacking roadmap can enable Chinese-made chips to meet the necessary thresholds in scenarios such as mobile phones, AI inference, and network devices while keeping costs under control.

Once sufficiency is established, the strategic value of this supply chain will be repriced.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Share To
APP

X

Telegram

Facebook

Reddit

CopyLink