AI PC has arrived, locally battling with a 120B large model! Nvidia redefines the foundation of "personal AI computer" with RTX Spark.

CN
PANews
Follow
1 hour ago

In the past two years, PC manufacturers have repeatedly mentioned a parameter when promoting "AI PCs": NPU computing power. However, whether it is Intel's Lunar Lake with 45 TOPS or AMD's Strix Point with 50 TOPS, these figures remain at a relatively mild level. They can perform background blurring, reduce noise in voice, and run some small-scale edge models, but that's about it.

On May 31, NVIDIA unveiled the RTX Spark super chip at the GTC 2026 conference, raising this number to 1 petaflop, or 1000 TOPS. Not an increase of 30% or 50%, but a direct leap across a magnitude.

Several other announcements were made at the same event: Microsoft upgraded the native security mechanism of Windows in conjunction with RTX Spark and introduced NVIDIA's open-source sandbox runtime, OpenShell, to the Windows platform; Adobe announced a fundamental reconstruction of Photoshop and Premiere, specifically adapted to the unified memory architecture of RTX Spark; the first six OEM manufacturers confirmed that they will launch lightweight laptops and compact desktops equipped with this chip this autumn.

NVIDIA's actions at this GTC were not just about launching a new chip. It is attempting to set a new hardware standard for the category of "personal AI computers."

image

When GPUs Become the Main Character of PCs

First, let’s look at this chip itself. According to the data released by NVIDIA at GTC, the RTX Spark integrates a Blackwell architecture GPU with 6144 CUDA cores, paired with a 20-core Arm architecture Grace CPU co-designed with MediaTek, using TSMC's 3nm process. The key change is in the memory architecture: up to 128GB of unified memory, where the CPU and GPU share the same memory pool, eliminating the need to move data back and forth between the two.

This is contrary to the architectural logic of PCs in the past.

The traditional basic structure of PCs is "an x86 CPU as the main processor, an independent GPU as an optional accessory". Even with the recent rise of AI PCs, Intel and AMD's approach has been to embed an NPU in the CPU as an additional module for AI acceleration, generally with computing power in the range of forty to fifty TOPS. The GPU remains an "external component."

RTX Spark redistributes the power dynamics. This SoC turns the GPU into the main character, while the CPU takes a supporting role. NVIDIA's provided AI computing power is 1 petaflop FP4, equivalent to 1000 TOPS, over twenty times that of the built-in NPU computing power of the previous generation AI PCs. This is not an acceleration on the same track; it is a new starting line on a different track.

The speed at which OEM manufacturers are following up confirms this judgment. According to NVIDIA's official announcements and subsequent reports from DIGITIMES, ASUS, Dell, HP, Lenovo, Microsoft Surface, and MSI will launch lightweight laptops and compact desktops equipped with RTX Spark this autumn, with Acer and Gigabyte models to follow. Almost all mainstream Windows PC brands are getting in on this.

image

RTX Spark is not a product born from scratch. At the beginning of 2025, a similar Blackwell plus Grace core chip appeared under the names Project DIGITS and DGX Spark, but at that time it was positioned as a Linux desktop supercomputer for developers, with a size close to that of a small desktop. A year later, this architecture was compressed into the thermal space of lightweight laptops, switching the operating system from Linux to Windows, and broadening the target users from AI developers to ordinary consumers and enterprises. This is the most noteworthy change from the GTC 2026 consumer release: NVIDIA is not launching a developer toy but is opening the doors to the consumer market.

Running a 120B Model Locally, Is It Enough?

The numbers for computing power and memory ultimately need to answer one question: What can it do?

NVIDIA's answer at the press conference was that RTX Spark supports local running of 120B parameter large models, and the context window can reach one million tokens. What does 120B mean? For reference, the current mainstream practice for consumer-level hardware running local models is that the RTX 4090 with 24GB of video memory can run models at 30B to 40B parameter levels through quantization compression. Some smaller models can run quickly on consumer-grade graphics cards at the 9B model level. The leap from 9B to 120B redefines the "adequacy" standard for edge AI.

The 128GB unified memory is the premise for all this. In traditional PC architectures, the CPU has its own system memory, and the GPU has its own video memory, creating a physical boundary between them. A large model exceeding the video memory capacity either cannot run at all or requires complex model splitting and memory swapping, significantly slowing down speed. The unified memory architecture eliminates this bottleneck, allowing model data to be placed directly in the 128GB shared pool, accessible to both CPU and GPU. Apple was the first to prove the consumer viability of this technological route on Apple Silicon, and now NVIDIA brings it to the Windows camp.

In addition to large model inference, NVIDIA has listed use cases that include 12K video editing, rendering 3D scenes over 90GB, and ray tracing games at over 100fps at 1440p resolution. The common characteristic of these scenarios is the extraordinarily large amount of data processed at once; traditional PCs either require several times the processing time to wait or simply cannot run at all.

There is still a distance between "supports running" and "smoothly usable." NVIDIA has not disclosed the actual inference speed of the 120B model on RTX Spark, nor provided data on the first token latency in a one-million-token context scenario. The key metric that determines the speed of long context inference is memory bandwidth. For reference, the DGX Spark also utilizing the GB10 core has a measured memory bandwidth of about 301GB/s. This bandwidth level can run the 120B model, but when processing one-million-token context windows, users may need to wait several seconds to see the first output token. The notebook version of RTX Spark may see actual bandwidth adjustments due to power constraints.

Adding a Security Layer for AI Agents

Another core release beyond computing power is NVIDIA's collaboration with Microsoft at the system level. This part may be the easiest to overlook in the GTC 2026 consumer release but has the most profound impact on the industry.

A computer capable of running a 120B model, if operated by an AI agent that can autonomously manipulate the desktop, click buttons, and read/write files, raises safety risks that go beyond "will data be lost?" to "will the agent do something you do not want it to do?" If this question is not resolved, enterprises cannot possibly deploy such devices to employees.

The solution provided by Microsoft and NVIDIA is two lines of defense. The first line is that Microsoft has upgraded the native security mechanisms of Windows to provide monitoring and constraints for AI agent behaviors at the operating system level. The second line is that NVIDIA has formally introduced OpenShell runtime to the Windows platform. According to NVIDIA's official documentation, OpenShell is an open-source sandbox runtime that offers kernel-level isolation. It demarcates a controllable operational range for AI agents, allowing them to independently execute tasks within this range, but with strictly limited permissions to prevent unauthorized access to core system files, network connections, or sensitive user data.

This combination has clear significance for enterprise procurement. Previously, the concept of "local AI agents" was still at the technical demonstration stage. The hardware could run, but the security framework was empty. No IT department dared to include devices in this state on their procurement lists. By inserting a layer of standardized isolation between hardware and applications, NVIDIA and Microsoft have transformed "usable" into "manageable."

The performance overhead of OpenShell itself is a variable to be observed. Sandbox isolation typically incurs a certain degree of performance loss, but NVIDIA has not yet disclosed specific data on how much inference speed or system response it affects. The complexities of deploying enterprise IT management and compatibility with existing security policies are practical issues that will only be validated after OEM devices are launched.

Why Adobe is Willing to "Reconstruct from the Bottom Up"

The willingness of software manufacturers to cooperate is often a barometer of whether a new hardware platform can stabilize its footing.

The move announced by Adobe during GTC is the biggest signal from the software side in this release wave. According to NVIDIA's official blog and confirmation from Adobe executives, Adobe is starting a fundamental reconstruction of Photoshop and Premiere, specifically adapting to the unified memory architecture of RTX Spark, claiming AI and graphics processing performance increases of up to 2 times.

"Fundamental reconstruction" is not just adding a plugin or making a compatibility layer. On traditional PCs, the CPU and GPU each have their own memory spaces, and when processing an ultra-large PSD file or an 8K video timeline, data needs to be repeatedly transported between the two sets of memory, which is a major area of performance waste. The unified memory of RTX Spark allows the CPU and GPU to directly share the same 128GB space, and this structural change has practical value for professional creators' workflows. Adobe's willingness to modify its underlying code indicates that it recognizes this architectural direction is not just a one-time marketing gimmick.

However, neither NVIDIA nor Adobe has disclosed what the benchmark for this "2 times acceleration" is. Is it compared to contemporary x86 processors with independent graphics cards, or against the previous generation AI PC's NPU solution? The results could be vastly different. Until benchmark testing conditions are made public, the value of this figure remains uncertain.

Also announcing support are Blackmagic Design, ComfyUI, llama.cpp, OTOY, and several game manufacturers. The follow-up from ComfyUI and llama.cpp is noteworthy, as they are currently the most active open-source tools in local AI workflows. Early support from the developer community often reflects a platform's ecological potential more genuinely than commitments from larger companies.

NVIDIA is building in the Windows camp a similar experience to Apple’s integrated software and hardware using its CUDA ecosystem and unified memory architecture. The difference is that while Apple builds its own walls, NVIDIA needs to persuade Microsoft and ISVs to build together. Adobe's willingness to start from the ground up at least indicates that the first brick in this wall has been laid.

Beyond the Paper Specifications

Returning to a very practical question: can these devices actually be purchased, and what is the experience like when obtained?

According to the information disclosed by NVIDIA, the first batch of RTX Spark devices will be released this autumn, covering lightweight laptops and compact desktops from ASUS, Dell, HP, Lenovo, Microsoft Surface, and MSI. Models from Acer and Gigabyte will follow. The specific pricing and exact release dates of all OEMs have not been disclosed.

More critical than pricing are several unknowns at the physical level. How will power consumption and heat dissipation be balanced while fitting a 1 petaflop chip into a lightweight laptop? What will RTX Spark's everyday office performance and battery life be like in non-AI scenarios? Will the actual bandwidth of the 128GB unified memory in laptop form be significantly diminished due to power consumption limitations?

These questions represent the real test of industrial implementation. The peak computing power of a chip in engineering prototypes and its actual performance in the hands of consumers over eight hours a day are often two different things. NVIDIA emphasized the energy efficiency of RTX Spark during the press conference, but did not provide specific TDP values or battery life data.

From the perspective of the PC industry landscape, the appearance of RTX Spark marks the emergence of a new division of labor. For the past thirty years, the core chip narrative of PCs has been in the hands of x86 processor manufacturers, while GPU manufacturers, although increasingly important, have always remained "accessories plugged into the motherboard." NVIDIA has introduced a complete SoC this time, with everything integrated from CPU to GPU to memory controller, with the CPU portion based on Arm architecture designed by MediaTek. The power structure of the PC industry chain is shifting from "x86 CPU plus optional GPU" to "GPU-centered SoC platforms."

This shift will not be completed overnight. OEM pricing strategies, the energy efficiency performance of actual products, the adaptation progress of ISV software, and the procurement verification cycles of enterprise customers will each determine whether RTX Spark becomes a new coordinate in the PC industry or another high-profile yet low-performing technical demonstration. The answer will have to wait until this autumn.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Share To
APP

X

Telegram

Facebook

Reddit

CopyLink