Huang Renxun didn’t showcase a consumer-grade graphics card this time, but instead brought out his secret weapon.
At CES 2026, NVIDIA officially launched the Vera Rubin computing platform, a system named after the astronomer who discovered dark matter, weighing 2.5 tons. On stage, Huang Renxun ignited the audience with a bold move—directly lifting this massive AI server rack onto the podium, announcing a new era.
Ultimate Collaborative Design, Breaking the Single-Chip Convention
In the past, NVIDIA had an unwritten rule: each generation of products would modify at most 1-2 chips. But Vera Rubin completely broke this tradition.
This time, NVIDIA redesigned six chips all at once, entering full production. Why such radicalism? Because Moore’s Law is slowing down, and traditional performance enhancement paths can no longer keep up with the tenfold annual growth of AI models. Therefore, NVIDIA adopted an “Ultimate Collaborative Design” strategy—innovating simultaneously at every level of chips and system platforms.
These six chips include:
Vera CPU: 88 custom cores, 176 execution units, 1.5TB system memory (three times that of the previous Grace), NVLink C2C bandwidth up to 1.8TB/s.
Rubin GPU: NVFP4 inference performance of 50 PFLOPS, five times that of its predecessor, with 336 billion transistors (1.6 times more than Blackwell), equipped with third-generation Transformer engine capable of dynamic precision adjustment.
ConnectX-9 Network Card: Based on 200G PAM4 SerDes, 800Gbps Ethernet, supporting programmable RDMA and data path acceleration.
BlueField-4 DPU: Built specifically for AI storage, paired with a 64-core Grace CPU, targeting next-generation SmartNICs and storage processors.
NVLink-6 Switch Chip: Connects 18 compute nodes and 72 Rubin GPUs, enabling each GPU to achieve 3.6TB/s all-to-all communication bandwidth under its architecture.
After integrating these six chips, the Vera Rubin NVL72 system’s performance has achieved comprehensive surpassing.
In NVFP4 inference tasks, it reaches 3.6 EFLOPS (five times that of Blackwell), with training performance at 2.5 EFLOPS (3.5 times higher). Storage-wise, NVL72 is equipped with 54TB LPDDR5X memory (three times the previous generation), and 20.7TB HBM capacity (1.5 times). In bandwidth, HBM4 reaches 1.6PB/s (2.8 times higher), and Scale-Up bandwidth hits 260TB/s (doubling).
The most critical metric is throughput (the number of AI tokens per watt per dollar), which has increased by 10 times compared to Blackwell. For a gigawatt data center costing $50 billion, this directly doubles revenue potential.
Training a 10 trillion parameter model, Rubin requires only a quarter of the number of Blackwell systems, with token generation costs about one-tenth.
Engineering Innovation: From Complex Wiring to Plug-and-Play
Breakthroughs in engineering design are equally noteworthy.
Supercomputing nodes previously required 43 cables, took 2 hours to assemble, and were prone to errors. Vera Rubin’s approach is much more radical—zero cables, only 6 liquid cooling pipes, and connection completed in 5 minutes.
The back of the rack is filled with copper cables totaling 3.2 km, with 5000 copper cables forming the NVLink backbone network supporting 400Gbps transmission speeds. Huang Renxun joked, “You need to be a very fit CEO to handle this job.”
The Ultimate Solution to the KV Cache Dilemma
The biggest pain point in AI applications is insufficient context memory. As AI conversations grow longer and models larger, KV Cache (key-value cache) occupying HBM memory quickly saturates.
Vera Rubin’s solution: deploy BlueField-4 processors inside the rack specifically managing KV Cache. Each node is equipped with 4 BlueField-4s, backed by 150TB of context memory. When allocated to GPUs, each GPU gets an additional 16TB of memory. This is nearly 16 times the built-in memory (about 1TB), while maintaining 200Gbps bandwidth.
To make “sticky notes” distributed across dozens of racks and tens of thousands of GPUs work as a unified memory, the network must be “big, fast, and stable.” This is the mission of Spectrum-X—NVIDIA’s first end-to-end Ethernet platform designed specifically for generative AI.
In a gigawatt data center, Spectrum-X can deliver a 25% throughput increase, saving approximately $5 billion. Huang Renxun commented straightforwardly: “This network system is almost ‘free.’”
The Rise of Physical AI, Multi-Modal Applications in Full Swing
Beyond hardware, the software ecosystem is also a highlight. Huang Renxun emphasized that the approximately $10 trillion in computational resources invested over the past decade are being thoroughly modernized, but this is not just hardware upgrades—it’s a paradigm shift in software.
He highlighted the breakthrough of DeepSeek V1, believing that open-source inference systems have sparked a wave of industry development. Although open-source models currently lag the top models by about six months, new models emerge every half year, and this rapid iteration attracts entrepreneurs, giants, and researchers alike.
This time, NVIDIA didn’t just sell graphics cards but built a multi-billion-dollar DGX Cloud supercomputer, developed cutting-edge models like La Proteina and OpenFold 3, and launched an open-source model ecosystem covering biomedicine, physical AI, agents, robotics, autonomous driving, and more.
Alpamayo: The Autonomous Driving Nuke with Reasoning Capabilities
The ultimate application of physical AI is Alpamayo—the world’s first autonomous driving model with thinking and reasoning abilities.
Unlike traditional rule-based autonomous driving engines, Alpamayo is an end-to-end trained system capable of solving the “long tail problem” in autonomous driving. When faced with unfamiliar complex road conditions, it no longer rigidly executes code but can reason and make decisions like humans.
In demonstrations, the vehicle’s driving style was astonishingly natural, able to decompose extremely complex scenarios into basic common-sense processing. Huang Renxun announced that the Mercedes-Benz CLA equipped with Alpamayo technology stack will launch in the US in the first quarter of this year, followed by Europe and Asia.
This car was rated as the safest vehicle globally by NCAP, thanks to NVIDIA’s unique “dual safety stack” design—when the end-to-end AI model lacks confidence in road conditions, the system immediately switches back to traditional safety mode, ensuring absolute safety.
The Full Picture of Robot Strategy, Factories as the Largest Robots
In the robotics arena, NVIDIA is competing with nine top AI and hardware manufacturers. All robots will be equipped with Jetson small computers, trained within Omniverse’s Isaac simulator.
Huang Renxun invited representatives from Boston Dynamics, Agility, and others to showcase humanoid and quadruped robots, emphasizing that the biggest robot is actually the factory itself. NVIDIA’s vision is that future chip design, system design, and factory simulation will all be accelerated by physical AI.
From chip design to real manufacturing, everything will be verified in virtual environments—this marks NVIDIA’s major shift from virtual to physical worlds.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
Vera Rubin's nuclear-grade platform debuts, how NVIDIA is rewriting the new pattern of AI computing power | CES 2026 Spoiler
Huang Renxun didn’t showcase a consumer-grade graphics card this time, but instead brought out his secret weapon.
At CES 2026, NVIDIA officially launched the Vera Rubin computing platform, a system named after the astronomer who discovered dark matter, weighing 2.5 tons. On stage, Huang Renxun ignited the audience with a bold move—directly lifting this massive AI server rack onto the podium, announcing a new era.
Ultimate Collaborative Design, Breaking the Single-Chip Convention
In the past, NVIDIA had an unwritten rule: each generation of products would modify at most 1-2 chips. But Vera Rubin completely broke this tradition.
This time, NVIDIA redesigned six chips all at once, entering full production. Why such radicalism? Because Moore’s Law is slowing down, and traditional performance enhancement paths can no longer keep up with the tenfold annual growth of AI models. Therefore, NVIDIA adopted an “Ultimate Collaborative Design” strategy—innovating simultaneously at every level of chips and system platforms.
These six chips include:
Vera CPU: 88 custom cores, 176 execution units, 1.5TB system memory (three times that of the previous Grace), NVLink C2C bandwidth up to 1.8TB/s.
Rubin GPU: NVFP4 inference performance of 50 PFLOPS, five times that of its predecessor, with 336 billion transistors (1.6 times more than Blackwell), equipped with third-generation Transformer engine capable of dynamic precision adjustment.
ConnectX-9 Network Card: Based on 200G PAM4 SerDes, 800Gbps Ethernet, supporting programmable RDMA and data path acceleration.
BlueField-4 DPU: Built specifically for AI storage, paired with a 64-core Grace CPU, targeting next-generation SmartNICs and storage processors.
NVLink-6 Switch Chip: Connects 18 compute nodes and 72 Rubin GPUs, enabling each GPU to achieve 3.6TB/s all-to-all communication bandwidth under its architecture.
Spectrum-6 Optical Ethernet Switch Chip: 512 channels × 200Gbps, integrated with TSMC’s COOP silicon photonics technology, featuring co-packaged optical interfaces.
Exponential Performance Leap, Throughput Increased 10x
After integrating these six chips, the Vera Rubin NVL72 system’s performance has achieved comprehensive surpassing.
In NVFP4 inference tasks, it reaches 3.6 EFLOPS (five times that of Blackwell), with training performance at 2.5 EFLOPS (3.5 times higher). Storage-wise, NVL72 is equipped with 54TB LPDDR5X memory (three times the previous generation), and 20.7TB HBM capacity (1.5 times). In bandwidth, HBM4 reaches 1.6PB/s (2.8 times higher), and Scale-Up bandwidth hits 260TB/s (doubling).
The most critical metric is throughput (the number of AI tokens per watt per dollar), which has increased by 10 times compared to Blackwell. For a gigawatt data center costing $50 billion, this directly doubles revenue potential.
Training a 10 trillion parameter model, Rubin requires only a quarter of the number of Blackwell systems, with token generation costs about one-tenth.
Engineering Innovation: From Complex Wiring to Plug-and-Play
Breakthroughs in engineering design are equally noteworthy.
Supercomputing nodes previously required 43 cables, took 2 hours to assemble, and were prone to errors. Vera Rubin’s approach is much more radical—zero cables, only 6 liquid cooling pipes, and connection completed in 5 minutes.
The back of the rack is filled with copper cables totaling 3.2 km, with 5000 copper cables forming the NVLink backbone network supporting 400Gbps transmission speeds. Huang Renxun joked, “You need to be a very fit CEO to handle this job.”
The Ultimate Solution to the KV Cache Dilemma
The biggest pain point in AI applications is insufficient context memory. As AI conversations grow longer and models larger, KV Cache (key-value cache) occupying HBM memory quickly saturates.
Vera Rubin’s solution: deploy BlueField-4 processors inside the rack specifically managing KV Cache. Each node is equipped with 4 BlueField-4s, backed by 150TB of context memory. When allocated to GPUs, each GPU gets an additional 16TB of memory. This is nearly 16 times the built-in memory (about 1TB), while maintaining 200Gbps bandwidth.
To make “sticky notes” distributed across dozens of racks and tens of thousands of GPUs work as a unified memory, the network must be “big, fast, and stable.” This is the mission of Spectrum-X—NVIDIA’s first end-to-end Ethernet platform designed specifically for generative AI.
In a gigawatt data center, Spectrum-X can deliver a 25% throughput increase, saving approximately $5 billion. Huang Renxun commented straightforwardly: “This network system is almost ‘free.’”
The Rise of Physical AI, Multi-Modal Applications in Full Swing
Beyond hardware, the software ecosystem is also a highlight. Huang Renxun emphasized that the approximately $10 trillion in computational resources invested over the past decade are being thoroughly modernized, but this is not just hardware upgrades—it’s a paradigm shift in software.
He highlighted the breakthrough of DeepSeek V1, believing that open-source inference systems have sparked a wave of industry development. Although open-source models currently lag the top models by about six months, new models emerge every half year, and this rapid iteration attracts entrepreneurs, giants, and researchers alike.
This time, NVIDIA didn’t just sell graphics cards but built a multi-billion-dollar DGX Cloud supercomputer, developed cutting-edge models like La Proteina and OpenFold 3, and launched an open-source model ecosystem covering biomedicine, physical AI, agents, robotics, autonomous driving, and more.
Alpamayo: The Autonomous Driving Nuke with Reasoning Capabilities
The ultimate application of physical AI is Alpamayo—the world’s first autonomous driving model with thinking and reasoning abilities.
Unlike traditional rule-based autonomous driving engines, Alpamayo is an end-to-end trained system capable of solving the “long tail problem” in autonomous driving. When faced with unfamiliar complex road conditions, it no longer rigidly executes code but can reason and make decisions like humans.
In demonstrations, the vehicle’s driving style was astonishingly natural, able to decompose extremely complex scenarios into basic common-sense processing. Huang Renxun announced that the Mercedes-Benz CLA equipped with Alpamayo technology stack will launch in the US in the first quarter of this year, followed by Europe and Asia.
This car was rated as the safest vehicle globally by NCAP, thanks to NVIDIA’s unique “dual safety stack” design—when the end-to-end AI model lacks confidence in road conditions, the system immediately switches back to traditional safety mode, ensuring absolute safety.
The Full Picture of Robot Strategy, Factories as the Largest Robots
In the robotics arena, NVIDIA is competing with nine top AI and hardware manufacturers. All robots will be equipped with Jetson small computers, trained within Omniverse’s Isaac simulator.
Huang Renxun invited representatives from Boston Dynamics, Agility, and others to showcase humanoid and quadruped robots, emphasizing that the biggest robot is actually the factory itself. NVIDIA’s vision is that future chip design, system design, and factory simulation will all be accelerated by physical AI.
From chip design to real manufacturing, everything will be verified in virtual environments—this marks NVIDIA’s major shift from virtual to physical worlds.