In a move that marks a significant shift in its strategy, NVIDIA has chosen not to showcase consumer graphics cards at CES 2026. Instead, Jensen Huang took the stage with something much more ambitious: a 2.5-ton computing platform that promises to redefine AI model training and inference.
When chip design breaks its own rules
The real surprise lies not in the size of the chassis, but in its internal composition. The Vera Rubin platform (named after the astronomer who discovered dark matter) broke an internal rule NVIDIA has maintained for years: each product generation only redesigns 1-2 chips at most.
This time, the company simultaneously redesigned 6 different chips, completing the development cycle and moving directly into mass production. The pragmatic reason: Moore’s Law is slowing down, but AI models continue to demand a 10x annual performance growth. The only solution was to innovate not in a single component, but across the entire architecture.
The six pillars of Vera Rubin
Vera CPU: The computational core with 88 custom Olympus cores, capable of processing 176 threads simultaneously. Its system memory reaches 1.5 TB, triple that of its predecessor Grace, with an NVLink C2C bandwidth of 1.8 TB/s.
Rubin GPU: The true inference engine. Offers a NVFP4 power of 50 PFLOPS, five times higher than the previous Blackwell architecture. With 336 billion transistors, it incorporates the third generation of Transformer engines, allowing dynamic precision adjustment per model.
Connectivity and storage: ConnectX-9 provides 800 Gb/s Ethernet. The BlueField-4 DPU manages the new AI storage generation, combining a 64-core Grace CPU with 800 Gb/s capabilities.
Communication infrastructure: The NVLink-6 switch chip connects 18 compute nodes, enabling up to 72 Rubin GPUs to operate as a single system with 3.6 TB/s all-to-all bandwidth. Spectrum-6 adds 512 optical channels at 200 Gbps each, thanks to TSMC COOP silicon photonics integration.
The numbers impact: when investment multiplies
The Vera Rubin NVL72 system reaches 3.6 EFLOPS in NVFP4 inference tasks, five times more than Blackwell. In training, it hits 2.5 EFLOPS, a 3.5x increase. But the most dramatic is the memory: 54 TB of LPDDR5X (triple) and 20.7 TB of HBM (1.5 times more).
For a 1 GW data center costing 50 billion dollars, this is not just a technical improvement. It means that throughput in AI tokens generated per watt and dollar improves 10x, directly doubling the revenue capacity of the infrastructure.
Training a 10-trillion-parameter model now requires only 1/4 of the previous Blackwell systems. The cost per token generated drops to around 1/10 of the previous.
Solving the bottleneck: context memory
For months, the AI industry has faced a growing problem: the “KV Cache” or working memory generated by models quickly exhausts in long conversations. Vera Rubin solves this by deploying BlueField-4 processors inside the chassis, each with 150 TB of context memory.
This approach provides each GPU with an additional 16 TB of memory (when originally only ~1 TB) was available, maintaining a bandwidth of 200 Gbps without sacrificing speed. The Spectrum-X network, specifically designed for generative AI, ensures these “sticky notes” scattered across thousands of GPUs function as a single coherent memory.
Jensen Huang estimated that Spectrum-X can improve throughput by 25%, equivalent to saving 5 billion dollars in a data center of that scale. “It’s practically free,” he summarized.
Encrypted security at every layer
All data in transit, storage, and computation are encrypted, including PCIe buses, NVLink communication, and CPU-GPU transfers. Companies can deploy models on external systems without fear of data leaks.
The shift toward physical AI and agent intelligence
While Vera Rubin provides brute force power, NVIDIA announced a deeper paradigm shift: the era of “intelligent agents” and physical AI is here.
Jensen Huang made a special call to the open-source community, highlighting how DeepSeek V1 surprised the world last year as the first open-source inference system, sparking a wave of innovation. He recognized Kimi K2 and DeepSeek V3.2 as leaders in the open-source space, demonstrating that NVIDIA now builds upon this ecosystem rather than competing against it.
The strategy is not just about selling shovels. NVIDIA developed the supercomputer DGX Cloud (valued at billions) and cutting-edge models like protein synthesis (The Protein) and OpenFold 3. Its open-source family Nemotron includes voice, multimodal, retrieval-augmented, and security models.
Alpamayo: autonomous driving with reasoning
The truly surprising highlight of the event was Alpamayo, the world’s first autonomous driving system with thinking and reasoning capabilities. Unlike rule-based autonomous driving, Alpamayo reasons like a human driver, breaking down complex scenarios into common-sense elements.
“It will tell you what it will do next and why it makes that decision,” Jensen Huang explained. The Mercedes CLA with this technology will launch in the U.S. in the first quarter of 2026, rated the safest car in the world by NCAP, thanks to the “double safety stack” architecture NVIDIA developed.
Robots, factories, and the future of physical AI
NVIDIA presented a comprehensive robotics strategy. All robots will be equipped with the Jetson mini-computer and trained in the Isaac simulator of the Omniverse platform. The vision is clear: chip design, system architecture, and factory simulation, all accelerated by physical AI.
Jensen Huang invited humanoid and quadruped robots from Boston Dynamics and Agility to the stage, emphasizing that the factory itself is the biggest robot. Even Disney robots were trained on computers and validated in simulations before facing gravity in the real world.
The underlying message
In a context where skepticism about the “AI bubble” is growing and Moore’s Law limits are becoming evident, Jensen Huang needed to demonstrate with concrete facts what AI can achieve.
Previously, NVIDIA made chips for the virtual world. Now, they themselves demonstrate how physical AI—in the form of autonomous driving and humanoid robots—is entering the real world. As he said, when the battle begins, the business of the “military-industrial complex” can truly thrive.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
Vera Rubin: The Silent Transformation NVIDIA Is Preparing for 2026
In a move that marks a significant shift in its strategy, NVIDIA has chosen not to showcase consumer graphics cards at CES 2026. Instead, Jensen Huang took the stage with something much more ambitious: a 2.5-ton computing platform that promises to redefine AI model training and inference.
When chip design breaks its own rules
The real surprise lies not in the size of the chassis, but in its internal composition. The Vera Rubin platform (named after the astronomer who discovered dark matter) broke an internal rule NVIDIA has maintained for years: each product generation only redesigns 1-2 chips at most.
This time, the company simultaneously redesigned 6 different chips, completing the development cycle and moving directly into mass production. The pragmatic reason: Moore’s Law is slowing down, but AI models continue to demand a 10x annual performance growth. The only solution was to innovate not in a single component, but across the entire architecture.
The six pillars of Vera Rubin
Vera CPU: The computational core with 88 custom Olympus cores, capable of processing 176 threads simultaneously. Its system memory reaches 1.5 TB, triple that of its predecessor Grace, with an NVLink C2C bandwidth of 1.8 TB/s.
Rubin GPU: The true inference engine. Offers a NVFP4 power of 50 PFLOPS, five times higher than the previous Blackwell architecture. With 336 billion transistors, it incorporates the third generation of Transformer engines, allowing dynamic precision adjustment per model.
Connectivity and storage: ConnectX-9 provides 800 Gb/s Ethernet. The BlueField-4 DPU manages the new AI storage generation, combining a 64-core Grace CPU with 800 Gb/s capabilities.
Communication infrastructure: The NVLink-6 switch chip connects 18 compute nodes, enabling up to 72 Rubin GPUs to operate as a single system with 3.6 TB/s all-to-all bandwidth. Spectrum-6 adds 512 optical channels at 200 Gbps each, thanks to TSMC COOP silicon photonics integration.
The numbers impact: when investment multiplies
The Vera Rubin NVL72 system reaches 3.6 EFLOPS in NVFP4 inference tasks, five times more than Blackwell. In training, it hits 2.5 EFLOPS, a 3.5x increase. But the most dramatic is the memory: 54 TB of LPDDR5X (triple) and 20.7 TB of HBM (1.5 times more).
For a 1 GW data center costing 50 billion dollars, this is not just a technical improvement. It means that throughput in AI tokens generated per watt and dollar improves 10x, directly doubling the revenue capacity of the infrastructure.
Training a 10-trillion-parameter model now requires only 1/4 of the previous Blackwell systems. The cost per token generated drops to around 1/10 of the previous.
Solving the bottleneck: context memory
For months, the AI industry has faced a growing problem: the “KV Cache” or working memory generated by models quickly exhausts in long conversations. Vera Rubin solves this by deploying BlueField-4 processors inside the chassis, each with 150 TB of context memory.
This approach provides each GPU with an additional 16 TB of memory (when originally only ~1 TB) was available, maintaining a bandwidth of 200 Gbps without sacrificing speed. The Spectrum-X network, specifically designed for generative AI, ensures these “sticky notes” scattered across thousands of GPUs function as a single coherent memory.
Jensen Huang estimated that Spectrum-X can improve throughput by 25%, equivalent to saving 5 billion dollars in a data center of that scale. “It’s practically free,” he summarized.
Encrypted security at every layer
All data in transit, storage, and computation are encrypted, including PCIe buses, NVLink communication, and CPU-GPU transfers. Companies can deploy models on external systems without fear of data leaks.
The shift toward physical AI and agent intelligence
While Vera Rubin provides brute force power, NVIDIA announced a deeper paradigm shift: the era of “intelligent agents” and physical AI is here.
Jensen Huang made a special call to the open-source community, highlighting how DeepSeek V1 surprised the world last year as the first open-source inference system, sparking a wave of innovation. He recognized Kimi K2 and DeepSeek V3.2 as leaders in the open-source space, demonstrating that NVIDIA now builds upon this ecosystem rather than competing against it.
The strategy is not just about selling shovels. NVIDIA developed the supercomputer DGX Cloud (valued at billions) and cutting-edge models like protein synthesis (The Protein) and OpenFold 3. Its open-source family Nemotron includes voice, multimodal, retrieval-augmented, and security models.
Alpamayo: autonomous driving with reasoning
The truly surprising highlight of the event was Alpamayo, the world’s first autonomous driving system with thinking and reasoning capabilities. Unlike rule-based autonomous driving, Alpamayo reasons like a human driver, breaking down complex scenarios into common-sense elements.
“It will tell you what it will do next and why it makes that decision,” Jensen Huang explained. The Mercedes CLA with this technology will launch in the U.S. in the first quarter of 2026, rated the safest car in the world by NCAP, thanks to the “double safety stack” architecture NVIDIA developed.
Robots, factories, and the future of physical AI
NVIDIA presented a comprehensive robotics strategy. All robots will be equipped with the Jetson mini-computer and trained in the Isaac simulator of the Omniverse platform. The vision is clear: chip design, system architecture, and factory simulation, all accelerated by physical AI.
Jensen Huang invited humanoid and quadruped robots from Boston Dynamics and Agility to the stage, emphasizing that the factory itself is the biggest robot. Even Disney robots were trained on computers and validated in simulations before facing gravity in the real world.
The underlying message
In a context where skepticism about the “AI bubble” is growing and Moore’s Law limits are becoming evident, Jensen Huang needed to demonstrate with concrete facts what AI can achieve.
Previously, NVIDIA made chips for the virtual world. Now, they themselves demonstrate how physical AI—in the form of autonomous driving and humanoid robots—is entering the real world. As he said, when the battle begins, the business of the “military-industrial complex” can truly thrive.