The last time NVIDIA didn’t bring consumer-grade graphics cards to CES was five years ago. But this time is different—Jensen Huang took the stage wearing his signature crocodile leather jacket, but brought something heavyweight: a 2.5-ton server cabinet and a Vera Rubin computing platform that could reshape the entire AI industry.
Named after astronomer Vera Rubin (who discovered dark matter), this system addresses the most urgent problem of the AI era—$10 trillion in computing resources are in dire need of modernization.
Understanding NVIDIA’s “Chip Integration” Through a Single Machine
As traditional Moore’s Law gradually fails, NVIDIA has chosen an extreme integration approach. This time, they broke the tradition of improving only 1-2 chips per generation—Vera Rubin redesigns six chips simultaneously, all entering mass production.
These six chips are:
Vera CPU: 88 Olympus cores, 176 threads, 227 billion transistors per chip
Rubin GPU: inference performance up to 50 PFLOPS (5 times that of the previous Blackwell), 336 billion transistors
Looking at parameters alone might be dull, but the key figures are this—training a model with 100 trillion parameters requires only a quarter of the Blackwell system. The cost to generate a token drops to one-tenth.
Behind the Increase in Computing Density: An Engineering Revolution
Previously, building a supercomputing node required 43 cables, took 2 hours to assemble, and was prone to errors. Vera Rubin nodes have no electrical connection cables—only 6 liquid cooling water pipes—reducing installation time to 5 minutes.
At the back of the cabinet, nearly 3.2 km of copper cables form an NVLink backbone network with 5000 lines, providing 400Gbps bandwidth. Jensen Huang joked, “Only a healthy CEO can move this stuff.”
More critically, memory upgrades—Vera Rubin systems feature 54TB of LPDDR5X memory (3 times that of the previous generation) and 20.7TB of HBM, with HBM4 bandwidth reaching 1.6 PB/s. Despite a performance increase of 3.5-5 times, the transistor count only increased by 1.7 times, reflecting advances in semiconductor manufacturing.
Solving the “Long Tail Problem”: From Capability to Application
A persistent bottleneck in AI training is insufficient context memory. Large models generate “KV Cache” (key-value cache), which is AI’s “working memory.” As conversations lengthen and models grow larger, HBM capacity often becomes insufficient.
Vera Rubin’s solution is to deploy BlueField-4 processors inside the cabinet to manage KV caches. Each node is equipped with 4 BlueField-4s, each providing 150TB of context storage—after allocation to GPUs, each GPU gets an additional 16TB of memory (the GPU itself has about 1TB). Data transfer speeds remain at 200Gbps, avoiding performance bottlenecks.
To make these “notes” scattered across dozens of cabinets and thousands of GPUs operate as a unified memory, the network must be large enough, fast enough, and stable enough. That’s where Spectrum-X comes in—the world’s first “Ethernet platform designed for generative AI,” integrating TSMC’s silicon photonics technology, with 512 channels × 200Gbps.
Jensen Huang estimated: a 1GW data center costs $50 billion, and Spectrum-X can deliver a 25% throughput increase, saving $5 billion. “You could say this network system is basically free.”
“Open Source Shock” and Industry Shift
Returning to the opening of the speech, Jensen Huang threw out a number—over the past decade, $10 trillion in computing resources have been fully upgraded. But this is not just hardware updates; it’s a paradigm shift in programming.
Last year’s open-source breakthrough with DeepSeek V1 surprised everyone. As the first open-source inference system, it sparked a wave of development across the industry. China’s Kimi K2 and DeepSeek V3.2 are currently ranked #1 and #2 in open-source models.
Jensen Huang admitted that although open-source models may lag industry-leading levels by six months, new models emerge every half year. This rapid iteration makes startups, giants, and research institutions eager not to fall behind—and NVIDIA is no exception. So this time, they are not just “selling shovels,” but also investing billions in DGX Cloud supercomputing, developing models like La Proteina (protein synthesis) and OpenFold 3.
The Nemotron open-source model family covers speech, multimodal, RAG, security, and more, performing well in various rankings, with more companies deploying them.
The Three Layers of Physical AI Computers
If large language models solve the “digital world” problem, the next ambition is to conquer the “physical world.” To enable AI to understand physical laws and survive in reality, extremely rare data is needed.
Jensen Huang summarized the “three computers” needed for physical AI:
Training Computer—high-performance systems equipped with training cards (e.g., GB300 architecture)
Inference Computer—embedded “little brains” in robots and cars responsible for real-time decision-making
Simulation Computer—including Omniverse and Cosmos, allowing AI to learn and feedback in virtual environments
Cosmos can generate massive physical world training environments for AI. Based on this architecture, Jensen Huang officially announced the world’s first end-to-end autonomous driving model, Alpamayo.
Unlike traditional systems, Alpamayo is truly end-to-end trained. Its breakthrough lies in solving the “long tail problem” of autonomous driving—when facing unfamiliar and complex road situations, Alpamayo doesn’t rigidly execute code but reasons like humans. “It will tell you what to do next and why.” In demos, the vehicle drives naturally and smoothly, decomposing complex road conditions into basic rules.
An implementation—Mercedes CLA equipped with Alpamayo technology will debut in the US in Q1 this year, followed by Europe and Asia. This car has been rated as the safest vehicle globally by NCAP, thanks to NVIDIA’s unique “dual safety stack” design—when the end-to-end AI is uncertain, the system automatically switches to a more traditional, safer mode.
Robot Army and Factories as Robots
NVIDIA’s robot strategy is equally ambitious. All robots will be equipped with Jetson mini computers, trained within Omniverse’s Isaac simulator. These technologies are integrated into industrial ecosystems like Synopsys, Cadence, Siemens, and others.
On stage appeared humanoid robots from Boston Dynamics and Agility, quadruped robots, and Disney’s robots. Jensen Huang joked, “These adorable guys are designed, manufactured, and even tested in computers, and have already ‘lived’ once before experiencing real gravity.”
NVIDIA’s ultimate vision is to accelerate everything from chip design to system architecture and factory simulation with physical AI—this includes industry deployment in new regions and applications, such as the physical map of Asia (mapa fizyczna azji) used to verify the performance of autonomous driving and robots under different geographic conditions.
The Logic of “Selling Weapons During Wartime”
If it weren’t for Jensen Huang, you might think this was a launch event for some AI model company. As the AI bubble heats up and Moore’s Law slows down, Jensen Huang seems to want to rekindle our confidence in AI—by showing what it can truly do.
From the powerful chip platform Vera Rubin, to the focus on applications and software, and down to concrete cases like physical AI, autonomous driving, and robotics—once they built chips for virtual worlds, now they are demonstrating in person, focusing on physical AI and competing fiercely in the real physical world.
After all, only in “wartime” can weapons continue to sell well.
Easter Egg: Due to time constraints at CES, Jensen Huang didn’t finish all slides. The unseen parts were turned into a humorous short film.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
Jensen Huang arrives with a 2,500-kilogram "monster," NVIDIA Vera Rubin completely revitalizes the AI industry
The last time NVIDIA didn’t bring consumer-grade graphics cards to CES was five years ago. But this time is different—Jensen Huang took the stage wearing his signature crocodile leather jacket, but brought something heavyweight: a 2.5-ton server cabinet and a Vera Rubin computing platform that could reshape the entire AI industry.
Named after astronomer Vera Rubin (who discovered dark matter), this system addresses the most urgent problem of the AI era—$10 trillion in computing resources are in dire need of modernization.
Understanding NVIDIA’s “Chip Integration” Through a Single Machine
As traditional Moore’s Law gradually fails, NVIDIA has chosen an extreme integration approach. This time, they broke the tradition of improving only 1-2 chips per generation—Vera Rubin redesigns six chips simultaneously, all entering mass production.
These six chips are:
Looking at parameters alone might be dull, but the key figures are this—training a model with 100 trillion parameters requires only a quarter of the Blackwell system. The cost to generate a token drops to one-tenth.
Behind the Increase in Computing Density: An Engineering Revolution
Previously, building a supercomputing node required 43 cables, took 2 hours to assemble, and was prone to errors. Vera Rubin nodes have no electrical connection cables—only 6 liquid cooling water pipes—reducing installation time to 5 minutes.
At the back of the cabinet, nearly 3.2 km of copper cables form an NVLink backbone network with 5000 lines, providing 400Gbps bandwidth. Jensen Huang joked, “Only a healthy CEO can move this stuff.”
More critically, memory upgrades—Vera Rubin systems feature 54TB of LPDDR5X memory (3 times that of the previous generation) and 20.7TB of HBM, with HBM4 bandwidth reaching 1.6 PB/s. Despite a performance increase of 3.5-5 times, the transistor count only increased by 1.7 times, reflecting advances in semiconductor manufacturing.
Solving the “Long Tail Problem”: From Capability to Application
A persistent bottleneck in AI training is insufficient context memory. Large models generate “KV Cache” (key-value cache), which is AI’s “working memory.” As conversations lengthen and models grow larger, HBM capacity often becomes insufficient.
Vera Rubin’s solution is to deploy BlueField-4 processors inside the cabinet to manage KV caches. Each node is equipped with 4 BlueField-4s, each providing 150TB of context storage—after allocation to GPUs, each GPU gets an additional 16TB of memory (the GPU itself has about 1TB). Data transfer speeds remain at 200Gbps, avoiding performance bottlenecks.
To make these “notes” scattered across dozens of cabinets and thousands of GPUs operate as a unified memory, the network must be large enough, fast enough, and stable enough. That’s where Spectrum-X comes in—the world’s first “Ethernet platform designed for generative AI,” integrating TSMC’s silicon photonics technology, with 512 channels × 200Gbps.
Jensen Huang estimated: a 1GW data center costs $50 billion, and Spectrum-X can deliver a 25% throughput increase, saving $5 billion. “You could say this network system is basically free.”
“Open Source Shock” and Industry Shift
Returning to the opening of the speech, Jensen Huang threw out a number—over the past decade, $10 trillion in computing resources have been fully upgraded. But this is not just hardware updates; it’s a paradigm shift in programming.
Last year’s open-source breakthrough with DeepSeek V1 surprised everyone. As the first open-source inference system, it sparked a wave of development across the industry. China’s Kimi K2 and DeepSeek V3.2 are currently ranked #1 and #2 in open-source models.
Jensen Huang admitted that although open-source models may lag industry-leading levels by six months, new models emerge every half year. This rapid iteration makes startups, giants, and research institutions eager not to fall behind—and NVIDIA is no exception. So this time, they are not just “selling shovels,” but also investing billions in DGX Cloud supercomputing, developing models like La Proteina (protein synthesis) and OpenFold 3.
The Nemotron open-source model family covers speech, multimodal, RAG, security, and more, performing well in various rankings, with more companies deploying them.
The Three Layers of Physical AI Computers
If large language models solve the “digital world” problem, the next ambition is to conquer the “physical world.” To enable AI to understand physical laws and survive in reality, extremely rare data is needed.
Jensen Huang summarized the “three computers” needed for physical AI:
Cosmos can generate massive physical world training environments for AI. Based on this architecture, Jensen Huang officially announced the world’s first end-to-end autonomous driving model, Alpamayo.
Unlike traditional systems, Alpamayo is truly end-to-end trained. Its breakthrough lies in solving the “long tail problem” of autonomous driving—when facing unfamiliar and complex road situations, Alpamayo doesn’t rigidly execute code but reasons like humans. “It will tell you what to do next and why.” In demos, the vehicle drives naturally and smoothly, decomposing complex road conditions into basic rules.
An implementation—Mercedes CLA equipped with Alpamayo technology will debut in the US in Q1 this year, followed by Europe and Asia. This car has been rated as the safest vehicle globally by NCAP, thanks to NVIDIA’s unique “dual safety stack” design—when the end-to-end AI is uncertain, the system automatically switches to a more traditional, safer mode.
Robot Army and Factories as Robots
NVIDIA’s robot strategy is equally ambitious. All robots will be equipped with Jetson mini computers, trained within Omniverse’s Isaac simulator. These technologies are integrated into industrial ecosystems like Synopsys, Cadence, Siemens, and others.
On stage appeared humanoid robots from Boston Dynamics and Agility, quadruped robots, and Disney’s robots. Jensen Huang joked, “These adorable guys are designed, manufactured, and even tested in computers, and have already ‘lived’ once before experiencing real gravity.”
NVIDIA’s ultimate vision is to accelerate everything from chip design to system architecture and factory simulation with physical AI—this includes industry deployment in new regions and applications, such as the physical map of Asia (mapa fizyczna azji) used to verify the performance of autonomous driving and robots under different geographic conditions.
The Logic of “Selling Weapons During Wartime”
If it weren’t for Jensen Huang, you might think this was a launch event for some AI model company. As the AI bubble heats up and Moore’s Law slows down, Jensen Huang seems to want to rekindle our confidence in AI—by showing what it can truly do.
From the powerful chip platform Vera Rubin, to the focus on applications and software, and down to concrete cases like physical AI, autonomous driving, and robotics—once they built chips for virtual worlds, now they are demonstrating in person, focusing on physical AI and competing fiercely in the real physical world.
After all, only in “wartime” can weapons continue to sell well.
Easter Egg: Due to time constraints at CES, Jensen Huang didn’t finish all slides. The unseen parts were turned into a humorous short film.