Why Can't AI Data Centers Rely Solely on GPUs? Exploring the Synergy of Memory, Networking, and Storage

In June 2026, Bitcoin hovered around the $60,000 mark while Ethereum traded in the $1,600 range, signaling a period of consolidation for the crypto market. In contrast, a different sector—AI data center infrastructure—was experiencing a surge in activity. Gartner projects that global IT spending will reach $6.31 trillion in 2026, a 13.5% year-over-year increase, with data center system spending leading all categories at a 55.8% growth rate. IDC forecasts that global enterprise AI spending will hit $940 billion in 2026.

Amid this computing arms race, a crucial shift in perspective is underway: the competitiveness of AI data centers is no longer defined solely by the number of GPUs or peak computing power. Instead, the focus is shifting toward the overall synergy of compute, storage, and networking within clusters. Understanding how Memory, Networking, and Storage work together has become fundamental to evaluating the investment value of AI infrastructure.

The Memory Wall: The First Bottleneck in the Era of Large Models

The parameter size of AI large models has grown exponentially over the past two years. From 2024 to 2026, mainstream model parameters have increased a hundredfold, and context windows have expanded from tens of thousands to millions of tokens. However, server memory bandwidth has grown by less than 15% annually, lagging far behind the pace of AI business demand. This severe mismatch between software and hardware iteration rates has made the "memory wall" the core bottleneck limiting AI computing power.

The so-called memory wall refers to the fact that CPU/GPU processing speeds are increasing much faster than memory read/write bandwidth and latency. While compute chips operate at blazing speed, data cannot be supplied quickly enough, causing processors to spend significant time idly waiting. According to industry test reports, in large-scale GPU clusters, data I/O bottlenecks can result in GPUs idling for over 40% of the time—meaning nearly half of these expensive chips spend their time waiting for data transfers.

Memory resources are becoming alarmingly scarce. A single AI inference server consumes more than ten times the DRAM and HBM of a traditional data center server, with nearly 60% of global DRAM wafer capacity now allocated to AI clusters. HBM, in particular, has been in chronic short supply, with major production capacity pre-booked by large customers through 2026 and even 2027. Gartner notes that surging demand and supply constraints have driven HBM prices to record highs, making memory a high-margin segment for semiconductor manufacturers.

To break through the memory wall, the industry is advancing along two paths: First, software-level fine-tuning and compression optimization, such as tiered KV cache scheduling and low-bit quantization, to maximize existing storage resources. Second, hardware-level architectural innovation, including HBM upgrades and the adoption of new memory interconnect protocols like CXL (Compute Express Link). NVIDIA’s next-generation HGX Rubin platform has tripled GPU memory bandwidth to 176 TB/s. These two approaches are not mutually exclusive; rather, they are complementary strategies that reshape the logic of storage and compute collaboration across the industry.

Networking: The "Neural Network" of AI Clusters

While memory addresses data transfer efficiency within a single node, networking solves the challenge of data movement between nodes. In large-scale AI clusters, hundreds or thousands of GPUs must work together to train or infer a single model, making inter-GPU communication efficiency critical to overall training speed.

Current bandwidth bottlenecks exist at multiple levels: Between chips, traditional PCB interconnects can no longer meet the high bandwidth and low latency demands of AI chips. Within server racks, inter-server bandwidth constrains vertical scaling. Across data centers, long-distance transmission bandwidth and latency limit horizontal scaling and cross-region workload scheduling. Estimates show that, in today’s AI training clusters, the energy consumed by data movement now exceeds that of computation itself.

NVIDIA’s NVLink and InfiniBand have long dominated the internal interconnect market for AI clusters. The latest NVLink Switch now delivers 28.8 TB/s of bandwidth, doubling the previous generation. However, this landscape is being challenged—companies like AMD and Broadcom are developing their own interconnect solutions, and open standards such as UALink (Ultra Accelerator Link) are rapidly maturing. By 2026, the networking space will have shifted from "NVIDIA-only" to "multi-standard competition," raising the bar for system integration capabilities among data center operators.

Storage: From "Warehouse" to "Data Pipeline"

In traditional data centers, storage has functioned as a "data warehouse"—primarily for archiving and preserving cold data. In AI data centers, however, storage has evolved into a "data pipeline"—tasked with continuously delivering training data to compute nodes at extremely high speeds and supporting low-latency model parameter reads during inference.

AI training requires rapid access to massive volumes of raw data, while inference demands fast retrieval of model weights and KV caches. KV caches are now extending from GPU HBM down to system DRAM and even further to high-speed local SSDs. This blurs the line between storage and memory, making storage devices not just endpoints for data, but critical nodes in the data flow pipeline.

All-flash storage is replacing traditional hard drives as the mainstream choice for AI data centers. At ISC High Performance 2026, Sugon showcased all-flash storage and native high-speed networking products, underscoring this industry trend. Storage performance now directly determines whether data reaches compute units in time, thereby impacting GPU utilization rates.

Compute-Memory-Network Synergy: From Single-Point Breakthroughs to System Optimization

Once the roles and bottlenecks of each component are clear, the meaning of "synergy" comes into focus: The true computing power of an AI data center is not a simple sum of GPU performance, memory bandwidth, network throughput, and storage IOPS. Instead, it’s the effective output generated by the system-level coupling of all four.

The relentless growth of model parameters is driving the rise of super AI clusters. Usability is no longer determined solely by chip performance, but increasingly by the overall synergy and efficiency of compute, storage, and networking within the cluster. This view is quickly becoming an industry consensus.

In practice, tightly integrated compute-memory-network design has become the standard approach among leading vendors. Sugon’s scaleX AI supercluster follows this tightly integrated philosophy, significantly boosting training and inference efficiency. NVIDIA’s Dynamo 1.0 inference operating system, paired with the BlueField-4 CMX platform, seamlessly connects GPU, HBM, host DRAM, local flash, and remote storage across multiple tiers, breaking down GPU memory silos with automated hot and cold data routing.

IDC’s June 2026 report makes it clear: The competitive edge in AI is no longer about having the most powerful compute, but about converting AI into sustainable business capability at the lowest token cost. At the heart of token cost are the combined efficiencies of compute, memory, networking, and storage.

Market Landscape: Who’s Benefiting?

This industry trend is already well reflected in capital markets.

On the memory side, SK Hynix stands out as the star performer of 2026. On June 22, 2026, SK Hynix shares soared 6% to a record 2,944,000 KRW, surpassing Samsung to become Korea’s most valuable listed company, with a year-to-date gain of over 349%. Micron has also delivered strong results, with quarterly revenue more than quadrupling in the last week of June and 16 new long-term supply agreements announced. Micron’s stock price jumped 16% on the day of its earnings release.

In networking, fiber optics supplier Corning hit an all-time high in the last week of June, as its products’ pivotal role in AI data centers was revalued by the market. Cisco’s AI infrastructure orders have exceeded $9 billion.

On the server and system integration front, Dell’s AI-optimized server revenue reached $16.1 billion in a single quarter, up 757% year-over-year. Supermicro holds about 70% market share in direct liquid cooling technology.

For data center operations, BOCOM International named GDS (GDS-SW) and SUNeVision (SUNEVISION) as top picks in the data center sector, citing explosive demand growth driven by generative AI. UBS also noted that China’s internet data center industry will accelerate significantly from the second half of 2026.

How to Invest in AI Infrastructure via Gate?

Gate now offers access to over 12,500 stocks and ETFs across US, Hong Kong, and Korean markets. Investors can use a unified account to trade global equities directly with USDT and other digital assets, enabling seamless allocation between crypto and traditional securities.

Within the AI data center infrastructure sector, Gate covers the entire industry chain, from chips to applications:

For US stocks, investors can trade core companies such as NVIDIA (NVDA), AMD, Micron (MU), Broadcom (AVGO), Dell (DELL), Supermicro (SMCI), Corning (GLW), and Cisco (CSCO). Gate supports pre-market and after-hours trading, expanding trading hours to 16×5, allowing users to respond promptly to earnings and macro data releases.

For Hong Kong stocks, investors can focus on data center operators like GDS (09698.HK) and SUNeVision (01686.HK).

For Korean stocks, SK Hynix (000660.KS) is the undisputed leader in HBM, while Jeju Semiconductor plays a key upstream role in AI data center optical communications materials.

Gate’s stock trading offers fees as low as 0.1%, supports both leveraged and spot trading modes, and users with holdings of $2,000 or more enjoy exclusive VIP rates. For investors seeking systematic exposure to the AI data center infrastructure sector, Gate’s cross-market, multi-asset, one-stop trading capabilities are lowering the barriers to global tech asset allocation.

Conclusion

AI data centers are moving from the "GPU stacking" era to a new phase of system-level optimization. Memory, networking, and storage are no longer isolated infrastructure components—they are now system variables that, under the compute-memory-network synergy framework, jointly determine the real output of AI computing power.

Understanding this logic not only helps in evaluating technology trends but also provides a solid analytical framework for investment decisions. From chips to memory, networking to storage, servers to data center operations, the entire industry chain is just beginning to be revalued. As short-term crypto market volatility intersects with the long-term narrative of AI infrastructure, a new window for allocating across digital assets and the real economy is opening.

FAQ

Q1: Why can’t AI data centers solve computing power issues just by adding more GPUs?

GPUs are only the endpoint for compute output. Their performance depends heavily on whether memory bandwidth can supply data in time, whether networking can efficiently coordinate multi-GPU parallelism, and whether storage can quickly handle massive data reads and writes. In large GPU clusters, data I/O bottlenecks can lead to over 40% GPU idle time—simply piling up GPUs without addressing these three areas results in massive waste of computing power.

Q2: Why is HBM in such short supply?

HBM (High Bandwidth Memory) is the standard memory for AI chips, with complex manufacturing processes and expansion cycles exceeding two years. In 2026, AI inference demand will surpass training, further driving demand for HBM and high-capacity DRAM. Most production capacity is already booked by major customers through 2026 and even 2027, leaving little short-term supply flexibility.

Q3: What is the core logic behind investing in AI data center infrastructure?

The core logic is the shift from "training-dominated" to "full-stack demand explosion." In 2026, Microsoft, Google, Amazon, and Meta will collectively spend $725 billion on AI infrastructure capital expenditures. This scale of investment cannot be borne by a single segment; the entire value chain—from chips and memory to networking and data center operations—stands to benefit structurally.

Q4: How does Gate enable trading of AI data center-related stocks?

Gate offers access to over 12,500 US, Hong Kong, and Korean stocks and ETFs. Users can deposit with USDT and other digital assets, and trade core AI infrastructure stocks like NVIDIA, Micron, and SK Hynix in a unified account. Gate supports pre-market and after-hours trading, both leveraged and spot modes, with fees as low as 0.1%.

Q5: What are the main risks of investing in AI data center infrastructure?

Key risks include: (1) Supply-demand mismatches could lead to temporary oversupply—BOCOM International notes the need to watch for cyclical mismatches and valuation swings; (2) The sustainability of hyperscale cloud providers’ capital expenditures—J.P. Morgan warns that 2025–2026 capex growth far outpaces revenue growth, pressuring cash flows; (3) Geopolitical tensions and export controls could disrupt advanced chip supply chains.

The content herein does not constitute any offer, solicitation, or recommendation. You should always seek independent professional advice before making any investment decisions. Please note that Gate may restrict or prohibit the use of all or a portion of the Services from Restricted Locations. For more information, please read the User Agreement

Why Can't AI Data Centers Rely Solely on GPUs? Exploring the Synergy of Memory, Networking, and Storage

The Memory Wall: The First Bottleneck in the Era of Large Models

Networking: The "Neural Network" of AI Clusters

Storage: From "Warehouse" to "Data Pipeline"

Compute-Memory-Network Synergy: From Single-Point Breakthroughs to System Optimization

Market Landscape: Who’s Benefiting?

How to Invest in AI Infrastructure via Gate?

Conclusion

FAQ

Flash

Brent Crude Falls 22% From Month Ago as U.S.-Iran Doha Talks Approach

Binance Opens Alpha Airdrop Claiming and Trading Today at 6 PM UTC+8, 224 Points Required

Binance to Delist 9 Spot Trading Pairs on July 3

Binance to Delist 9 Inactive Spot Trading Pairs on July 3

Broadcom Co-Founder Henry Samueli Sells $250M in Stock on June 24

Dexe Nears All-Time High: Why Is DAO Governance Back in the Spotlight?

Ethereum Whales’ Unrealized Profits Turn Negative for the First Time Since 2019: Is the Market Bottom Approaching?

Story: Why Rebrand as DATA Foundation? AI Training Data Emerges as the New Frontier