Subscribe

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Service

Huawei Beats Nvidia H100 with CloudMatrix 384 Breakthrough

Huawei Beats Nvidia H100 with CloudMatrix 384 Breakthrough Huawei Beats Nvidia H100 with CloudMatrix 384 Breakthrough
IMAGE CREDITS: AFP

Huawei has just raised the stakes in the global AI race. Days after announcing mass shipments of its Ascend 910C AI chips to fill the gap left by Nvidia in China, Huawei has now gone a step further. According to a new analysis by SemiAnalysis, Huawei’s CloudMatrix 384 system, powered by its Ascend 910C chips, has outpaced Nvidia’s flagship GB200 NVL72 rack system across several critical metrics.

The GB200 NVL72, revealed earlier this year, is Nvidia’s powerhouse—built with 72 Blackwell GPUs and 36 Grace CPUs. It promised massive performance gains, claiming up to 30 times faster AI processing and 25 times better energy efficiency for tasks like large language model inference compared to older systems.

Huawei’s Big Leap: Ascend 910C Powers CloudMatrix 384

But Huawei is not standing still. SemiAnalysis reports that the CloudMatrix 384 can now deliver 300 PFLOPs of dense BF16 compute—nearly double what Nvidia’s GB200 NVL72 offers. Even more impressively, Huawei’s system packs 3.6 times more memory capacity and more than twice the memory bandwidth.

While Huawei’s chips aren’t fully homegrown—they still rely on global supply chains, using HBM memory from Korea, TSMC’s wafer production, and equipment from the U.S., Netherlands, and Japan—the accomplishment signals a serious shift. SemiAnalysis even urged U.S. policymakers to pay closer attention to these supply chain gaps if they want to curb China’s progress.

Although Huawei’s chip manufacturing remains a generation behind Nvidia’s cutting-edge technology, its system-level approach is leading in other ways. The CloudMatrix 384 stitches together 384 Ascend 910C chips into a full mesh network. Even though each chip individually offers about a third of the performance of Nvidia’s Blackwell GPUs, sheer scale gives Huawei the advantage.

Huawei’s Next Play: Ascend 910D Is Coming

Adding to the buzz, The Wall Street Journal recently revealed that Huawei is preparing to test the next evolution—the Ascend 910D chip. This chip is specifically designed to go head-to-head with Nvidia’s H100. Huawei has already begun early trials with Chinese tech giants, with initial samples expected by late May and full production not far behind.

However, there has been some confusion. During an April 28 broadcast, CNBC mistakenly credited the recent performance leap to the unreleased 910D chip. SemiAnalysis quickly corrected the record, confirming the CloudMatrix 384’s breakthrough is powered by the current Ascend 910C—not the upcoming 910D.

How Huawei’s CloudMatrix 384 Stacks Up Against Nvidia’s GB200 NVL72

When placed side-by-side, Huawei’s system stands out:

  • Compute Power: 300 petaFLOPs BF16 performance, nearly twice Nvidia’s 150 petaFLOPs.
  • Memory Capacity: 49.2 terabytes HBM, versus GB200’s 13.8 terabytes.
  • Memory Bandwidth: 1229 terabytes per second, more than double Nvidia’s 576 TB/s.

These numbers matter. In AI, raw computing muscle, memory size, and bandwidth make or break training times and inference speeds—critical for everything from advanced scientific research to autonomous driving systems.

Notably, Nvidia’s GB200 NVL72 already represented a huge jump over the H100, so Huawei’s leap past it implies a strong advantage over existing H100-based platforms like Nvidia’s DGX H100 systems.

But Huawei’s win comes with a big trade-off: energy efficiency. CloudMatrix 384 consumes nearly four times more power and offers 2.3 times worse performance per watt compared to Nvidia’s GB200 NVL72. While Huawei wins in brute strength, Nvidia remains ahead in power efficiency—a huge deal for data centers managing rising energy costs.

Why the Chip Battle Matters

Despite its momentum, Huawei still faces big challenges. The Ascend 910C chip itself, while powerful in large numbers, only delivers about 60% of the performance of a single Nvidia H100. Huawei’s real advantage lies in its ability to link hundreds of these chips through a lightning-fast optical mesh network.

Plus, even as Huawei advances, it does so while navigating heavy sanctions. After being blacklisted by the U.S. in 2019, the company was cut off from premium chipmakers like TSMC and key memory suppliers. Yet Huawei found creative ways around those hurdles, reportedly using intermediaries to access critical technologies and continuing to innovate despite restrictions.

Meanwhile, U.S. export controls tightened further in 2025, cutting off even downgraded Nvidia chips like the H20 from Chinese markets—leading to a $5.5 billion write-off for Nvidia and opening a bigger opportunity for Huawei.

The market took notice. Following the release of Huawei’s benchmark results, Nvidia’s stock dropped around 2%, signaling that investors are paying close attention to this escalating competition.

Looking Ahead: Huawei’s Tough Road

Even with its impressive system-level achievement, Huawei’s path forward isn’t simple:

  • Energy Concerns: Power-hungry data centers are costly and harder to scale sustainably.
  • Technology Gaps: Huawei’s chips still lag two generations behind Nvidia’s latest HBM3e memory and 5nm process nodes.
  • Software Ecosystem: Nvidia’s CUDA software stack is deeply entrenched among developers, while Huawei’s toolkit is still catching up.

The real test will come with the Ascend 910D’s launch. If Huawei can deliver a chip that matches or beats the H100 without relying solely on scale, it could reshape the competitive landscape even further.

For now, Huawei’s CloudMatrix 384 stands as a bold reminder: in the AI arms race, it’s not just about who builds the fastest chip—it’s about who can build the most powerful system.

Share with others