Get all your news in one place.
100's of premium titles.
One app.
Start reading
Tom’s Hardware
Tom’s Hardware
Technology
Hassam Nasir

Huawei unveils new Atlas 350 AI accelerator with 1.56 PFLOPS of FP4 compute and up to 112GB of HBM — claims 2.8x more performance than Nvidia's H20

Huawei Atlas 350 launch.

China's mission to become entirely self-reliant in the field of artificial intelligence has reached a new milestone. Announced at the Huawei China Partner Conference 2026 in Shenzhen, the company has just unveiled its latest AI accelerator: the Atlas 350. This new NPU is based on an in-house Ascend 950PR chip, representing a significant upgrade over the last-gen Ascend 910-class silicon.

Huawei is marketing the Atlas 350 as a high-efficiency workhorse designed for the prefill stage (inference) of AI deployment. As such, it delivers 1.56 PFLOPS of FP4 throughput, which Huawei claims is 2.87 times higher than Nvidia's China-only H20. That number can't be verified because Hopper-era cards don't support FP4 natively, while the Atlas 350 is the first homegrown Chinese accelerator to be optimized for FP4 precision.

That's already a significant achievement because even Nvidia only recently started to support the format with its Blackwell GPUs. FP4 allows for larger models to be deployed on the same hardware while requiring less memory. Speaking of which, the Atlas 350 comes with 112GB of Huawei's proprietary HBM known as "HiBL 1.0."

(Image credit: Mydrivers)

Even though the Ascend 950PR otherwise features 128 GB of memory with a 1.6 TB/s bandwidth, current reports for the Atlas 350 say it maxes out at 1.4 TB/s. The memory access granularity has been reduced from 512 bytes to just 128 bytes. It also supports 2 TB/s interconnect bandwidth using the new LingQu protocol, which is 2.5x higher than the previous Ascend 910 series. The Atlas 350 is rated at 600W, 200W more than the H20.

Those specs paint an impressive picture for a homegrown chip, especially one that's made with U.S. sanctions in place. Huawei is not allowed to access TSMC's CoWoS tech that Nvidia uses to stack HBM near the GPU, so the company is leveraging some other advanced packaging. The memory itself is in-house and is supposed to compete with the likes of SK Hynix and Micron, though we don't know who the actual supplier is.

Precise availability wasn't announced — it rarely is with AI accelerators — but Huawei has kept its prior promise of a Q1 2026 release for the Ascend 950PR. BigGo Finance says the NPU is priced at 111,000 Yuan (~$16,000) versus Nvidia's H20 which can range from anywhere between $15,000 to $25,000 in the region. Street pricing doesn't really exist for AI GPUs, so take this particular bit with a grain of salt.

There are a lot more Ascend chips in the pipeline that we've covered in a dedicated article before. Despite the ambition to gain independence from foreign hardware, Chinese companies still source Nvidia GPUs (and not the nerfed ones), which makes sense considering how local silicon is not quite as competitive yet and because the CUDA software stack is so mature. Huawei's latest efforts, therefore, represent a serious step in trying to bridge that gap.

Sign up to read this article
Read news from 100's of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.