Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Tom’s Hardware
Tom’s Hardware
Technology
Anton Shilov

TikTok owner ByteDance taps TSMC to make its own AI GPUs to stop relying on Nvidia — the company has reportedly spent over $2 billion on Nvidia AI GPUs

TSMC.

According to news outlet The Information, ByteDance, the parent company of TikTok, is developing two AI GPUs that will enter mass production by 2026. TSMC will make both products. Assuming that the information from an unofficial source is accurate, ByteDance will reduce reliance on Nvidia for AI hardware while staying within the U.S. export regulations.

ByteDance's lineup of AI GPUs, which are in the design phase and will enter mass production in a year, if not later, includes one for AI training and another for AI inference. Broadcom, which has already built AI chips for Google, is expected to design the AI chips. The GPUs are said to be produced on one of TSMC's N4/N5 process technologies, so a similar node to TSMC's 4NP is used to build Nvidia's Blackwell series GPUs for AI and HPC. ByteDance's GPUs are expected to enter mass production by 2026, so expect their deployment in 2026.

ByteDance has reportedly spent over $2 billion on more than 200,000 Nvidia H20 GPUs (i.e., about $10,000 per unit, which is a bit less than $12,000 – $13,000) for its AI efforts this year alone, and many of these GPUs have yet to be delivered to the company. This heavy investment highlights the importance of AI to ByteDance's general strategy.

According to the report, the shortage of Nvidia GPUs and their high price are among the reasons why ByteDance decided to build its own AI hardware. Nvidia designed its DGX H20 and some other GPUs specifically for the Chinese market in response to U.S. export controls imposed last year.

As a result, the HGX H20 is a massively cut-down GPU (compared to the H100) that still sells for a whopping $10,000 if the information about the price is correct. For example, while Nvidia's HGX H20 only offers 296 INT8/FP8 TOPS/TFLOPS and 148 BF16/FP16 TFLOPS performance for AI computations, the fully-fledged H100 delivers 3,958 INT8/FP8 TOPS/TFLOPS as well as 1,979 BF16/FP16 TFLOPS performance for AI. Yet, with 96 GB of HBM3 memory onboard, up to 4.0 TB/s of memory bandwidth, and 8-way GPU capability, Nvidia's HGX H20 is still in high demand by Chinese companies in real-world applications. Nvidia's processor reportedly beats Huawei's competitors.

While ByteDance will unlikely be able to make its GPUs significantly faster than Nvidia's HGX H20 due to U.S. export control rules (as TSMC will be unable to ship such GPUs to ByteDance), they will be significantly cheaper for the company.

There is a massive catch about ByteDance's initiative to develop its GPUs for AI. The company now relies on Nvidia's CUDA and supporting software stack for AI training and inference. Once it goes with its AI GPUs, it must develop its software platform and ensure its software stack is fully compatible with its hardware. Although many Chinese companies have developed AI GPUs to reduce reliance on Nvidia, those chips are used for select workloads and continue to rely on Nvidia's GPUs for others.

Nvidia expects to ship over one million HGX H20 units to its Chinese customers this year, nearly double Huawei's projected sales of 550,000 Ascend 910B AI GPUs for 2024. Nvidia's H20 GPUs could generate over $12 billion in revenue, surpassing the company's total China earnings from the previous year, including sales of other hardware types, such as GPUs for gamers.

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.