Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Tom’s Hardware
Tom’s Hardware
Technology
Paul Alcorn

AMD Expands MI300 With GPU-Only Model, Eight-GPU Platform with 1.5TB of HBM3

(Image credit: Forbes)

AMD announced a range of new products today at its Data Center and AI Technology Premiere event here in San Francisco, California. The company finally shared more details about its Instinct MI300A processors that feature 3D-stacked CPU and GPU cores on the same package with HBM, and a new GPU-only MI300X model that brings eight accelerators onto one platform that wields an incredible 1.5TB of HBM3 memory. 

The company also made announcements about its 5nm EPYC Bergamo processors for cloud native applications and its EPYC Genoa-X processors with up to 1.1GB of L3 cache. All three of these products are available now, but AMD also has its EPYC Sienna processors for telco and the edge coming in the second half of 2023. 

(Image credit: AMD)

Combined with AMD's portfolio of Alveo and Pensando networking and DPUs, AMD has a full stack of products geared for AI workloads, placing it into direct contention with market leader Nvidia, its primary competitor for AI acceleration products, and Intel, which also offers several AI-acceleration solutions across a broad range of products. 

This article focuses on the news around the MI300, but we'll add links to our other content shortly. We just received the information from AMD, so this article will be updated as we add more details. 

AMD Instinct MI300

(Image credit: AMD)
(Image credit: AMD)

The Instinct MI300A is a data center APU that blends a total of 13 chiplets, many of them 3D-stacked, to create a single chip package with twenty-four Zen 4 CPU cores fused with a CDNA 3 graphics engine and eight stacks of HBM3 memory totaling 128GB. Overall the chip weighs in with 146 billion transistors, making it the largest chip AMD has pressed into production. The nine compute dies, a mix of 5nm CPUs and GPUs, are 3D-stacked atop four 6nm base dies that are active interposers that handle memory and I/O traffic, among other functions. The instinct MI300 will power the two-exaflop El Capitan supercomputer, which is slated to be the fastest in the world when it comes online later this year.  

(Image credit: AMD)
(Image credit: AMD)

Today AMD announced a GPU-only variant, the MI300X, and presented several demos of its performance. The GPU-only MI300X is optimized for large language models (LLMs) and comes equipped with only CDNA3 GPU tiles paired with 192GB of HBM3 memory.

The voluminous memory capacity, spread across 24GB HBM3 chips, allows the chip to run LLMs up to 80 billion parameters, which AMD claims is a record for a single GPU. The chip delivers 5.2 TB/s of memory bandwidth across eight channels and 896 GB/s of Infinity Fabric Bandwidth. The MI300X offers 2.4X HBM density than the Nvidia H100 and 1.6X HBM bandwidth than the H100, meaning that AMD can run larger models than Nvidia's chips. 

The chip is forged from 12 different chiplets on a mix of 5nm (GPU) and 6nm nodes (I/O die), which include eight GPUs and several I/O dies, for a total of 153 billion transistors. 

(Image credit: AMD)
(Image credit: AMD)

The MI300A can run in several different modes, but the primary mode consists of a single memory domain and NUMA domain, thus providing uniform access memory for all the CPU and GPU cores. Meanwhile, the MI300X uses coherent memory between all of its GPU clusters. The key takeaway is that the cache-coherent memory reduces data movement between the CPU and GPU, which often consumes more power than the computation itself, thus reducing latency and improving performance and power efficiency.

AMD conducted a demo of a 40 billion parameter Falcon-40B model running on a single MI300X GPU, but no performance metrics were provided. Instead, the LLM wrote a poem about San Francisco, the location of AMD's event. AMD says this is the first time a model this large has been run on a single GPU. 

(Image credit: AMD)
(Image credit: AMD)

AMD also announced the AMD Instinct Platform, which combines eight MI300X GPUs onto a single server motherboard with 1.5TB of total HBM3 memory. This platform is OCP-compliant, in contrast to Nvidia's proprietary MGX platforms. AMD says this open-sourced design will speed deployment. 

MI300A, the CPU+GPU model, is sampling now. The MI300X and 8-GPU Instinct Platform will sample in the third quarter, and launch in the fourth quarter. We're still digging for more details - stay tuned for more in the coming hours. 

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.