Update 11/22/2024 6:45am PT: AMD has sent over additional details about the processor:
"D: The AMD EPYC 9V64H processor is an HPC focused processor that utilizes 96 Zen 4 cores with 128GB of HBM3 on package. It is designed to outperform the competition when it comes to memory bandwidth and memory latency sensitive workloads and was developed in close collaboration with Microsoft Azure. Our chiplet architecture has enabled us to easily build the EPYC 9V64H. The EPYC 9V64H is based on the SH5 socket that is used for the AMD Instinct MI300X and AMD Instinct MI300A accelerators."
Original Article:
Microsoft announced its latest high performance computing (HPC) Azure virtual machines powered by a custom AMD CPU that may have once been called MI300C.
The HBv series of Azure VMs are focused on delivering high amounts of memory bandwidth, an important specification for HPC; Microsoft calls it the “biggest HPC bottleneck.” Previously, Microsoft had used Milan-X and Genoa-X server CPUs with AMD’s 3D V-Cache to provide this extra bandwidth, but for the latest HBv5 VMs, Microsoft clearly wanted something even more performant.
The custom AMD CPU used for HBv5 VMs leverages HBM3, usually the memory of choice for the latest data center-class GPUs, such as AMD’s MI300X. With a bandwidth of 6.9TB/s from four of the chips in a single VM, the VMs are almost nine times faster than the Genoa-X CPUs that Microsoft offers in HBv4 VMs, and nearly 20 times faster than Milan-X chips in HBv3 VMs.
When paired with a CPU, the HBM3 fulfills a similar role as 3D V-Cache. Still, instead of expanding the pool of L3 cache, it effectively adds a massive L4 cache with even greater bandwidth and presumably much worse latency. However, the latter isn't as important in certain types of workloads.
Each HBv5 VM gets four of these custom AMD CPUs, and with all the bells and whistles, a single HBv5 VM offers 450GB of HBM3, 352 Zen 4 cores that clock up to 4GHz, and double the normal Infinity Fabric bandwidth that’s available on regular Epyc CPUs. SMT (hyperthreading) has, however, been disabled. The VMs also have 800Gb/s of Nvidia’s Quantum-2 InfiniBand for network switching.
At 352 cores across four CPUs, that’s 88 cores for each, though it’s likely not every core on the processor is exposed to the VM. Each Zen 4 CCD has either eight or 16 cores, depending on whether it’s Zen 4 or Zen 4c; the custom CPU either uses 11 Zen 4 CCDs or six Zen 4c CCDs, with eight cores on one CCD disabled. It’s more probable that the CPU has 96 fully functional cores, with eight of them reserved for operating the VM, perhaps in an orchestration or hypervisor role.
This “custom” AMD CPU might not be so custom either, as it sounds quite a bit like last year’s rumored MI300C chip. This CPU was expected to essentially be an MI300A APU but equipped exclusively with Zen 4 CCDs instead of CDNA 3 graphics, allowing for a 96-core CPU with HBM3. MI300A’s CPU cores clock up to 3.7GHz, not far off from the CPU used for HBv5, indicating that the custom Azure processor and MI300C may be one and the same.
However, while the HBv5 CPU may not be custom on a technical level, it’s nevertheless Microsoft’s exclusive CPU. “It is only available on Azure,” Microsoft engineer Glenn Lockwood said on Bluesky, responding to a user wondering whether the AMD CPU would ever become available as a regular Epyc CPU.
If the HBv5 processor was formerly MI300C, AMD may have initially wanted to sell it to the general public but had trouble finding a market for it, according to AMD memory engineer Phil Park.
“Why haven’t we seen EPYC+HBM sooner? EPYC has been focused on high volume markets, which is why you don’t see EPYC with more than 2 sockets,” Park posted on Bluesky. “You can’t swap out your DDR5 controllers and add HBM controllers/stacks and call it a day. HBM forces certain design choices (e.g., every HBM3 stack requires sixteen 64-bit channels).
“Flexibility: with HBM, you can’t upgrade capacity or have lower cost versions with fewer channels populated," he added. "Generally, CPUs don’t require that much bandwidth.”
This explanation lines up with the thus-far short history of HBM-equipped CPUs. Intel has already launched HBM-infused CPUs based on Sapphire Rapids, called Xeon Max, which are used in the Aurora supercomputer and are also generally available.
However, Intel confirmed last year there won’t be a version of Xeon Max based on Emerald Rapids, and it’s unclear if Granite Rapids will get a Xeon Max variant either, which may indicate they’ve not been a huge commercial success. The pragmatic decision for AMD may have been to secure a deal with Microsoft and focus MI300C production towards Azure.