Micron is currently an underdog in the market of high-bandwidth memory, but it looks like things are changing rapidly as the company said that its supply of HBM3E memory had been sold out for 2024 and allocated for most of 2025. For now, Micron has said that its HBM3E will show up in Nvidia's H200 GPU for artificial intelligence and high-performance computing, so it looks like Micron is poised to grab a sizeable HBM market share.
"Our HBM is sold out for calendar 2024, and the overwhelming majority of our 2025 supply has already been allocated," said Sanjay Mehrotra, chief executive of Micron, in prepared remarks for the company's earnings call this week. "We continue to expect HBM bit share equivalent to our overall DRAM bit share sometime in calendar 2025."
Micron's initial HBM3E stacks are 24 GB 8Hi modules featuring a data transfer rate of 9.2 GT/s and a peak memory bandwidth of over 1.2 TB/s per device. Six of these stacks will be used for Nvidia's H200 GPU for AI and HPC to enable 141 GB of high-bandwidth memory in total. Since Micron is the first company to start shipments of HBM3E commercially, it is going to sell a boatload of its HBM3E packages.
"We are on track to generate several hundred million dollars of revenue from HBM in fiscal 2024 and expect HBM revenues to be accretive to our DRAM and overall gross margins starting in the fiscal third quarter," said Mehrotra.
The head of Micron said that it had started sampling of its 12-Hi HBM3E cubes, which increase memory capacity by 50% and therefore enable AI training of larger language models. These 36 GB HBM3E cubes will be used for next-generation AI processors and their production will ramp up in 2025.
Since the manufacturing of HBM involves production of specialty DRAMs, ramp-up of HBM will greatly affect Micron's ability to make DRAM ICs for mainstream applications.
"The ramp of HBM production will constrain supply growth in non-HBM products," Mehrotra said. "Industrywide, HBM3E consumes approximately three times the wafer supply as DDR5 to produce a given number of bits in the same technology node."