IBM has announced its next-generation Telum II processor with a built-in AI accelerator for next-generation IBM Z mainframes that can handle both mission-critical tasks and AI workloads. The new processor can potentially improve performance "by up to 70% across key system components," compared to the original Telum, released in 2021, according to an email we received from IBM.
The Telum II processor packs eight high-performance cores with improved branch prediction, store writeback, and address translation operating at 5.5GHz as well as 36MB of L2 cache, a 40% increase over its predecessor. The CPU also supports virtual L3 and L4 caches expanding to 360MB and 2.88GB, respectively. A key feature of Telum II is its improved AI accelerator, which delivers four times the computational power of its predecessor, reaching 24 trillion operations per second (TOPS) with INT8 precision. The accelerator's architecture is optimized for handling AI workloads in real time with low latency. In addition, Telum II has a built-in DPU for faster transaction processing. Telum II is made on Samsung's 5HPP process technology and contains 43 billion transistors.
System-level improvements in Telum II allow each AI accelerator within a processor drawer to receive tasks from any of the eight cores, ensuring balanced workloads and maximizing the available 192 TOPS per drawer across all accelerators when fully configured.
In addition to Telum II, IBM introduced its new Spyre AI accelerator add-in-card developed in collaboration with IBM Research and IBM. This processor contains 32 AI accelerator cores and shares architectural similarities with the AI accelerator in Telum II. The Spyre Accelerator can be integrated into the I/O subsystem of IBM Z through PCIe connections to boost the system's AI processing power. Spyre packs 26 billion transistors and is made on Samsung's 5LPE production node.
Both the Telum II processor and the Spyre Accelerator are designed to support ensemble AI methods, which involve using multiple AI models to improve the accuracy and performance of tasks. An example of this is in fraud detection, where combining traditional neural networks with large language models (LLMs) can significantly enhance the detection of suspicious activities, according to IBM.
Both the Telum II processor and Spyre Accelerator will be available in 2025, though IBM does not specify whether it will be early in the year or late in the year.