A seismic shift is occurring in the artificial intelligence hardware market driven by a new contender: Cerebras Systems. Recently, the California-based startup announced the launch of Cerebras Inference, a groundbreaking solution claimed to be 20 times faster than Nvidia (NVDA)'s GPUs.
Cerebras has developed what it calls the Wafer Scale Engine, the third generation of which powers the new Cerebras Inference. This massive chip integrates 44GB of SRAM and eliminates the need for external memory, which has been a significant bottleneck in traditional GPU setups. By resolving the memory bandwidth issue, Cerebras Inference can deliver a whopping 1,800 tokens per second for Llama3.1 8B and 450 tokens for Llama3.1 70B, setting new industry standards for speed.
For investors and tech enthusiasts alike, the comparison between Cerebras and leading chip manufacturers like Nvidia, AMD (AMD), and Intel (INTC) becomes increasingly relevant. While Nvidia has long dominated the AI and deep learning sectors with its robust GPU solutions, Cerebras' entry with a distinct and potentially superior technology could disrupt market dynamics. Moreover, AMD and Intel, both significant players in the chip industry, may also feel the pressure as Cerebras chips begin to carve out a niche in high-performance AI tasks.
Comparing Cerebras Chips to Nvidia
Comparing Cerebras chips to Nvidia's GPUs involves looking at several key dimensions of hardware performance, architectural design, application suitability, and market impact.
Architectural Design
Cerebras: Cerebras' claim to fame is its Wafer Scale Engine, which, as the name suggests, is built on a single, massive wafer. The latest wafer-scale engine features approximately 4 trillion transistors and integrates 44GB of SRAM directly on-chip. This design eliminates the need for external memory, thus removing the memory bandwidth bottleneck that hampers traditional chip architectures. Cerebras focuses on creating the largest and most powerful chip that can store and process enormous AI models directly on the wafer, which dramatically reduces the latency involved in AI computations.
Nvidia: Nvidia's architecture is based on a multi-die approach where several GPU dies are connected via high-speed interlinks like NVLink. This setup, seen in their latest offerings like the DGX B200 server, allows for a modular and scalable approach but involves complex orchestration between multiple chips and memory pools. Nvidia's chips, like the B200, pack a substantial punch with billions of transistors and are optimized for both AI training and inference tasks, leveraging their advanced GPU architecture that has been refined over the years.
Performance
Cerebras: The performance of Cerebras chips is groundbreaking in specific scenarios, particularly AI inference, where the chip can process inputs at speeds reportedly 20 times faster than Nvidia's solutions. This is due to the direct integration of memory and processing power, which allows for faster data retrieval and processing without the inter-chip data transfer delays.
Nvidia: While Nvidia may lag behind Cerebras in raw inference speed per chip, its GPUs are extremely versatile and are considered industry-standard in various applications ranging from gaming to complex AI training tasks. Nvidia's strength lies in its ecosystem and software stack, which is robust and widely adopted, making its GPUs highly effective for a broad range of AI tasks.
Application Suitability
Cerebras: Cerebras chips are particularly suited for enterprises that require extremely fast processing of large AI models, such as those used in natural language processing and deep learning inference tasks. Their system is ideal for organizations looking to reduce latency to the bare minimum and who require the processing of large volumes of data in real-time.
Nvidia: Nvidia's GPUs are more versatile and can handle a range of tasks, from rendering graphics in video games to training complex AI models and running simulations. This flexibility makes Nvidia a go-to choice for many sectors, not just those focused on AI.
Conclusion
While Cerebras offers superior performance in specific high-end AI tasks, Nvidia provides versatility and a strong ecosystem. The choice between Cerebras and Nvidia would depend on specific use cases and requirements. For organizations dealing with extremely large AI models where inference speed is critical, Cerebras could be the better choice. Meanwhile, Nvidia remains a strong contender across a wide range of applications, providing flexibility and reliability with a comprehensive software support ecosystem.
On the date of publication, Caleb Naysmith did not have (either directly or indirectly) positions in any of the securities mentioned in this article. All information and data in this article is solely for informational purposes. For more information please view the Barchart Disclosure Policy here.