
OpenAI on Thursday released GPT-5.3-Codex-Spark, its first AI model served on chips from Cerebras Systems, marking the ChatGPT maker’s first production deployment on silicon outside its long-standing core stack with Nvidia. The new model is a streamlined, lower-power variant of Codex designed for fast, interruptible coding tasks, and is initially rolling out as a research preview to ChatGPT Pro subscribers.
According to OpenAI, GPT-5.3-Codex-Spark is tuned for interactive development workflows such as editing specific sections of code and running targeted tests, and the model is optimized for high throughput when served on ultra-low latency hardware. The company claims it can exceed 1,000 tokens per second under the right configuration, while also defaulting to minimal edits, and will not automatically execute tests unless instructed.
The hardware behind all this is Cerebras’ third-generation Wafer Scale Engine. Unlike conventional GPU clusters built from many smaller chips connected over high-speed interconnects, Cerebras uses a single wafer-scale processor with hundreds of thousands of AI cores and large pools of on-chip memory. The architecture is designed to minimize data movement and reduce latency, which is often the bottleneck in interactive inference workloads.
OpenAI said last month that it had signed a deal to deploy Cerebras hardware for low-latency inference, and that it plans to bring 750 megawatts of Cerebras-backed compute online in phases through 2028. While that capacity will not replace Nvidia’s role in OpenAI’s training infrastructure, it gives the company a dedicated tier optimized for responsiveness rather than training.
Earlier this month, Sam Altman took to X.com to say that OpenAI loves working with Nvidia, and that “they make the best chips in the world,” adding, “We hope to be a gigantic customer for a very long time.” This came following a controversial report from Reuters that claimed OpenAI is unsatisfied with some Nvidia chips.
OpenAI has also described the partnership with Nvidia as “foundational” and said the company is anchored on Nvidia as the core of its training and inference stack, while also expanding the ecosystem around it through partnerships with Cerebras and others. OpenAI’s most powerful models continue to be trained and served on Nvidia systems.
OpenAI has also agreed to deploy 6 gigawatts in chips from AMD over multiple years and has also struck a deal with Broadcom to develop custom AI accelerators and networking components.
Codex itself now has more than 1 million weekly active users, according to OpenAI, and will expand beyond Pro users in the coming weeks as the company evaluates performance and demand.

Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.