Get all your news in one place.
100’s of premium titles.
One app.
Start reading
iMore
iMore
Technology
Connor Jewiss

Apple Intelligence trained on Google's custom chips, rather than on hardware from Nvidia

IPhone Siri OpenAI.

Apple announced iOS 18 at its WWDC developers conference on June 10. One of the biggest software updates we've ever seen, iOS 18 brings some incredible new features and more customization options to the iPhone than ever before. But the biggest addition is Apple Intelligence – Apple's set of AI features. All these features are powered by Apple's own AI models.

What makes these models so special is that they work entirely on your device, rather than sending things to servers. Now, we know a little bit more about how Apple trained these AI models. Per an official Apple research paper, the tech giant trained its models using Google's custom chips rather than hardware from Nvidia.

It turns out, Apple ditched Nvidia and instead opted for Google’s TPUv4 and TPUv5 chips to churn through the mountains of data needed for their Apple Intelligence Foundation Language Models (AFMs). These AFMs are the brains behind the flashy Apple Intelligence features that are starting to roll out to developers.

How Apple trained its AI models on Google chips

Apple’s main LLM (large language model), had the muscle of 8,192 TPUv4 chips working in unison. Picture it as eight slices of 1,024 chips each. The training was very intense, involving a triple-stage process with trillions of tokens – 6.3 trillion to start, followed by a mere 1 trillion, and a final stretch with 100 billion tokens for context-lengthening.

The data buffet for these AFMs was pretty lavish too, with contributions from the Applebot web crawler (following robots.txt, mind you), various licensed datasets, and a sprinkle of public code, math, and other datasets for good measure.

The AFM-on-device model, the slimmer sibling designed for offline features, underwent some serious knowledge distillation. This model, a tidy 3 billion parameters, was distilled from the 6.4 billion parameter server model and trained using a single slice of 2,048 TPUv5p chips.

In terms of performance, Apple claims its AFM-server and AFM-on-device models are top-notch, excelling in benchmarks like Instruction Following, Tool Use, Writing, and more.

More from iMore

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.