What you need to know
- Google Translate's latest expansion covers languages spoken by over 614 million people, about 8% of the global population.
- The new languages range from widely spoken ones to indigenous dialects and even those without current native speakers.
- Google's PaLM 2 language model has been crucial in adding languages related to Hindi and French creoles.
- About a quarter of the new additions are African languages like Fon, Kikongo, Luo, Ga, Swati, Venda, and Wolof.
Artificial intelligence is helping Google Translate expand with 110 new languages, which is its biggest update yet.
Google announced in a blog post that this huge expansion includes languages spoken by over 614 million people worldwide, or about 8% of the global population. The variety is amazing, from the widely spoken world languages spoken by more than 100 million people to the cherished dialects of indigenous communities.
Even languages with no current native speakers are included, showing Google's dedication to preserving endangered languages.
Adding to this impressive update, Google has focused on African languages, making up about a quarter of the new additions. Languages like Fon, Kikongo, Luo, Ga, Swati, Venda, and Wolof are now part of Google Translate, marking its most significant push into African languages so far.
Google's PaLM 2 large language model has been crucial in expanding Translate. It learns related languages, enabling the addition of Hindi-related tongues like Awadhi and Marwadi, as well as French creoles like Seychellois and Mauritian.
Meanwhile, adding widely spoken languages like Cantonese comes with its own set of challenges due to the shared written characters with Mandarin. Despite these hurdles, Google’s dedication to linguistic diversity shines through. A prime example is the inclusion of Manx, a Celtic language from the Isle of Man that nearly went extinct in 1974. Thanks to revitalization efforts, fluency has risen to thousands.
This expansion also includes Punjabi written in the Shahmukhi script, a Perso-Arabic variant used in Pakistan, where it's the most widely spoken language.
Google recognizes that translating languages isn't easy due to regional variations, dialects, and spelling differences. Some languages, like Romani, with its many dialects, don't have a single standard version, making translation tricky.
Before this expansion, Google Translate's biggest update was in May 2022 with the introduction of Zero-Shot Machine Translation. This technology allows the model to learn new languages without needing pre-existing translated examples. This was a major advancement in machine translation, helping Google bridge language gaps even further.
This expansion is a major step toward Google's 1,000 Languages Initiative, which aims to use AI models to support the world's top 1,000 languages.