Google revealed a total of 24 new languages coming to its Google Translate platform at this year’s I / O event.
The full list of new supported languages includes dialects spoken by a total of 300 million people across the globe, Google said. The most widely spoken of the new lot, Bhojpuri, is used by around 50 million speakers in northern India, Nepal, and Fiji. Meanwhile, the rarest addition, Sanskrit, remains in use by just 20,000 individuals in India.
More: Google I / O 2022: Here’s everything Google has announced so far
Other tongues such as Aymara and Guarani come from South America, while Krio, Lingala, and others can trace their origins back to nations across Africa.
Google also took special note of the fact that the aforementioned Aymara and Guarani languages, as well as Quechua, are being added as some of the first Indigenous languages supported by the platform. An English dialect (Sierra Leonean Krio) is also making its first appearance.
In addition to expanding the total number of available languages to 133, this update also marks a major technical milestone for Google. The company noted the new languages were brought in using its “Zero-Shot Machine Translation” technology. This system made it possible for Google to create a machine learning model that only sees monolingual text. According to the company, the model “learns to translate into another language without ever seeing an example.”
More: Google: AI helps Google Translate offer these new languages spoken by millions
Google admitted that, while impressive, the technology remains imperfect and continues to be a work in progress. It hopes continued effort in development will allow languages using Zero-Shot Machine Translation to “deliver the same experience you’re used to with a Spanish or German translation, for example.”
The search company also published a blog post and research paper for those interested in a deeper dive into this nascent technology.