The Languages AI Is Leaving Behind
The generative-AI boom looks very different for non-English speakers.
This is Atlantic Intelligence, a limited-run series in which our writers help you wrap your mind around artificial intelligence and a new machine age. Sign up here.
Generative AI is famously data-hungry. The technology requires huge troves of digital information—text, photos, video, audio—to “learn” how to produce convincingly humanlike material. The most powerful large language models have effectively “read” just about everything; when it comes to content mined from the open web, this means that AI is especially well versed in English and a handful of other languages, to the exclusion of thousands more that people speak around the world.
In a recent story for The Atlantic, my colleague Matteo Wong explored what this might mean for the future of communication. AI is positioned more and more as the portal through which billions of people might soon access the internet. Yet so far, the technology has developed in such a way that will reinforce the dominance of English while possibly degrading the experience of the web for those who primarily speak languages with less minable data. “AI models might also be void of cultural nuance and context, no matter how grammatically adept they become,” Matteo writes. “Such programs long translated ‘good morning’ to a variation of ‘someone has died’ in Yoruba,” David Adelani, a DeepMind research fellow at University College London told Matteo, “because the same Yoruba phrase can convey either meaning.”
But Matteo also explores how generative AI might be used as a tool to preserve languages. The grassroots efforts to create such applications move slowly. Meanwhile, tech giants charge ahead to deploy ever more powerful models on the web—crystallizing a status quo that doesn’t work for all.
— Damon Beres, senior editor
The AI Revolution Is Crushing Thousands of Languages
By Matteo Wong
Recently, Bonaventure Dossou learned of an alarming tendency in a popular AI model. The program described Fon—a language spoken by Dossou’s mother and millions of others in Benin and neighboring countries—as “a fictional language.”
This result, which I replicated, is not unusual. Dossou is accustomed to the feeling that his culture is unseen by technology that so easily serves other people. He grew up with no Wikipedia pages in Fon, and no translation programs to help him communicate with his mother in French, in which he is more fluent. “When we have a technology that treats something as simple and fundamental as our name as an error, it robs us of our personhood,” Dossou told me.
The rise of the internet, alongside decades of American hegemony, made English into a common tongue for business, politics, science, and entertainment. More than half of all websites are in English, yet more than 80 percent of people in the world don’t speak the language. Even basic aspects of digital life—searching with Google, talking to Siri, relying on autocorrect, simply typing on a smartphone—have long been closed off to much of the world. And now the generative-AI boom, despite promises to bridge languages and cultures, may only further entrench the dominance of English in life on and off the web.
What to Read Next
- So much for “learn to code”: “In the age of AI, computer science is no longer the safe major,” Kelli María Korducki writes.
- The new Luddites aren’t backing down: “Activists are organizing to combat generative AI and other technologies—and reclaiming a misunderstood label in the process,” Brian Merchant writes.
P.S.
America is getting sick of dating apps. As Lora Kelley reports, apps such as Grindr and Hinge are trying something new to spark interest: weekly discount codes for burrito bowls. No, just kidding: It’s artificial intelligence.
— Damon
What's Your Reaction?