India’s AI dream is getting lost, but where?
When Indian Prime Minister Narendra Modi hosted world leaders and tech chiefs earlier this year in New Delhi, he declared that AI must be “democratized”
When Indian Prime Minister Narendra Modi hosted world leaders and tech chiefs earlier this year in New Delhi, he declared that AI must be “democratized” and a “medium of inclusion and empowerment, especially across the Global South.”It’s a convenient vision for Silicon Valley, which is in the midst of an ongoing landgrab for the lucrative market. Young, tech-savvy and mobile first, India has become one of the most important growth regions for AI, ranking behind only the US in usage for both OpenAI’s ChatGPT and Anthropic’s Claude.But the key to both the “inclusion” and business dreams of tech diffusion is overcoming the barriers to speech. India has nearly two dozen official languages and more than a hundred dialects. If AI can’t close this gap, it will just become another technology that divides the English-speaking elite and everyone else. True localization will depend on whether models can comprehend Bengali voice notes, Gujarati payment queries, and code-switched Hindi-English business calls — all the messy and real-world spoken words that drive daily commerce and public life.Also read: Big Four rethink partnership as AI changes the gameMore than a billion people speak Indic languages. Yet one study found that GPT 5 only achieved about 45% accuracy on a human-curated benchmark covering 11 of them, including Modi’s mother tongue, Gujarati.The first generation of AI tools were trained on internet text, the majority of which is in English.
Technical improvements and better datasets have helped recent models improve in non-English and so-called “low-resource languages,” those with less data to train on. But the language gap persists, especially for speech, forecast to become the next mass way of interacting with models.“Voice is the most intuitive interface for humans, especially in more developing regions,” Sandeep Chinchali, the co-founder of Poseidon, an Andreessen Horowitz-backed data infrastructure startup, told me. And South Asia, he added “uses voice for everything,” with businesses running through phone calls, WhatsApp voice memos, speech-based payments and increasingly voice-enabled coding tools. AI systems that can’t comprehend these interactions will be useless in automating this work, not to mention potentially dangerous in public services.One problem is a lack of proper benchmarks for non-English models. Leading ones, for example, can’t even agree on what proper Bengali — a language spoken by more than 280 million — should look like. The heart of the issue is still data, Chinchali says, and not just quantity (Bengali makes up less than 0.1% of web text) but also quality.Spoken Indic languages add another layer of difficulty: regional variants, background noise, and frequent code-switching in technical and financial conversations. Speech data for AI training requires accurate transcription, longer clips, varied acoustic environments, demographic and regional variants, as well as careful human review before they can be truly improve AI models. Systems trained on narrower datasets often fail in the real world, where conversations mix local slang and borrowed English words in varied settings.Also read: Employers across globe continue to have confidence in MBA graduates despite AI boom: GMAC reportIndia Inc.