Sarvam-M: India’s 24B Parameter Open-Source LLM Sets New Benchmarks in Indic AI

In a significant leap for India’s AI landscape, Sarvam AI has unveiled Sarvam-M, a 24-billion parameter open-source large language model (LLM) tailored specifically for Indian languages, mathematics, and programming tasks. Built upon the Mistral Small architecture, Sarvam-M not only showcases impressive performance metrics but also underscores India’s commitment to developing sovereign AI capabilities.


Background: Sarvam AI’s Vision

Founded in 2023 by Vivek Raghavan and Pratyush Kumar, Sarvam AI emerged with the mission to bridge the linguistic and cultural gaps in AI models for India. Recognizing the limitations of existing models in handling India’s diverse languages and contexts, Sarvam AI set out to create models that resonate with the country’s unique needs. Their earlier release, Sarvam-1, a 2-billion parameter model, laid the groundwork by supporting 10 major Indian languages alongside English.


Sarvam-M: Technical Overview

Sarvam-M is a culmination of extensive research and development, focusing on enhancing capabilities in Indian languages, mathematics, and programming. Key technical aspects include:

  • Foundation: Built on the Mistral Small architecture, ensuring a robust and scalable base.
  • Supervised Fine-Tuning (SFT): Involves curating a diverse set of prompts, quality and hardness scoring, and creating a curriculum that balances ‘non-think’ and ‘think’ modes.
  • Reinforcement Learning with Verifiable Rewards (RLVR): Employs custom reward engineering across tasks like instruction following, math, and programming, enhancing the model’s reasoning capabilities.

Performance Benchmarks

Sarvam-M’s performance has been rigorously evaluated across various benchmarks:

  • Indian Languages: Achieved a +20% average improvement on Indian language benchmarks compared to the base model.
  • Mathematics: Demonstrated a +21.6% improvement on math benchmarks, showcasing enhanced problem-solving abilities.
  • Programming: Recorded a +17.6% improvement on programming benchmarks, indicating better code understanding and generation.

Notably, in tasks intersecting Indian languages and math, such as the romanized Indian language GSM-8K benchmark, Sarvam-M achieved an 86% improvement. These results position Sarvam-M competitively against larger models like Llama-3.3 70B and Gemma 3 27B.


Deployment and Accessibility

Sarvam-M is openly available for download on Hugging Face, allowing researchers and developers to integrate and build upon its capabilities. Additionally, Sarvam AI provides APIs and a playground for users to experiment with the model’s functionalities .


Strategic Significance

Sarvam-M’s release aligns with India’s broader vision of establishing a sovereign AI ecosystem. The model’s focus on Indian languages and contexts ensures that AI solutions are more inclusive and representative of the country’s diversity. Furthermore, Sarvam AI’s selection by the Indian government to develop the nation’s foundational LLM under the IndiaAI Mission underscores the strategic importance of such indigenous initiatives.


Future Prospects

Building on the success of Sarvam-M, Sarvam AI has plans to develop more advanced models, including a proposed 70-billion parameter multimodal AI model supporting Indian languages and English . These endeavors aim to further solidify India’s position in the global AI landscape.


Conclusion

Sarvam-M represents a significant milestone in India’s AI journey, offering a powerful, open-source LLM tailored for the nation’s linguistic and computational needs. As AI continues to shape various sectors, models like Sarvam-M ensure that India’s diverse population is not only represented but also empowered in this technological evolution.

Leave a Comment