Sanjana
Sanjana Surbhi
Assistant Manager
New Delhi, Updated on Jun 5, 2025 16:06 IST

The BharatGen team from IIIT Hyderabad has launched Patram, India’s first vision-language foundational model designed specifically for document understanding.

IIIT Hyderabad’s BharatGen Team Launches Patram

IIIT Hyderabad’s BharatGen Team Launches Patram

BharatGen, a government-supported initiative focused on developing India-centric multimodal large language models, has achieved a major milestone with the launch of Patram-7B-Instruct - India’s first vision-language foundational model built from the ground up for complex document understanding. The model has been developed by a team representing BharatGen from the International Institute of Information Technology, Hyderabad (IIIT H) and the Indian Institute of Technology, Bombay (IIT B).

Patram by BharatGen

Patram is part of the BharatGen suite of multimodal large language models being developed with funding support from the Department of Science and Technology (DST). Patram-7B-Instruct is a 7-billion parameter vision-language AI model trained on a large and diverse corpus of Indian documents. Designed to analyze and understand scanned or photographed documents, the model can interpret and respond to natural-language instructions. It is now freely available as an open-source release on Hugging Face and MeitY IndiaAI’s AIKosh platform.

Developed in just five months, Patram was created by a team based at IIIT Hyderabad, comprising engineers (alumni) and student interns, with institutional support from IIIT-H and TiH-IoT at IIT Bombay. The project was led by Dr. Ravi Kiran Sarvadevabhatla, Associate Professor at IIIT-Hyderabad, and Dr. Ganesh Ramakrishnan, Professor at IIT-Bombay.

Despite its compact size, Patram outperforms several larger international models including DeepSeek-VL-2 on key benchmarks like DocVQA and VisualMRC. It also delivers impressive results on Patram-Bench, a custom benchmark designed to reflect real-world Indian document scenarios.

Prof. P. J. Narayanan, Director, IIIT Hyderabad, said, "Patram marks a significant step as India designs state-of-the-art foundational models. With this launch, we integrate language available in all forms: as text, as speech, and as images.  This can power multimodal applications with integrated vision-language intelligence."

Dr. Ravi Kiran Sarvadevabhatla, Associate Professor at IIIT-Hyderabad and lead researcher on the project, said, “With Patram, we’ve built a model that understands the unique structure and diversity of Indian documents. This is just the beginning of what India can achieve in vision-language AI.”

Q:   What is the admission procedure of Dual Degree programmes at IIIT Hyderabad?
Q:   How and when should I select my course choice for PhD at IIIT Hyderabad?
Videos you may like

Follow Shiksha.com for latest education news in detail on Exam Results, Dates, Admit Cards, & Schedules, Colleges & Universities news related to Admissions & Courses, Board exams, Scholarships, Careers, Education Events, New education policies & Regulations.
To get in touch with Shiksha news team, please write to us at news@shiksha.com

About the Author
author-image
Sanjana Surbhi
Assistant Manager
Sanjana Surbhi has over five years of experience in the online education sector. Drawing from her tenure with ed-tech companies, she infuses her work with a wealth of knowledge from the education realm, lending an i Read Full Bio

Next Story