← All publications

Conference — ACL Findings — 2026

Evaluating Large Vision Language Models on Bangla Medical Visual Question Answering

Rafid Ahmed, Intesar Tahmid, Mir Sazzat Hossain, Tasnimul Hossain Tomal, Md Mahir Jawad, Anam Borhan Uddin, Md Fahim, Md Farhad Alam Bhuiyan

Accepted to Findings of the Association for Computational Linguistics: ACL 2026.

Abstract

Recent advancements in Large Language Models (LLMs) and Large Vision Language Models (LVLMs) have enabled general-purpose systems to demonstrate promising capabilities in complex reasoning tasks, including those in the medical domain. However, their evaluation has predominantly focused on high-resource languages, leaving low-resource contexts like Bangla underexplored. To address this gap, we introduce BanglaMedVQA, a multilingual Medical Visual Question Answering (VQA) dataset comprising clinically validated image-question-answer pairs, along with a comprehensive evaluation of current LVLMs on this resource. We rigorously evaluate nine state-of-the-art LVLMs using zero-shot, Chain-of-Thought (CoT), and LoRA fine-tuning strategies. Our results reveal a clear performance disparity: models perform well on generalized visual tasks but struggle with fine-grained diagnostic reasoning, achieving surprisingly low accuracy in specialized categories. While fine-tuning significantly improves overall accuracy, especially for Qwen2.5-VL and MedGemma 4B, limitations in specialized medical reasoning persist. Our work provides a foundation for future research in Bangla medical VQA. The code and dataset are available at https://github.com/ahmedrafid023/BanglaMedVQA.

Cite

@inproceedings{ahmed-etal-2026-banglamedqa,
    title = {Evaluating Large Vision Language Models on Bangla Medical Visual Question Answering},
    author = {Rafid Ahmed and Intesar Tahmid and Mir Sazzat Hossain and Tasnimul Hossain Tomal and Md Mahir Jawad and Anam Borhan Uddin and Md Fahim and Md Farhad Alam Bhuiyan},
    booktitle = {Findings of the Association for Computational Linguistics: ACL 2026},
    year = {2026},
    note = {Accepted}
}