Benchmarking Hindi LLMs: A New Suite of Datasets and a Comparative Analysis
arXiv:2508.19831v2 Announce Type: replace-cross Abstract: Evaluating instruction-tuned Large Language Models (LLMs) in Hindi is challenging due to a lack of high-quality benchmarks, as direct translation of English datasets fails to capture crucial linguistic and cultural nuances. To address this, we…
