Archives AI News

Comprehensive language-image pre-training for 3D medical image understanding

arXiv:2510.15042v2 Announce Type: replace-cross Abstract: Vision-language pre-training, i.e., aligning images with paired text, is a powerful paradigm to create encoders that can be directly used for tasks such as classification, retrieval, and segmentation. In the 3D medical image domain, these…

DatBench: Discriminative, Faithful, and Efficient VLM Evaluations

arXiv:2601.02316v2 Announce Type: replace Abstract: Empirical evaluation serves as the primary compass guiding research progress in foundation models. Despite a large body of work focused on training frontier vision-language models (VLMs), approaches to their evaluation remain nascent. To guide their…