Archives AI News

MathBuddy: A Multimodal System for Affective Math Tutoring

MathBuddy: A Multimodal System for Affective Math Tutoring arXiv:2508.19993v1 Announce Type: cross Abstract: The rapid adoption of LLM-based conversational systems is already transforming the landscape of educational technology. However, the current state-of-the-art learning models do not take into account the…

Tracking World States with Language Models: State-Based Evaluation Using Chess

Tracking World States with Language Models: State-Based Evaluation Using Chess arXiv:2508.19851v1 Announce Type: new Abstract: Large Language Models (LLMs) exhibit emergent capabilities in structured domains, suggesting they may implicitly internalize high-fidelity representations of world models. While probing techniques have shown…

EnvInjection: Environmental Prompt Injection Attack to Multi-modal Web Agents

EnvInjection: Environmental Prompt Injection Attack to Multi-modal Web Agents arXiv:2505.11717v2 Announce Type: replace-cross Abstract: Multi-modal large language model (MLLM)-based web agents interact with webpage environments by generating actions based on screenshots of the webpages. Environmental prompt injection attacks manipulate the…

CASE: An Agentic AI Framework for Enhancing Scam Intelligence in Digital Payments

CASE: An Agentic AI Framework for Enhancing Scam Intelligence in Digital Payments arXiv:2508.19932v1 Announce Type: new Abstract: The proliferation of digital payment platforms has transformed commerce, offering unmatched convenience and accessibility globally. However, this growth has also attracted malicious actors,…

PediatricsMQA: a Multi-modal Pediatrics Question Answering Benchmark

PediatricsMQA: a Multi-modal Pediatrics Question Answering Benchmark arXiv:2508.16439v3 Announce Type: replace-cross Abstract: Large language models (LLMs) and vision-augmented LLMs (VLMs) have significantly advanced medical informatics, diagnostics, and decision support. However, these models exhibit systematic biases, particularly age bias, compromising their…