Archives AI News

MathBuddy: A Multimodal System for Affective Math Tutoring

MathBuddy: A Multimodal System for Affective Math Tutoring arXiv:2508.19993v1 Announce Type: cross Abstract: The rapid adoption of LLM-based conversational systems is already transforming the landscape of educational technology. However, the current state-of-the-art learning models do not take into account the…

August 29, 2025

InquireMobile: Teaching VLM-based Mobile Agent to Request Human Assistance via Reinforcement Fine-Tuning

InquireMobile: Teaching VLM-based Mobile Agent to Request Human Assistance via Reinforcement Fine-Tuning arXiv:2508.19679v1 Announce Type: new Abstract: Recent advances in Vision-Language Models (VLMs) have enabled mobile agents to perceive and interact with real-world mobile environments based on human instructions. However,…

August 29, 2025

Analysing Chain of Thought Dynamics: Active Guidance or Unfaithful Post-hoc Rationalisation?

Analysing Chain of Thought Dynamics: Active Guidance or Unfaithful Post-hoc Rationalisation? arXiv:2508.19827v1 Announce Type: new Abstract: Recent work has demonstrated that Chain-of-Thought (CoT) often yields limited gains for soft-reasoning problems such as analytical and commonsense reasoning. CoT can also be…

August 29, 2025

Understanding Fairness-Accuracy Trade-offs in Machine Learning Models: Does Promoting Fairness Undermine Performance?

Understanding Fairness-Accuracy Trade-offs in Machine Learning Models: Does Promoting Fairness Undermine Performance? arXiv:2411.17374v2 Announce Type: replace-cross Abstract: Fairness in both Machine Learning (ML) predictions and human decision-making is essential, yet both are susceptible to different forms of bias, such as…

August 29, 2025

Tracking World States with Language Models: State-Based Evaluation Using Chess

Tracking World States with Language Models: State-Based Evaluation Using Chess arXiv:2508.19851v1 Announce Type: new Abstract: Large Language Models (LLMs) exhibit emergent capabilities in structured domains, suggesting they may implicitly internalize high-fidelity representations of world models. While probing techniques have shown…

August 29, 2025

EnvInjection: Environmental Prompt Injection Attack to Multi-modal Web Agents

EnvInjection: Environmental Prompt Injection Attack to Multi-modal Web Agents arXiv:2505.11717v2 Announce Type: replace-cross Abstract: Multi-modal large language model (MLLM)-based web agents interact with webpage environments by generating actions based on screenshots of the webpages. Environmental prompt injection attacks manipulate the…

August 29, 2025

CASE: An Agentic AI Framework for Enhancing Scam Intelligence in Digital Payments

CASE: An Agentic AI Framework for Enhancing Scam Intelligence in Digital Payments arXiv:2508.19932v1 Announce Type: new Abstract: The proliferation of digital payment platforms has transformed commerce, offering unmatched convenience and accessibility globally. However, this growth has also attracted malicious actors,…

August 29, 2025

Flocking Behavior: An Innovative Inspiration for the Optimization of Production Plants

Flocking Behavior: An Innovative Inspiration for the Optimization of Production Plants arXiv:2508.19963v1 Announce Type: new Abstract: Optimizing modern production plants using the job-shop principle is a known hard problem. For very large plants, like semiconductor fabs, the problem becomes unsolvable…

August 29, 2025

PediatricsMQA: a Multi-modal Pediatrics Question Answering Benchmark

PediatricsMQA: a Multi-modal Pediatrics Question Answering Benchmark arXiv:2508.16439v3 Announce Type: replace-cross Abstract: Large language models (LLMs) and vision-augmented LLMs (VLMs) have significantly advanced medical informatics, diagnostics, and decision support. However, these models exhibit systematic biases, particularly age bias, compromising their…

August 29, 2025

SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control

SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control arXiv:2508.20018v1 Announce Type: new Abstract: The rapid advancement of large vision language models (LVLMs) and agent systems has heightened interest in mobile GUI agents that can reliably translate…

August 29, 2025