Archives AI News

SynthAgent: Adapting Web Agents with Synthetic Supervision

arXiv:2511.06101v3 Announce Type: replace Abstract: Web agents struggle to adapt to new websites due to the scarcity of environment specific tasks and demonstrations. Recent works have explored synthetic data generation to address this challenge, however, they suffer from data quality…

April 14, 2026

STaR-DRO: Stateful Tsallis Reweighting for Group-Robust Structured Prediction

arXiv:2604.09737v1 Announce Type: new Abstract: Structured prediction requires models to generate ontology-constrained labels, grounded evidence, and valid structure under ambiguity, label skew, and heterogeneous group difficulty. We present a two-part framework for controllable inference and robust fine-tuning. First, we introduce…

April 14, 2026

Ambivalence/Hesitancy Recognition in Videos for Personalized Digital Health Interventions

arXiv:2604.11730v1 Announce Type: cross Abstract: Using behavioural science, health interventions focus on behaviour change by providing a framework to help patients acquire and maintain healthy habits that improve medical outcomes. In-person interventions are costly and difficult to scale, especially in…

April 14, 2026

Active Inference with a Self-Prior in the Mirror-Mark Task

arXiv:2604.09673v1 Announce Type: new Abstract: The mirror self-recognition test evaluates whether a subject touches a mark on its own body that is visible only in a mirror, and is widely used as an indicator of self-awareness. In this study, we…

April 14, 2026

A Comparative Theoretical Analysis of Entropy Control Methods in Reinforcement Learning

arXiv:2604.09676v1 Announce Type: new Abstract: Reinforcement learning (RL) has become a key approach for enhancing reasoning in large language models (LLMs), yet scalable training is often hindered by the rapid collapse of policy entropy, which leads to premature convergence and…

April 14, 2026

Belief-State RWKV for Reinforcement Learning under Partial Observability

arXiv:2604.09671v1 Announce Type: new Abstract: We propose a stronger formulation of RL on top of RWKV-style recurrent sequence models, in which the fixed-size recurrent state is explicitly interpreted as a belief state rather than an opaque hidden vector. Instead of…

April 14, 2026

Human-like Working Memory Interference in Large Language Models

arXiv:2604.09670v1 Announce Type: new Abstract: Intelligent systems must maintain and manipulate task-relevant information online to adapt to dynamic environments and changing goals. This capacity, known as working memory, is fundamental to human reasoning and intelligence. Despite having on the order…

April 14, 2026

Deliberative Alignment is Deep, but Uncertainty Remains: Inference time safety improvement in reasoning via attribution of unsafe behavior to base model

arXiv:2604.09665v1 Announce Type: new Abstract: While the wide adoption of refusal training in large language models (LLMs) has showcased improvements in model safety, recent works have highlighted shortcomings due to the shallow nature of these alignment methods. To this end,…

April 14, 2026

FORGE: Fine-grained Multimodal Evaluation for Manufacturing Scenarios

arXiv:2604.07413v2 Announce Type: replace-cross Abstract: The manufacturing sector is increasingly adopting Multimodal Large Language Models (MLLMs) to transition from simple perception to autonomous execution, yet current evaluations fail to reflect the rigorous demands of real-world manufacturing environments. Progress is hindered…

April 14, 2026

ExecTune: Effective Steering of Black-Box LLMs with Guide Models

arXiv:2604.09741v1 Announce Type: new Abstract: For large language models deployed through black-box APIs, recurring inference costs often exceed one-time training costs. This motivates composed agentic systems that amortize expensive reasoning into reusable intermediate representations. We study a broad class of…

April 14, 2026