Archives AI News

Scaling Multi-Agent Environment Co-Design with Diffusion Models

arXiv:2511.03100v2 Announce Type: replace Abstract: The agent-environment co-design paradigm jointly optimises agent policies and environment configurations in search of improved system performance. With application domains ranging from warehouse logistics to windfarm management, co-design promises to fundamentally change how we deploy…

Bounded Behavioral Indistinguishability for Black-Box LLM Distillation

arXiv:2605.30448v1 Announce Type: new Abstract: Black-box LLM distillation is usually evaluated as an output-matching problem: a student is considered successful when its responses are semantically similar to, or task-consistent with, those of a teacher. However, output similarity does not imply…

Plain Transformers are Surprisingly Powerful Link Predictors

arXiv:2602.01553v2 Announce Type: replace Abstract: Link prediction is a core challenge in graph machine learning, demanding models that capture rich and complex topological dependencies. While Graph Neural Networks (GNNs) are the standard solution, state-of-the-art pipelines often rely on explicit structural…

VeriGate: Verifier-Gated Step-Level Supervision for GRPO

arXiv:2605.30451v1 Announce Type: new Abstract: Group Relative Policy Optimization (GRPO) is an effective recipe for training reasoning models with verifier-based outcome rewards, but its supervision is sparse: when all sampled trajectories for a prompt receive the same verifier reward, the…

Mollified Value Learning

arXiv:2602.23280v2 Announce Type: replace Abstract: Offline goal-conditioned reinforcement learning (GCRL) learns goal-reaching behaviors from static datasets, but accurate value estimation remains challenging under limited state-action coverage. Existing physics-informed approaches address this by imposing pointwise distance-like geometric constraints derived from Hamilton–Jacobi–Bellman…

A Unified Framework for Gradient Aggregation in Multi-Objective Optimization

arXiv:2605.30452v1 Announce Type: new Abstract: Many machine learning problems involve multiple inherent trade-offs that are best addressed by gradient-based multi-objective optimization (MOO) algorithms. Existing methods are often proposed with various motivations, analyzed case by case, and differ algorithmically in how…