Archives AI News

Scaling Multi-Agent Environment Co-Design with Diffusion Models

arXiv:2511.03100v2 Announce Type: replace Abstract: The agent-environment co-design paradigm jointly optimises agent policies and environment configurations in search of improved system performance. With application domains ranging from warehouse logistics to windfarm management, co-design promises to fundamentally change how we deploy…

June 1, 2026

Bounded Behavioral Indistinguishability for Black-Box LLM Distillation

arXiv:2605.30448v1 Announce Type: new Abstract: Black-box LLM distillation is usually evaluated as an output-matching problem: a student is considered successful when its responses are semantically similar to, or task-consistent with, those of a teacher. However, output similarity does not imply…

June 1, 2026

Plain Transformers are Surprisingly Powerful Link Predictors

arXiv:2602.01553v2 Announce Type: replace Abstract: Link prediction is a core challenge in graph machine learning, demanding models that capture rich and complex topological dependencies. While Graph Neural Networks (GNNs) are the standard solution, state-of-the-art pipelines often rely on explicit structural…

June 1, 2026

VeriGate: Verifier-Gated Step-Level Supervision for GRPO

arXiv:2605.30451v1 Announce Type: new Abstract: Group Relative Policy Optimization (GRPO) is an effective recipe for training reasoning models with verifier-based outcome rewards, but its supervision is sparse: when all sampled trajectories for a prompt receive the same verifier reward, the…

June 1, 2026

Mollified Value Learning

arXiv:2602.23280v2 Announce Type: replace Abstract: Offline goal-conditioned reinforcement learning (GCRL) learns goal-reaching behaviors from static datasets, but accurate value estimation remains challenging under limited state-action coverage. Existing physics-informed approaches address this by imposing pointwise distance-like geometric constraints derived from Hamilton–Jacobi–Bellman…

June 1, 2026

A Unified Framework for Gradient Aggregation in Multi-Objective Optimization

arXiv:2605.30452v1 Announce Type: new Abstract: Many machine learning problems involve multiple inherent trade-offs that are best addressed by gradient-based multi-objective optimization (MOO) algorithms. Existing methods are often proposed with various motivations, analyzed case by case, and differ algorithmically in how…

June 1, 2026

FML-bench: A Controlled Study of AI Research Agent Strategies from the Perspective of Search Dynamics

arXiv:2605.17373v2 Announce Type: replace Abstract: AI research agents accelerate ML research by automating hypothesis generation, experimentation, and empirical refinement. Existing agent strategies range from greedy hill-climbing to tree search and evolutionary optimization, yet which strategy choices drive performance remains unclear.…

June 1, 2026

DisjunctiveNet: Neural Symbolic Learning via Differentiable Convexified Optimization Layers

arXiv:2605.30456v1 Announce Type: new Abstract: Many learning tasks in science and engineering are characterized by sparse datasets, which limits the effectiveness of purely data-driven approaches. At the same time, these problems are often accompanied by rich domain knowledge derived from…

June 1, 2026

Position: Quantum Kernel Machines Should Move Beyond Scalar-Valued Kernels to Realize Their Potential

arXiv:2506.03779v2 Announce Type: replace-cross Abstract: Quantum kernel functions built using quantum-mechanical principles and have emerged as a centerpiece of quantum machine learning. The initial enthusiasm for quantum kernel machines has been tempered by recent studies suggesting that quantum kernels could…

June 1, 2026

Scalable Constrained Multi-Agent Reinforcement Learning via State Augmentation and Consensus for Separable Dynamics

arXiv:2605.30461v1 Announce Type: new Abstract: We present a distributed approach for constrained Multi-Agent Reinforcement Learning (MARL) that combines state-augmented policy learning with distributed consensus over dual variables. Our method targets systems where agents have separable dynamics but must coordinate to…

June 1, 2026