Archives AI News

Agentic Transformers Provably Learn to Search via Reinforcement Learning

arXiv:2606.00183v1 Announce Type: new Abstract: Tree search is a central abstraction behind many language-agent reasoning and decision-making tasks: agents must explore actions, remember failures, and backtrack toward promising alternatives. Yet, we lack a theoretical understanding of how transformer-based policies acquire…

June 2, 2026

A Direct Approach for Handling Contextual Bandits with Latent State Dynamics

arXiv:2604.08149v2 Announce Type: replace Abstract: We consider a linear contextual bandit model where contexts and rewards are governed by a finite hidden Markov chain. We first revisit the simplified model by Nelson et al. (2022), in which rewards are linear…

June 2, 2026

Score Function Gradient Estimation to Widen the Applicability of Decision-Focused Learning

arXiv:2307.05213v3 Announce Type: replace Abstract: Many real-world optimization problems contain parameters that are unknown before deployment time, either due to stochasticity or to lack of information (e.g., demand or travel times in delivery problems). A common strategy in such cases…

June 2, 2026

Honest Lying: Understanding Memory Confabulation in Reflexive Agents

arXiv:2605.29463v2 Announce Type: replace Abstract: Reflexion-style agents rely on self-generated reflections as memory, implicitly assuming that agents can accurately diagnose their own failures. We show that this assumption can fail systematically: across ALFWorld and HumanEval, agents store confident but incorrect…

June 2, 2026

AI-Guided Design and Optimization of Graphite-Based Anodes via Iterative Experimental Feedback

arXiv:2606.00187v1 Announce Type: new Abstract: This study presents an iterative AI-guided workflow that accelerates graphite-based anode development by improving both formulation feasibility and process robustness. Sequential learning via AI/ML-guided multiobjective inverse design for anode optimization was implemented using the Citrine…

June 2, 2026

Fundamental bounds on efficiency-confidence trade-off for transductive conformal prediction

arXiv:2509.04631v2 Announce Type: replace Abstract: Transductive conformal prediction addresses the simultaneous prediction for multiple data points. Given a desired confidence level, the objective is to construct a prediction set that includes the true outcomes with the prescribed confidence. We demonstrate…

June 2, 2026

Beyond Discreteness: Sample Complexity Analysis of Straight-Through Estimator for 1-bit Quantization

arXiv:2505.18113v2 Announce Type: replace Abstract: Training quantized neural networks requires addressing the non-differentiable and discrete nature of the underlying optimization problem. To tackle this challenge, the straight-through estimator (STE) has become the most widely adopted heuristic, allowing backpropagation through discrete…

June 2, 2026

From Evaluation to Design: Using Potential Energy Surface Smoothness Metrics to Guide Machine Learning Interatomic Potential Architectures

arXiv:2602.04861v2 Announce Type: replace Abstract: Machine Learning Interatomic Potentials (MLIPs) sometimes fail to reproduce the physical smoothness of the quantum potential energy surface (PES), leading to erroneous behavior in downstream simulations that standard energy and force regression evaluations can miss.…

June 2, 2026

Learning to Construct Practical Agentic Systems

arXiv:2606.00189v1 Announce Type: new Abstract: Automated design and optimization of agentic LLM-based systems leads to sophisticated systems that substantially improve result quality over off-the-shelf agentic patterns. However, studies of fielded agentic systems show that production systems focus much more on…

June 2, 2026

BAGEN: Are LLM Agents Budget-Aware?

arXiv:2606.00198v1 Announce Type: new Abstract: While agents are increasingly spending more resources, today agent cost is mostly measured only after execution. A Budget-Aware Agent (BAGEN) should treat budget as an active control signal, rather than a passive cost metric. We…

June 2, 2026