Archives AI News

Honest Lying: Understanding Memory Confabulation in Reflexive Agents

arXiv:2605.29463v2 Announce Type: replace Abstract: Reflexion-style agents rely on self-generated reflections as memory, implicitly assuming that agents can accurately diagnose their own failures. We show that this assumption can fail systematically: across ALFWorld and HumanEval, agents store confident but incorrect…

June 2, 2026

AI-Guided Design and Optimization of Graphite-Based Anodes via Iterative Experimental Feedback

arXiv:2606.00187v1 Announce Type: new Abstract: This study presents an iterative AI-guided workflow that accelerates graphite-based anode development by improving both formulation feasibility and process robustness. Sequential learning via AI/ML-guided multiobjective inverse design for anode optimization was implemented using the Citrine…

June 2, 2026

Fundamental bounds on efficiency-confidence trade-off for transductive conformal prediction

arXiv:2509.04631v2 Announce Type: replace Abstract: Transductive conformal prediction addresses the simultaneous prediction for multiple data points. Given a desired confidence level, the objective is to construct a prediction set that includes the true outcomes with the prescribed confidence. We demonstrate…

June 2, 2026

Beyond Discreteness: Sample Complexity Analysis of Straight-Through Estimator for 1-bit Quantization

arXiv:2505.18113v2 Announce Type: replace Abstract: Training quantized neural networks requires addressing the non-differentiable and discrete nature of the underlying optimization problem. To tackle this challenge, the straight-through estimator (STE) has become the most widely adopted heuristic, allowing backpropagation through discrete…

June 2, 2026

Well-Posed KL-Regularized Control via Wasserstein and Kalman-Wasserstein KL Divergences

arXiv:2602.02250v2 Announce Type: replace-cross Abstract: Kullback-Leibler (KL) divergence regularization is widely used in reinforcement learning, but it becomes infinite under support mismatch and can degenerate in low-noise regimes. Using a unified information-geometric framework, we introduce KL analogs by replacing the…

June 2, 2026

From Evaluation to Design: Using Potential Energy Surface Smoothness Metrics to Guide Machine Learning Interatomic Potential Architectures

arXiv:2602.04861v2 Announce Type: replace Abstract: Machine Learning Interatomic Potentials (MLIPs) sometimes fail to reproduce the physical smoothness of the quantum potential energy surface (PES), leading to erroneous behavior in downstream simulations that standard energy and force regression evaluations can miss.…

June 2, 2026

Learning to Construct Practical Agentic Systems

arXiv:2606.00189v1 Announce Type: new Abstract: Automated design and optimization of agentic LLM-based systems leads to sophisticated systems that substantially improve result quality over off-the-shelf agentic patterns. However, studies of fielded agentic systems show that production systems focus much more on…

June 2, 2026

ChurnNet: A Optimized Modern AI for Churn Prediction

arXiv:2606.00169v1 Announce Type: new Abstract: Increased competition and the growing similarity of products and services offered by retailers have lowered the barriers for customers to switch to competitors. Accurate churn prediction can be a valuable tool for driving effective personalized…

June 2, 2026

BAGEN: Are LLM Agents Budget-Aware?

arXiv:2606.00198v1 Announce Type: new Abstract: While agents are increasingly spending more resources, today agent cost is mostly measured only after execution. A Budget-Aware Agent (BAGEN) should treat budget as an active control signal, rather than a passive cost metric. We…

June 2, 2026

Capability and Robustness Cannot Both Be Free: An Information-Theoretic Bound for Vision-Language-Action Models

arXiv:2605.25889v4 Announce Type: replace-cross Abstract: Vision-Language-Action (VLA) models reach high success rates on clean inputs but collapse under small adversarial perturbations: a $16/255$ PGD attack drops OpenVLA-7B’s LIBERO success from $95%$ to under $5%$. Whether this trade-off has a theoretical…

June 2, 2026