Archives AI News

Improved Robustness of Deep Reinforcement Learning for Control of Time-Varying Systems by Bounded Extremum Seeking

arXiv:2510.02490v1 Announce Type: new Abstract: In this paper, we study the use of robust model independent bounded extremum seeking (ES) feedback control to improve the robustness of deep reinforcement learning (DRL) controllers for a class of nonlinear time-varying systems. DRL…

October 6, 2025

A Multi-Fidelity Control Variate Approach for Policy Gradient Estimation

arXiv:2503.05696v3 Announce Type: replace Abstract: Many reinforcement learning (RL) algorithms are impractical for deployment in operational systems or for training with computationally expensive high-fidelity simulations, as they require large amounts of data. Meanwhile, low-fidelity simulators — such as reduced-order models,…

October 6, 2025

Beyond Imitation: Recovering Dense Rewards from Demonstrations

arXiv:2510.02493v1 Announce Type: new Abstract: Conventionally, supervised fine-tuning (SFT) is treated as a simple imitation learning process that only trains a policy to imitate expert behavior on demonstration datasets. In this work, we challenge this view by establishing a fundamental…

October 6, 2025

Risk-Sensitive Agent Compositions

arXiv:2506.04632v2 Announce Type: replace Abstract: From software development to robot control, modern agentic systems decompose complex objectives into a sequence of subtasks and choose a set of specialized AI agents to complete them. We formalize agentic workflows as directed acyclic…

October 6, 2025

In-memory Training on Analog Devices with Limited Conductance States via Multi-tile Residual Learning

arXiv:2510.02516v1 Announce Type: new Abstract: Analog in-memory computing (AIMC) accelerators enable efficient deep neural network computation directly within memory using resistive crossbar arrays, where model parameters are represented by the conductance states of memristive devices. However, effective in-memory training typically…

October 6, 2025

Feature Dynamics as Implicit Data Augmentation: A Depth-Decomposed View on Deep Neural Network Generalization

arXiv:2509.20334v2 Announce Type: replace Abstract: Why do deep networks generalize well? In contrast to classical generalization theory, we approach this fundamental question by examining not only inputs and outputs, but the evolution of internal features. Our study suggests a phenomenon…

October 6, 2025

Graph Generation with Spectral Geodesic Flow Matching

arXiv:2510.02520v1 Announce Type: new Abstract: Graph generation is a fundamental task with wide applications in modeling complex systems. Although existing methods align the spectrum or degree profile of the target graph, they often ignore the geometry induced by eigenvectors and…

October 6, 2025

KAIROS: Unified Training for Universal Non-Autoregressive Time Series Forecasting

arXiv:2510.02084v2 Announce Type: replace Abstract: In the World Wide Web, reliable time series forecasts provide the forward-looking signals that drive resource planning, cache placement, and anomaly response, enabling platforms to operate efficiently as user behavior and content distributions evolve. Compared…

October 6, 2025

Model-brain comparison using inter-animal transforms

arXiv:2510.02523v1 Announce Type: new Abstract: Artificial neural network models have emerged as promising mechanistic models of the brain. However, there is little consensus on the correct method for comparing model activations to brain responses. Drawing on recent work in philosophy…

October 6, 2025

L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning

arXiv:2503.04697v2 Announce Type: replace-cross Abstract: Reasoning language models have shown an uncanny ability to improve performance at test-time by “thinking longer”-that is, by generating longer chain-of-thought sequences and hence using more compute. However, the length of their chain-of-thought reasoning is…

October 6, 2025