Archives AI News

The hidden risks of temporal resampling in clinical reinforcement learning

arXiv:2602.06603v3 Announce Type: replace Abstract: Reinforcement learning (RL) is a type of artificial intelligence for making optimal choices. In healthcare, researchers generally use offline RL (ORL), where models are trained and evaluated from retrospective observational data. To accommodate inherently irregular…

Adaptive Layerwise Perturbation: Unifying Off-Policy Corrections for LLM RL

arXiv:2603.19470v2 Announce Type: replace Abstract: Off-policy problems such as policy staleness and training–inference mismatch have become a major bottleneck for training stability and further exploration in LLM RL. The distribution gap between the inference and updated policies grows because of…

Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control

arXiv:2604.26326v1 Announce Type: new Abstract: Reinforcement learning (RL) has unlocked complex reasoning abilities in large language models (LLMs). However, most RL algorithms suffer from performance saturation, preventing further gains as RL training scales. This problem can be characterized by the…

The Role of Symmetry in Optimizing Overparameterized Networks

arXiv:2604.25150v2 Announce Type: replace Abstract: Overparameterization is central to the success of deep learning, yet the mechanisms by which it improves optimization remain incompletely understood. We analyze weight-space symmetries in neural networks and show that overparameterization introduces additional symmetries that…