Archives AI News

The Perception-Physics Paradox: Probing Scientific Alignment with TC-Bench

arXiv:2605.24782v2 Announce Type: replace Abstract: While Vision Foundation Models (VFMs) excel at predictive tasks on satellite imagery, their performance can arise from visual correlations rather than underlying structural invariants, making even perception-based out-of-distribution accuracy a poor proxy for scientific utility.…

Bayes-Sufficient Representations in Supervised Learning

arXiv:2606.04045v1 Announce Type: new Abstract: Representation learning is often described as preserving the information in an input that is relevant for prediction. This work asks what relevance means for a fixed supervised decision problem. A representation is defined to be…

Unifying Model-Free Efficiency and Model-Based Representations via Latent Dynamics

arXiv:2602.12643v2 Announce Type: replace Abstract: We present Unified Latent Dynamics (ULD), a novel reinforcement learning algorithm that unifies the efficiency of model-free methods with the representational strengths of model-based approaches, without incurring planning overhead. By embedding state-action pairs into a…

Self-Distilled Policy Gradient

arXiv:2606.04036v1 Announce Type: new Abstract: On-policy self-distillation, where a language model conditions on privileged context to supervise its own generations, is a promising source of dense supervision for sparse-reward reinforcement learning. Actually, it can be instantiated as an auxiliary full-vocabulary…

Do Transformers Need Three Projections? Systematic Study of QKV Variants

arXiv:2606.04032v1 Announce Type: new Abstract: Transformers have become the standard solution for various AI tasks, with the query, key, and value (QKV) attention formulation playing a central role. However, the individual contribution of these three projections and the impact of…

Pseudospectral Bounds for Transient Amplification in Coupled Gradient Descent

arXiv:2606.04031v1 Announce Type: new Abstract: Coupled gradient descent–where the update of one parameter block depends on another–underlies bilevel optimization, two-time-scale stochastic approximation, and adversarial training. When the coupled Jacobian is block-triangular, asymptotic stability is governed by the spectral radii of…

Position: Deployed Reinforcement Learning should be Continual

arXiv:2606.04029v1 Announce Type: new Abstract: Reinforcement Learning (RL) has received increasing attention and adoption in real-world use cases. Most of these systems follow a train-then-fix paradigm, where trained agents do not learn while interacting with the world until performance degrades…

Bypassing Prompt Guards in Production with Controlled-Release Prompting

arXiv:2510.01529v3 Announce Type: replace Abstract: Ball et al. recently established that prompt filtering for AI alignment faces a fundamental barrier: under standard cryptographic assumptions, no filter running significantly faster than the protected model can universally distinguish adversarial prompts from benign…

Unlocking Feature Learning in Gated Delta Networks at Scale

arXiv:2606.04048v1 Announce Type: new Abstract: Training and scaling Large Language Models demand enormous computational resources, motivating both efficient sub-quadratic architectures and principled hyperparameter tuning methods. While the Maximal Update Parametrization ($mu$P) has enabled zero-shot hyperparameter transfer for standard Transformers, its…