Archives AI News

Residuals-based Offline Reinforcement Learning

arXiv:2604.01378v1 Announce Type: new Abstract: Offline reinforcement learning (RL) has received increasing attention for learning policies from previously collected data without interaction with the real environment, which is particularly important in high-stakes applications. While a growing body of work has…

Deep Networks Favor Simple Data

arXiv:2604.00394v2 Announce Type: replace Abstract: Estimated density is often interpreted as indicating how typical a sample is under a model. Yet deep models trained on one dataset can assign higher density to simpler out-of-distribution (OOD) data than to in-distribution test…

Intervening to Learn and Compose Causally Disentangled Representations

arXiv:2507.04754v2 Announce Type: replace-cross Abstract: In designing generative models, it is commonly believed that in order to learn useful latent structure, we face a fundamental tension between expressivity and structure. In this paper we challenge this view by proposing a…

Test-Time Scaling Makes Overtraining Compute-Optimal

arXiv:2604.01411v1 Announce Type: new Abstract: Modern LLMs scale at test-time, e.g. via repeated sampling, where inference cost grows with model size and the number of samples. This creates a trade-off that pretraining scaling laws, such as Chinchilla, do not address.…

Improving Latent Generalization Using Test-time Compute

arXiv:2604.01430v1 Announce Type: new Abstract: Language Models (LMs) exhibit two distinct mechanisms for knowledge acquisition: in-weights learning (i.e., encoding information within the model weights) and in-context learning (ICL). Although these two modes offer complementary strengths, in-weights learning frequently struggles to…