Archives AI News

Residuals-based Offline Reinforcement Learning

arXiv:2604.01378v1 Announce Type: new Abstract: Offline reinforcement learning (RL) has received increasing attention for learning policies from previously collected data without interaction with the real environment, which is particularly important in high-stakes applications. While a growing body of work has…

Deep Networks Favor Simple Data

arXiv:2604.00394v2 Announce Type: replace Abstract: Estimated density is often interpreted as indicating how typical a sample is under a model. Yet deep models trained on one dataset can assign higher density to simpler out-of-distribution (OOD) data than to in-distribution test…

Intervening to Learn and Compose Causally Disentangled Representations

arXiv:2507.04754v2 Announce Type: replace-cross Abstract: In designing generative models, it is commonly believed that in order to learn useful latent structure, we face a fundamental tension between expressivity and structure. In this paper we challenge this view by proposing a…

Test-Time Scaling Makes Overtraining Compute-Optimal

arXiv:2604.01411v1 Announce Type: new Abstract: Modern LLMs scale at test-time, e.g. via repeated sampling, where inference cost grows with model size and the number of samples. This creates a trade-off that pretraining scaling laws, such as Chinchilla, do not address.…

Improving Latent Generalization Using Test-time Compute

arXiv:2604.01430v1 Announce Type: new Abstract: Language Models (LMs) exhibit two distinct mechanisms for knowledge acquisition: in-weights learning (i.e., encoding information within the model weights) and in-context learning (ICL). Although these two modes offer complementary strengths, in-weights learning frequently struggles to…

Causal K-Means Clustering

arXiv:2405.03083v5 Announce Type: replace-cross Abstract: Causal effects are often characterized with population summaries. These might provide an incomplete picture when there are heterogeneous treatment effects across subgroups. Since the subgroup structure is typically unknown, it is more challenging to identify…