Archives AI News

Energy based diffusion generator for efficient sampling of Boltzmann distributions

arXiv:2401.02080v3 Announce Type: replace-cross Abstract: Sampling from Boltzmann distributions, particularly those tied to high dimensional and complex energy functions, poses a significant challenge in many fields. In this work, we present the Energy-Based Diffusion Generator (EDG), a novel approach that…

Best-of-$infty$ — Asymptotic Performance of Test-Time Compute

arXiv:2509.21091v1 Announce Type: new Abstract: We study best-of-$N$ for large language models (LLMs) where the selection is based on majority voting. In particular, we analyze the limit $N to infty$, which we denote as Best-of-$infty$. While this approach achieves impressive…

Understanding Optimization in Deep Learning with Central Flows

arXiv:2410.24206v2 Announce Type: replace-cross Abstract: Traditional theories of optimization cannot describe the dynamics of optimization in deep learning, even in the simple setting of deterministic training. The challenge is that optimizers typically operate in a complex, oscillatory regime called the…

Provably Sample-Efficient Robust Reinforcement Learning with Average Reward

arXiv:2505.12462v2 Announce Type: replace-cross Abstract: Robust reinforcement learning (RL) under the average-reward criterion is essential for long-term decision-making, particularly when the environment may differ from its specification. However, a significant gap exists in understanding the finite-sample complexity of these methods,…

Improved Scaling Laws in Linear Regression via Data Reuse

arXiv:2506.08415v2 Announce Type: replace-cross Abstract: Neural scaling laws suggest that the test error of large language models trained online decreases polynomially as the model size and data size increase. However, such scaling can be unsustainable when running out of new…

Response to Promises and Pitfalls of Deep Kernel Learning

arXiv:2509.21228v1 Announce Type: new Abstract: This note responds to “Promises and Pitfalls of Deep Kernel Learning” (Ober et al., 2021). The marginal likelihood of a Gaussian process can be compartmentalized into a data fit term and a complexity penalty. Ober…

Enhanced Generative Model Evaluation with Clipped Density and Coverage

arXiv:2507.01761v2 Announce Type: replace-cross Abstract: Although generative models have made remarkable progress in recent years, their use in critical applications has been hindered by an inability to reliably evaluate the quality of their generated samples. Quality refers to at least…

Data-Efficient Time-Dependent PDE Surrogates: Graph Neural Simulators vs. Neural Operators

arXiv:2509.06154v2 Announce Type: replace-cross Abstract: Developing accurate, data-efficient surrogate models is central to advancing AI for Science. Neural operators (NOs), which approximate mappings between infinite-dimensional function spaces using conventional neural architectures, have gained popularity as surrogates for systems driven by…