Archives AI News

Energy based diffusion generator for efficient sampling of Boltzmann distributions

arXiv:2401.02080v3 Announce Type: replace-cross Abstract: Sampling from Boltzmann distributions, particularly those tied to high dimensional and complex energy functions, poses a significant challenge in many fields. In this work, we present the Energy-Based Diffusion Generator (EDG), a novel approach that…

September 26, 2025

Best-of-$infty$ — Asymptotic Performance of Test-Time Compute

arXiv:2509.21091v1 Announce Type: new Abstract: We study best-of-$N$ for large language models (LLMs) where the selection is based on majority voting. In particular, we analyze the limit $N to infty$, which we denote as Best-of-$infty$. While this approach achieves impressive…

September 26, 2025

Understanding Optimization in Deep Learning with Central Flows

arXiv:2410.24206v2 Announce Type: replace-cross Abstract: Traditional theories of optimization cannot describe the dynamics of optimization in deep learning, even in the simple setting of deterministic training. The challenge is that optimizers typically operate in a complex, oscillatory regime called the…

September 26, 2025

WISER: Segmenting watermarked region – an epidemic change-point perspective

arXiv:2509.21160v1 Announce Type: new Abstract: With the increasing popularity of large language models, concerns over content authenticity have led to the development of myriad watermarking schemes. These schemes can be used to detect a machine-generated text via an appropriate key,…

September 26, 2025

Provably Sample-Efficient Robust Reinforcement Learning with Average Reward

arXiv:2505.12462v2 Announce Type: replace-cross Abstract: Robust reinforcement learning (RL) under the average-reward criterion is essential for long-term decision-making, particularly when the environment may differ from its specification. However, a significant gap exists in understanding the finite-sample complexity of these methods,…

September 26, 2025

Breaking the curse of dimensionality for linear rules: optimal predictors over the ellipsoid

arXiv:2509.21174v1 Announce Type: new Abstract: In this work, we address the following question: What minimal structural assumptions are needed to prevent the degradation of statistical learning bounds with increasing dimensionality? We investigate this question in the classical statistical setting of…

September 26, 2025

Improved Scaling Laws in Linear Regression via Data Reuse

arXiv:2506.08415v2 Announce Type: replace-cross Abstract: Neural scaling laws suggest that the test error of large language models trained online decreases polynomially as the model size and data size increase. However, such scaling can be unsustainable when running out of new…

September 26, 2025

Response to Promises and Pitfalls of Deep Kernel Learning

arXiv:2509.21228v1 Announce Type: new Abstract: This note responds to “Promises and Pitfalls of Deep Kernel Learning” (Ober et al., 2021). The marginal likelihood of a Gaussian process can be compartmentalized into a data fit term and a complexity penalty. Ober…

September 26, 2025

Enhanced Generative Model Evaluation with Clipped Density and Coverage

arXiv:2507.01761v2 Announce Type: replace-cross Abstract: Although generative models have made remarkable progress in recent years, their use in critical applications has been hindered by an inability to reliably evaluate the quality of their generated samples. Quality refers to at least…

September 26, 2025

Data-Efficient Time-Dependent PDE Surrogates: Graph Neural Simulators vs. Neural Operators

arXiv:2509.06154v2 Announce Type: replace-cross Abstract: Developing accurate, data-efficient surrogate models is central to advancing AI for Science. Neural operators (NOs), which approximate mappings between infinite-dimensional function spaces using conventional neural architectures, have gained popularity as surrogates for systems driven by…

September 26, 2025