Archives AI News

How Does the Pretraining Distribution Shape In-Context Learning? Task Selection, Generalization, and Robustness

arXiv:2510.01163v1 Announce Type: cross Abstract: The emergence of in-context learning (ICL) in large language models (LLMs) remains poorly understood despite its consistent effectiveness, enabling models to adapt to new tasks from only a handful of examples. To clarify and improve…

October 2, 2025

On the Benefits of Weight Normalization for Overparameterized Matrix Sensing

arXiv:2510.01175v1 Announce Type: cross Abstract: While normalization techniques are widely used in deep learning, their theoretical understanding remains relatively limited. In this work, we establish the benefits of (generalized) weight normalization (WN) applied to the overparameterized matrix sensing problem. We…

October 2, 2025

On Conformal Machine Unlearning

arXiv:2508.03245v2 Announce Type: replace-cross Abstract: The increasing demand for data privacy has made machine unlearning (MU) essential for removing the influence of specific training samples from machine learning models while preserving performance on retained data. However, most existing MU methods…

October 2, 2025

Private Learning of Littlestone Classes, Revisited

arXiv:2510.00076v1 Announce Type: new Abstract: We consider online and PAC learning of Littlestone classes subject to the constraint of approximate differential privacy. Our main result is a private learner to online-learn a Littlestone class with a mistake bound of $tilde{O}(d^{9.5}cdot…

October 2, 2025

Learning to Dissipate Energy in Oscillatory State-Space Models

arXiv:2505.12171v2 Announce Type: replace-cross Abstract: State-space models (SSMs) are a class of networks for sequence learning that benefit from fixed state size and linear complexity with respect to sequence length, contrasting the quadratic scaling of typical attention mechanisms. Inspired from…

October 2, 2025

Accurate Estimation of Mutual Information in High Dimensional Data

arXiv:2506.00330v2 Announce Type: replace-cross Abstract: Mutual information (MI) is a fundamental measure of statistical dependence between two variables, yet accurate estimation from finite data remains notoriously difficult. No estimator is universally reliable, and common approaches fail in the high-dimensional, undersampled…

October 2, 2025

Approximation of differential entropy in Bayesian optimal experimental design

arXiv:2510.00734v1 Announce Type: new Abstract: Bayesian optimal experimental design provides a principled framework for selecting experimental settings that maximize obtained information. In this work, we focus on estimating the expected information gain in the setting where the differential entropy of…

October 2, 2025

Optimal placement of wind farms via quantile constraint learning

arXiv:2510.01093v1 Announce Type: new Abstract: Wind farm placement arranges the size and the location of multiple wind farms within a given region. The power output is highly related to the wind speed on spatial and temporal levels, which can be…

October 2, 2025

Bayesian Neural Networks for Functional ANOVA model

arXiv:2510.00545v1 Announce Type: new Abstract: With the increasing demand for interpretability in machine learning, functional ANOVA decomposition has gained renewed attention as a principled tool for breaking down high-dimensional function into low-dimensional components that reveal the contributions of different variable…

October 2, 2025

Guaranteed Noisy CP Tensor Recovery via Riemannian Optimization on the Segre Manifold

arXiv:2510.00569v1 Announce Type: new Abstract: Recovering a low-CP-rank tensor from noisy linear measurements is a central challenge in high-dimensional data analysis, with applications spanning tensor PCA, tensor regression, and beyond. We exploit the intrinsic geometry of rank-one tensors by casting…

October 2, 2025