Archives AI News

Rethinking Large Language Model Distillation: A Constrained Markov Decision Process Perspective

arXiv:2509.22921v1 Announce Type: new Abstract: We introduce a novel approach to large language model (LLM) distillation by formulating it as a constrained reinforcement learning problem. While recent work has begun exploring the integration of task-specific rewards into distillation processes, existing…

September 30, 2025

MonoCon: A general framework for learning ultra-compact high-fidelity representations using monotonicity constraints

arXiv:2509.22931v1 Announce Type: new Abstract: Learning high-quality, robust, efficient, and disentangled representations is a central challenge in artificial intelligence (AI). Deep metric learning frameworks tackle this challenge primarily using architectural and optimization constraints. Here, we introduce a third approach that…

September 30, 2025

Lightweight Learning for Grant-Free Activity Detection in Cell-Free Massive MIMO Networks

arXiv:2503.11305v3 Announce Type: replace-cross Abstract: Grant-free random access (GF-RA) is a promising access technique for massive machine-type communications (mMTC) in future wireless networks, particularly in the context of 5G and beyond (6G) systems. Within the context of GF-RA, this study…

September 30, 2025

Compute-Optimal Quantization-Aware Training

arXiv:2509.22935v1 Announce Type: new Abstract: Quantization-aware training (QAT) is a leading technique for improving the accuracy of quantized neural networks. Previous work has shown that decomposing training into a full-precision (FP) phase followed by a QAT phase yields superior accuracy…

September 30, 2025

Diffusion models for multivariate subsurface generation and efficient probabilistic inversion

arXiv:2507.15809v3 Announce Type: replace-cross Abstract: Diffusion models offer stable training and state-of-the-art performance for deep generative modeling tasks. Here, we consider their use in the context of multivariate subsurface modeling and probabilistic inversion. We first demonstrate that diffusion models enhance…

September 30, 2025

Understanding SOAP from the Perspective of Gradient Whitening

arXiv:2509.22938v1 Announce Type: new Abstract: Shampoo with Adam in the Preconditioner’s eigenbasis (SOAP) has recently emerged as a promising optimization algorithm for neural network training, achieving superior training efficiency over both Adam and Shampoo in language modeling tasks. In this…

September 30, 2025

Quantitative convergence of trained single layer neural networks to Gaussian processes

arXiv:2509.24544v1 Announce Type: cross Abstract: In this paper, we study the quantitative convergence of shallow neural networks trained via gradient descent to their associated Gaussian processes in the infinite-width limit. While previous work has established qualitative convergence under broad settings,…

September 30, 2025

SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights

arXiv:2509.22944v1 Announce Type: new Abstract: Post-training quantization has emerged as the most widely used strategy for deploying large language models at low precision. Still, current methods show perplexity degradation at bit-widths less than or equal to 4, partly because representing…

September 30, 2025

LVT: Large-Scale Scene Reconstruction via Local View Transformers

arXiv:2509.25001v1 Announce Type: cross Abstract: Large transformer models are proving to be a powerful tool for 3D vision and novel view synthesis. However, the standard Transformer’s well-known quadratic complexity makes it difficult to scale these methods to large scenes. To…

September 30, 2025

Meta-Learning Fourier Neural Operators for Hessian Inversion and Enhanced Variational Data Assimilation

arXiv:2509.22949v1 Announce Type: new Abstract: Data assimilation (DA) is crucial for enhancing solutions to partial differential equations (PDEs), such as those in numerical weather prediction, by optimizing initial conditions using observational data. Variational DA methods are widely used in oceanic…

September 30, 2025