Archives AI News

CodeGEMM: A Codebook-Centric Approach to Efficient GEMM in Quantized LLMs

arXiv:2512.17970v1 Announce Type: new Abstract: Weight-only quantization is widely used to mitigate the memory-bound nature of LLM inference. Codebook-based methods extend this trend by achieving strong accuracy in the extremely low-bit regime (e.g., 2-bit). However, current kernels rely on dequantization,…

Convolutional-neural-operator-based transfer learning for solving PDEs

arXiv:2512.17969v1 Announce Type: new Abstract: Convolutional neural operator is a CNN-based architecture recently proposed to enforce structure-preserving continuous-discrete equivalence and enable the genuine, alias-free learning of solution operators of PDEs. This neural operator was demonstrated to outperform for certain cases…

MOORL: A Framework for Integrating Offline-Online Reinforcement Learning

arXiv:2506.09574v2 Announce Type: replace Abstract: Sample efficiency and exploration remain critical challenges in Deep Reinforcement Learning (DRL), particularly in complex domains. Offline RL, which enables agents to learn optimal policies from static, pre-collected datasets, has emerged as a promising alternative.…

GenUQ: Predictive Uncertainty Estimates via Generative Hyper-Networks

arXiv:2509.21605v2 Announce Type: replace Abstract: Operator learning is a recently developed generalization of regression to mappings between functions. It promises to drastically reduce expensive numerical integration of PDEs to fast evaluations of mappings between functional states of a system, i.e.,…

Renormalizable Spectral-Shell Dynamics as the Origin of Neural Scaling Laws

arXiv:2512.10427v3 Announce Type: replace Abstract: Neural scaling laws and double-descent phenomena suggest that deep-network training obeys a simple macroscopic structure despite highly nonlinear optimization dynamics. We derive such structure directly from gradient descent in function space. For mean-squared error loss,…

Density estimation via mixture discrepancy and moments

arXiv:2504.01570v2 Announce Type: replace-cross Abstract: With the aim of generalizing histogram statistics to higher dimensional cases, density estimation via discrepancy based sequential partition (DSP) has been proposed to learn an adaptive piecewise constant approximation defined on a binary sequential partition…

Learning Generalizable Neural Operators for Inverse Problems

arXiv:2512.18120v1 Announce Type: new Abstract: Inverse problems challenge existing neural operator architectures because ill-posed inverse maps violate continuity, uniqueness, and stability assumptions. We introduce B2B${}^{-1}$, an inverse basis-to-basis neural operator framework that addresses this limitation. Our key innovation is to…