Archives AI News

Probing Length Generalization in Mamba via Image Reconstruction

arXiv:2603.12499v1 Announce Type: new Abstract: Mamba has attracted widespread interest as a general-purpose sequence model due to its low computational complexity and competitive performance relative to transformers. However, its performance can degrade when inference sequence lengths exceed those seen during…

March 16, 2026

Accelerating Stroke MRI with Diffusion Probabilistic Models through Large-Scale Pre-training and Target-Specific Fine-Tuning

arXiv:2603.13007v1 Announce Type: cross Abstract: Purpose: To develop a data-efficient strategy for accelerated MRI reconstruction with Diffusion Probabilistic Generative Models (DPMs) that enables faster scan times in clinical stroke MRI when only limited fully-sampled data samples are available. Methods: Our…

March 16, 2026

Adaptive Conditional Forest Sampling for Spectral Risk Optimisation under Decision-Dependent Uncertainty

arXiv:2603.12507v1 Announce Type: new Abstract: Minimising a spectral risk objective, defined as a convex combination of expected cost and Conditional Value-at-Risk (CVaR), is challenging when the uncertainty distribution is decision-dependent, making both surrogate modelling and simulation-based ranking sensitive to tail…

March 16, 2026

Convergence Rate of a Functional Learning Method for Contextual Stochastic Optimization

arXiv:2603.13048v1 Announce Type: cross Abstract: We consider a stochastic optimization problem involving two random variables: a context variable $X$ and a dependent variable $Y$. The objective is to minimize the expected value of a nonlinear loss functional applied to the…

March 16, 2026

Byzantine-Robust Optimization under $(L_0, L_1)$-Smoothness

arXiv:2603.12512v1 Announce Type: new Abstract: We consider distributed optimization under Byzantine attacks in the presence of $(L_0,L_1)$-smoothness, a generalization of standard $L$-smoothness that captures functions with state-dependent gradient Lipschitz constants. We propose Byz-NSGDM, a normalized stochastic gradient descent method with…

March 16, 2026

Rethinking Attention: Polynomial Alternatives to Softmax in Transformers

arXiv:2410.18613v3 Announce Type: replace Abstract: This paper questions whether the strong performance of softmax attention in transformers stems from producing a probability distribution over inputs. Instead, we argue that softmax’s effectiveness lies in its implicit regularization of the Frobenius norm…

March 16, 2026

Learning Pore-scale Multiphase Flow from 4D Velocimetry

arXiv:2603.12516v1 Announce Type: new Abstract: Multiphase flow in porous media underpins subsurface energy and environmental technologies, including geological CO$_2$ storage and underground hydrogen storage, yet pore-scale dynamics in realistic three-dimensional materials remain difficult to characterize and predict. Here we introduce…

March 16, 2026

Accelerating Residual Reinforcement Learning with Uncertainty Estimation

arXiv:2506.17564v2 Announce Type: replace Abstract: Residual Reinforcement Learning (RL) is a popular approach for adapting pretrained policies by learning a lightweight residual policy that provides corrective actions. While Residual RL is more sample-efficient than finetuning the entire base policy, existing…

March 16, 2026

New sensor sniffs out pneumonia on a patient’s breath

The technology could enable fast, point-of-care diagnoses for pneumonia and other lung conditions.

March 16, 2026

Knowing without Acting: The Disentangled Geometry of Safety Mechanisms in Large Language Models

arXiv:2603.05773v2 Announce Type: replace-cross Abstract: Safety alignment is often conceptualized as a monolithic process wherein harmfulness detection automatically triggers refusal. However, the persistence of jailbreak attacks suggests a fundamental mechanistic decoupling. We propose the textbf{underline{D}}isentangled textbf{underline{S}}afety textbf{underline{H}}ypothesis textbf{(DSH)}, positing that…

March 16, 2026