Archives AI News

On the Existence and Behaviour of Secondary Attention Sinks

arXiv:2512.22213v1 Announce Type: new Abstract: Attention sinks are tokens, often the beginning-of-sequence (BOS) token, that receive disproportionately high attention despite limited semantic relevance. In this work, we identify a class of attention sinks, which we term secondary sinks, that differ…

Multivariate Conformal Prediction via Conformalized Gaussian Scoring

arXiv:2507.20941v2 Announce Type: replace-cross Abstract: While achieving exact conditional coverage in conformal prediction is unattainable without making strong, untestable regularity assumptions, the promise of conformal prediction hinges on finding approximations to conditional guarantees that are realizable in practice. A promising…

M”untz-Sz’asz Networks: Neural Architectures with Learnable Power-Law Bases

arXiv:2512.22222v1 Announce Type: new Abstract: Standard neural network architectures employ fixed activation functions (ReLU, tanh, sigmoid) that are poorly suited for approximating functions with singular or fractional power behavior, a structure that arises ubiquitously in physics, including boundary layers, fracture…

Multimodal Diffeomorphic Registration with Neural ODEs and Structural Descriptors

arXiv:2512.22689v1 Announce Type: cross Abstract: This work proposes a multimodal diffeomorphic registration method using Neural Ordinary Differential Equations (Neural ODEs). Nonrigid registration algorithms exhibit tradeoffs between their accuracy, the computational complexity of their deformation model, and its proper regularization. In…

JADAI: Jointly Amortizing Adaptive Design and Bayesian Inference

arXiv:2512.22999v1 Announce Type: cross Abstract: We consider problems of parameter estimation where design variables can be actively optimized to maximize information gain. To this end, we introduce JADAI, a framework that jointly amortizes Bayesian adaptive design and inference by training…

Atom of Thoughts for Markov LLM Test-Time Scaling

arXiv:2502.12018v4 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have achieved significant performance gains through test-time scaling methods. However, existing approaches often incur redundant computations due to the accumulation of historical dependency information during inference. To address this challenge, we…