Archives AI News

Cross-Domain Uncertainty Quantification for Selective Prediction: A Comprehensive Bound Ablation with Transfer-Informed Betting

arXiv:2603.08907v1 Announce Type: new Abstract: We present a comprehensive ablation of nine finite-sample bound families for selective prediction with risk control, combining concentration inequalities (Hoeffding, Empirical Bernstein, Clopper-Pearson, Wasserstein DRO, CVaR) with multiple-testing corrections (union bound, Learn Then Test fixed-sequence)…

March 11, 2026

XConv: Low-memory stochastic backpropagation for convolutional layers

arXiv:2106.06998v3 Announce Type: replace Abstract: Training convolutional neural networks at scale demands substantial memory, largely due to storing intermediate activations for backpropagation. Existing approaches — such as checkpointing, invertible architectures, or gradient approximation methods like randomized automatic differentiation — either…

March 11, 2026

Quantifying Memorization and Privacy Risks in Genomic Language Models

arXiv:2603.08913v1 Announce Type: new Abstract: Genomic language models (GLMs) have emerged as powerful tools for learning representations of DNA sequences, enabling advances in variant prediction, regulatory element identification, and cross-task transfer learning. However, as these models are increasingly trained or…

March 11, 2026

The Gaussian-Multinoulli Restricted Boltzmann Machine: A Potts Model Extension of the GRBM

arXiv:2505.11635v2 Announce Type: replace Abstract: Many real-world tasks, from associative memory to symbolic reasoning, benefit from discrete, structured representations that standard continuous latent models can struggle to express. We introduce the Gaussian-Multinoulli Restricted Boltzmann Machine (GM-RBM), a generative energy-based model…

March 11, 2026

Uncovering a Winning Lottery Ticket with Continuously Relaxed Bernoulli Gates

arXiv:2603.08914v1 Announce Type: new Abstract: Over-parameterized neural networks incur prohibitive memory and computational costs for resource-constrained deployment. The Strong Lottery Ticket (SLT) hypothesis suggests that randomly initialized networks contain sparse subnetworks achieving competitive accuracy without weight training. Existing SLT methods,…

March 11, 2026

A Surrogate model for High Temperature Superconducting Magnets to Predict Current Distribution with Neural Network

arXiv:2509.06067v2 Announce Type: replace Abstract: Finite element methods (FEM) for high-temperature superconducting (HTS) magnets become time-consuming at larger scales, restricting the rapid optimization of meter-scale REBCO solenoids. In this work, a surrogate model based on a fully connected residual neural…

March 11, 2026

The $qs$ Inequality: Quantifying the Double Penalty of Mixture-of-Experts at Inference

arXiv:2603.08960v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) models deliver high quality at low training FLOPs, but this efficiency often vanishes at inference. We identify a double penalty that structurally disadvantages MoE architectures during decoding: first, expert routing fragments microbatches and…

March 11, 2026

Automating Forecasting Question Generation and Resolution for AI Evaluation

arXiv:2601.22444v2 Announce Type: replace Abstract: Forecasting future events is highly valuable in decision-making and is a robust measure of general intelligence. As forecasting is probabilistic, developing and evaluating AI forecasters requires generating large numbers of diverse and difficult questions, and…

March 11, 2026

Semantic Level of Detail: Multi-Scale Knowledge Representation via Heat Kernel Diffusion on Hyperbolic Manifolds

arXiv:2603.08965v1 Announce Type: new Abstract: AI memory systems increasingly organize knowledge into graph structures — knowledge graphs, entity relations, community hierarchies — yet lack a principled mechanism for continuous resolution control: where do the qualitative boundaries between abstraction levels lie,…

March 11, 2026

Adversarial Latent-State Training for Robust Policies in Partially Observable Domains

arXiv:2603.07313v2 Announce Type: replace Abstract: Robustness under latent distribution shift remains challenging in partially observable reinforcement learning. We formalize a focused setting where an adversary selects a hidden initial latent distribution before the episode, termed an adversarial latent-initial-state POMDP. Theoretically,…

March 11, 2026