Archives AI News

Regularized Meta-Learning for Improved Generalization

arXiv:2602.12469v2 Announce Type: replace Abstract: Deep ensemble methods often improve predictive performance, yet they suffer from three practical limitations: redundancy among base models that inflates computational cost and degrades conditioning, unstable weighting under multicollinearity, and overfitting in meta-learning pipelines. We…

April 27, 2026

Universal Transformers Need Memory: Depth-State Trade-offs in Adaptive Recursive Reasoning

arXiv:2604.21999v1 Announce Type: new Abstract: We study learned memory tokens as computational scratchpad for a single-block Universal Transformer (UT) with Adaptive Computation Time (ACT) on Sudoku-Extreme, a combinatorial reasoning benchmark. We find that memory tokens are empirically necessary: across all…

April 27, 2026

Mochi: Aligning Pre-training and Inference for Efficient Graph Foundation Models via Meta-Learning

arXiv:2604.22031v1 Announce Type: new Abstract: We propose Mochi, a Graph Foundation Model that addresses task unification and training efficiency by adopting a meta-learning based training framework. Prior models pre-train with reconstruction-based objectives such as link prediction, and assume that the…

April 27, 2026

When Quotes Crumble: Detecting Transient Mechanical Liquidity Erosion in Limit Order Books

arXiv:2604.21993v1 Announce Type: new Abstract: We study the detection of transient liquidity erosion (“crumbling quotes”) in electronic limit order books, where observable quote deterioration may reflect either mechanical liquidity withdrawal or informational repricing. Using the ABIDES agent-based simulator, we construct…

April 27, 2026

Multi-Task Optimization over Networks of Tasks

arXiv:2604.21991v1 Announce Type: new Abstract: Multi-task optimization is a powerful approach for solving a large number of tasks in parallel. However, existing algorithms face distinct limitations: Population-based methods scale poorly and remain underexplored for large task sets. Approaches that do…

April 27, 2026

Conditional anomaly detection using soft harmonic functions: An application to clinical alerting

arXiv:2604.21956v1 Announce Type: new Abstract: Timely detection of concerning events is an important problem in clinical practice. In this paper, we consider the problem of conditional anomaly detection that aims to identify data instances with an unusual response, such as…

April 27, 2026

Parameter-Efficient Conditioning for Material Generalization in Graph-Based Simulators

arXiv:2511.05456v2 Announce Type: replace Abstract: Graph network-based simulators (GNS) have demonstrated strong potential for learning particle-based physics (such as fluids, deformable solids, and granular flows) while generalizing to unseen geometries due to their inherent inductive biases. However, existing models are…

April 27, 2026

LTBs-KAN: Linear-Time B-splines Kolmogorov-Arnold Networks

arXiv:2604.22034v1 Announce Type: new Abstract: Kolmogorov-Arnold Networks (KANs) are a recent neural network architecture offering an alternative to Multilayer Perceptrons (MLPs) with improved explainability and expressibility. However, KANs are significantly slower than MLPs due to the recursive nature of B-spline…

April 27, 2026

Optimal Lower Bounds for Online Multicalibration

arXiv:2601.05245v2 Announce Type: replace Abstract: We prove tight lower bounds for online multicalibration, establishing an information-theoretic separation from marginal calibration. In the general setting where group functions can depend on both context and the learner’s predictions, we prove an $Omega(T^{2/3})$…

April 27, 2026

LayerBoost: Layer-Aware Attention Reduction for Efficient LLMs

arXiv:2604.22050v1 Announce Type: new Abstract: Transformers are mostly relying on softmax attention, which introduces quadratic complexity with respect to sequence length and remains a major bottleneck for efficient inference. Prior work on linear or hybrid attention typically replaces softmax attention…

April 27, 2026