Archives AI News

Expressivity-Efficiency Tradeoffs for Hybrid Sequence Models

arXiv:2603.08859v1 Announce Type: new Abstract: Hybrid sequence models–combining Transformer and state-space model layers–seek to gain the expressive versatility of attention as well as the computational efficiency of state-space model layers. Despite burgeoning interest in hybrid models, we lack a basic…

March 11, 2026

The Temporal Markov Transition Field

arXiv:2603.08803v1 Announce Type: new Abstract: The Markov Transition Field (MTF), introduced by Wang and Oates (2015), encodes a time series as a two-dimensional image by mapping each pair of time steps to the transition probability between their quantile states, estimated…

March 11, 2026

SoftJAX & SoftTorch: Empowering Automatic Differentiation Libraries with Informative Gradients

arXiv:2603.08824v1 Announce Type: new Abstract: Automatic differentiation (AD) frameworks such as JAX and PyTorch have enabled gradient-based optimization for a wide range of scientific fields. Yet, many “hard” primitives in these libraries such as thresholding, Boolean logic, discrete indexing, and…

March 11, 2026

Multi-level meta-reinforcement learning with skill-based curriculum

arXiv:2603.08773v1 Announce Type: new Abstract: We consider problems in sequential decision making with natural multi-level structure, where sub-tasks are assembled together to accomplish complex goals. Systematically inferring and leveraging hierarchical structure has remained a longstanding challenge; we describe an efficient…

March 11, 2026

SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning

arXiv:2603.08763v1 Announce Type: new Abstract: A key challenge in lifelong imitation learning (LIL) is enabling agents to acquire new skills from expert demonstrations while retaining prior knowledge. This requires preserving the low-dimensional manifolds and geometric structures that underlie task representations…

March 11, 2026

Generalized Reduction to the Isotropy for Flexible Equivariant Neural Fields

arXiv:2603.08758v1 Announce Type: new Abstract: Many geometric learning problems require invariants on heterogeneous product spaces, i.e., products of distinct spaces carrying different group actions, where standard techniques do not directly apply. We show that, when a group $G$ acts transitively…

March 11, 2026

Enhancing Retrieval-Augmented Generation with Entity Linking for Educational Platforms

arXiv:2512.05967v2 Announce Type: replace-cross Abstract: In the era of Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) architectures are gaining significant attention for their ability to ground language generation in reliable knowledge sources. Despite their effectiveness, RAG systems based solely on…

March 11, 2026

A New Modeling to Feature Selection Based on the Fuzzy Rough Set Theory in Normal and Optimistic States on Hybrid Information Systems

arXiv:2603.08900v1 Announce Type: new Abstract: Considering the high volume, wide variety, and rapid speed of data generation, investigating feature selection methods for big data presents various applications and advantages. By removing irrelevant and redundant features, feature selection reduces data dimensions,…

March 11, 2026

Scalable Training of Mixture-of-Experts Models with Megatron Core

arXiv:2603.07685v2 Announce Type: replace-cross Abstract: Scaling Mixture-of-Experts (MoE) training introduces systems challenges absent in dense models. Because each token activates only a subset of experts, this sparsity allows total parameters to grow much faster than per-token computation, creating coupled constraints…

March 11, 2026

Cross-Domain Uncertainty Quantification for Selective Prediction: A Comprehensive Bound Ablation with Transfer-Informed Betting

arXiv:2603.08907v1 Announce Type: new Abstract: We present a comprehensive ablation of nine finite-sample bound families for selective prediction with risk control, combining concentration inequalities (Hoeffding, Empirical Bernstein, Clopper-Pearson, Wasserstein DRO, CVaR) with multiple-testing corrections (union bound, Learn Then Test fixed-sequence)…

March 11, 2026