Archives AI News

Generalized Reduction to the Isotropy for Flexible Equivariant Neural Fields

arXiv:2603.08758v1 Announce Type: new Abstract: Many geometric learning problems require invariants on heterogeneous product spaces, i.e., products of distinct spaces carrying different group actions, where standard techniques do not directly apply. We show that, when a group $G$ acts transitively…

March 11, 2026

Enhancing Retrieval-Augmented Generation with Entity Linking for Educational Platforms

arXiv:2512.05967v2 Announce Type: replace-cross Abstract: In the era of Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) architectures are gaining significant attention for their ability to ground language generation in reliable knowledge sources. Despite their effectiveness, RAG systems based solely on…

March 11, 2026

A New Modeling to Feature Selection Based on the Fuzzy Rough Set Theory in Normal and Optimistic States on Hybrid Information Systems

arXiv:2603.08900v1 Announce Type: new Abstract: Considering the high volume, wide variety, and rapid speed of data generation, investigating feature selection methods for big data presents various applications and advantages. By removing irrelevant and redundant features, feature selection reduces data dimensions,…

March 11, 2026

Scalable Training of Mixture-of-Experts Models with Megatron Core

arXiv:2603.07685v2 Announce Type: replace-cross Abstract: Scaling Mixture-of-Experts (MoE) training introduces systems challenges absent in dense models. Because each token activates only a subset of experts, this sparsity allows total parameters to grow much faster than per-token computation, creating coupled constraints…

March 11, 2026

Cross-Domain Uncertainty Quantification for Selective Prediction: A Comprehensive Bound Ablation with Transfer-Informed Betting

arXiv:2603.08907v1 Announce Type: new Abstract: We present a comprehensive ablation of nine finite-sample bound families for selective prediction with risk control, combining concentration inequalities (Hoeffding, Empirical Bernstein, Clopper-Pearson, Wasserstein DRO, CVaR) with multiple-testing corrections (union bound, Learn Then Test fixed-sequence)…

March 11, 2026

XConv: Low-memory stochastic backpropagation for convolutional layers

arXiv:2106.06998v3 Announce Type: replace Abstract: Training convolutional neural networks at scale demands substantial memory, largely due to storing intermediate activations for backpropagation. Existing approaches — such as checkpointing, invertible architectures, or gradient approximation methods like randomized automatic differentiation — either…

March 11, 2026

Quantifying Memorization and Privacy Risks in Genomic Language Models

arXiv:2603.08913v1 Announce Type: new Abstract: Genomic language models (GLMs) have emerged as powerful tools for learning representations of DNA sequences, enabling advances in variant prediction, regulatory element identification, and cross-task transfer learning. However, as these models are increasingly trained or…

March 11, 2026

The Gaussian-Multinoulli Restricted Boltzmann Machine: A Potts Model Extension of the GRBM

arXiv:2505.11635v2 Announce Type: replace Abstract: Many real-world tasks, from associative memory to symbolic reasoning, benefit from discrete, structured representations that standard continuous latent models can struggle to express. We introduce the Gaussian-Multinoulli Restricted Boltzmann Machine (GM-RBM), a generative energy-based model…

March 11, 2026

Uncovering a Winning Lottery Ticket with Continuously Relaxed Bernoulli Gates

arXiv:2603.08914v1 Announce Type: new Abstract: Over-parameterized neural networks incur prohibitive memory and computational costs for resource-constrained deployment. The Strong Lottery Ticket (SLT) hypothesis suggests that randomly initialized networks contain sparse subnetworks achieving competitive accuracy without weight training. Existing SLT methods,…

March 11, 2026

A Surrogate model for High Temperature Superconducting Magnets to Predict Current Distribution with Neural Network

arXiv:2509.06067v2 Announce Type: replace Abstract: Finite element methods (FEM) for high-temperature superconducting (HTS) magnets become time-consuming at larger scales, restricting the rapid optimization of meter-scale REBCO solenoids. In this work, a surrogate model based on a fully connected residual neural…

March 11, 2026