Archives AI News

Is Retraining-Free Enough? The Necessity of Router Calibration for Efficient MoE Compression

arXiv:2603.02217v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) models scale capacity efficiently, but their massive parameter footprint creates a deployment-time memory bottleneck. We organize retraining-free MoE compression into three paradigms – Expert Pruning, Expert Editing, and Expert Merging – and show…

March 4, 2026

Discrete Solution Operator Learning for Geometry-Dependent PDEs

arXiv:2601.09143v3 Announce Type: replace Abstract: Neural operator learning accelerates PDE solution by approximating operators as mappings between continuous function spaces. Yet in many engineering settings, varying geometry induces discrete structural changes, including topological changes, abrupt changes in boundary conditions or…

March 4, 2026

Characterizing and Predicting Wildfire Evacuation Behavior: A Dual-Stage ML Approach

arXiv:2603.02223v1 Announce Type: new Abstract: Wildfire evacuation behavior is highly variable and influenced by complex interactions among household resources, preparedness, and situational cues. Using a large-scale MTurk survey of residents in California, Colorado, and Oregon, this study integrates unsupervised and…

March 4, 2026

Tell Me What To Learn: Generalizing Neural Memory to be Controllable in Natural Language

arXiv:2602.23201v2 Announce Type: replace Abstract: Modern machine learning models are deployed in diverse, non-stationary environments where they must continually adapt to new tasks and evolving knowledge. Continual fine-tuning and in-context learning are costly and brittle, whereas neural memory methods promise…

March 4, 2026

Subspace Geometry Governs Catastrophic Forgetting in Low-Rank Adaptation

arXiv:2603.02224v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) has emerged as a parameter-efficient approach for adapting large pre-trained models, yet its behavior under continual learning remains poorly understood. We present a geometric theory characterizing catastrophic forgetting in LoRA through the…

March 4, 2026

EP-GAT: Energy-based Parallel Graph Attention Neural Network for Stock Trend Classification

arXiv:2507.08184v2 Announce Type: replace-cross Abstract: Graph neural networks have shown remarkable performance in forecasting stock movements, which arises from learning complex inter-dependencies between stocks and intra-dynamics of stocks. Existing approaches based on graph neural networks typically rely on static or…

March 4, 2026

Scaling Reward Modeling without Human Supervision

arXiv:2603.02225v1 Announce Type: new Abstract: Learning from feedback is an instrumental process for advancing the capabilities and safety of frontier models, yet its effectiveness is often constrained by cost and scalability. We present a pilot study that explores scaling reward…

March 4, 2026

Generative adversarial imitation learning for robot swarms: Learning from human demonstrations and trained policies

arXiv:2603.02783v1 Announce Type: cross Abstract: In imitation learning, robots are supposed to learn from demonstrations of the desired behavior. Most of the work in imitation learning for swarm robotics provides the demonstrations as rollouts of an existing policy. In this…

March 4, 2026

Efficient Sparse Selective-Update RNNs for Long-Range Sequence Modeling

arXiv:2603.02226v1 Announce Type: new Abstract: Real-world sequential signals, such as audio or video, contain critical information that is often embedded within long periods of silence or noise. While recurrent neural networks (RNNs) are designed to process such data efficiently, they…

March 4, 2026

Infinite dimensional generative sensing

arXiv:2603.03196v1 Announce Type: cross Abstract: Deep generative models have become a standard for modeling priors for inverse problems, going beyond classical sparsity-based methods. However, existing theoretical guarantees are mostly confined to finite-dimensional vector spaces, creating a gap when the physical…

March 4, 2026