Archives AI News

The Role of Symmetry in Optimizing Overparameterized Networks

arXiv:2604.25150v2 Announce Type: replace Abstract: Overparameterization is central to the success of deep learning, yet the mechanisms by which it improves optimization remain incompletely understood. We analyze weight-space symmetries in neural networks and show that overparameterization introduces additional symmetries that…

April 30, 2026

Adaptive and Fine-grained Module-wise Expert Pruning for Efficient LoRA-MoE Fine-Tuning

arXiv:2604.26340v1 Announce Type: new Abstract: LoRA-MoE has emerged as an effective paradigm for parameter-efficient fine-tuning, combining the low training cost of LoRA with the increased adaptation capacity of Mixture-of-Experts (MoE). However, existing LoRA-MoE frameworks typically adopt a fixed and uniform…

April 30, 2026

Study: Immigrants help address the US eldercare shortage

Economists find that in metro areas with more immigration, nurses are spending more time with elderly patients.

April 30, 2026

MoRFI: Monotonic Sparse Autoencoder Feature Identification

arXiv:2604.26866v1 Announce Type: cross Abstract: Large language models (LLMs) acquire most of their factual knowledge during the pre-training stage, through next token prediction. Subsequent stages of post-training often introduce new facts outwith the parametric knowledge, giving rise to hallucinations. While…

April 30, 2026

The Alignment Flywheel: A Governance-Centric Hybrid MAS for Architecture-Agnostic Safety

arXiv:2603.02259v2 Announce Type: replace-cross Abstract: Multi-agent systems provide mature methodologies for role decomposition, coordination, and normative governance, capabilities that remain essential as increasingly powerful autonomous decision components are embedded within agent-based systems. While learned and generative models substantially expand system…

April 30, 2026

A Randomized PDE Energy driven Iterative Framework for Efficient and Stable PDE Solutions

arXiv:2604.25943v1 Announce Type: new Abstract: Efficient and stable solution of partial differential equations (PDEs) is central to scientific and engineering applications, yet existing numerical solvers rely heavily on matrix based discretizations, while learning based methods require costly training and often…

April 30, 2026

FedSLoP: Memory-Efficient Federated Learning with Low-Rank Gradient Projection

arXiv:2604.24012v2 Announce Type: replace Abstract: Federated learning enables a population of clients to collaboratively train machine learning models without exchanging their raw data, but standard algorithms such as FedAvg suffer from slow convergence and high communication and memory costs in…

April 30, 2026

Out-of-Distribution Generalization of In-Context Learning: A Low-Dimensional Subspace Perspective

arXiv:2505.14808v2 Announce Type: replace-cross Abstract: The transformer’s remarkable ability to perform in-context learning (ICL) has sparked a wide range of studies designed to understand its strengths and limitations. However, a theoretical understanding of when ICL can and cannot generalize beyond…

April 30, 2026

RoseCDL: Robust and Scalable Convolutional Dictionary Learning for Rare event and Anomaly Detection

arXiv:2509.07523v4 Announce Type: replace Abstract: Detecting rare events and anomalies in large-scale signals is essential in fields such as astronomy, physical simulations, and biomedical science. In many cases, this problem naturally decomposes into identifying common local patterns and detecting deviations…

April 30, 2026

Causally Sufficient and Necessary Feature Expansion for Class-Incremental Learning

arXiv:2603.09145v3 Announce Type: replace Abstract: Current expansion-based methods for Class Incremental Learning (CIL) effectively mitigate catastrophic forgetting by freezing old features. However, such task-specific features learned from the new task may collide with the old features. From a causal perspective,…

April 30, 2026