Archives AI News

Path-Coordinated Continual Learning with Neural Tangent Kernel-Justified Plasticity: A Theoretical Framework with Near State-of-the-Art Performance

arXiv:2511.02025v1 Announce Type: new Abstract: Catastrophic forgetting is one of the fundamental issues of continual learning because neural networks forget the tasks learned previously when trained on new tasks. The proposed framework is a new path-coordinated framework of continual learning…

November 5, 2025

MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems

arXiv:2412.07067v5 Announce Type: replace Abstract: The sparse Mixture-of-Experts (MoE) architecture is increasingly favored for scaling Large Language Models (LLMs) efficiently, but it depends on heterogeneous compute and memory resources. These factors jointly affect system Cost, Accuracy, and Performance (CAP), making…

November 5, 2025

RobustFSM: Submodular Maximization in Federated Setting with Malicious Clients

arXiv:2511.02029v1 Announce Type: new Abstract: Submodular maximization is an optimization problem benefiting many machine learning applications, where we seek a small subset best representing an extremely large dataset. We focus on the federated setting where the data are locally owned…

November 5, 2025

Training Language Models to Reason Efficiently

arXiv:2502.04463v4 Announce Type: replace Abstract: Scaling model size and training data has led to great advances in the performance of Large Language Models (LLMs). However, the diminishing returns of this approach necessitate alternative methods to improve model capabilities, particularly in…

November 5, 2025

Predicting Microbial Interactions Using Graph Neural Networks

arXiv:2511.02038v1 Announce Type: new Abstract: Predicting interspecies interactions is a key challenge in microbial ecology, as these interactions are critical to determining the structure and activity of microbial communities. In this work, we used data on monoculture growth capabilities, interactions…

November 5, 2025

Noise-based reward-modulated learning

arXiv:2503.23972v3 Announce Type: replace Abstract: The pursuit of energy-efficient and adaptive artificial intelligence (AI) has positioned neuromorphic computing as a promising alternative to conventional computing. However, achieving learning on these platforms requires techniques that prioritize local information while enabling effective…

November 5, 2025

Quantum-Enhanced Generative Models for Rare Event Prediction

arXiv:2511.02042v1 Announce Type: new Abstract: Rare events such as financial crashes, climate extremes, and biological anomalies are notoriously difficult to model due to their scarcity and heavy-tailed distributions. Classical deep generative models often struggle to capture these rare occurrences, either…

November 5, 2025

Flashlight: PyTorch Compiler Extensions to Accelerate Attention Variants

arXiv:2511.02043v1 Announce Type: new Abstract: Bad charactors when submitting to arXiv: Attention is a fundamental building block of large language models (LLMs), so there have been many efforts to implement it efficiently. For example, FlashAttention leverages tiling and kernel fusion…

November 5, 2025

Neural Collapse in Cumulative Link Models for Ordinal Regression: An Analysis with Unconstrained Feature Model

arXiv:2506.05801v2 Announce Type: replace Abstract: A phenomenon known as ”Neural Collapse (NC)” in deep classification tasks, in which the penultimate-layer features and the final classifiers exhibit an extremely simple geometric structure, has recently attracted considerable attention, with the expectation that…

November 5, 2025

Learning to Steer: Input-dependent Steering for Multimodal LLMs

arXiv:2508.12815v2 Announce Type: replace-cross Abstract: Steering has emerged as a practical approach to enable post-hoc guidance of LLMs towards enforcing a specific behavior. However, it remains largely underexplored for multimodal LLMs (MLLMs); furthermore, existing steering techniques, such as mean steering,…

November 5, 2025