Archives AI News

Bulk-boundary decomposition of neural networks

arXiv:2511.02003v1 Announce Type: new Abstract: We present the bulk-boundary decomposition as a new framework for understanding the training dynamics of deep neural networks. Starting from the stochastic gradient descent formulation, we show that the Lagrangian can be reorganized into a…

November 5, 2025

PyDPF: A Python Package for Differentiable Particle Filtering

arXiv:2510.25693v2 Announce Type: replace-cross Abstract: State-space models (SSMs) are a widely used tool in time series analysis. In the complex systems that arise from real-world data, it is common to employ particle filtering (PF), an efficient Monte Carlo method for…

November 5, 2025

TapOut: A Bandit-Based Approach to Dynamic Speculative Decoding

arXiv:2511.02017v1 Announce Type: new Abstract: Speculative decoding accelerates LLMs by using a lightweight draft model to generate tokens autoregressively before verifying them in parallel with a larger target model. However, determining the optimal number of tokens to draft remains a…

November 5, 2025

Accelerated Frank-Wolfe Algorithms: Complementarity Conditions and Sparsity

arXiv:2511.02821v1 Announce Type: cross Abstract: We develop new accelerated first-order algorithms in the Frank-Wolfe (FW) family for minimizing smooth convex functions over compact convex sets, with a focus on two prominent constraint classes: (1) polytopes and (2) matrix domains given…

November 5, 2025

Shared Parameter Subspaces and Cross-Task Linearity in Emergently Misaligned Behavior

arXiv:2511.02022v1 Announce Type: new Abstract: Recent work has discovered that large language models can develop broadly misaligned behaviors after being fine-tuned on narrowly harmful datasets, a phenomenon known as emergent misalignment (EM). However, the fundamental mechanisms enabling such harmful generalization…

November 5, 2025

RASPNet: A Benchmark Dataset for Radar Adaptive Signal Processing Applications

arXiv:2406.09638v3 Announce Type: replace Abstract: We present a large-scale dataset called RASPNet for radar adaptive signal processing (RASP) applications to support the development of data-driven models within the adaptive radar community. RASPNet exceeds 16 TB in size and comprises 100…

November 5, 2025

Path-Coordinated Continual Learning with Neural Tangent Kernel-Justified Plasticity: A Theoretical Framework with Near State-of-the-Art Performance

arXiv:2511.02025v1 Announce Type: new Abstract: Catastrophic forgetting is one of the fundamental issues of continual learning because neural networks forget the tasks learned previously when trained on new tasks. The proposed framework is a new path-coordinated framework of continual learning…

November 5, 2025

MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems

arXiv:2412.07067v5 Announce Type: replace Abstract: The sparse Mixture-of-Experts (MoE) architecture is increasingly favored for scaling Large Language Models (LLMs) efficiently, but it depends on heterogeneous compute and memory resources. These factors jointly affect system Cost, Accuracy, and Performance (CAP), making…

November 5, 2025

RobustFSM: Submodular Maximization in Federated Setting with Malicious Clients

arXiv:2511.02029v1 Announce Type: new Abstract: Submodular maximization is an optimization problem benefiting many machine learning applications, where we seek a small subset best representing an extremely large dataset. We focus on the federated setting where the data are locally owned…

November 5, 2025

Training Language Models to Reason Efficiently

arXiv:2502.04463v4 Announce Type: replace Abstract: Scaling model size and training data has led to great advances in the performance of Large Language Models (LLMs). However, the diminishing returns of this approach necessitate alternative methods to improve model capabilities, particularly in…

November 5, 2025