Archives AI News

Bulk-boundary decomposition of neural networks

arXiv:2511.02003v1 Announce Type: new Abstract: We present the bulk-boundary decomposition as a new framework for understanding the training dynamics of deep neural networks. Starting from the stochastic gradient descent formulation, we show that the Lagrangian can be reorganized into a…

PyDPF: A Python Package for Differentiable Particle Filtering

arXiv:2510.25693v2 Announce Type: replace-cross Abstract: State-space models (SSMs) are a widely used tool in time series analysis. In the complex systems that arise from real-world data, it is common to employ particle filtering (PF), an efficient Monte Carlo method for…

TapOut: A Bandit-Based Approach to Dynamic Speculative Decoding

arXiv:2511.02017v1 Announce Type: new Abstract: Speculative decoding accelerates LLMs by using a lightweight draft model to generate tokens autoregressively before verifying them in parallel with a larger target model. However, determining the optimal number of tokens to draft remains a…

Accelerated Frank-Wolfe Algorithms: Complementarity Conditions and Sparsity

arXiv:2511.02821v1 Announce Type: cross Abstract: We develop new accelerated first-order algorithms in the Frank-Wolfe (FW) family for minimizing smooth convex functions over compact convex sets, with a focus on two prominent constraint classes: (1) polytopes and (2) matrix domains given…

Training Language Models to Reason Efficiently

arXiv:2502.04463v4 Announce Type: replace Abstract: Scaling model size and training data has led to great advances in the performance of Large Language Models (LLMs). However, the diminishing returns of this approach necessitate alternative methods to improve model capabilities, particularly in…