Archives AI News

Tree Training: Accelerating Agentic LLMs Training via Shared Prefix Reuse

arXiv:2511.00413v5 Announce Type: replace Abstract: Agentic large language model (LLM) training often involves multi-turn interaction trajectories that branch into multiple execution paths due to concurrent tool use, think-mode, sub-agent, context management and other runtime designs. As a result, the tokens…

SGD at the Edge of Stability: The Stochastic Sharpness Gap

arXiv:2604.21016v1 Announce Type: new Abstract: When training neural networks with full-batch gradient descent (GD) and step size $eta$, the largest eigenvalue of the Hessian — the sharpness $S(boldsymbol{theta})$ — rises to $2/eta$ and hovers there, a phenomenon termed the Edge…

BackPlay: Head-Only Look-Back Self-Correction for Diffusion Language Models

arXiv:2601.06428v3 Announce Type: replace Abstract: Diffusion Language Models (DLMs) decode multiple tokens in parallel, but aggressive multi-token decoding amplifies cross-token dependency errors and can sharply degrade generation quality. We propose BackPlay, a frozen-backbone self-correction framework that trains only a lightweight…

MCAP: Deployment-Time Layer Profiling for Memory-Constrained LLM Inference

arXiv:2604.21026v1 Announce Type: new Abstract: Deploying large language models to heterogeneous hardware is often constrained by memory, not compute. We introduce MCAP (Monte Carlo Activation Profiling), a load-time per-layer importance estimator that enables dynamic precision and memory placement decisions on…

Continuous-Utility Direct Preference Optimization

arXiv:2602.00931v2 Announce Type: replace Abstract: Large language model reasoning is often treated as a monolithic capability, relying on binary preference supervision that fails to capture partial progress or fine-grained reasoning quality. We introduce Continuous Utility Direct Preference Optimization (CU-DPO), a…

BioTrain: Sub-MB, Sub-50mW On-Device Fine-Tuning for Edge-AI on Biosignals

arXiv:2604.13359v2 Announce Type: replace Abstract: Biosignals exhibit substantial cross-subject and cross-session variability, inducing severe domain shifts that degrade post-deployment performance for small, edge-oriented AI models. On-device adaptation is therefore essential to both preserve user privacy and ensure system reliability. However,…

Adaptive Soft Error Protection for Neural Network Processing

arXiv:2407.19664v3 Announce Type: replace Abstract: Previous research on selective protection for neural network components typically exploits only static vulnerability differences. Although these methods improve upon classical modular redundancy, they still incur substantial overhead for neural network workloads that are both…