Archives AI News

ParetoBandit: Budget-Paced Adaptive Routing for Non-Stationary LLM Serving

arXiv:2604.00136v1 Announce Type: new Abstract: Production LLM serving often relies on multi-model portfolios spanning a ~530x cost range, where routing decisions trade off quality against cost. This trade-off is non-stationary: providers revise pricing, model quality can regress silently, and new…

Diagnosing Neural Convergence with Topological Alignment Spectra

arXiv:2411.08687v2 Announce Type: replace Abstract: Representational similarity in neural networks is inherently scale-dependent, yet widely used metrics such as Centered Kernel Alignment (CKA) and Procrustes analysis provide only global scalar estimates. These scalars often fail to distinguish micro-scale geometric jitter…

Mousse: Rectifying the Geometry of Muon with Curvature-Aware Preconditioning

arXiv:2603.09697v2 Announce Type: replace Abstract: Recent advances in spectral optimization, notably Muon, have demonstrated that constraining update steps to the Stiefel manifold can significantly accelerate training and improve generalization. However, Muon implicitly assumes an isotropic optimization landscape, enforcing a uniform…