BSFA: Leveraging the Subspace Dichotomy to Accelerate Neural Network Training
arXiv:2510.25244v2 Announce Type: replace Abstract: Recent studies citep{gur2018gradient,song2024does, wen2024understanding} highlight a fundamental dichotomy in deep learning optimization: Although parameter updates along the top eigendirections of the loss Hessian (Dom-space) capture most of the update magnitude, they often contribute minimally to…
