An Analytical Characterization of Sloppiness in Neural Networks: Insights from Linear Models
arXiv:2505.08915v2 Announce Type: replace Abstract: Recent experiments have shown that training trajectories of multiple deep neural networks with different architectures, optimization algorithms, hyper-parameter settings, and regularization methods evolve on a remarkably low-dimensional “hyper-ribbon-like” manifold in the space of probability distributions.…
