Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks

2025-09-21 19:00 GMT · 9 months ago aimagpro.com

arXiv:2410.22069v3 Announce Type: replace-cross
Abstract: We study the implicit bias of the general family of steepest descent algorithms with infinitesimal learning rate in deep homogeneous neural networks. We show that: (a) an algorithm-dependent geometric margin starts increasing once the networks reach perfect training accuracy, and (b) any limit point of the training trajectory corresponds to a KKT point of the corresponding margin-maximization problem. We experimentally zoom into the trajectories of neural networks optimized with various steepest descent algorithms, highlighting connections to the implicit bias of popular adaptive methods (Adam and Shampoo).