The Hessian of tall-skinny networks is easy to invert

2026-01-12 20:00 GMT · 5 months ago aimagpro.com

arXiv:2601.06096v1 Announce Type: new
Abstract: We describe an exact algorithm for solving linear systems $Hx=b$ where $H$ is the Hessian of a deep net. The method computes Hessian-inverse-vector products without storing the Hessian or its inverse in time and storage that scale linearly in the number of layers. Compared to the naive approach of first computing the Hessian, then solving the linear system, which takes storage that’s quadratic in the number of parameters and cubically many operations, our Hessian-inverse-vector product method scales roughly like Pearlmutter’s algorithm for computing Hessian-vector products.