Moonwalk: Inverse-Forward Differentiation
arXiv:2402.14212v2 Announce Type: replace Abstract: Backpropagation’s main limitation is its need to store intermediate activations (residuals) during the forward pass, which restricts the depth of trainable networks. This raises a fundamental question: can we avoid storing these activations? We address…
