Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks
arXiv:2505.11881v2 Announce Type: replace-cross Abstract: Residual connections are pivotal for deep neural networks, enabling greater depth by mitigating vanishing gradients. However, in standard residual updates, the module’s output is directly added to the input stream. This can lead to updates…
