Understanding the Failure Modes of Transformers through the Lens of Graph Neural Networks
arXiv:2512.09182v1 Announce Type: new Abstract: Transformers and more specifically decoder-only transformers dominate modern LLM architectures. While they have shown to work exceptionally well, they are not without issues, resulting in surprising failure modes and predictably asymmetric performance degradation. This article…
