The Mean-Field Dynamics of Transformers
arXiv:2512.01868v3 Announce Type: replace Abstract: We develop a mathematical framework that interprets Transformer attention as an interacting particle system and studies its continuum (mean-field) limits. By idealizing attention on the sphere, we connect Transformer dynamics to Wasserstein gradient flows, synchronization…
