The Curved Spacetime of Transformer Architectures
arXiv:2511.03060v1 Announce Type: new Abstract: We present a geometric framework for understanding Transformer-based language models, drawing an explicit analogy to General Relativity. Queries and keys induce an effective metric on representation space, and attention acts as a discrete connection that…
