Deep sequence models tend to memorize geometrically; it is unclear why
arXiv:2510.26745v2 Announce Type: replace Abstract: Deep sequence models are said to store atomic facts predominantly in the form of associative memory: a brute-force lookup of co-occurring entities. We identify a dramatically different form of storage of atomic facts that we…
