Teaching Metric Distance to Discrete Autoregressive Language Models
arXiv:2503.02379v4 Announce Type: replace Abstract: As large language models expand beyond natural language to domains such as mathematics, multimodal understanding, and embodied agents, tokens increasingly reflect metric relationships rather than purely linguistic meaning. We introduce DIST2Loss, a distance-aware framework designed…
