ToMA: Token Merge with Attention for Diffusion Models
arXiv:2509.10918v2 Announce Type: replace Abstract: Diffusion models excel in high-fidelity image generation but face scalability limits due to transformers’ quadratic attention complexity. Plug-and-play token reduction methods like ToMeSD and ToFu reduce FLOPs by merging redundant tokens in generated images but…
