Streaming Generation of Co-Speech Gestures via Accelerated Rolling Diffusion
arXiv:2503.10488v3 Announce Type: replace Abstract: Generating co-speech gestures in real time requires both temporal coherence and efficient sampling. We introduce a novel framework for streaming gesture generation that extends Rolling Diffusion models with structured progressive noise scheduling, enabling seamless long-sequence…
