Thoughtbubbles: an Unsupervised Method for Parallel Thinking in Latent Space
arXiv:2510.00219v1 Announce Type: new Abstract: Current approaches for scaling inference-time compute in transformers rely on training them to emit explicit chain-of-thought tokens before producing an answer. While these methods are powerful, they are limited because they cannot be applied during…
