Archives AI News

Remasking Discrete Diffusion Models with Inference-Time Scaling

arXiv:2503.00307v3 Announce Type: replace Abstract: Part of the success of diffusion models stems from their ability to perform iterative refinement, i.e., repeatedly correcting outputs during generation. However, modern masked discrete diffusion lacks this capability: when a token is generated, it…

The Geometry of Grokking: Norm Minimization on the Zero-Loss Manifold

arXiv:2511.01938v1 Announce Type: new Abstract: Grokking is a puzzling phenomenon in neural networks where full generalization occurs only after a substantial delay following the complete memorization of the training data. Previous research has linked this delayed generalization to representation learning…

Hybrid Quantum-Classical Recurrent Neural Networks

arXiv:2510.25557v2 Announce Type: replace Abstract: We present a hybrid quantum-classical recurrent neural network (QRNN) architecture in which the recurrent core is realized as a parametrized quantum circuit (PQC) controlled by a classical feedforward network. The hidden state is the quantum…