Archives AI News

VeRPO: Verifiable Dense Reward Policy Optimization for Code Generation

arXiv:2601.03525v1 Announce Type: new Abstract: Effective reward design is a central challenge in Reinforcement Learning (RL) for code generation. Mainstream pass/fail outcome rewards enforce functional correctness via executing unit tests, but the resulting sparsity limits potential performance gains. While recent…

Low Resource Reconstruction Attacks Through Benign Prompts

arXiv:2507.07947v3 Announce Type: replace Abstract: Recent advances in generative models, such as diffusion models, have raised concerns related to privacy, copyright infringement, and data stewardship. To better understand and control these risks, prior work has introduced techniques and attacks that…

The Mean-Field Dynamics of Transformers

arXiv:2512.01868v3 Announce Type: replace Abstract: We develop a mathematical framework that interprets Transformer attention as an interacting particle system and studies its continuum (mean-field) limits. By idealizing attention on the sphere, we connect Transformer dynamics to Wasserstein gradient flows, synchronization…