Archives AI News

Empowering Multi-Turn Tool-Integrated Reasoning with Group Turn Policy Optimization

arXiv:2511.14846v1 Announce Type: new Abstract: Training Large Language Models (LLMs) for multi-turn Tool-Integrated Reasoning (TIR) – where models iteratively reason, generate code, and verify through execution – remains challenging for existing reinforcement learning (RL) approaches. Current RL methods, exemplified by…

VisPlay: Self-Evolving Vision-Language Models from Images

arXiv:2511.15661v1 Announce Type: cross Abstract: Reinforcement learning (RL) provides a principled framework for improving Vision-Language Models (VLMs) on complex reasoning tasks. However, existing RL approaches often rely on human-annotated labels or task-specific heuristics to define verifiable rewards, both of which…

$pi^{*}_{0.6}$: a VLA That Learns From Experience

arXiv:2511.14759v2 Announce Type: replace Abstract: We study how vision-language-action (VLA) models can improve through real-world deployments via reinforcement learning (RL). We present a general-purpose method, RL with Experience and Corrections via Advantage-conditioned Policies (RECAP), that provides for RL training of…

Energy-based generator matching: A neural sampler for general state space

arXiv:2505.19646v3 Announce Type: replace Abstract: We propose Energy-based generator matching (EGM), a modality-agnostic approach to train generative models from energy functions in the absence of data. Extending the recently proposed generator matching, EGM enables training of arbitrary continuous-time Markov processes,…