Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning
arXiv:2508.03501v2 Announce Type: replace Abstract: Research on applications of reinforcement learning (RL) to large language models has mostly been focused on single-turn problems, such as mathematical reasoning or single-shot code generation. While these problems can be viewed as token-level multi-turn…
