Learning General Policies with Policy Gradient Methods
arXiv:2512.19366v1 Announce Type: cross Abstract: While reinforcement learning methods have delivered remarkable results in a number of settings, generalization, i.e., the ability to produce policies that generalize in a reliable and systematic way, has remained a challenge. The problem of…
