Flattening Hierarchies with Policy Bootstrapping
arXiv:2505.14975v3 Announce Type: replace Abstract: Offline goal-conditioned reinforcement learning (GCRL) is a promising approach for pretraining generalist policies on large datasets of reward-free trajectories, akin to the self-supervised objectives used to train foundation models for computer vision and natural language…
