Google AI Unveils Supervised Reinforcement Learning (SRL): A Step Wise Framework with Expert Trajectories to Teach Small Language Models to Reason through Hard Problems
How can a small model learn to solve tasks it currently fails at, without rote imitation or relying on a correct rollout? A team of researchers from Google Cloud AI Research and UCLA have released a training framework, ‘Supervised Reinforcement…
