On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification
arXiv:2508.05629v3 Announce Type: replace Abstract: In this work, we present a simple yet theoretically motivated improvement to Supervised Fine-Tuning (SFT) for the Large Language Model (LLM), addressing its limited generalization compared to reinforcement learning (RL). Through mathematical analysis, we reveal…
