Find Your Optimal Teacher: Personalized Data Synthesis via Router-Guided Multi-Teacher Distillation
arXiv:2510.10925v2 Announce Type: replace Abstract: Training student models on synthetic data generated by strong teacher models is a promising way to distilling the capabilities of teachers. However, recent studies show that stronger models are not always optimal teachers, revealing a…
