Preference Leakage: A Contamination Problem in LLM-as-a-judge
arXiv:2502.01534v3 Announce Type: replace Abstract: Large Language Models (LLMs) as judges and LLM-based data synthesis have emerged as two fundamental LLM-driven data annotation methods in model development. While their combination significantly enhances the efficiency of model training and evaluation, little…
