On Evaluating LLM Alignment by Evaluating LLMs as Judges
arXiv:2511.20604v1 Announce Type: cross Abstract: Alignment with human preferences is an important evaluation aspect of LLMs, requiring them to be helpful, honest, safe, and to precisely follow human instructions. Evaluating large language models’ (LLMs) alignment typically involves directly assessing their…
