CTRL-ALT-DECEIT: Sabotage Evaluations for Automated AI R&D
arXiv:2511.09904v1 Announce Type: new Abstract: AI systems are increasingly able to autonomously conduct realistic software engineering tasks, and may soon be deployed to automate machine learning (ML) R&D itself. Frontier AI systems may be deployed in safety-critical settings, including to…
