Archives AI News

When Shallow Wins: Silent Failures and the Depth-Accuracy Paradox in Latent Reasoning

arXiv:2603.03475v1 Announce Type: new Abstract: Mathematical reasoning models are widely deployed in education, automated tutoring, and decision support systems despite exhibiting fundamental computational instabilities. We demonstrate that state-of-the-art models (Qwen2.5-Math-7B) achieve 61% accuracy through a mixture of reliable and unreliable…

Preference Leakage: A Contamination Problem in LLM-as-a-judge

arXiv:2502.01534v3 Announce Type: replace Abstract: Large Language Models (LLMs) as judges and LLM-based data synthesis have emerged as two fundamental LLM-driven data annotation methods in model development. While their combination significantly enhances the efficiency of model training and evaluation, little…

Knowing When to Quit: Probabilistic Early Exits for Speech Separation

arXiv:2507.09768v3 Announce Type: replace Abstract: In recent years, deep learning-based single-channel speech separation has improved considerably, in large part driven by increasingly compute- and parameter-efficient neural network architectures. Most such architectures are, however, designed with a fixed compute and parameter…

Circuit Insights: Towards Interpretability Beyond Activations

arXiv:2510.14936v2 Announce Type: replace Abstract: The fields of explainable AI and mechanistic interpretability aim to uncover the internal structure of neural networks, with circuit discovery as a central tool for understanding model computations. Existing approaches, however, rely on manual inspection…

Solving adversarial examples requires solving exponential misalignment

arXiv:2603.03507v1 Announce Type: new Abstract: Adversarial attacks – input perturbations imperceptible to humans that fool neural networks – remain both a persistent failure mode in machine learning, and a phenomenon with mysterious origins. To shed light, we define and analyze…