Archives AI News

Learning to Rank Chain-of-Thought: Using a Small Model

arXiv:2505.14999v3 Announce Type: replace-cross Abstract: Large Language Models (LLMs) struggle with reliable mathematical reasoning, and current verification methods are often computationally expensive. This paper introduces the Energy Outcome Reward Model (EORM), a highly efficient, lightweight post-hoc verifier designed to address…

Error Feedback for Muon and Friends

arXiv:2510.00643v1 Announce Type: cross Abstract: Recent optimizers like Muon, Scion, and Gluon have pushed the frontier of large-scale deep learning by exploiting layer-wise linear minimization oracles (LMOs) over non-Euclidean norm balls, capturing neural network structure in ways traditional algorithms cannot.…

Learn to Guide Your Diffusion Model

arXiv:2510.00815v1 Announce Type: cross Abstract: Classifier-free guidance (CFG) is a widely used technique for improving the perceptual quality of samples from conditional diffusion models. It operates by linearly combining conditional and unconditional score estimates using a guidance weight $omega$. While…

Guided Speculative Inference for Efficient Test-Time Alignment of LLMs

arXiv:2506.04118v2 Announce Type: replace-cross Abstract: We propose Guided Speculative Inference (GSI), a novel algorithm for efficient reward-guided decoding in large language models. GSI combines soft best-of-$n$ test-time scaling with a reward model $r(x,y)$ and speculative samples from a small auxiliary…

False Discovery Rate Control via Bayesian Mirror Statistic

arXiv:2510.00875v1 Announce Type: cross Abstract: Simultaneously performing variable selection and inference in high-dimensional models is an open challenge in statistics and machine learning. The increasing availability of vast amounts of variables requires the adoption of specific statistical procedures to accurately…

Robust Estimation Under Heterogeneous Corruption Rates

arXiv:2508.15051v2 Announce Type: replace-cross Abstract: We study the problem of robust estimation under heterogeneous corruption rates, where each sample may be independently corrupted with a known but non-identical probability. This setting arises naturally in distributed and federated learning, crowdsourcing, and…