Archives AI News

Learning to Reason as Action Abstractions with Scalable Mid-Training RL

arXiv:2509.25810v2 Announce Type: replace-cross Abstract: Large language models excel with reinforcement learning (RL), but fully unlocking this potential requires a mid-training stage. An effective mid-training phase should identify a compact set of useful actions and enable fast selection among them…

October 7, 2025

Quick Adaptive Ternary Segmentation: An Efficient Decoding Procedure For Hidden Markov Models

arXiv:2305.18578v2 Announce Type: replace-cross Abstract: Hidden Markov models (HMMs) are characterized by an unobservable Markov chain and an observable process — a noisy version of the hidden chain. Decoding the original signal from the noisy observations is one of the…

October 7, 2025

Dependency-aware Maximum Likelihood Estimation for Active Learning

arXiv:2503.05969v2 Announce Type: replace-cross Abstract: Active learning aims to efficiently build a labeled training set by strategically selecting samples to query labels from annotators. In this sequential process, each sample acquisition influences subsequent selections, causing dependencies among samples in the…

October 7, 2025

Machine Learning for Inverse Problems and Data Assimilation

arXiv:2410.10523v2 Announce Type: replace Abstract: The aim of these notes is to demonstrate the potential for ideas in machine learning to impact on the fields of inverse problems and data assimilation. The perspective is one that is primarily aimed at…

October 7, 2025

Uniform convergence of the smooth calibration error and its relationship with functional gradient

arXiv:2505.19396v4 Announce Type: replace Abstract: Calibration is a critical requirement for reliable probabilistic prediction, especially in high-risk applications. However, the theoretical understanding of which learning algorithms can simultaneously achieve high accuracy and good calibration remains limited, and many existing studies…

October 7, 2025

SONA: Learning Conditional, Unconditional, and Mismatching-Aware Discriminator

arXiv:2510.04576v1 Announce Type: cross Abstract: Deep generative models have made significant advances in generating complex content, yet conditional generation remains a fundamental challenge. Existing conditional generative adversarial networks often struggle to balance the dual objectives of assessing authenticity and conditional…

October 7, 2025

Rethinking Langevin Thompson Sampling from A Stochastic Approximation Perspective

arXiv:2510.05023v1 Announce Type: cross Abstract: Most existing approximate Thompson Sampling (TS) algorithms for multi-armed bandits use Stochastic Gradient Langevin Dynamics (SGLD) or its variants in each round to sample from the posterior, relaxing the need for conjugacy assumptions between priors…

October 7, 2025

Scalable Causal Discovery from Recursive Nonlinear Data via Truncated Basis Function Scores and Tests

arXiv:2510.04276v1 Announce Type: new Abstract: Learning graphical conditional independence structures from nonlinear, continuous or mixed data is a central challenge in machine learning and the sciences, and many existing methods struggle to scale to thousands of samples or hundreds of…

October 7, 2025

Relative Information Gain and Gaussian Process Regression

arXiv:2510.04277v1 Announce Type: new Abstract: The sample complexity of estimating or maximising an unknown function in a reproducing kernel Hilbert space is known to be linked to both the effective dimension and the information gain associated with the kernel. While…

October 7, 2025

Self-Speculative Masked Diffusions

arXiv:2510.03929v1 Announce Type: new Abstract: We present self-speculative masked diffusions, a new class of masked diffusion generative models for discrete data that require significantly fewer function evaluations to generate samples. Standard masked diffusion models predict factorized logits over currently masked…

October 7, 2025