Archives AI News

Learning to Reason as Action Abstractions with Scalable Mid-Training RL

arXiv:2509.25810v2 Announce Type: replace-cross Abstract: Large language models excel with reinforcement learning (RL), but fully unlocking this potential requires a mid-training stage. An effective mid-training phase should identify a compact set of useful actions and enable fast selection among them…

Dependency-aware Maximum Likelihood Estimation for Active Learning

arXiv:2503.05969v2 Announce Type: replace-cross Abstract: Active learning aims to efficiently build a labeled training set by strategically selecting samples to query labels from annotators. In this sequential process, each sample acquisition influences subsequent selections, causing dependencies among samples in the…

Machine Learning for Inverse Problems and Data Assimilation

arXiv:2410.10523v2 Announce Type: replace Abstract: The aim of these notes is to demonstrate the potential for ideas in machine learning to impact on the fields of inverse problems and data assimilation. The perspective is one that is primarily aimed at…

SONA: Learning Conditional, Unconditional, and Mismatching-Aware Discriminator

arXiv:2510.04576v1 Announce Type: cross Abstract: Deep generative models have made significant advances in generating complex content, yet conditional generation remains a fundamental challenge. Existing conditional generative adversarial networks often struggle to balance the dual objectives of assessing authenticity and conditional…

Relative Information Gain and Gaussian Process Regression

arXiv:2510.04277v1 Announce Type: new Abstract: The sample complexity of estimating or maximising an unknown function in a reproducing kernel Hilbert space is known to be linked to both the effective dimension and the information gain associated with the kernel. While…

Self-Speculative Masked Diffusions

arXiv:2510.03929v1 Announce Type: new Abstract: We present self-speculative masked diffusions, a new class of masked diffusion generative models for discrete data that require significantly fewer function evaluations to generate samples. Standard masked diffusion models predict factorized logits over currently masked…