Archives AI News

Variance-Aware Prior-Based Tree Policies for Monte Carlo Tree Search

arXiv:2512.21648v1 Announce Type: new Abstract: Monte Carlo Tree Search (MCTS) has profoundly influenced reinforcement learning (RL) by integrating planning and learning in tasks requiring long-horizon reasoning, exemplified by the AlphaZero family of algorithms. Central to MCTS is the search strategy,…

Bias-variance decompositions: the exclusive privilege of Bregman divergences

arXiv:2501.18581v3 Announce Type: replace Abstract: Bias-variance decompositions are widely used to understand the generalization performance of machine learning models. While the squared error loss permits a straightforward decomposition, other loss functions – such as zero-one loss or $L_1$ loss –…

BSFA: Leveraging the Subspace Dichotomy to Accelerate Neural Network Training

arXiv:2510.25244v2 Announce Type: replace Abstract: Recent studies citep{gur2018gradient,song2024does, wen2024understanding} highlight a fundamental dichotomy in deep learning optimization: Although parameter updates along the top eigendirections of the loss Hessian (Dom-space) capture most of the update magnitude, they often contribute minimally to…

Clustering with Communication: A Variational Framework for Single Cell Representation Learning

arXiv:2505.04891v2 Announce Type: replace Abstract: Single-cell RNA sequencing (scRNA-seq) has revealed complex cellular heterogeneity, but recent studies emphasize that understanding biological function also requires modeling cell-cell communication (CCC), the signaling interactions mediated by ligand-receptor pairs that coordinate cellular behavior. Tools…

A Reinforcement Learning Approach to Synthetic Data Generation

arXiv:2512.21395v1 Announce Type: new Abstract: Synthetic data generation (SDG) is a promising approach for enabling data sharing in biomedical studies while preserving patient privacy. Yet, state-of-the-art generative models often require large datasets and complex training procedures, limiting their applicability in…

RLLaVA: An RL-central Framework for Language and Vision Assistants

arXiv:2512.21450v1 Announce Type: new Abstract: We present an RL-central framework for Language and Vision Assistants (RLLaVA) with its formulation of Markov decision process (MDP). RLLaVA decouples RL algorithmic logic from model architecture and distributed execution, supporting researchers in implementing new…