Archives AI News

Single-stream Policy Optimization

arXiv:2509.13232v1 Announce Type: cross Abstract: We revisit policy-gradient optimization for Large Language Models (LLMs) from a single-stream perspective. Prevailing group-based methods like GRPO reduce variance with on-the-fly baselines but suffer from critical flaws: frequent degenerate groups erase learning signals, and…

Learning Discrete Bayesian Networks with Hierarchical Dirichlet Shrinkage

arXiv:2509.13267v1 Announce Type: cross Abstract: Discrete Bayesian networks (DBNs) provide a broadly useful framework for modeling dependence structures in multivariate categorical data. There is a vast literature on methods for inferring conditional probabilities and graphical structure in DBNs, but data…

InfoGain Wavelets: Furthering the Design of Graph Diffusion Wavelets

arXiv:2504.08802v2 Announce Type: replace-cross Abstract: Diffusion wavelets extract information from graph signals at different scales of resolution by utilizing graph diffusion operators raised to various powers, known as diffusion scales. Traditionally, these scales are chosen to be dyadic integers, $2^j$.…