Archives AI News

Diversity-Aware Reverse Kullback-Leibler Divergence for Large Language Model Distillation

arXiv:2604.00223v1 Announce Type: new Abstract: Reverse Kullback-Leibler (RKL) divergence has recently emerged as the preferred objective for large language model (LLM) distillation, consistently outperforming forward KL (FKL), particularly in regimes with large vocabularies and significant teacher-student capacity mismatch, where RKL…

April 2, 2026

Rapid mixing in positively weighted restricted Boltzmann machines

arXiv:2604.00963v1 Announce Type: cross Abstract: We show polylogarithmic mixing time bounds for the alternating-scan sampler for positively weighted restricted Boltzmann machines. This is done via analysing the same chain and the Glauber dynamics for ferromagnetic two-spin systems, where we obtain…

April 2, 2026

Neural Collapse Dynamics: Depth, Activation, Regularisation, and Feature Norm Threshold

arXiv:2604.00230v1 Announce Type: new Abstract: Neural collapse (NC) — the convergence of penultimate-layer features to a simplex equiangular tight frame — is well understood at equilibrium, but the dynamics governing its onset remain poorly characterised. We identify a simple and…

April 2, 2026

Paper Reconstruction Evaluation: Evaluating Presentation and Hallucination in AI-written Papers

arXiv:2604.01128v1 Announce Type: cross Abstract: This paper introduces the first systematic evaluation framework for quantifying the quality and risks of papers written by modern coding agents. While AI-driven paper writing has become a growing concern, rigorous evaluation of the quality…

April 2, 2026

MAC-Attention: a Match-Amend-Complete Scheme for Fast and Accurate Attention Computation

arXiv:2604.00235v1 Announce Type: new Abstract: Long-context decoding in LLMs is IO-bound: each token re-reads an ever-growing KV cache. Prior accelerations cut bytes via compression, which lowers fidelity, or selection/eviction, which restricts what remains accessible, and both can degrade delayed recall…

April 2, 2026

A Survey on Graph Neural Network Acceleration: Algorithms, Systems, and Customized Hardware

arXiv:2306.14052v2 Announce Type: replace Abstract: Graph neural networks (GNNs) are emerging for machine learning research on graph-structured data. GNNs achieve state-of-the-art performance on many tasks, but they face scalability challenges when it comes to real-world applications that have numerous data…

April 2, 2026

Hierarchical Discrete Flow Matching for Graph Generation

arXiv:2604.00236v1 Announce Type: new Abstract: Denoising-based models, including diffusion and flow matching, have led to substantial advances in graph generation. Despite this progress, such models remain constrained by two fundamental limitations: a computational cost that scales quadratically with the number…

April 2, 2026

VT-Former: Efffcient Transformer-based Decoder for Varshamov-Tenengolts Codes

arXiv:2502.21060v2 Announce Type: replace Abstract: In recent years, widespread attention has been drawn to the challenge of correcting insertion, deletion, and substitution (IDS) errors in DNA-based data storage. Among various IDS-correcting codes, Varshamov-Tenengolts (VT) codes, originally designed for single-error correction,…

April 2, 2026

Softmax gradient policy for variance minimization and risk-averse multi armed bandits

arXiv:2604.00241v1 Announce Type: new Abstract: Algorithms for the Multi-Armed Bandit (MAB) problem play a central role in sequential decision-making and have been extensively explored both theoretically and numerically. While most classical approaches aim to identify the arm with the highest…

April 2, 2026

Binned semiparametric Bayesian networks for efficient kernel density estimation

arXiv:2506.21997v3 Announce Type: replace Abstract: This paper introduces a new type of probabilistic semiparametric model that takes advantage of data binning to reduce the computational cost of kernel density estimation in nonparametric distributions. Two new conditional probability distributions are developed…

April 2, 2026