Archives AI News

Dynamic Controlled Variables Based Dynamic Self-Optimizing Control

arXiv:2605.06469v1 Announce Type: cross Abstract: Self-optimizing control is a strategy for selecting controlled variables, where the economic objective guides the selection and design of controlled variables, with the expectation that maintaining the controlled variables at constant values can achieve optimization…

Pretrained Event Classification Model for High Energy Physics Analysis

arXiv:2412.10665v2 Announce Type: replace-cross Abstract: We introduce a foundation model for event classification in high-energy physics, built on a Graph Neural Network architecture and trained on 120 million simulated proton-proton collision events spanning 12 distinct physics processes. The model is…

Dense Neural Networks are not Universal Approximators

arXiv:2602.07618v5 Announce Type: replace Abstract: We investigate the approximation capabilities of dense neural networks. While universal approximation theorems establish that sufficiently large architectures can approximate arbitrary continuous functions if there are no restrictions on the weight values, we show that…

Amortized Vine Copulas for High-Dimensional Density and Information Estimation

arXiv:2604.20568v2 Announce Type: replace Abstract: Modeling high-dimensional dependencies while keeping likelihoods tractable remains challenging. Classical vine-copula pipelines are interpretable but can be expensive, while many neural estimators are flexible but less structured. In this work, we propose Vine Denoising Copula…

High entropy leads to symmetry equivariant policies in Dec-POMDPs

arXiv:2511.22581v4 Announce Type: replace Abstract: We prove that in any Dec-POMDP, sufficiently high entropy regularization ensures that the policy gradient flow with tabular softmax parametrization always converges, for any initialization, to the same joint policy, and that this joint policy…

Adaptive Computation Depth via Learned Token Routing in Transformers

arXiv:2605.05222v1 Announce Type: new Abstract: Standard transformer architectures apply the same number of layers to every token regardless of contextual difficulty. We present Token-Selective Attention (TSA), a learned per-token gate on residual updates between consecutive transformer blocks. Each gate is…

Sparse Prefix Caching for Hybrid and Recurrent LLM Serving

arXiv:2605.05219v1 Announce Type: new Abstract: Prefix caching is a key latency optimization for autoregressive LLM serving, yet existing systems assume dense per-token key/value reuse. State-space models change the structure of the problem: a recurrent layer can resume from a single…