Archives AI News

Efficient Offline Reinforcement Learning: First Imitate, then Improve

arXiv:2406.13376v2 Announce Type: replace Abstract: Supervised imitation-based approaches are often favored over off-policy reinforcement learning approaches for learning policies offline, since their straightforward optimization objective makes them computationally efficient and stable to train. However, their performance is fundamentally limited by…

Transformer Reconstructed with Dynamic Value Attention

arXiv:2512.22212v1 Announce Type: new Abstract: Since transformer was firstly published in 2017, several works have been proposed to optimize it. However, the major structure of transformer remains unchanged, ignoring one of its main intrinsic limitations, which is the same static…

On the Existence and Behaviour of Secondary Attention Sinks

arXiv:2512.22213v1 Announce Type: new Abstract: Attention sinks are tokens, often the beginning-of-sequence (BOS) token, that receive disproportionately high attention despite limited semantic relevance. In this work, we identify a class of attention sinks, which we term secondary sinks, that differ…

Multivariate Conformal Prediction via Conformalized Gaussian Scoring

arXiv:2507.20941v2 Announce Type: replace-cross Abstract: While achieving exact conditional coverage in conformal prediction is unattainable without making strong, untestable regularity assumptions, the promise of conformal prediction hinges on finding approximations to conditional guarantees that are realizable in practice. A promising…

M”untz-Sz’asz Networks: Neural Architectures with Learnable Power-Law Bases

arXiv:2512.22222v1 Announce Type: new Abstract: Standard neural network architectures employ fixed activation functions (ReLU, tanh, sigmoid) that are poorly suited for approximating functions with singular or fractional power behavior, a structure that arises ubiquitously in physics, including boundary layers, fracture…

ReGAIN: Retrieval-Grounded AI Framework for Network Traffic Analysis

arXiv:2512.22223v1 Announce Type: new Abstract: Modern networks generate vast, heterogeneous traffic that must be continuously analyzed for security and performance. Traditional network traffic analysis systems, whether rule-based or machine learning-driven, often suffer from high false positives and lack interpretability, limiting…