Archives AI News

VAM: Verbalized Action Masking for Controllable Exploration in RL Post-Training — A Chess Case Study

arXiv:2602.16833v1 Announce Type: new Abstract: Exploration remains a key bottleneck for reinforcement learning (RL) post-training of large language models (LLMs), where sparse feedback and large action spaces can lead to premature collapse into repetitive behaviors. We propose Verbalized Action Masking…

February 20, 2026

SpectralGCD: Spectral Concept Selection and Cross-modal Representation Learning for Generalized Category Discovery

arXiv:2602.17395v1 Announce Type: cross Abstract: Generalized Category Discovery (GCD) aims to identify novel categories in unlabeled data while leveraging a small labeled subset of known classes. Training a parametric classifier solely on image features often leads to overfitting to old…

February 20, 2026

A Residual-Aware Theory of Position Bias in Transformers

arXiv:2602.16837v1 Announce Type: new Abstract: Transformer models systematically favor certain token positions, yet the architectural origins of this position bias remain poorly understood. Under causal masking at infinite depth, prior theoretical analyses of attention rollout predict an inevitable collapse of…

February 20, 2026

Simultaneous Blackwell Approachability and Applications to Multiclass Omniprediction

arXiv:2602.17577v1 Announce Type: cross Abstract: Omniprediction is a learning problem that requires suboptimality bounds for each of a family of losses $mathcal{L}$ against a family of comparator predictors $mathcal{C}$. We initiate the study of omniprediction in a multiclass setting, where…

February 20, 2026

Training Large Reasoning Models Efficiently via Progressive Thought Encoding

arXiv:2602.16839v1 Announce Type: new Abstract: Large reasoning models (LRMs) excel on complex problems but face a critical barrier to efficiency: reinforcement learning (RL) training requires long rollouts for outcome-based rewards, where autoregressive decoding dominates time and memory usage. While sliding-window…

February 20, 2026

Graph Machine Learning based Doubly Robust Estimator for Network Causal Effects

arXiv:2403.11332v3 Announce Type: replace Abstract: We address the challenge of inferring causal effects in social network data. This results in challenges due to interference — where a unit’s outcome is affected by neighbors’ treatments — and network-induced confounding factors. While…

February 20, 2026

What is the Value of Censored Data? An Exact Analysis for the Data-driven Newsvendor

arXiv:2602.16842v1 Announce Type: new Abstract: We study the offline data-driven newsvendor problem with censored demand data. In contrast to prior works where demand is fully observed, we consider the setting where demand is censored at the inventory level and only…

February 20, 2026

On the Mechanism and Dynamics of Modular Addition: Fourier Features, Lottery Ticket, and Grokking

arXiv:2602.16849v1 Announce Type: new Abstract: We present a comprehensive analysis of how two-layer neural networks learn features to solve the modular addition task. Our work provides a full mechanistic interpretation of the learned model and a theoretical explanation of its…

February 20, 2026

Generating Directed Graphs with Dual Attention and Asymmetric Encoding

arXiv:2506.16404v3 Announce Type: replace Abstract: Directed graphs naturally model systems with asymmetric, ordered relationships, essential to applications in biology, transportation, social networks, and visual understanding. Generating such graphs enables tasks such as simulation, data augmentation and novel instance discovery; however,…

February 20, 2026

Position: Why a Dynamical Systems Perspective is Needed to Advance Time Series Modeling

arXiv:2602.16864v1 Announce Type: new Abstract: Time series (TS) modeling has come a long way from early statistical, mainly linear, approaches to the current trend in TS foundation models. With a lot of hype and industrial demand in this field, it…

February 20, 2026