Archives AI News

Why Safety Probes Catch Liars But Miss Fanatics

arXiv:2603.25861v1 Announce Type: new Abstract: Activation-based probes have emerged as a promising approach for detecting deceptively aligned AI systems by identifying internal conflict between true and stated goals. We identify a fundamental blind spot: probes fail on coherent misalignment –…

Incorporating contextual information into KGWAS for interpretable GWAS discovery

arXiv:2603.25855v1 Announce Type: new Abstract: Genome-Wide Association Studies (GWAS) identify associations between genetic variants and disease; however, moving beyond associations to causal mechanisms is critical for therapeutic target prioritization. The recently proposed Knowledge Graph GWAS (KGWAS) framework addresses this challenge…

A Compression Perspective on Simplicity Bias

arXiv:2603.25839v1 Announce Type: new Abstract: Deep neural networks exhibit a simplicity bias, a well-documented tendency to favor simple functions over complex ones. In this work, we cast new light on this phenomenon through the lens of the Minimum Description Length…

Task Tokens: A Flexible Approach to Adapting Behavior Foundation Models

arXiv:2503.22886v2 Announce Type: replace Abstract: Recent advancements in imitation learning have led to transformer-based behavior foundation models (BFMs) that enable multi-modal, human-like control for humanoid agents. While excelling at zero-shot generation of robust behaviors, BFMs often require meticulous prompt engineering…

Parameter-Free Dynamic Regret for Unconstrained Linear Bandits

arXiv:2603.25916v1 Announce Type: new Abstract: We study dynamic regret minimization in unconstrained adversarial linear bandit problems. In this setting, a learner must minimize the cumulative loss relative to an arbitrary sequence of comparators $boldsymbol{u}_1,ldots,boldsymbol{u}_T$ in $mathbb{R}^d$, but receives only point-evaluation…

Revisiting Diffusion Model Predictions Through Dimensionality

arXiv:2601.21419v2 Announce Type: replace Abstract: Recent advances in diffusion and flow matching models have highlighted a shift in the preferred prediction target — moving from noise ($varepsilon$) and velocity (v) to direct data (x) prediction — particularly in high-dimensional settings.…