Archives AI News

On the Existence and Behavior of Secondary Attention Sinks

arXiv:2512.22213v2 Announce Type: replace Abstract: Attention sinks are tokens, often the beginning-of-sequence (BOS) token, that receive disproportionately high attention despite limited semantic relevance. In this work, we identify a class of attention sinks, which we term secondary sinks, that differ…

February 20, 2026

Efficient Tail-Aware Generative Optimization via Flow Model Fine-Tuning

arXiv:2602.16796v1 Announce Type: new Abstract: Fine-tuning pre-trained diffusion and flow models to optimize downstream utilities is central to real-world deployment. Existing entropy-regularized methods primarily maximize expected reward, providing no mechanism to shape tail behavior. However, tail control is often essential:…

February 20, 2026

Goal Inference from Open-Ended Dialog

arXiv:2410.13957v2 Announce Type: replace-cross Abstract: Embodied AI Agents are quickly becoming important and common tools in society. These embodied agents should be able to learn about and accomplish a wide range of user goals and preferences efficiently and robustly. Large…

February 20, 2026

TopoFlow: Physics-guided Neural Networks for high-resolution air quality prediction

arXiv:2602.16821v1 Announce Type: new Abstract: We propose TopoFlow (Topography-aware pollutant Flow learning), a physics-guided neural network for efficient, high-resolution air quality prediction. To explicitly embed physical processes into the learning framework, we identify two critical factors governing pollutant dynamics: topography…

February 20, 2026

Bongard-RWR+: Real-World Representations of Fine-Grained Concepts in Bongard Problems

arXiv:2508.12026v2 Announce Type: replace-cross Abstract: Bongard Problems (BPs) provide a challenging testbed for abstract visual reasoning (AVR), requiring models to identify visual concepts fromjust a few examples and describe them in natural language. Early BP benchmarks featured synthetic black-and-white drawings,…

February 20, 2026

Formal Mechanistic Interpretability: Automated Circuit Discovery with Provable Guarantees

arXiv:2602.16823v1 Announce Type: new Abstract: *Automated circuit discovery* is a central tool in mechanistic interpretability for identifying the internal components of neural networks responsible for specific behaviors. While prior methods have made significant progress, they typically depend on heuristics or…

February 20, 2026

Building Safe and Deployable Clinical Natural Language Processing under Temporal Leakage Constraints

arXiv:2602.15852v2 Announce Type: replace-cross Abstract: Clinical natural language processing (NLP) models have shown promise for supporting hospital discharge planning by leveraging narrative clinical documentation. However, note-based models are particularly vulnerable to temporal and lexical leakage, where documentation artifacts encode future…

February 20, 2026

HiVAE: Hierarchical Latent Variables for Scalable Theory of Mind

arXiv:2602.16826v1 Announce Type: new Abstract: Theory of mind (ToM) enables AI systems to infer agents’ hidden goals and mental states, but existing approaches focus mainly on small human understandable gridworld spaces. We introduce HiVAE, a hierarchical variational architecture that scales…

February 20, 2026

Chip-processing method could assist cryptography schemes to keep data secure

By enabling two chips to authenticate each other using a shared fingerprint, this technique can improve privacy and energy efficiency.

February 20, 2026

A Unifying Framework for Robust and Efficient Inference with Unstructured Data

arXiv:2505.00282v3 Announce Type: replace-cross Abstract: To analyze unstructured data (text, images, audio, video), economists typically first extract low-dimensional structured features with a neural network. Neural networks do not make generically unbiased predictions, and biases will propagate to estimators that use…

February 20, 2026