Archives AI News

Defining and Evaluating Physical Safety for Large Language Models

arXiv:2411.02317v2 Announce Type: replace Abstract: Large Language Models (LLMs) are increasingly used to control robotic systems such as drones, but their risks of causing physical threats and harm in real-world applications remain unexplored. Our study addresses the critical gap in…

February 20, 2026

Better Think Thrice: Learning to Reason Causally with Double Counterfactual Consistency

arXiv:2602.16787v1 Announce Type: new Abstract: Despite their strong performance on reasoning benchmarks, large language models (LLMs) have proven brittle when presented with counterfactual questions, suggesting weaknesses in their causal reasoning ability. While recent work has demonstrated that labeled counterfactual tasks…

February 20, 2026

Continuous-Time Value Iteration for Multi-Agent Reinforcement Learning

arXiv:2509.09135v3 Announce Type: replace Abstract: Existing reinforcement learning (RL) methods struggle with complex dynamical systems that demand interactions at high frequencies or irregular time intervals. Continuous-time RL (CTRL) has emerged as a promising alternative by replacing discrete-time Bellman recursion with…

February 20, 2026

Escaping the Cognitive Well: Efficient Competition Math with Off-the-Shelf Models

arXiv:2602.16793v1 Announce Type: new Abstract: In the past year, custom and unreleased math reasoning models reached gold medal performance on the International Mathematical Olympiad (IMO). Similar performance was then reported using large-scale inference on publicly available models but at prohibitive…

February 20, 2026

On the Existence and Behavior of Secondary Attention Sinks

arXiv:2512.22213v2 Announce Type: replace Abstract: Attention sinks are tokens, often the beginning-of-sequence (BOS) token, that receive disproportionately high attention despite limited semantic relevance. In this work, we identify a class of attention sinks, which we term secondary sinks, that differ…

February 20, 2026

Efficient Tail-Aware Generative Optimization via Flow Model Fine-Tuning

arXiv:2602.16796v1 Announce Type: new Abstract: Fine-tuning pre-trained diffusion and flow models to optimize downstream utilities is central to real-world deployment. Existing entropy-regularized methods primarily maximize expected reward, providing no mechanism to shape tail behavior. However, tail control is often essential:…

February 20, 2026

Goal Inference from Open-Ended Dialog

arXiv:2410.13957v2 Announce Type: replace-cross Abstract: Embodied AI Agents are quickly becoming important and common tools in society. These embodied agents should be able to learn about and accomplish a wide range of user goals and preferences efficiently and robustly. Large…

February 20, 2026

TopoFlow: Physics-guided Neural Networks for high-resolution air quality prediction

arXiv:2602.16821v1 Announce Type: new Abstract: We propose TopoFlow (Topography-aware pollutant Flow learning), a physics-guided neural network for efficient, high-resolution air quality prediction. To explicitly embed physical processes into the learning framework, we identify two critical factors governing pollutant dynamics: topography…

February 20, 2026

Bongard-RWR+: Real-World Representations of Fine-Grained Concepts in Bongard Problems

arXiv:2508.12026v2 Announce Type: replace-cross Abstract: Bongard Problems (BPs) provide a challenging testbed for abstract visual reasoning (AVR), requiring models to identify visual concepts fromjust a few examples and describe them in natural language. Early BP benchmarks featured synthetic black-and-white drawings,…

February 20, 2026

Formal Mechanistic Interpretability: Automated Circuit Discovery with Provable Guarantees

arXiv:2602.16823v1 Announce Type: new Abstract: *Automated circuit discovery* is a central tool in mechanistic interpretability for identifying the internal components of neural networks responsible for specific behaviors. While prior methods have made significant progress, they typically depend on heuristics or…

February 20, 2026