Archives AI News

DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning

arXiv:2602.16742v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has been shown effective in enhancing the visual reflection and reasoning capabilities of Large Multimodal Models (LMMs). However, existing datasets are predominantly derived from either small-scale manual construction or…

February 20, 2026

Quantifying LLM Attention-Head Stability: Implications for Circuit Universality

arXiv:2602.16740v1 Announce Type: new Abstract: In mechanistic interpretability, recent work scrutinizes transformer “circuits” – sparse, mono or multi layer sub computations, that may reflect human understandable functions. Yet, these network circuits are rarely acid-tested for their stability across different instances…

February 20, 2026

Real-time Secondary Crash Likelihood Prediction Excluding Post Primary Crash Features

arXiv:2602.16739v1 Announce Type: new Abstract: Secondary crash likelihood prediction is a critical component of an active traffic management system to mitigate congestion and adverse impacts caused by secondary crashes. However, existing approaches mainly rely on post-crash features (e.g., crash type…

February 20, 2026

LLM4Cov: Execution-Aware Agentic Learning for High-coverage Testbench Generation

arXiv:2602.16953v1 Announce Type: cross Abstract: Execution-aware LLM agents offer a promising paradigm for learning from tool feedback, but such feedback is often expensive and slow to obtain, making online reinforcement learning (RL) impractical. High-coverage hardware verification exemplifies this challenge due…

February 20, 2026

Attending to Routers Aids Indoor Wireless Localization

arXiv:2602.16762v1 Announce Type: new Abstract: Modern machine learning-based wireless localization using Wi-Fi signals continues to face significant challenges in achieving groundbreaking performance across diverse environments. A major limitation is that most existing algorithms do not appropriately weight the information from…

February 20, 2026

Deep Reinforcement Learning for Optimal Portfolio Allocation: A Comparative Study with Mean-Variance Optimization

arXiv:2602.17098v1 Announce Type: cross Abstract: Portfolio Management is the process of overseeing a group of investments, referred to as a portfolio, with the objective of achieving predetermined investment goals. Portfolio optimization is a key component that involves allocating the portfolio…

February 20, 2026

Machine Learning Argument of Latitude Error Model for LEO Satellite Orbit and Covariance Correction

arXiv:2602.16764v1 Announce Type: new Abstract: Low Earth orbit (LEO) satellites are leveraged to support new position, navigation, and timing (PNT) service alternatives to GNSS. These alternatives require accurate propagation of satellite position and velocity with a realistic quantification of uncertainty.…

February 20, 2026

genriesz: A Python Package for Automatic Debiased Machine Learning with Generalized Riesz Regression

arXiv:2602.17543v1 Announce Type: cross Abstract: Efficient estimation of causal and structural parameters can be automated using the Riesz representation theorem and debiased machine learning (DML). We present genriesz, an open-source Python package that implements automatic DML and generalized Riesz regression,…

February 20, 2026

Omitted Variable Bias in Language Models Under Distribution Shift

arXiv:2602.16784v1 Announce Type: new Abstract: Despite their impressive performance on a wide variety of tasks, modern language models remain susceptible to distribution shifts, exhibiting brittle behavior when evaluated on data that differs in distribution from their training data. In this…

February 20, 2026

Defining and Evaluating Physical Safety for Large Language Models

arXiv:2411.02317v2 Announce Type: replace Abstract: Large Language Models (LLMs) are increasingly used to control robotic systems such as drones, but their risks of causing physical threats and harm in real-world applications remain unexplored. Our study addresses the critical gap in…

February 20, 2026