Archives AI News

SABER: Small Actions, Big Errors – Safeguarding Mutating Steps in LLM Agents

arXiv:2512.07850v1 Announce Type: new Abstract: Despite rapid progress in LLM agents, performance on long-horizon, tool-using tasks remains fragile. To better understand this fragility, we ask a simple question: emph{do all actions contribute equally to failure?} Analyzing execution traces on $tau$-Bench…

December 10, 2025

RaX-Crash: A Resource Efficient and Explainable Small Model Pipeline with an Application to City Scale Injury Severity Prediction

arXiv:2512.07848v1 Announce Type: new Abstract: New York City reports over one hundred thousand motor vehicle collisions each year, creating substantial injury and public health burden. We present RaX-Crash, a resource efficient and explainable small model pipeline for structured injury severity…

December 10, 2025

CarBench: A Comprehensive Benchmark for Neural Surrogates on High-Fidelity 3D Car Aerodynamics

arXiv:2512.07847v1 Announce Type: new Abstract: Benchmarking has been the cornerstone of progress in computer vision, natural language processing, and the broader deep learning domain, driving algorithmic innovation through standardized datasets and reproducible evaluation protocols. The growing availability of large-scale Computational…

December 10, 2025

Heuristics for Combinatorial Optimization via Value-based Reinforcement Learning: A Unified Framework and Analysis

arXiv:2512.08601v1 Announce Type: cross Abstract: Since the 1990s, considerable empirical work has been carried out to train statistical models, such as neural networks (NNs), as learned heuristics for combinatorial optimization (CO) problems. When successful, such an approach eliminates the need…

December 10, 2025

SA^2GFM: Enhancing Robust Graph Foundation Models with Structure-Aware Semantic Augmentation

arXiv:2512.07857v1 Announce Type: new Abstract: We present Graph Foundation Models (GFMs) which have made significant progress in various tasks, but their robustness against domain noise, structural perturbations, and adversarial attacks remains underexplored. A key limitation is the insufficient modeling of…

December 10, 2025

Multicalibration for LLM-based Code Generation

arXiv:2512.08810v1 Announce Type: cross Abstract: As AI-based code generation becomes widespread, researchers are investigating the calibration of code LLMs – ensuring their confidence scores faithfully represent the true likelihood of code correctness. To do so, we investigate multicalibration, which can…

December 10, 2025

FAIM: Frequency-Aware Interactive Mamba for Time Series Classification

arXiv:2512.07858v1 Announce Type: new Abstract: Time series classification (TSC) is crucial in numerous real-world applications, such as environmental monitoring, medical diagnosis, and posture recognition. TSC tasks require models to effectively capture discriminative information for accurate class identification. Although deep learning…

December 10, 2025

Score-based Conditional Out-of-Distribution Augmentation for Graph Covariate Shift

arXiv:2410.17506v2 Announce Type: replace Abstract: Distribution shifts between training and testing datasets significantly impair the model performance on graph learning. A commonly-taken causal view in graph invariant learning suggests that stable predictive features of graphs are causally associated with labels,…

December 10, 2025

SetAD: Semi-Supervised Anomaly Learning in Contextual Sets

arXiv:2512.07863v1 Announce Type: new Abstract: Semi-supervised anomaly detection (AD) has shown great promise by effectively leveraging limited labeled data. However, existing methods are typically structured around scoring individual points or simple pairs. Such {point- or pair-centric} view not only overlooks…

December 10, 2025

Schauder Bases for $C[0, 1]$ Using ReLU, Softplus and Two Sigmoidal Functions

arXiv:2506.07884v2 Announce Type: replace Abstract: We construct four Schauder bases for the space $C[0,1]$, one using ReLU functions, another using Softplus functions, and two more using sigmoidal versions of the ReLU and Softplus functions. This establishes the existence of a…

December 10, 2025