Archives AI News

Beyond Introspection: Reinforcing Thinking via Externalist Behavioral Feedback

arXiv:2501.01457v3 Announce Type: replace Abstract: While inference-time thinking allows Large Language Models (LLMs) to address complex problems, the extended thinking process can be unreliable or inconsistent because of the model’s probabilistic nature, especially near its knowledge boundaries. Existing approaches attempt…

December 1, 2025

Spatio-Temporal Hierarchical Causal Models

arXiv:2511.20558v2 Announce Type: replace-cross Abstract: The abundance of fine-grained spatio-temporal data, such as traffic sensor networks, offers vast opportunities for scientific discovery. However, inferring causal relationships from such observational data remains challenging, particularly due to unobserved confounders that are specific…

December 1, 2025

Language-conditioned world model improves policy generalization by reading environmental descriptions

arXiv:2511.22904v1 Announce Type: cross Abstract: To interact effectively with humans in the real world, it is important for agents to understand language that describes the dynamics of the environment–that is, how the environment behaves–rather than just task instructions specifying “what…

December 1, 2025

Physics Steering: Causal Control of Cross-Domain Concepts in a Physics Foundation Model

arXiv:2511.20798v2 Announce Type: replace Abstract: Recent advances in mechanistic interpretability have revealed that large language models (LLMs) develop internal representations corresponding not only to concrete entities but also distinct, human-understandable abstract concepts and behaviour. Moreover, these hidden features can be…

December 1, 2025

Gradient-Based Program Repair: Fixing Bugs in Continuous Program Spaces

arXiv:2505.17703v2 Announce Type: replace-cross Abstract: Automatic program repair seeks to generate correct code from buggy programs, with most approaches searching the correct program in a discrete, symbolic space of source code tokens. This symbolic search is fundamentally limited by its…

December 1, 2025

Lightweight ML-Based Air Quality Prediction for IoT and Embedded Applications

arXiv:2511.21857v1 Announce Type: new Abstract: This study investigates the effectiveness and efficiency of two variants of the XGBoost regression model, the full-capacity and lightweight (tiny) versions, for predicting the concentrations of carbon monoxide (CO) and nitrogen dioxide (NO2). Using the…

December 1, 2025

Unsupervised Anomaly Detection for Smart IoT Devices: Performance and Resource Comparison

arXiv:2511.21842v1 Announce Type: new Abstract: The rapid expansion of Internet of Things (IoT) deployments across diverse sectors has significantly enhanced operational efficiency, yet concurrently elevated cybersecurity vulnerabilities due to increased exposure to cyber threats. Given the limitations of traditional signature-based…

December 1, 2025

Massively Parallel Imitation Learning of Mouse Forelimb Musculoskeletal Reaching Dynamics

arXiv:2511.21848v1 Announce Type: new Abstract: The brain has evolved to effectively control the body, and in order to understand the relationship we need to model the sensorimotor transformations underlying embodied control. As part of a coordinated effort, we are developing…

December 1, 2025

The Double-Edged Nature of the Rashomon Set for Trustworthy Machine Learning

arXiv:2511.21799v1 Announce Type: new Abstract: Real-world machine learning (ML) pipelines rarely produce a single model; instead, they produce a Rashomon set of many near-optimal ones. We show that this multiplicity reshapes key aspects of trustworthiness. At the individual-model level, sparse…

December 1, 2025

Multiclass threshold-based classification and model evaluation

arXiv:2511.21794v1 Announce Type: new Abstract: In this paper, we introduce a threshold-based framework for multiclass classification that generalizes the standard argmax rule. This is done by replacing the probabilistic interpretation of softmax outputs with a geometric one on the multidimensional…

December 1, 2025