Archives AI News

Knowing without Acting: The Disentangled Geometry of Safety Mechanisms in Large Language Models

arXiv:2603.05773v2 Announce Type: replace-cross Abstract: Safety alignment is often conceptualized as a monolithic process wherein harmfulness detection automatically triggers refusal. However, the persistence of jailbreak attacks suggests a fundamental mechanistic decoupling. We propose the textbf{underline{D}}isentangled textbf{underline{S}}afety textbf{underline{H}}ypothesis textbf{(DSH)}, positing that…

March 16, 2026

Re2: A Consistency-ensured Dataset for Full-stage Peer Review and Multi-turn Rebuttal Discussions

arXiv:2505.07920v2 Announce Type: replace-cross Abstract: Peer review is a critical component of scientific progress in the fields like AI, but the rapid increase in submission volume has strained the reviewing system, which inevitably leads to reviewer shortages and declines review…

March 16, 2026

Scaling Generalist Data-Analytic Agents

arXiv:2509.25084v3 Announce Type: replace-cross Abstract: Data-analytic agents are emerging as a key catalyst for automated scientific discovery and for the vision of Innovating AI. Current approaches, however, rely heavily on prompt engineering over proprietary models, while open-source models struggle to…

March 16, 2026

Disentangling Recall and Reasoning in Transformer Models through Layer-wise Attention and Activation Analysis

arXiv:2510.03366v2 Announce Type: replace Abstract: Transformer-based language models excel at both recall (retrieving memorized facts) and reasoning (performing multi-step inference), but whether these abilities rely on distinct internal mechanisms remains unclear. Distinguishing recall from reasoning is crucial for predicting model…

March 16, 2026

On the Geometric Coherence of Global Aggregation in Federated Graph Neural Networks

arXiv:2602.15510v2 Announce Type: replace Abstract: Federated Learning (FL) enables distributed training across multiple clients without centralized data sharing, while Graph Neural Networks (GNNs) model relational data through message passing. In federated GNN settings, client graphs often exhibit heterogeneous structural and…

March 16, 2026

SortScrews: A Dataset and Baseline for Real-time Screw Classification

arXiv:2603.13027v1 Announce Type: cross Abstract: Automatic identification of screw types is important for industrial automation, robotics, and inventory management. However, publicly available datasets for screw classification are scarce, particularly for controlled single-object scenarios commonly encountered in automated sorting systems. In…

March 16, 2026

Dual Filter: A Transformer-like Inference Architecture for Hidden Markov Models

arXiv:2505.00818v2 Announce Type: replace Abstract: This paper presents a mathematical framework for causal nonlinear prediction in settings where observations are generated from an underlying hidden Markov model (HMM). Both the problem formulation and the proposed solution are motivated by the…

March 16, 2026

Thermodynamics of Reinforcement Learning Curricula

arXiv:2603.12324v1 Announce Type: new Abstract: Connections between statistical mechanics and machine learning have repeatedly proven fruitful, providing insight into optimization, generalization, and representation learning. In this work, we follow this tradition by leveraging results from non-equilibrium thermodynamics to formalize curriculum…

March 16, 2026

Maximum Entropy Exploration Without the Rollouts

arXiv:2603.12325v1 Announce Type: new Abstract: Efficient exploration remains a central challenge in reinforcement learning, serving as a useful pretraining objective for data collection, particularly when an external reward function is unavailable. A principled formulation of the exploration problem is to…

March 16, 2026

A Geometrically-Grounded Drive for MDL-Based Optimization in Deep Learning

arXiv:2603.12304v1 Announce Type: new Abstract: This paper introduces a novel optimization framework that fundamentally integrates the Minimum Description Length (MDL) principle into the training dynamics of deep neural networks. Moving beyond its conventional role as a model selection criterion, we…

March 16, 2026