Archives AI News

A Shared Valence Axis Across Modern LLMs and Human EEG: The Saturation Regularity

arXiv:2606.00129v1 Announce Type: new Abstract: Large language models (LLMs) have emerged as powerful representation learners whose internal features increasingly align with human cognition. We study whether modern LLMs can serve as a lens for understanding neural representations in the human…

June 2, 2026

scicode-lint: Detecting Methodology Bugs in Scientific Python Code with LLM-Generated Patterns

arXiv:2603.17893v2 Announce Type: replace-cross Abstract: Methodology bugs in scientific Python code produce plausible but incorrect results that traditional linters and static analysis tools cannot detect. Several research groups have built ML-specific linters, demonstrating that detection is feasible. Yet these tools…

June 2, 2026

From Demonstrations to Rewards: Test-Time Prompt Optimization for VLM Reward Models

arXiv:2606.00083v1 Announce Type: new Abstract: Reinforcement learning relies on accurate reward functions, which are often hand-crafted or even unavailable in real-world applications, such as robotics. Recent work has explored the zero-shot reasoning capabilities of pre-trained Vision-Language Models (VLMs) as reward…

June 2, 2026

A Direct Approach for Handling Contextual Bandits with Latent State Dynamics

arXiv:2604.08149v2 Announce Type: replace Abstract: We consider a linear contextual bandit model where contexts and rewards are governed by a finite hidden Markov chain. We first revisit the simplified model by Nelson et al. (2022), in which rewards are linear…

June 2, 2026

Hoeffding Concept Bottleneck Models with Applications to Overhead Images

arXiv:2606.00082v1 Announce Type: new Abstract: Explainability of deep learning algorithms is critical for computer-vision applications with high-stake decisions. Concept bottleneck models (CBM) have recently shown promising performance to provide explainable and accurate predictions for classification problems, based on a bottleneck…

June 2, 2026

Honest Lying: Understanding Memory Confabulation in Reflexive Agents

arXiv:2605.29463v2 Announce Type: replace Abstract: Reflexion-style agents rely on self-generated reflections as memory, implicitly assuming that agents can accurately diagnose their own failures. We show that this assumption can fail systematically: across ALFWorld and HumanEval, agents store confident but incorrect…

June 2, 2026

Fundamental bounds on efficiency-confidence trade-off for transductive conformal prediction

arXiv:2509.04631v2 Announce Type: replace Abstract: Transductive conformal prediction addresses the simultaneous prediction for multiple data points. Given a desired confidence level, the objective is to construct a prediction set that includes the true outcomes with the prescribed confidence. We demonstrate…

June 2, 2026

From Evaluation to Design: Using Potential Energy Surface Smoothness Metrics to Guide Machine Learning Interatomic Potential Architectures

arXiv:2602.04861v2 Announce Type: replace Abstract: Machine Learning Interatomic Potentials (MLIPs) sometimes fail to reproduce the physical smoothness of the quantum potential energy surface (PES), leading to erroneous behavior in downstream simulations that standard energy and force regression evaluations can miss.…

June 2, 2026

Introduction to Graph Neural Networks for Machine Learning Engineers

arXiv:2412.19419v2 Announce Type: replace Abstract: Graph neural networks are deep neural networks designed for graphs with attributes attached to nodes or edges. The number of research papers in the literature concerning these models is growing rapidly due to their impressive…

June 2, 2026

Geometric Erasure by Contrastive Velocity Matching in Rectified Flows

arXiv:2606.00140v1 Announce Type: new Abstract: While the rapid adoption of multimodal generative models offers immense potential, it has also increased the risks of harmful content synthesis, deepfakes, and copyright infringements. To address these challenges, concept erasure has emerged as a…

June 2, 2026