Archives AI News

AI Governance under Political Turnover: The Alignment Surface of Compliance Design

arXiv:2604.21103v1 Announce Type: new Abstract: Governments are increasingly interested in using AI to make administrative decisions cheaper, more scalable, and more consistent. But for probabilistic AI to be incorporated into public administration it must be embedded in a compliance layer…

April 25, 2026

ReactBench: A Benchmark for Topological Reasoning in MLLMs on Chemical Reaction Diagrams

arXiv:2604.15994v2 Announce Type: replace Abstract: Multimodal Large Language Models (MLLMs) excel at recognizing individual visual elements and reasoning over simple linear diagrams. However, when faced with complex topological structures involving branching paths, converging flows, and cyclic dependencies, their reasoning capabilities…

April 25, 2026

Counterfactual Segmentation Reasoning: Diagnosing and Mitigating Pixel-Grounding Hallucination

arXiv:2506.21546v4 Announce Type: replace-cross Abstract: Segmentation Vision-Language Models (VLMs) have significantly advanced grounded visual understanding, yet they remain prone to pixel-grounding hallucinations, producing masks for incorrect objects or for objects that are entirely absent. Existing evaluations rely almost entirely on…

April 25, 2026

Replay-buffer engineering for noise-robust quantum circuit optimization

arXiv:2604.21863v1 Announce Type: cross Abstract: Deep reinforcement learning (RL) for quantum circuit optimization faces three fundamental bottlenecks: replay buffers that ignore the reliability of temporal-difference (TD) targets, curriculum-based architecture search that triggers a full quantum-classical evaluation at every environment step,…

April 25, 2026

Speculative Actions: A Lossless Framework for Faster Agentic Systems

arXiv:2510.04371v2 Announce Type: replace Abstract: AI agents are increasingly deployed in complex, interactive environments, yet their runtime remains a major bottleneck for training, evaluation, and real-world use. Typical agent behavior unfolds sequentially, with each action requiring an API call that…

April 25, 2026

MISTY: High-Throughput Motion Planning via Mixer-based Single-step Drifting

arXiv:2604.21489v1 Announce Type: cross Abstract: Multi-modal trajectory generation is essential for safe autonomous driving, yet existing diffusion-based planners suffer from high inference latency due to iterative neural function evaluations. This paper presents MISTY (Mixer-based Inference for Single-step Trajectory-drifting Yield), a…

April 25, 2026

HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering

arXiv:2604.21027v1 Announce Type: new Abstract: Electronic health record (EHR) question answering is often handled by LLM-based pipelines that are costly to deploy and do not explicitly leverage the hierarchical structure of clinical data. Motivated by evidence that medical ontologies and…

April 25, 2026

Cognitive Amplification vs Cognitive Delegation in Human-AI Systems: A Metric Framework

arXiv:2603.18677v2 Announce Type: replace-cross Abstract: Artificial intelligence is increasingly embedded in human decision making. In some cases, it enhances human reasoning. In others, it fosters excessive cognitive dependence. This paper introduces a conceptual and mathematical framework to distinguish cognitive amplification,…

April 25, 2026

Deep FinResearch Bench: Evaluating AI’s Ability to Conduct Professional Financial Investment Research

arXiv:2604.21006v1 Announce Type: new Abstract: We introduce Deep FinResearch Bench, a practical and comprehensive evaluation framework for deep research (DR) agents in financial investment research. The benchmark assesses three dimensions of report quality: qualitative rigor, quantitative forecasting and valuation accuracy,…

April 25, 2026

Adaptive Test-Time Compute Allocation with Evolving In-Context Demonstrations

arXiv:2604.21018v1 Announce Type: new Abstract: While scaling test-time compute can substantially improve model performance, existing approaches either rely on static compute allocation or sample from fixed generation distributions. In this work, we introduce a test-time compute allocation framework that jointly…

April 25, 2026