Archives AI News

Large Language Models Hack Rewards, and Society

arXiv:2606.04075v1 Announce Type: new Abstract: Reinforcement learning (RL) has become a dominant post-training paradigm, enabling large language models (LLMs) to learn from rewards. We observe that societal regulations are structurally similar to reward functions. They define measurable outcomes, thresholds, and…

Formal Semantics for Agentic Tool Protocols: A Process Calculus Approach

arXiv:2603.24747v2 Announce Type: replace Abstract: The emergence of large language model agents capable of invoking external tools has created urgent need for formal verification of agent protocols. Two paradigms dominate this space: Schema-Guided Dialogue (SGD), a research framework for zero-shot…

Bayesian learning for the stochastic shortest path problem

arXiv:2606.04845v1 Announce Type: cross Abstract: Sequential decision-making problems are often modelled as a Markov decision process (MDP). We focus on the stochastic shortest path (SSP) problem, which is an infinite-horizon undiscounted MDP with absorbing terminal states. We develop a Bayesian…

The Perception-Physics Paradox: Probing Scientific Alignment with TC-Bench

arXiv:2605.24782v2 Announce Type: replace Abstract: While Vision Foundation Models (VFMs) excel at predictive tasks on satellite imagery, their performance can arise from visual correlations rather than underlying structural invariants, making even perception-based out-of-distribution accuracy a poor proxy for scientific utility.…