Archives AI News

Large Language Models Hack Rewards, and Society

arXiv:2606.04075v1 Announce Type: new Abstract: Reinforcement learning (RL) has become a dominant post-training paradigm, enabling large language models (LLMs) to learn from rewards. We observe that societal regulations are structurally similar to reward functions. They define measurable outcomes, thresholds, and…

June 4, 2026

PerchRL: Vision-Based Agile Perching on Inclined Platforms under Rapid and Irregular Motion

arXiv:2606.03441v2 Announce Type: replace-cross Abstract: Autonomous vision-based perching of quadrotors on moving inclined platforms is critical for air-ground collaboration but remains challenging due to the limited field of view (FOV). In this paper, we propose PerchRL, a reinforcement learning (RL)…

June 4, 2026

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models

arXiv:2606.04164v1 Announce Type: new Abstract: Data samples used for training often differ from those encountered during fine-tuning and deployment, and while ML models show promise, their performance remains limited when only small annotated datasets are available. Performance often degrades under…

June 4, 2026

Neetyabhas: A Framework for Uncertainty-Aware Public Policy Optimization in Rational Agent-Based Models

arXiv:2606.04562v1 Announce Type: cross Abstract: Purpose The WHO’s COVID-19 non-pharmaceutical interventions (e.g., lockdowns, vaccinations) effectively curb transmission but impose heavy economic strains. Existing research often neglects individual behaviors and falsely assumes perfect infection tracking and flawless policy execution, failing to…

June 4, 2026

Formal Semantics for Agentic Tool Protocols: A Process Calculus Approach

arXiv:2603.24747v2 Announce Type: replace Abstract: The emergence of large language model agents capable of invoking external tools has created urgent need for formal verification of agent protocols. Two paradigms dominate this space: Schema-Guided Dialogue (SGD), a research framework for zero-shot…

June 4, 2026

Bayesian learning for the stochastic shortest path problem

arXiv:2606.04845v1 Announce Type: cross Abstract: Sequential decision-making problems are often modelled as a Markov decision process (MDP). We focus on the stochastic shortest path (SSP) problem, which is an infinite-horizon undiscounted MDP with absorbing terminal states. We develop a Bayesian…

June 4, 2026

Explaining a probabilistic prediction on the simplex with Shapley compositions

arXiv:2408.01382v3 Announce Type: replace Abstract: Originating in game theory, Shapley values are widely used for explaining a machine learning model’s prediction by quantifying the contribution of each feature’s value to the prediction. This requires a scalar prediction as in binary…

June 4, 2026

Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text

arXiv:2605.29076v2 Announce Type: replace-cross Abstract: LLMs have advanced text classification, yet existing paradigms face a trade-off: supervised (label only) fine-tuning is scalable but offers limited reasoning on complex text and lacks broader model transparency, while discrete prompt optimization offers human-readable…

June 4, 2026

Beyond Symmetric Alignment: Spectral Diagnostics of Modality Imbalance in Vision-Language Models in the Medical Domain

arXiv:2606.04613v1 Announce Type: cross Abstract: Vision-Language Models (VLMs) struggle when applied to medical image-text data, yet the tools available to diagnose this failure remain limited. Existing representation alignment metrics are symmetric, collapsing both modalities into a single score and hiding…

June 4, 2026

The Perception-Physics Paradox: Probing Scientific Alignment with TC-Bench

arXiv:2605.24782v2 Announce Type: replace Abstract: While Vision Foundation Models (VFMs) excel at predictive tasks on satellite imagery, their performance can arise from visual correlations rather than underlying structural invariants, making even perception-based out-of-distribution accuracy a poor proxy for scientific utility.…

June 4, 2026