Archives AI News

reward-lens: A Mechanistic Interpretability Library for Reward Models

arXiv:2604.26130v1 Announce Type: new Abstract: Every RLHF-trained language model is shaped by a reward model, yet the mechanistic interpretability toolkit — logit lens, direct logit attribution, activation patching, sparse autoencoders — was built for generative LLMs whose primitives all project…

April 30, 2026

Deep neural networks with ReLU, leaky ReLU, and softplus activation provably overcome the curse of dimensionality for Kolmogorov partial differential equations with Lipschitz nonlinearities in the $L^p$-sense

arXiv:2309.13722v3 Announce Type: replace-cross Abstract: Recently, several deep learning (DL) methods for approximating high-dimensional partial differential equations (PDEs) have been proposed. The interest that these methods have generated in the literature is in large part due to simulations which appear…

April 30, 2026

Spatially-constrained clustering of geospatial features for heat vulnerability assessment of favelas in Rio de Janeiro

arXiv:2604.26133v1 Announce Type: new Abstract: Informal settlements face disproportionate exposure to climate-related health hazards. However, existing methodologies lack systematic approaches to link diverse settlement characteristics with environmental health outcomes. We develop a data-driven framework to assess heat vulnerability in Rio…

April 30, 2026

Efficient Zero-Shot Inpainting with Decoupled Diffusion Guidance

arXiv:2512.18365v2 Announce Type: replace-cross Abstract: Diffusion models have emerged as powerful priors for image editing tasks such as inpainting and local modification, where the objective is to generate realistic content that remains consistent with observed regions. In particular, zero-shot approaches…

April 30, 2026

Budget-Constrained Causal Bandits: Bridging Uplift Modeling and Sequential Decision-Making

arXiv:2604.26169v1 Announce Type: new Abstract: Treatment allocation under budget constraints is a central challenge in digital advertising: advertisers must decide which users to show ads to while spending a limited budget wisely. The standard approach follows a two-stage offline pipeline…

April 30, 2026

A Nonlinear Separation Principle via Contraction Theory: Applications to Neural Networks, Control, and Learning

arXiv:2604.15238v2 Announce Type: replace-cross Abstract: This paper establishes a nonlinear separation principle based on contraction theory and derives sharp stability conditions for recurrent neural networks (RNNs). First, we introduce a nonlinear separation principle that guarantees global exponential stability for the…

April 30, 2026

Entropy Centroids as Intrinsic Rewards for Test-Time Scaling

arXiv:2604.26173v1 Announce Type: new Abstract: An effective way to scale up test-time compute of large language models is to sample multiple responses and then select the best one, as in Grok Heavy and Gemini Deep Think. Existing selection methods often…

April 30, 2026

Domain-Adapted Small Language Models for Reliable Clinical Triage

arXiv:2604.26766v1 Announce Type: cross Abstract: Accurate and consistent Emergency Severity Index (ESI) assignment remains a persistent challenge in emergency departments, where highly variable free-text triage documentation contributes to mistriage and workflow inefficiencies. This study evaluates whether open-source small language models…

April 30, 2026

SWAN: World-Aware Adaptive Multimodal Networks for Runtime Variations

arXiv:2604.26181v1 Announce Type: new Abstract: Multimodal deep neural networks deployed in realistic environments must contend with runtime variations: changes in modality quality, overall input complexity, and available platform resources. Current networks struggle with such fluctuations — adaptive networks cannot adhere…

April 30, 2026

ClawGym: A Scalable Framework for Building Effective Claw Agents

arXiv:2604.26904v1 Announce Type: cross Abstract: Claw-style environments support multi-step workflows over local files, tools, and persistent workspace states. However, scalable development around these environments remains constrained by the absence of a systematic framework, especially one for synthesizing verifiable training data…

April 30, 2026