Archives AI News

reward-lens: A Mechanistic Interpretability Library for Reward Models

arXiv:2604.26130v1 Announce Type: new Abstract: Every RLHF-trained language model is shaped by a reward model, yet the mechanistic interpretability toolkit — logit lens, direct logit attribution, activation patching, sparse autoencoders — was built for generative LLMs whose primitives all project…

Deep neural networks with ReLU, leaky ReLU, and softplus activation provably overcome the curse of dimensionality for Kolmogorov partial differential equations with Lipschitz nonlinearities in the $L^p$-sense

arXiv:2309.13722v3 Announce Type: replace-cross Abstract: Recently, several deep learning (DL) methods for approximating high-dimensional partial differential equations (PDEs) have been proposed. The interest that these methods have generated in the literature is in large part due to simulations which appear…

Efficient Zero-Shot Inpainting with Decoupled Diffusion Guidance

arXiv:2512.18365v2 Announce Type: replace-cross Abstract: Diffusion models have emerged as powerful priors for image editing tasks such as inpainting and local modification, where the objective is to generate realistic content that remains consistent with observed regions. In particular, zero-shot approaches…

Entropy Centroids as Intrinsic Rewards for Test-Time Scaling

arXiv:2604.26173v1 Announce Type: new Abstract: An effective way to scale up test-time compute of large language models is to sample multiple responses and then select the best one, as in Grok Heavy and Gemini Deep Think. Existing selection methods often…

Domain-Adapted Small Language Models for Reliable Clinical Triage

arXiv:2604.26766v1 Announce Type: cross Abstract: Accurate and consistent Emergency Severity Index (ESI) assignment remains a persistent challenge in emergency departments, where highly variable free-text triage documentation contributes to mistriage and workflow inefficiencies. This study evaluates whether open-source small language models…

SWAN: World-Aware Adaptive Multimodal Networks for Runtime Variations

arXiv:2604.26181v1 Announce Type: new Abstract: Multimodal deep neural networks deployed in realistic environments must contend with runtime variations: changes in modality quality, overall input complexity, and available platform resources. Current networks struggle with such fluctuations — adaptive networks cannot adhere…

ClawGym: A Scalable Framework for Building Effective Claw Agents

arXiv:2604.26904v1 Announce Type: cross Abstract: Claw-style environments support multi-step workflows over local files, tools, and persistent workspace states. However, scalable development around these environments remains constrained by the absence of a systematic framework, especially one for synthesizing verifiable training data…