Archives AI News

Exploring Pass-Rate Reward in Reinforcement Learning for Code Generation

arXiv:2605.02944v1 Announce Type: new Abstract: Reinforcement learning (RL) from unit-test feedback has become a standard post-training recipe for improving large language models (LLMs) on code generation. However, the pass-all-tests binary reward can be sparse, yielding no learning signal on challenging…

May 6, 2026

Personalized Worked Example Generation from Student Code Submissions Using Pattern-based Knowledge Components

arXiv:2604.24758v2 Announce Type: replace-cross Abstract: Adaptive programming practice often relies on fixed libraries of worked examples and practice problems, which require substantial authoring effort and may not correspond well to the logical errors and partial solutions students produce while writing…

May 6, 2026

RouteHijack: Routing-Aware Attack on Mixture-of-Experts LLMs

arXiv:2605.02946v1 Announce Type: new Abstract: Safety alignment is critical for the responsible deployment of large language models (LLMs). As Mixture-of-Experts (MoE) architectures are increasingly adopted to scale model capacity, understanding their safety robustness becomes essential. Existing adversarial attacks, however, have…

May 6, 2026

GRPO-TTA: Test-Time Visual Tuning for Vision-Language Models via GRPO-Driven Reinforcement Learning

arXiv:2605.03403v1 Announce Type: cross Abstract: Group Relative Policy Optimization (GRPO) has recently shown strong performance in post-training large language models and vision-language models. It raises a question of whether the GRPO also significantly promotes the test-time adaptation (TTA) of vision…

May 6, 2026

Method for stress-testing cloud computing algorithms helps avoid network failures

The “MetaEase” technique provides a heads-up to potential scenarios that could cause long wait-times or outages.

May 6, 2026

Fisher Decorator: Refining Flow Policy via a Local Transport Map

arXiv:2604.17919v2 Announce Type: replace Abstract: Recent advances in flow-based offline reinforcement learning (RL) have achieved strong performance by parameterizing policies via flow matching. However, they still face critical trade-offs among expressiveness, optimality, and efficiency. In particular, existing flow policies interpret…

May 6, 2026

InvisibleInk: High-Utility and Low-Cost Text Generation with Differential Privacy

arXiv:2507.02974v3 Announce Type: replace Abstract: As major progress in LLM-based long-form text generation enables paradigms such as retrieval-augmented generation (RAG) and inference-time scaling, safely incorporating private information into the generation remains a critical open question. We present InvisibleInk, a highly…

May 6, 2026

Learning a Stochastic Differential Equation Model of Tropical Cyclone Intensification from Reanalysis and Observational Data

arXiv:2601.08116v3 Announce Type: replace Abstract: Tropical cyclones are among the most consequential weather hazards, yet estimates of their risk are limited by the relatively short historical record. To extend these records, researchers often generate large ensembles of synthetic storms using…

May 6, 2026

Amortized Variational Inference for Joint Posterior and Predictive Distributions in Bayesian Uncertainty Quantification

arXiv:2605.03710v1 Announce Type: cross Abstract: Bayesian predictive inference propagates parameter uncertainty to quantities of interest through the posterior-predictive distribution. In practice, this is typically performed using a two-stage procedure: first approximating the posterior distribution of model parameters, and then propagating…

May 6, 2026

Label-Efficient School Detection from Aerial Imagery via Weakly Supervised Pretraining and Fine-Tuning

arXiv:2605.03968v1 Announce Type: cross Abstract: Accurate school detection is essential for supporting education initiatives, including infrastructure planning and expanding internet connectivity to underserved areas. However, many regions around the world face challenges due to outdated, incomplete, or unavailable official records.…

May 6, 2026