Archives AI News

SATURN: SAT-based Reinforcement Learning to Unleash LLMs Reasoning

arXiv:2505.16368v4 Announce Type: replace Abstract: How to design reinforcement learning (RL) tasks that effectively unleash the reasoning capability of large language models (LLMs) remains an open question. Existing RL tasks (e.g., math, programming, and constructing reasoning tasks) suffer from three…

Dynamic Multi-period Experts for Online Time Series Forecasting

arXiv:2603.09062v1 Announce Type: new Abstract: Online Time Series Forecasting (OTSF) requires models to continuously adapt to concept drift. However, existing methods often treat concept drift as a monolithic phenomenon. To address this limitation, we first redefine concept drift by categorizing…

Multimodal LLM-assisted Evolutionary Search for Programmatic Control Policies

arXiv:2508.05433v3 Announce Type: replace Abstract: Deep reinforcement learning has achieved impressive success in control tasks. However, its policies, represented as opaque neural networks, are often difficult for humans to understand, verify, and debug, which undermines trust and hinders real-world deployment.…

Learning Adaptive LLM Decoding

arXiv:2603.09065v1 Announce Type: new Abstract: Decoding from large language models (LLMs) typically relies on fixed sampling hyperparameters (e.g., temperature, top-p), despite substantial variation in task difficulty and uncertainty across prompts and individual decoding steps. We propose to learn adaptive decoding…

REAP the Experts: Why Pruning Prevails for One-Shot MoE compression

arXiv:2510.13999v2 Announce Type: replace Abstract: Sparsely-activated Mixture-of-Experts (SMoE) models offer efficient pre-training and low latency but their large parameter counts create significant memory overhead, motivating research into expert compression. Contrary to recent findings favouring expert merging on discriminative benchmarks, we…