Archives AI News

SCALAR: Learning and Composing Skills through LLM Guided Symbolic Planning and Deep RL Grounding

arXiv:2603.09036v1 Announce Type: new Abstract: LM-based agents excel when given high-level action APIs but struggle to ground language into low-level control. Prior work has LLMs generate skills or reward functions for RL, but these one-shot approaches lack feedback to correct…

March 11, 2026

HyConEx: Hypernetwork classifier with counterfactual explanations for tabular data

arXiv:2503.12525v2 Announce Type: replace Abstract: In recent years, there has been a growing interest in explainable AI methods. In addition to making accurate predictions, we also want to understand what the model’s decision is based on. One of the fundamental…

March 11, 2026

Sim2Act: Robust Simulation-to-Decision Learning via Adversarial Calibration and Group-Relative Perturbation

arXiv:2603.09053v1 Announce Type: new Abstract: Simulation-to-decision learning enables safe policy training in digital environments without risking real-world deployment, and has become essential in mission-critical domains such as supply chains and industrial systems. However, simulators learned from noisy or biased real-world…

March 11, 2026

SATURN: SAT-based Reinforcement Learning to Unleash LLMs Reasoning

arXiv:2505.16368v4 Announce Type: replace Abstract: How to design reinforcement learning (RL) tasks that effectively unleash the reasoning capability of large language models (LLMs) remains an open question. Existing RL tasks (e.g., math, programming, and constructing reasoning tasks) suffer from three…

March 11, 2026

Dynamic Multi-period Experts for Online Time Series Forecasting

arXiv:2603.09062v1 Announce Type: new Abstract: Online Time Series Forecasting (OTSF) requires models to continuously adapt to concept drift. However, existing methods often treat concept drift as a monolithic phenomenon. To address this limitation, we first redefine concept drift by categorizing…

March 11, 2026

Multimodal LLM-assisted Evolutionary Search for Programmatic Control Policies

arXiv:2508.05433v3 Announce Type: replace Abstract: Deep reinforcement learning has achieved impressive success in control tasks. However, its policies, represented as opaque neural networks, are often difficult for humans to understand, verify, and debug, which undermines trust and hinders real-world deployment.…

March 11, 2026

Learning Adaptive LLM Decoding

arXiv:2603.09065v1 Announce Type: new Abstract: Decoding from large language models (LLMs) typically relies on fixed sampling hyperparameters (e.g., temperature, top-p), despite substantial variation in task difficulty and uncertainty across prompts and individual decoding steps. We propose to learn adaptive decoding…

March 11, 2026

REAP the Experts: Why Pruning Prevails for One-Shot MoE compression

arXiv:2510.13999v2 Announce Type: replace Abstract: Sparsely-activated Mixture-of-Experts (SMoE) models offer efficient pre-training and low latency but their large parameter counts create significant memory overhead, motivating research into expert compression. Contrary to recent findings favouring expert merging on discriminative benchmarks, we…

March 11, 2026

A better method for planning complex visual tasks

A new hybrid system could help robots navigate in changing environments or increase the efficiency of multirobot assembly teams.

March 11, 2026

Global Convergence of Iteratively Reweighted Least Squares for Robust Subspace Recovery

arXiv:2506.20533v4 Announce Type: replace-cross Abstract: Robust subspace estimation is fundamental to many machine learning and data analysis tasks. Iteratively Reweighted Least Squares (IRLS) is an elegant and empirically effective approach to this problem, yet its theoretical properties remain poorly understood.…

March 11, 2026