Archives AI News

Exploitation Over Exploration: Unmasking the Bias in Linear Bandit Recommender Offline Evaluation

arXiv:2507.18756v2 Announce Type: replace Abstract: Multi-Armed Bandit (MAB) algorithms are widely used in recommender systems that require continuous, incremental learning. A core aspect of MABs is the exploration-exploitation trade-off: choosing between exploiting items likely to be enjoyed and exploring new…

April 20, 2026

ChemAmp: Amplified Chemistry Tools via Composable Agents

arXiv:2505.21569v3 Announce Type: replace Abstract: Although LLM-based agents are proven to master tool orchestration in scientific fields, particularly chemistry, their single-task performance remains limited by underlying tool constraints. To this end, we propose tool amplification, a novel paradigm that enhances…

April 20, 2026

PRL-Bench: A Comprehensive Benchmark Evaluating LLMs’ Capabilities in Frontier Physics Research

arXiv:2604.15411v1 Announce Type: new Abstract: The paradigm of agentic science requires AI systems to conduct robust reasoning and engage in long-horizon, autonomous exploration. However, current scientific benchmarks remain confined to domain knowledge comprehension and complex reasoning, failing to evaluate the…

April 20, 2026

Optimizing Stochastic Gradient Push under Broadcast Communications

arXiv:2604.15549v1 Announce Type: new Abstract: We consider the problem of minimizing the convergence time for decentralized federated learning (DFL) in wireless networks under broadcast communications, with focus on mixing matrix design. The mixing matrix is a critical hyperparameter for DFL…

April 20, 2026

Teaching Language Models Mechanistic Explainability Through MechSMILES

arXiv:2512.05722v2 Announce Type: replace Abstract: Chemical reaction mechanisms are the foundation of how chemists evaluate reactivity and feasibility, yet current Computer-Assisted Synthesis Planning (CASP) systems operate without this mechanistic reasoning. We introduce a computational framework that teaches language models to…

April 20, 2026

Bridging the phenotype-target gap for molecular generation via multi-objective reinforcement learning

arXiv:2509.21010v2 Announce Type: replace Abstract: The de novo generation of drug-like molecules capable of inducing desirable phenotypic changes is receiving increasing attention. However, previous methods predominantly rely on expression profiles to guide molecule generation, but overlook the perturbative effect of…

April 20, 2026

Beyond Single-Model Optimization: Preserving Plasticity in Continual Reinforcement Learning

arXiv:2604.15414v1 Announce Type: new Abstract: Continual reinforcement learning must balance retention with adaptation, yet many methods still rely on emph{single-model preservation}, committing to one evolving policy as the main reusable solution across tasks. Even when a previously successful policy is…

April 20, 2026

Natural gradient descent with momentum

arXiv:2604.15554v1 Announce Type: new Abstract: We consider the problem of approximating a function by an element of a nonlinear manifold which admits a differentiable parametrization, typical examples being neural networks with differentiable activation functions or tensor networks. Natural gradient descent…

April 20, 2026

Power to the Clients: Federated Learning in a Dictatorship Setting

arXiv:2510.22149v3 Announce Type: replace Abstract: Federated learning (FL) has emerged as a promising paradigm for decentralized model training, enabling multiple clients to collaboratively learn a shared model without exchanging their local data. However, the decentralized nature of FL also introduces…

April 20, 2026

Learning Affine-Equivariant Proximal Operators

arXiv:2604.15556v1 Announce Type: new Abstract: Proximal operators are fundamental across many applications in signal processing and machine learning, including solving ill-posed inverse problems. Recent work has introduced Learned Proximal Networks (LPNs), providing parametric functions that compute exact proximals for data-driven…

April 20, 2026