Archives AI News

AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization

arXiv:2405.18187v2 Announce Type: replace Abstract: Implicit Q-learning (IQL) serves as a strong baseline for offline RL, which learns the value function using only dataset actions through quantile regression. However, it is unclear how to recover the implicit policy from the…

November 6, 2025

Online Learning to Rank under Corruption: A Robust Cascading Bandits Approach

arXiv:2511.03074v1 Announce Type: new Abstract: Online learning to rank (OLTR) studies how to recommend a short ranked list of items from a large pool and improves future rankings based on user clicks. This setting is commonly modeled as cascading bandits,…

November 6, 2025

REINFORCE-ING Chemical Language Models for Drug Discovery

arXiv:2501.15971v2 Announce Type: replace Abstract: Chemical language models, combined with reinforcement learning (RL), have shown significant promise to efficiently traverse large chemical spaces for drug discovery. However, the performance of various RL algorithms and their best practices for practical drug…

November 6, 2025

Sparse, self-organizing ensembles of local kernels detect rare statistical anomalies

arXiv:2511.03095v1 Announce Type: new Abstract: Modern artificial intelligence has revolutionized our ability to extract rich and versatile data representations across scientific disciplines. Yet, the statistical properties of these representations remain poorly controlled, causing misspecified anomaly detection (AD) methods to falter.…

November 6, 2025

NeuralSurv: Deep Survival Analysis with Bayesian Uncertainty Quantification

arXiv:2505.11054v2 Announce Type: replace Abstract: We introduce NeuralSurv, the first deep survival model to incorporate Bayesian uncertainty quantification. Our non-parametric, architecture-agnostic framework captures time-varying covariate-risk relationships in continuous time via a novel two-stage data-augmentation scheme, for which we establish theoretical…

November 6, 2025

Scaling Multi-Agent Environment Co-Design with Diffusion Models

arXiv:2511.03100v1 Announce Type: new Abstract: The agent-environment co-design paradigm jointly optimises agent policies and environment configurations in search of improved system performance. With application domains ranging from warehouse logistics to windfarm management, co-design promises to fundamentally change how we deploy…

November 6, 2025

Model-Informed Flows for Bayesian Inference

arXiv:2505.24243v2 Announce Type: replace Abstract: Variational inference often struggles with the posterior geometry exhibited by complex hierarchical Bayesian models. Recent advances in flow-based variational families and Variationally Inferred Parameters (VIP) each address aspects of this challenge, but their formal relationship…

November 6, 2025

An Efficient Classification Model for Cyber Text

arXiv:2511.03107v1 Announce Type: new Abstract: The uprising of deep learning methodology and practice in recent years has brought about a severe consequence of increasing carbon footprint due to the insatiable demand for computational resources and power. The field of text…

November 6, 2025

Composing Linear Layers from Irreducibles

arXiv:2507.11688v3 Announce Type: replace Abstract: Contemporary large models often exhibit behaviors suggesting the presence of low-level primitives that compose into modules with richer functionality, but these fundamental building blocks remain poorly understood. We investigate this compositional structure in linear layers…

November 6, 2025

Towards Scalable Backpropagation-Free Gradient Estimation

arXiv:2511.03110v1 Announce Type: new Abstract: While backpropagation–reverse-mode automatic differentiation–has been extraordinarily successful in deep learning, it requires two passes (forward and backward) through the neural network and the storage of intermediate activations. Existing gradient estimation methods that instead use forward-mode…

November 6, 2025