Archives AI News

Behaviour Policy Optimization: Provably Lower Variance Return Estimates for Off-Policy Reinforcement Learning

arXiv:2511.10843v1 Announce Type: new Abstract: Many reinforcement learning algorithms, particularly those that rely on return estimates for policy improvement, can suffer from poor sample efficiency and training instability due to high-variance return estimates. In this paper we leverage new results…

November 17, 2025

Augmented data and neural networks for robust epidemic forecasting: application to COVID-19 in Italy

arXiv:2510.09192v2 Announce Type: replace-cross Abstract: In this work, we propose a data augmentation strategy aimed at improving the training phase of neural networks and, consequently, the accuracy of their predictions. Our approach relies on generating synthetic data through a suitable…

November 17, 2025

STAMP: Spatial-Temporal Adapter with Multi-Head Pooling

arXiv:2511.10848v1 Announce Type: new Abstract: Time series foundation models (TSFMs) pretrained on data from multiple domains have shown strong performance on diverse modeling tasks. Various efforts have been made to develop foundation models specific to electroencephalography (EEG) data, which records…

November 17, 2025

On the Relationship Between Adversarial Robustness and Decision Region in Deep Neural Networks

arXiv:2207.03400v3 Announce Type: replace Abstract: In general, Deep Neural Networks (DNNs) are evaluated by the generalization performance measured on unseen data excluded from the training phase. Along with the development of DNNs, the generalization performance converges to the state-of-the-art and…

November 17, 2025

ExPairT-LLM: Exact Learning for LLM Code Selection by Pairwise Queries

arXiv:2511.10855v1 Announce Type: new Abstract: Despite recent advances in LLMs, the task of code generation is still challenging. To cope, code selection algorithms select the best program from multiple programs generated by an LLM. However, existing algorithms can fail to…

November 17, 2025

Provable Domain Adaptation for Offline Reinforcement Learning with Limited Samples

arXiv:2408.12136v4 Announce Type: replace Abstract: Offline reinforcement learning (RL) learns effective policies from a static target dataset. The performance of state-of-the-art offline RL algorithms notwithstanding, it relies on the size of the target dataset, and it degrades if limited samples…

November 17, 2025

Private Zeroth-Order Optimization with Public Data

arXiv:2511.10859v1 Announce Type: new Abstract: One of the major bottlenecks for deploying popular first-order differentially private (DP) machine learning algorithms (e.g., DP-SGD) lies in their high computation and memory cost, despite the existence of optimized implementations. Zeroth-order methods have promise…

November 17, 2025

Training speedups via batching for geometric learning: an analysis of static and dynamic algorithms

arXiv:2502.00944v3 Announce Type: replace Abstract: Graph neural networks (GNN) have shown promising results for several domains such as materials science, chemistry, and the social sciences. GNN models often contain millions of parameters, and like other neural network (NN) models, are…

November 17, 2025

Go-UT-Bench: A Fine-Tuning Dataset for LLM-Based Unit Test Generation in Go

arXiv:2511.10868v1 Announce Type: new Abstract: Training data imbalance poses a major challenge for code LLMs. Most available data heavily over represents raw opensource code while underrepresenting broader software engineering tasks, especially in low resource languages like Golang. As a result,…

November 17, 2025

Beyond $tilde{O}(sqrt{T})$ Constraint Violation for Online Convex Optimization with Adversarial Constraints

arXiv:2505.06709v2 Announce Type: replace Abstract: We study Online Convex Optimization with adversarial constraints (COCO). At each round a learner selects an action from a convex decision set and then an adversary reveals a convex cost and a convex constraint function.…

November 17, 2025