Archives AI News

Bayesian Double Descent

arXiv:2507.07338v3 Announce Type: replace-cross Abstract: Double descent is a phenomenon of over-parameterized statistical models such as deep neural networks which have a re-descending property in their risk function. As the complexity of the model increases, risk exhibits a U-shaped region…

October 16, 2025

Randomness and Interpolation Improve Gradient Descent

arXiv:2510.13040v1 Announce Type: new Abstract: Based on Stochastic Gradient Descent (SGD), the paper introduces two optimizers, named Interpolational Accelerating Gradient Descent (IAGD) as well as Noise-Regularized Stochastic Gradient Descent (NRSGD). IAGD leverages second-order Newton Interpolation to expedite the convergence process…

October 16, 2025

Can Prompts Rewind Time for LLMs? Evaluating the Effectiveness of Prompted Knowledge Cutoffs

arXiv:2510.02340v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are widely used for temporal prediction, but their reliance on pretraining data raises contamination concerns, as accurate predictions on pre-cutoff test data may reflect memorization rather than reasoning, leading to an…

October 16, 2025

An Operational Deep Learning System for Satellite-Based High-Resolution Global Nowcasting

arXiv:2510.13050v1 Announce Type: new Abstract: Precipitation nowcasting, which predicts rainfall up to a few hours ahead, is a critical tool for vulnerable communities in the Global South frequently exposed to intense, rapidly developing storms. Timely forecasts provide a crucial window…

October 16, 2025

On Pretraining for Project-Level Code Completion

arXiv:2510.13697v1 Announce Type: cross Abstract: Repository-level pretraining is commonly used to enable large language models for code to leverage codebase-wide context. This enhances their ability to generate accurate and context-aware code completions. In this work, we investigate how different repository-processing…

October 16, 2025

Time-Varying Optimization for Streaming Data Via Temporal Weighting

arXiv:2510.13052v1 Announce Type: new Abstract: Classical optimization theory deals with fixed, time-invariant objective functions. However, time-varying optimization has emerged as an important subject for decision-making in dynamic environments. In this work, we study the problem of learning from streaming data…

October 16, 2025

PriorGuide: Test-Time Prior Adaptation for Simulation-Based Inference

arXiv:2510.13763v1 Announce Type: cross Abstract: Amortized simulator-based inference offers a powerful framework for tackling Bayesian inference in computational fields such as engineering or neuroscience, increasingly leveraging modern generative methods like diffusion models to map observed data to model parameters or…

October 16, 2025

Achieving Logarithmic Regret in KL-Regularized Zero-Sum Markov Games

arXiv:2510.13060v1 Announce Type: new Abstract: Reverse Kullback-Leibler (KL) divergence-based regularization with respect to a fixed reference policy is widely used in modern reinforcement learning to preserve the desired traits of the reference policy and sometimes to promote exploration (using uniform…

October 16, 2025

Do LLM Agents Have Regret? A Case Study in Online Learning and Games

arXiv:2403.16843v5 Announce Type: replace Abstract: Large language models (LLMs) have been increasingly employed for (interactive) decision-making, via the development of LLM-based autonomous agents. Despite their emerging successes, the performance of LLM agents in decision-making has not been fully investigated through…

October 16, 2025

Absolute indices for determining compactness, separability and number of clusters

arXiv:2510.13065v1 Announce Type: new Abstract: Finding “true” clusters in a data set is a challenging problem. Clustering solutions obtained using different models and algorithms do not necessarily provide compact and well-separated clusters or the optimal number of clusters. Cluster validity…

October 16, 2025