Archives AI News

A Copula Based Supervised Filter for Feature Selection in Diabetes Risk Prediction Using Machine Learning

arXiv:2505.22554v2 Announce Type: replace Abstract: Effective feature selection is critical for building robust and interpretable predictive models, particularly in medical applications where identifying risk factors in the most extreme patient strata is essential. Traditional methods often focus on average associations,…

October 1, 2025

An Orthogonal Learner for Individualized Outcomes in Markov Decision Processes

arXiv:2509.26429v1 Announce Type: new Abstract: Predicting individualized potential outcomes in sequential decision-making is central for optimizing therapeutic decisions in personalized medicine (e.g., which dosing sequence to give to a cancer patient). However, predicting potential outcomes over long horizons is notoriously…

October 1, 2025

Understanding and Improving Shampoo and SOAP via Kullback-Leibler Minimization

arXiv:2509.03378v2 Announce Type: replace Abstract: Shampoo and its efficient variant, SOAP, use structured second-moment estimation and have attracted growing interest for their effectiveness in training neural networks (NNs). In practice, Shampoo requires step-size grafting with Adam to achieve competitive performance.…

October 1, 2025

Pretrain-Test Task Alignment Governs Generalization in In-Context Learning

arXiv:2509.26551v1 Announce Type: new Abstract: In-context learning (ICL) is a central capability of Transformer models, but the structures in data that enable its emergence and govern its robustness remain poorly understood. In this work, we study how the structure of…

October 1, 2025

Performance of the empirical median for location estimation in heteroscedastic settings

arXiv:2501.16956v2 Announce Type: replace-cross Abstract: We investigate the performance of the empirical median for location estimation in heteroscedastic settings. Specifically, we consider independent symmetric real-valued random variables that share a common but unknown location parameter while having different and unknown…

October 1, 2025

On Fitting Flow Models with Large Sinkhorn Couplings

arXiv:2506.05526v3 Announce Type: replace-cross Abstract: Flow models transform data gradually from one modality (e.g. noise) onto another (e.g. images). Such models are parameterized by a time-dependent velocity field, trained to fit segments connecting pairs of source and target points. When…

October 1, 2025

BOOST: Bayesian Optimization with Optimal Kernel and Acquisition Function Selection Technique

arXiv:2508.02332v2 Announce Type: replace-cross Abstract: The performance of Bayesian optimization (BO), a highly sample-efficient method for expensive black-box problems, is critically governed by the selection of its hyperparameters, including the kernel and acquisition functions. This presents a significant practical challenge:…

October 1, 2025

Ringleader ASGD: The First Asynchronous SGD with Optimal Time Complexity under Data Heterogeneity

arXiv:2509.22860v2 Announce Type: replace-cross Abstract: Asynchronous stochastic gradient methods are central to scalable distributed optimization, particularly when devices differ in computational capabilities. Such settings arise naturally in federated learning, where training takes place on smartphones and other heterogeneous edge devices.…

October 1, 2025

How Effective Are Time-Series Models for Rainfall Nowcasting? A Comprehensive Benchmark for Rainfall Nowcasting Incorporating PWV Data

arXiv:2509.25263v1 Announce Type: cross Abstract: Rainfall nowcasting, which aims to predict precipitation within the next 0 to 3 hours, is critical for disaster mitigation and real-time response planning. However, most time series forecasting benchmarks in meteorology are evaluated on variables…

October 1, 2025

AuON: A Linear-time Alternative to Semi-Orthogonal Momentum Updates

arXiv:2509.24320v2 Announce Type: replace-cross Abstract: Orthogonal gradient updates have emerged as a promising direction in optimization for machine learning. However, traditional approaches such as SVD/QR decomposition incur prohibitive computational costs of O(n^3) and underperform compared to well-tuned SGD with momentum,…

October 1, 2025