Archives AI News

Pre-training Epidemic Time Series Forecasters with Compartmental Prototypes

arXiv:2502.03393v5 Announce Type: replace Abstract: Accurate epidemic forecasting is crucial for outbreak preparedness, but existing data-driven models are often brittle. Typically trained on a single pathogen, they struggle with data scarcity during new outbreaks and fail under distribution shifts caused…

On the Robustness of Kernel Goodness-of-Fit Tests

arXiv:2408.05854v5 Announce Type: replace-cross Abstract: Goodness-of-fit testing is often criticized for its lack of practical relevance: since “all models are wrong”, the null hypothesis that the data conform to our model is ultimately always rejected as the sample size grows.…

Beyond the Ideal: Analyzing the Inexact Muon Update

arXiv:2510.19933v1 Announce Type: new Abstract: The Muon optimizer has rapidly emerged as a powerful, geometry-aware alternative to AdamW, demonstrating strong performance in large-scale training of neural networks. However, a critical theory-practice disconnect exists: Muon’s efficiency relies on fast, approximate orthogonalization,…

FINDER: Feature Inference on Noisy Datasets using Eigenspace Residuals

arXiv:2510.19917v1 Announce Type: new Abstract: ”Noisy” datasets (regimes with low signal to noise ratios, small sample sizes, faulty data collection, etc) remain a key research frontier for classification methods with both theoretical and practical implications. We introduce FINDER, a rigorous…

FairGRPO: Fair Reinforcement Learning for Equitable Clinical Reasoning

arXiv:2510.19893v1 Announce Type: new Abstract: Medical artificial intelligence systems have achieved remarkable diagnostic capabilities, yet they consistently exhibit performance disparities across demographic groups, causing real-world harm to underrepresented populations. While recent multimodal reasoning foundation models have advanced clinical diagnosis through…

From Large to Small: Transferring CUDA Optimization Expertise via Reasoning Graph

arXiv:2510.19873v1 Announce Type: new Abstract: Despite significant evolution of CUDA programming and domain-specific libraries, effectively utilizing GPUs with massively parallel engines remains difficult. Large language models (LLMs) show strong potential in generating optimized CUDA code from sequential code. However, using…