Archives AI News

Robust Adversarial Reinforcement Learning in Stochastic Games via Sequence Modeling

arXiv:2510.11877v1 Announce Type: new Abstract: The Transformer, a highly expressive architecture for sequence modeling, has recently been adapted to solve sequential decision-making, most notably through the Decision Transformer (DT), which learns policies by conditioning on desired returns. Yet, the adversarial…

October 15, 2025

ACCO: Accumulate While You Communicate for Communication-Overlapped Sharded LLM Training

arXiv:2406.02613v3 Announce Type: replace Abstract: Training LLMs relies on distributed implementations using multiple GPUs to compute gradients in parallel with sharded optimizers. However, synchronizing gradients in data parallel setups introduces communication overhead that grows with the number of workers, limiting…

October 15, 2025

Why some quantum materials stall while others scale

In a new study, MIT researchers evaluated quantum materials’ potential for scalable commercial success — and identified promising candidates.

October 15, 2025

Offline Fictitious Self-Play for Competitive Games

arXiv:2403.00841v2 Announce Type: replace-cross Abstract: Offline Reinforcement Learning (RL) enables policy improvement from fixed datasets without online interactions, making it highly suitable for real-world applications lacking efficient simulators. Despite its success in the single-agent setting, offline multi-agent RL remains a…

October 15, 2025

Reconstruction of SINR Maps from Sparse Measurements using Group Equivariant Non-Expansive Operators

arXiv:2507.19349v2 Announce Type: replace Abstract: As sixth generation (6G) wireless networks evolve, accurate signal-to-interference-noise ratio (SINR) maps are becoming increasingly critical for effective resource management and optimization. However, acquiring such maps at high resolution is often cost-prohibitive, creating a severe…

October 15, 2025

A Generalized Information Bottleneck Theory of Deep Learning

arXiv:2509.26327v2 Announce Type: replace Abstract: The Information Bottleneck (IB) principle offers a compelling theoretical framework to understand how neural networks (NNs) learn. However, its practical utility has been constrained by unresolved theoretical ambiguities and significant challenges in accurate estimation. In…

October 15, 2025

Evaluating multiple models using labeled and unlabeled data

arXiv:2501.11866v3 Announce Type: replace Abstract: It remains difficult to evaluate machine learning classifiers in the absence of a large, labeled dataset. While labeled data can be prohibitively expensive or impossible to obtain, unlabeled data is plentiful. Here, we introduce Semi-Supervised…

October 15, 2025

Your Pre-trained LLM is Secretly an Unsupervised Confidence Calibrator

arXiv:2505.16690v3 Announce Type: replace Abstract: Post-training of large language models is essential for adapting pre-trained language models (PLMs) to align with human preferences and downstream tasks. While PLMs typically exhibit well-calibrated confidence, post-trained language models (PoLMs) often suffer from over-confidence,…

October 15, 2025

ParsVoice: A Large-Scale Multi-Speaker Persian Speech Corpus for Text-to-Speech Synthesis

arXiv:2510.10774v2 Announce Type: replace-cross Abstract: Existing Persian speech datasets are typically smaller than their English counterparts, which creates a key limitation for developing Persian speech technologies. We address this gap by introducing ParsVoice, the largest Persian speech corpus designed specifically…

October 15, 2025

Wavefront Coding for Accommodation-Invariant Near-Eye Displays

arXiv:2510.12778v1 Announce Type: cross Abstract: We present a new computational near-eye display method that addresses the vergence-accommodation conflict problem in stereoscopic displays through accommodation-invariance. Our system integrates a refractive lens eyepiece with a novel wavefront coding diffractive optical element, operating…

October 15, 2025