Archives AI News

Differentially Private Clipped-SGD: High-Probability Convergence with Arbitrary Clipping Level

arXiv:2507.23512v2 Announce Type: replace Abstract: Gradient clipping is a fundamental tool in Deep Learning, improving the high-probability convergence of stochastic first-order methods like SGD, AdaGrad, and Adam under heavy-tailed noise, which is common in training large language models. It is…

September 30, 2025

OptiMind: Teaching LLMs to Think Like Optimization Experts

arXiv:2509.22979v1 Announce Type: new Abstract: Mathematical programming — the task of expressing operations and decision-making problems in precise mathematical language — is fundamental across domains, yet remains a skill-intensive process requiring operations research expertise. Recent advances in large language models…

September 30, 2025

Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search

arXiv:2509.15927v2 Announce Type: replace Abstract: Auto-bidding serves as a critical tool for advertisers to improve their advertising performance. Recent progress has demonstrated that AI-Generated Bidding (AIGB), which learns a conditional generative planner from offline data, achieves superior performance compared to…

September 30, 2025

MDP modeling for multi-stage stochastic programs

arXiv:2509.22981v1 Announce Type: new Abstract: We study a class of multi-stage stochastic programs, which incorporate modeling features from Markov decision processes (MDPs). This class includes structured MDPs with continuous state and action spaces. We extend policy graphs to include decision-dependent…

September 30, 2025

Is Thompson Sampling Susceptible to Algorithmic Collusion?

arXiv:2405.17463v2 Announce Type: replace-cross Abstract: When two players are engaged in a repeated game with unknown payoff matrices, they may use single-agent multi-armed bandit algorithms to choose the actions independent of each other. We show that when the players use…

September 30, 2025

T-TAMER: Provably Taming Trade-offs in ML Serving

arXiv:2509.22992v1 Announce Type: new Abstract: As machine learning models continue to grow in size and complexity, efficient serving faces increasingly broad trade-offs spanning accuracy, latency, resource usage, and other objectives. Multi-model serving further complicates these trade-offs; for example, in cascaded…

September 30, 2025

Nirvana AI Governance: How AI Policymaking Is Committing Three Old Fallacies

arXiv:2501.10384v2 Announce Type: replace-cross Abstract: This research applies Harold Demsetz’s concept of the nirvana approach to the realm of AI governance and debunks three common fallacies in various AI policy proposals–“the grass is always greener on the other side,” “free…

September 30, 2025

Analysis of Variational Autoencoders

arXiv:2509.22994v1 Announce Type: new Abstract: Sparse Autoencoders (SAEs) have emerged as a promising approach for interpreting neural network representations by learning sparse, human-interpretable features from dense activations. We investigate whether incorporating variational methods into SAE architectures can improve feature organization…

September 30, 2025

$textit{New News}$: System-2 Fine-tuning for Robust Integration of New Knowledge

arXiv:2505.01812v2 Announce Type: replace-cross Abstract: Humans and intelligent animals can internalize new information and accurately internalize their implications to perform downstream tasks. While large language models (LLMs) can achieve this through in-context learning (ICL) when the information (news) is explicitly…

September 30, 2025

Off-Policy Maximum Entropy RL with Future State and Action Visitation Measures

arXiv:2412.06655v3 Announce Type: replace-cross Abstract: Maximum entropy reinforcement learning integrates exploration into policy learning by providing additional intrinsic rewards proportional to the entropy of some distribution. In this paper, we propose a novel approach in which the intrinsic reward function…

September 30, 2025