Archives AI News

GTPO: Trajectory-Based Policy Optimization in Large Language Models

GTPO: Trajectory-Based Policy Optimization in Large Language Models arXiv:2508.03772v3 Announce Type: replace Abstract: Policy-based optimizations are widely adopted today for the training and alignment of language models, where one of the most recent and effective approaches is Group-relative Policy Optimization…

The Sample Complexity of Membership Inference and Privacy Auditing

The Sample Complexity of Membership Inference and Privacy Auditing arXiv:2508.19458v1 Announce Type: new Abstract: A membership-inference attack gets the output of a learning algorithm, and a target individual, and tries to determine whether this individual is a member of the…

From Optimization to Control: Quasi Policy Iteration

From Optimization to Control: Quasi Policy Iteration arXiv:2311.11166v3 Announce Type: replace-cross Abstract: Recent control algorithms for Markov decision processes (MDPs) have been designed using an implicit analogy with well-established optimization algorithms. In this paper, we adopt the quasi-Newton method (QNM)…

Incentivized Lipschitz Bandits

Incentivized Lipschitz Bandits arXiv:2508.19466v1 Announce Type: new Abstract: We study incentivized exploration in multi-armed bandit (MAB) settings with infinitely many arms modeled as elements in continuous metric spaces. Unlike classical bandit models, we consider scenarios where the decision-maker (principal) incentivizes…

DeepAtlas: a tool for effective manifold learning

DeepAtlas: a tool for effective manifold learning arXiv:2508.19479v1 Announce Type: new Abstract: Manifold learning builds on the “manifold hypothesis,” which posits that data in high-dimensional datasets are drawn from lower-dimensional manifolds. Current tools generate global embeddings of data, rather than…