Gaussian Approximation for Asynchronous Q-learning
arXiv:2604.07323v1 Announce Type: cross Abstract: In this paper, we derive rates of convergence in the high-dimensional central limit theorem for Polyak-Ruppert averaged iterates generated by the asynchronous Q-learning algorithm with a polynomial stepsize $k^{-omega},, omega in (1/2, 1]$. Assuming that…
