High entropy leads to symmetry equivariant policies in Dec-POMDPs
arXiv:2511.22581v4 Announce Type: replace Abstract: We prove that in any Dec-POMDP, sufficiently high entropy regularization ensures that the policy gradient flow with tabular softmax parametrization always converges, for any initialization, to the same joint policy, and that this joint policy…
