MAESTRO: Multi-Agent Environment Shaping through Task and Reward Optimization
arXiv:2511.19253v2 Announce Type: replace Abstract: Cooperative Multi-Agent Reinforcement Learning (MARL) faces two major design bottlenecks: crafting dense reward functions and constructing curricula that avoid local optima in high-dimensional, non-stationary environments. Existing approaches rely on fixed heuristics or use Large Language…
