Trust-region-free policy optimization for stochastic policies

Publication
RLDM 2022
Benjamin Ellis
Benjamin Ellis
Doctoral Candidate

Doctoral Candidate at the University of Oxford supervised by Jakob Foerster and Shimon Whiteson