Reducing the variance of policy gradient algorithms, from REINFORCE with baseline to A2C, to GAE that gracefully unifies them all.
Share this post
The Beauty of Reinforcement Learning (2) …
Share this post
Reducing the variance of policy gradient algorithms, from REINFORCE with baseline to A2C, to GAE that gracefully unifies them all.