Difference between revisions of "Advanced Actor Critic (A2C)"

Revision as of 15:20, 3 July 2020

A2C produces comparable performance to Asynchronous Advantage Actor Critic (A3C) while being more efficient. A2C is like A3C but without the asynchronous part; this means a single-worker variant of the A3C. Understanding Actor Critic Methods and A2C | Chris Yoon - Towards Data Science

@@ Line 24: / Line 24: @@
 * [http://towardsdatascience.com/advanced-reinforcement-learning-6d769f529eb3 Beyond DQN/A3C: A Survey in Advanced Reinforcement Learning | Joyce Xu - Towards Data Science]
 * [[Policy Gradient (PG)]]
+* [[Proximal Policy Optimization (PPO)]]
 A2C produces comparable performance to [[Asynchronous Advantage Actor Critic (A3C)]] while being more efficient. A2C is like A3C but without the asynchronous part; this means a single-worker variant of the A3C. [http://towardsdatascience.com/understanding-actor-critic-methods-931b97b6df3f Understanding Actor Critic Methods and A2C | Chris Yoon - Towards Data Science]