Difference between revisions of "Advanced Actor Critic (A2C)"

From
Jump to: navigation, search
Line 25: Line 25:
 
* [[Policy Gradient (PG)]]
 
* [[Policy Gradient (PG)]]
 
* [[Proximal Policy Optimization (PPO)]]
 
* [[Proximal Policy Optimization (PPO)]]
 +
  
 
A2C produces comparable performance to [[Asynchronous Advantage Actor Critic (A3C)]] while being more efficient. A2C is like A3C but without the asynchronous part; this means a single-worker variant of the A3C. [http://towardsdatascience.com/understanding-actor-critic-methods-931b97b6df3f Understanding Actor Critic Methods and A2C | Chris Yoon - Towards Data Science]
 
A2C produces comparable performance to [[Asynchronous Advantage Actor Critic (A3C)]] while being more efficient. A2C is like A3C but without the asynchronous part; this means a single-worker variant of the A3C. [http://towardsdatascience.com/understanding-actor-critic-methods-931b97b6df3f Understanding Actor Critic Methods and A2C | Chris Yoon - Towards Data Science]
 +
  
 
<youtube>GlwgeUmhWIM</youtube>
 
<youtube>GlwgeUmhWIM</youtube>

Revision as of 15:20, 3 July 2020