Difference between revisions of "Policy Gradient (PG)"
m (BPeat moved page Deep Deterministic Policy Gradient (DDPG) to Policy Gradient (PG) without leaving a redirect) |
|||
| Line 8: | Line 8: | ||
[http://www.google.com/search?q=Deep+Deterministic+Policy+Gradient+DDPG+machine+learning+ML+artificial+intelligence ...Google search] | [http://www.google.com/search?q=Deep+Deterministic+Policy+Gradient+DDPG+machine+learning+ML+artificial+intelligence ...Google search] | ||
| + | * [[Trust Region Policy Optimization (TRPO)]] | ||
| + | * [[Proximal Policy Optimization (PPO)]] | ||
* [[Reinforcement Learning (RL)]] | * [[Reinforcement Learning (RL)]] | ||
| − | * [[ | + | * [[Gradient Descent Optimization & Challenges]] |
<youtube>PDbXPBwOavc</youtube> | <youtube>PDbXPBwOavc</youtube> | ||