Difference between revisions of "Proximal Policy Optimization (PPO)"
| Line 9: | Line 9: | ||
* [[Deep Reinforcement Learning (DRL)]] | * [[Deep Reinforcement Learning (DRL)]] | ||
| + | * [[Policy Gradient (PG)]] | ||
<youtube>5P7I-xPq8u8</youtube> | <youtube>5P7I-xPq8u8</youtube> | ||