Difference between revisions of "Proximal Policy Optimization (PPO)"

From
Jump to: navigation, search
m
m
Line 48: Line 48:
 
<youtube>WxQfQW48A4A</youtube>
 
<youtube>WxQfQW48A4A</youtube>
 
<youtube>QHAu8EWRJJ0</youtube>
 
<youtube>QHAu8EWRJJ0</youtube>
 +
 +
= Proximal Policy Optimization with Imitation Learning (PPO-IL) =
 +
* [[Imitation Learning}}
 +
a [[Reinforcement Learning (RL)]] algorithm that can be used for [[Imitation Learning]]. PPO-IL learns a policy that is close to the expert's policy, while also ensuring that the policy is still able to learn from its own experience.

Revision as of 07:33, 13 August 2023

YouTube ... Quora ...Google search ...Google News ...Bing News


Proximal Policy Optimization with Imitation Learning (PPO-IL)

  • [[Imitation Learning}}

a Reinforcement Learning (RL) algorithm that can be used for Imitation Learning. PPO-IL learns a policy that is close to the expert's policy, while also ensuring that the policy is still able to learn from its own experience.