Revision as of 07:33, 13 August 2023

YouTube ... Quora ...Google search ...Google News ...Bing News

Policy ... Policy vs Plan ... Constitutional AI ... Trust Region Policy Optimization (TRPO) ... Policy Gradient (PG) ... Proximal Policy Optimization (PPO)
Proximal policy optimization algorithms | J. Schulman, F. Wolski, P. Dhariwal, A. Radford & O. Klimov 2017
Deep Reinforcement Learning (DRL)
Reinforcement Learning (RL):
- Monte Carlo (MC) Method - Model Free Reinforcement Learning
- Markov Decision Process (MDP)
- Q Learning
- State-Action-Reward-State-Action (SARSA)
- Deep Reinforcement Learning (DRL) DeepRL
- Distributed Deep Reinforcement Learning (DDRL)
- Deep Q Network (DQN)
- Symbiotic Intelligence ... Bio-inspired Computing ... Neuroscience ... Connecting Brains ... Nanobots ... Molecular ... Neuromorphic ... Evolutionary/Genetic
- Actor Critic
- Hierarchical Reinforcement Learning (HRL)
Generative AI ... Conversational AI ... ChatGPT | OpenAI ... Bing | Microsoft ... Bard | Google ... Claude | Anthropic ... Perplexity ... You ... Ernie | Baidu
Assistants ... Personal Companions ... Agents ... Negotiation ... LangChain
Large Language Model (LLM) ... Natural Language Processing (NLP) ...Generation ... Classification ... Understanding ... Translation ... Tools & Services

Proximal Policy Optimization with Imitation Learning (PPO-IL)

[[Imitation Learning}}

a Reinforcement Learning (RL) algorithm that can be used for Imitation Learning. PPO-IL learns a policy that is close to the expert's policy, while also ensuring that the policy is still able to learn from its own experience.

@@ Line 48: / Line 48: @@
 <youtube>WxQfQW48A4A</youtube>
 <youtube>QHAu8EWRJJ0</youtube>
+= Proximal Policy Optimization with Imitation Learning (PPO-IL) =
+* [[Imitation Learning}}
+a [[Reinforcement Learning (RL)]] algorithm that can be used for [[Imitation Learning]]. PPO-IL learns a policy that is close to the expert's policy, while also ensuring that the policy is still able to learn from its own experience.

Difference between revisions of "Proximal Policy Optimization (PPO)"

Revision as of 07:33, 13 August 2023

Proximal Policy Optimization with Imitation Learning (PPO-IL)

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools