Difference between revisions of "Proximal Policy Optimization (PPO)"

From
Jump to: navigation, search
m
m
Line 13: Line 13:
  
 
* [https://arxiv.org/abs/1707.06347 Proximal policy optimization algorithms | J. Schulman, F. Wolski, P. Dhariwal, A. Radford & O. Klimov  2017]  
 
* [https://arxiv.org/abs/1707.06347 Proximal policy optimization algorithms | J. Schulman, F. Wolski, P. Dhariwal, A. Radford & O. Klimov  2017]  
 +
* [[Generative AI]]  ... [[OpenAI]]'s [[ChatGPT]] ... [[Perplexity]]  ... [[Microsoft]]'s [[BingAI]] ... [[You]] ...[[Google]]'s [[Bard]]
 
* [[Deep Reinforcement Learning (DRL)]]
 
* [[Deep Reinforcement Learning (DRL)]]
 
* [[Policy Gradient (PG)]]
 
* [[Policy Gradient (PG)]]
Line 31: Line 32:
 
* [[Assistants]] ... [[Hybrid Assistants]]  ... [[Agents]]  ... [[Negotiation]]
 
* [[Assistants]] ... [[Hybrid Assistants]]  ... [[Agents]]  ... [[Negotiation]]
 
* [[Natural Language Processing (NLP)]]  ...[[Natural Language Generation (NLG)|Generation]]  ...[[Large Language Model (LLM)|LLM]]  ...[[Natural Language Tools & Services|Tools & Services]]
 
* [[Natural Language Processing (NLP)]]  ...[[Natural Language Generation (NLG)|Generation]]  ...[[Large Language Model (LLM)|LLM]]  ...[[Natural Language Tools & Services|Tools & Services]]
* [[Generative AI]]  ... [[OpenAI]]'s [[ChatGPT]] ... [[Perplexity]]  ... [[Microsoft]]'s [[BingAI]] ... [[You]] ...[[Google]]'s [[Bard]]
 
  
  

Revision as of 14:30, 19 March 2023