Difference between revisions of "Proximal Policy Optimization (PPO)"

From
Jump to: navigation, search
m
m
Line 24: Line 24:
 
*** [[Lifelong Latent Actor-Critic (LILAC)]]
 
*** [[Lifelong Latent Actor-Critic (LILAC)]]
 
** [[Hierarchical Reinforcement Learning (HRL)]]
 
** [[Hierarchical Reinforcement Learning (HRL)]]
 +
* [[Assistants]] ... [[Hybrid Assistants]]  ... [[Agents]]  ... [[Negotiation]]
 
* [https://www.technologyreview.com/2023/02/08/1068068/chatgpt-is-everywhere-heres-where-it-came-from/ ChatGPT is everywhere. Here’s where it came from | Will Douglas Heaven - MIT Technology Review]
 
* [https://www.technologyreview.com/2023/02/08/1068068/chatgpt-is-everywhere-heres-where-it-came-from/ ChatGPT is everywhere. Here’s where it came from | Will Douglas Heaven - MIT Technology Review]
 
** [[Sequence to Sequence (Seq2Seq)]]
 
** [[Sequence to Sequence (Seq2Seq)]]

Revision as of 01:57, 12 February 2023