Difference between revisions of "Apprenticeship Learning - Inverse Reinforcement Learning (IRL)"
m |
m |
||
| Line 9: | Line 9: | ||
* [[Learning Techniques]] | * [[Learning Techniques]] | ||
| − | ** [[Reinforcement Learning]] | + | ** [[Reinforcement Learning (RL)]] |
** [[Imitation Learning]] | ** [[Imitation Learning]] | ||
| − | * [[Inside Out - Curious Optimistic Reasoning]] | + | * [[Singularity]] ... [[Artificial Consciousness / Sentience|Sentience]] ... [[Artificial General Intelligence (AGI)| AGI]] ... [[Inside Out - Curious Optimistic Reasoning| Curious Reasoning]] ... [[Emergence]] ... [[Moonshots]] ... [[Explainable / Interpretable AI|Explainable AI]] ... [[Algorithm Administration#Automated Learning|Automated Learning]] |
| − | * [[Generative Adversarial Network (GAN)]] | + | * [[Attention]] Mechanism ... [[Transformer]] ... [[Generative Pre-trained Transformer (GPT)]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]] |
| − | * [[Connecting Brains]] | + | * [[Symbiotic Intelligence]] ... [[Bio-inspired Computing]] ... [[Neuroscience]] ... [[Connecting Brains]] ... [[Nanobots#Brain Interface using AI and Nanobots|Nanobots]] ... [[Molecular Artificial Intelligence (AI)|Molecular]] ... [[Neuromorphic Computing|Neuromorphic]] ... [[Evolutionary Computation / Genetic Algorithms| Evolutionary/Genetic]] |
* [[Policy]] ... [[Policy vs Plan]] ... [[Constitutional AI]] ... [[Trust Region Policy Optimization (TRPO)]] ... [[Policy Gradient (PG)]] ... [[Proximal Policy Optimization (PPO)]] | * [[Policy]] ... [[Policy vs Plan]] ... [[Constitutional AI]] ... [[Trust Region Policy Optimization (TRPO)]] ... [[Policy Gradient (PG)]] ... [[Proximal Policy Optimization (PPO)]] | ||
* [[Generative AI]] ... [[Conversational AI]] ... [[OpenAI]]'s [[ChatGPT]] ... [[Perplexity]] ... [[Microsoft]]'s [[Bing]] ... [[You]] ...[[Google]]'s [[Bard]] ... [[Baidu]]'s [[Ernie]] | * [[Generative AI]] ... [[Conversational AI]] ... [[OpenAI]]'s [[ChatGPT]] ... [[Perplexity]] ... [[Microsoft]]'s [[Bing]] ... [[You]] ...[[Google]]'s [[Bard]] ... [[Baidu]]'s [[Ernie]] | ||
Revision as of 07:00, 13 July 2023
YouTube search... ...Google search
- Learning Techniques
- Singularity ... Sentience ... AGI ... Curious Reasoning ... Emergence ... Moonshots ... Explainable AI ... Automated Learning
- Attention Mechanism ... Transformer ... Generative Pre-trained Transformer (GPT) ... GAN ... BERT
- Symbiotic Intelligence ... Bio-inspired Computing ... Neuroscience ... Connecting Brains ... Nanobots ... Molecular ... Neuromorphic ... Evolutionary/Genetic
- Policy ... Policy vs Plan ... Constitutional AI ... Trust Region Policy Optimization (TRPO) ... Policy Gradient (PG) ... Proximal Policy Optimization (PPO)
- Generative AI ... Conversational AI ... OpenAI's ChatGPT ... Perplexity ... Microsoft's Bing ... You ...Google's Bard ... Baidu's Ernie
- A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress | Saurabh Arora, Prashant Doshi 18 Jun 2018
- Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications | Daniel S. Brown, Scott Niekum 23 Jun 2018
- Guide to MBIRL – Model Based Inverse Reinforcement Learning | Aishwarya Verma
Inverse reinforcement learning (IRL) infers/derives a reward function from observed behavior/demonstrations, allowing for policy improvement and generalization. While ordinary "reinforcement learning" involves using rewards and punishments to learn behavior, in IRL the direction is reversed, and a robot observes a person's behavior to figure out what goal that behavior seems to be trying to achieve.