Difference between revisions of "Apprenticeship Learning - Inverse Reinforcement Learning (IRL)"
m (Text replacement - "* Conversational AI ... ChatGPT | OpenAI ... Bing | Microsoft ... Bard | Google ... Claude | Anthropic ... Perplexity ... You ... Ernie | Baidu" to "* Conversational AI ... [[C...) |
m |
||
| Line 15: | Line 15: | ||
* [[Symbiotic Intelligence]] ... [[Bio-inspired Computing]] ... [[Neuroscience]] ... [[Connecting Brains]] ... [[Nanobots#Brain Interface using AI and Nanobots|Nanobots]] ... [[Molecular Artificial Intelligence (AI)|Molecular]] ... [[Neuromorphic Computing|Neuromorphic]] ... [[Evolutionary Computation / Genetic Algorithms| Evolutionary/Genetic]] | * [[Symbiotic Intelligence]] ... [[Bio-inspired Computing]] ... [[Neuroscience]] ... [[Connecting Brains]] ... [[Nanobots#Brain Interface using AI and Nanobots|Nanobots]] ... [[Molecular Artificial Intelligence (AI)|Molecular]] ... [[Neuromorphic Computing|Neuromorphic]] ... [[Evolutionary Computation / Genetic Algorithms| Evolutionary/Genetic]] | ||
* [[Policy]] ... [[Policy vs Plan]] ... [[Constitutional AI]] ... [[Trust Region Policy Optimization (TRPO)]] ... [[Policy Gradient (PG)]] ... [[Proximal Policy Optimization (PPO)]] | * [[Policy]] ... [[Policy vs Plan]] ... [[Constitutional AI]] ... [[Trust Region Policy Optimization (TRPO)]] ... [[Policy Gradient (PG)]] ... [[Proximal Policy Optimization (PPO)]] | ||
| − | * [[Conversational AI]] ... [[ChatGPT]] | [[OpenAI]] ... [[Bing/Copilot]] | [[Microsoft]] ... [[Gemini]] | [[Google]] ... [[Claude]] | [[Anthropic]] ... [[Perplexity]] ... [[You]] ... [[phind]] ... [[Ernie]] | [[Baidu]] | + | * [[Conversational AI]] ... [[ChatGPT]] | [[OpenAI]] ... [[Bing/Copilot]] | [[Microsoft]] ... [[Gemini]] | [[Google]] ... [[Claude]] | [[Anthropic]] ... [[Perplexity]] ... [[You]] ... [[phind]] ... [[Grok]] | [https://x.ai/ xAI] ... [[Groq]] ... [[Ernie]] | [[Baidu]] |
* [https://arxiv.org/pdf/1806.06877.pdf A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress | Saurabh Arora, Prashant Doshi] 18 Jun 2018 | * [https://arxiv.org/pdf/1806.06877.pdf A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress | Saurabh Arora, Prashant Doshi] 18 Jun 2018 | ||
* [https://arxiv.org/pdf/1805.07687.pdf Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications | Daniel S. Brown, Scott Niekum] 23 Jun 2018 | * [https://arxiv.org/pdf/1805.07687.pdf Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications | Daniel S. Brown, Scott Niekum] 23 Jun 2018 | ||
Latest revision as of 21:07, 9 April 2024
YouTube search... ...Google search
- Learning Techniques
- Artificial General Intelligence (AGI) to Singularity ... Curious Reasoning ... Emergence ... Moonshots ... Explainable AI ... Automated Learning
- Attention Mechanism ... Transformer ... Generative Pre-trained Transformer (GPT) ... GAN ... BERT
- Symbiotic Intelligence ... Bio-inspired Computing ... Neuroscience ... Connecting Brains ... Nanobots ... Molecular ... Neuromorphic ... Evolutionary/Genetic
- Policy ... Policy vs Plan ... Constitutional AI ... Trust Region Policy Optimization (TRPO) ... Policy Gradient (PG) ... Proximal Policy Optimization (PPO)
- Conversational AI ... ChatGPT | OpenAI ... Bing/Copilot | Microsoft ... Gemini | Google ... Claude | Anthropic ... Perplexity ... You ... phind ... Grok | xAI ... Groq ... Ernie | Baidu
- A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress | Saurabh Arora, Prashant Doshi 18 Jun 2018
- Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications | Daniel S. Brown, Scott Niekum 23 Jun 2018
- Guide to MBIRL – Model Based Inverse Reinforcement Learning | Aishwarya Verma
Inverse reinforcement learning (IRL) infers/derives a reward function from observed behavior/demonstrations, allowing for policy improvement and generalization. While ordinary "reinforcement learning" involves using rewards and punishments to learn behavior, in IRL the direction is reversed, and a robot observes a person's behavior to figure out what goal that behavior seems to be trying to achieve.