Difference between revisions of "Apprenticeship Learning - Inverse Reinforcement Learning (IRL)"

From
Jump to: navigation, search
m
m
 
(5 intermediate revisions by the same user not shown)
Line 9: Line 9:
  
 
* [[Learning Techniques]]
 
* [[Learning Techniques]]
** [[Reinforcement Learning (RL)]]
+
** [[What is Artificial Intelligence (AI)? | Artificial Intelligence (AI)]] ... [[Generative AI]] ... [[Machine Learning (ML)]] ... [[Deep Learning]] ... [[Neural Network]] ... [[Reinforcement Learning (RL)|Reinforcement]] ... [[Learning Techniques]]
** [[Imitation Learning]]
+
** [[Imitation Learning (IL)]]
* [[Singularity]] ... [[Artificial Consciousness / Sentience|Sentience]] ... [[Artificial General Intelligence (AGI)| AGI]] ... [[Inside Out - Curious Optimistic Reasoning| Curious Reasoning]] ... [[Emergence]] ... [[Moonshots]] ... [[Explainable / Interpretable AI|Explainable AI]] ... [[Algorithm Administration#Automated Learning|Automated Learning]]
+
* [[Artificial General Intelligence (AGI) to Singularity]] ... [[Inside Out - Curious Optimistic Reasoning| Curious Reasoning]] ... [[Emergence]] ... [[Moonshots]] ... [[Explainable / Interpretable AI|Explainable AI]] ... [[Algorithm Administration#Automated Learning|Automated Learning]]
 
* [[Attention]] Mechanism  ... [[Transformer]] ... [[Generative Pre-trained Transformer (GPT)]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]]
 
* [[Attention]] Mechanism  ... [[Transformer]] ... [[Generative Pre-trained Transformer (GPT)]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]]
 
* [[Symbiotic Intelligence]] ... [[Bio-inspired Computing]] ... [[Neuroscience]] ... [[Connecting Brains]] ... [[Nanobots#Brain Interface using AI and Nanobots|Nanobots]] ... [[Molecular Artificial Intelligence (AI)|Molecular]] ... [[Neuromorphic Computing|Neuromorphic]] ... [[Evolutionary Computation / Genetic Algorithms| Evolutionary/Genetic]]
 
* [[Symbiotic Intelligence]] ... [[Bio-inspired Computing]] ... [[Neuroscience]] ... [[Connecting Brains]] ... [[Nanobots#Brain Interface using AI and Nanobots|Nanobots]] ... [[Molecular Artificial Intelligence (AI)|Molecular]] ... [[Neuromorphic Computing|Neuromorphic]] ... [[Evolutionary Computation / Genetic Algorithms| Evolutionary/Genetic]]
 
* [[Policy]]  ... [[Policy vs Plan]] ... [[Constitutional AI]] ... [[Trust Region Policy Optimization (TRPO)]] ... [[Policy Gradient (PG)]] ... [[Proximal Policy Optimization (PPO)]]
 
* [[Policy]]  ... [[Policy vs Plan]] ... [[Constitutional AI]] ... [[Trust Region Policy Optimization (TRPO)]] ... [[Policy Gradient (PG)]] ... [[Proximal Policy Optimization (PPO)]]
* [[Generative AI]] ... [[Conversational AI]] ... [[OpenAI]]'s [[ChatGPT]] ... [[Perplexity]] ... [[Microsoft]]'s [[Bing]] ... [[You]] ...[[Google]]'s [[Bard]] ... [[Baidu]]'s [[Ernie]]
+
* [[Conversational AI]] ... [[ChatGPT]] | [[OpenAI]] ... [[Bing/Copilot]] | [[Microsoft]] ... [[Gemini]] | [[Google]] ... [[Claude]] | [[Anthropic]] ... [[Perplexity]] ... [[You]] ... [[phind]] ... [[Grok]] | [https://x.ai/ xAI] ... [[Groq]] ... [[Ernie]] | [[Baidu]]
 
* [https://arxiv.org/pdf/1806.06877.pdf A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress | Saurabh Arora, Prashant Doshi] 18 Jun 2018
 
* [https://arxiv.org/pdf/1806.06877.pdf A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress | Saurabh Arora, Prashant Doshi] 18 Jun 2018
 
* [https://arxiv.org/pdf/1805.07687.pdf Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications | Daniel S. Brown, Scott Niekum] 23 Jun 2018
 
* [https://arxiv.org/pdf/1805.07687.pdf Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications | Daniel S. Brown, Scott Niekum] 23 Jun 2018

Latest revision as of 21:07, 9 April 2024

YouTube search... ...Google search

Inverse reinforcement learning (IRL) infers/derives a reward function from observed behavior/demonstrations, allowing for policy improvement and generalization. While ordinary "reinforcement learning" involves using rewards and punishments to learn behavior, in IRL the direction is reversed, and a robot observes a person's behavior to figure out what goal that behavior seems to be trying to achieve.