Difference between revisions of "Reinforcement Learning (RL)"

Revision as of 16:19, 1 September 2019

Reinforcement Learning (RL):

___________________________________________________________

Apprenticeship Learning - Inverse Reinforcement Learning (IRL)
Lifelong Learning
Dopamine Google DeepMind
- Math for Intelligence
Inside Out - Curious Optimistic Reasoning
World Models
Google DeepMind AlphaGo Zero
Google’s AI picks which machine learning models will produce the best results | Kyle Wiggers - VentureBeat off-policy classification,” or OPC, which evaluates the performance of AI-driven agents by treating evaluation as a classification problem
Deep Reinforcement Learning Hands-On: Apply modern RL methods, with deep Q-networks, value iteration, policy gradients, TRPO, AlphaGo Zero and more | Maxim Lapan
Reinforcement-Learning-Notebooks - A collection of Reinforcement Learning algorithms from Sutton and Barto's book and other research papers implemented in Python

This is a bit similar to the traditional type of data analysis; the algorithm discovers through trial and error and decides which action results in greater rewards. Three major components can be identified in reinforcement learning functionality: the agent, the environment, and the actions. The agent is the learner or decision-maker, the environment includes everything that the agent interacts with, and the actions are what the agent can do. Reinforcement learning occurs when the agent chooses actions that maximize the expected reward over a given time. This is best achieved when the agent has a good policy to follow. Machine Learning: What it is and Why it Matters | Priyadharshini @ simplilearn

Q Learning Algorithm and Agent - Reinforcement Learning w/ Python Tutorial | Sentdex - Harrison

Reinforcement Learning | Phil Tabor

Code | GitHub

Reinforcement learning is an area of machine learning that involves taking right action to maximize reward in a particular situation. In this full tutorial course, you will get a solid foundation in reinforcement learning core topics.

The course covers Q learning, State-Action-Reward-State-Action (SARSA), double Q learning, Deep Q Learning (DQN), and Policy Gradient (PG) methods. These algorithms are employed in a number of environments from the open AI gym, including space invaders, breakout, and others. The deep learning portion uses Tensorflow and PyTorch.

The course begins with more modern algorithms, such as deep q learning and Policy Gradient (PG) methods, and demonstrates the power of reinforcement learning.

Then the course teaches some of the fundamental concepts that power all reinforcement learning algorithms. These are illustrated by coding up some algorithms that predate deep learning, but are still foundational to the cutting edge. These are studied in some of the more traditional environments from the OpenAI Gym, like the cart pole problem.

⌨️ (00:00:00) Introduction

⌨️ (00:01:30) Intro to Deep Q Learning

⌨️ (00:08:56) How to Code Deep Q Learning in Tensorflow

⌨️ (00:52:03) Deep Q Learning with Pytorch Part 1: The Q Network

⌨️ (01:06:21) Deep Q Learning with Pytorch part 2: Coding the Agent

⌨️ (01:28:54) Deep Q Learning with Pytorch part 3

⌨️ (01:46:39) Intro to Policy Gradients 3: Coding the main loop

⌨️ (01:55:01) How to Beat Lunar Lander with Policy Gradients

⌨️ (02:21:32) How to Beat Space Invaders with Policy Gradients

⌨️ (02:34:41) How to Create Your Own Reinforcement Learning Environment Part 1

⌨️ (02:55:39) How to Create Your Own Reinforcement Learning Environment Part 2

⌨️ (03:08:20) Fundamentals of Reinforcement Learning

⌨️ (03:17:09) Markov Decision Processes

⌨️ (03:23:02) The Explore Exploit Dilemma

⌨️ (03:29:19) Reinforcement Learning in the Open AI Gym: SARSA

⌨️ (03:39:56) Reinforcement Learning in the Open AI Gym: Double Q Learning

⌨️ (03:54:07) Conclusion

Jump Start

Lunar Lander: Deep Q learning is Easy in PyTorch

Lunar Lander: How to Beat Lunar Lander with Policy Gradients | Tensorflow Tutorial

Breakout: How to Code Deep Q Learning in Tensorflow (Tutorial)

Gridworld: How To Create Your Own Reinforcement Learning Environments

Designing Your Own Open Ai Gym Compatible Reinforcement Learning Environment | NEURALNET.AI

@@ Line 20: / Line 20: @@
 ** [[Asynchronous Advantage Actor Critic (A3C)]]
 ** [[Hierarchical Reinforcement Learning (HRL)]]
-*** [[HIerarchical Reinforcement learning with Off-policy correction(HIRO)]]
+*** [[HIerarchical Reinforcement learning with Off-policy correction (HIRO)]]
 ** [[MERLIN]]

Difference between revisions of "Reinforcement Learning (RL)"

Revision as of 16:19, 1 September 2019

Contents

Q Learning Algorithm and Agent - Reinforcement Learning w/ Python Tutorial | Sentdex - Harrison

Reinforcement Learning | Phil Tabor

Jump Start

Lunar Lander: Deep Q learning is Easy in PyTorch

Lunar Lander: How to Beat Lunar Lander with Policy Gradients | Tensorflow Tutorial

Breakout: How to Code Deep Q Learning in Tensorflow (Tutorial)

Gridworld: How To Create Your Own Reinforcement Learning Environments

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools