Difference between revisions of "Reinforcement Learning (RL)"

From
Jump to: navigation, search
Line 30: Line 30:
 
* [[World Models]]
 
* [[World Models]]
 
* [[Google DeepMind AlphaGo Zero]]
 
* [[Google DeepMind AlphaGo Zero]]
* [https://venturebeat.com/2019/06/19/googles-ai-picks-which-machine-learning-models-will-produce-the-best-results/ Google’s AI picks which machine learning models will produce the best results | Kyle Wiggers - off-policy classification,” or OPC, which evaluates the performance of AI-driven agents by treating evaluation as a classification problem] off-policy classification,” or OPC, which evaluates the performance of AI-driven agents by treating evaluation as a classification problem
+
* [https://venturebeat.com/2019/06/19/googles-ai-picks-which-machine-learning-models-will-produce-the-best-results/ Google’s AI picks which machine learning models will produce the best results | Kyle Wiggers - VentureBeat] off-policy classification,” or OPC, which evaluates the performance of AI-driven agents by treating evaluation as a classification problem
 
* [http://www.amazon.com/Deep-Reinforcement-Learning-Hands-Q-networks/dp/1788834240 Deep Reinforcement Learning Hands-On: Apply modern RL methods, with deep Q-networks, value iteration, policy gradients, TRPO, AlphaGo Zero and more | Maxim Lapan]
 
* [http://www.amazon.com/Deep-Reinforcement-Learning-Hands-Q-networks/dp/1788834240 Deep Reinforcement Learning Hands-On: Apply modern RL methods, with deep Q-networks, value iteration, policy gradients, TRPO, AlphaGo Zero and more | Maxim Lapan]
 
* [http://github.com/Pulkit-Khandelwal/Reinforcement-Learning-Notebooks Reinforcement-Learning-Notebooks] - A collection of Reinforcement Learning algorithms from Sutton and Barto's book and other research papers implemented in Python  
 
* [http://github.com/Pulkit-Khandelwal/Reinforcement-Learning-Notebooks Reinforcement-Learning-Notebooks] - A collection of Reinforcement Learning algorithms from Sutton and Barto's book and other research papers implemented in Python  

Revision as of 20:16, 19 June 2019

YouTube search... ...Google search

___________________________________________________________

This is a bit similar to the traditional type of data analysis; the algorithm discovers through trial and error and decides which action results in greater rewards. Three major components can be identified in reinforcement learning functionality: the agent, the environment, and the actions. The agent is the learner or decision-maker, the environment includes everything that the agent interacts with, and the actions are what the agent can do. Reinforcement learning occurs when the agent chooses actions that maximize the expected reward over a given time. This is best achieved when the agent has a good policy to follow. Machine Learning: What it is and Why it Matters | Priyadharshini @ simplilearn

Machine_Learning_5.jpg