Difference between revisions of "Deep Q Network (DQN)"

From
Jump to: navigation, search
Line 11: Line 11:
 
* [[Reinforcement Learning (RL)]]
 
* [[Reinforcement Learning (RL)]]
 
* [http://medium.com/deep-math-machine-learning-ai/ch-12-1-model-free-reinforcement-learning-algorithms-monte-carlo-sarsa-q-learning-65267cb8d1b4 Model Free Reinforcement learning algorithms (Monte Carlo, SARSA, Q-learning) | Madhu Sanjeevi (Mady) - Medium]
 
* [http://medium.com/deep-math-machine-learning-ai/ch-12-1-model-free-reinforcement-learning-algorithms-monte-carlo-sarsa-q-learning-65267cb8d1b4 Model Free Reinforcement learning algorithms (Monte Carlo, SARSA, Q-learning) | Madhu Sanjeevi (Mady) - Medium]
 +
* [[Monte Carlo]]
 
* [[Gaming]]
 
* [[Gaming]]
 
* [http://en.wikipedia.org/wiki/Q-learning Q Learning | Wikipedia]
 
* [http://en.wikipedia.org/wiki/Q-learning Q Learning | Wikipedia]

Revision as of 15:50, 11 August 2019

Youtube search... ...Google search

When feedback is provided, it might be long time after the fateful decision has been made. In reality, the feedback is likely to be the result of a large number of prior decisions, taken amid a shifting, uncertain environment. Unlike supervised learning, there are no correct input/output pairs, so suboptimal actions are not explicitly corrected, wrong actions just decrease the corresponding value in the Q-table, meaning there’s less chance choosing the same action should the same state be encountered again. Quora | Jaron Collis

Training deep neural networks to show that a novel end-to-end reinforcement learning agent, termed a deep Q-network (DQN) Human-level control through Deep Reinforcement Learning | Deepmind