Deep Q Network (DQN)

From
Revision as of 00:18, 3 February 2019 by BPeat (talk | contribs)
Jump to: navigation, search

Youtube search... ...Google search

When feedback is provided, it might be long time after the fateful decision has been made. In reality, the feedback is likely to be the result of a large number of prior decisions, taken amid a shifting, uncertain environment. Unlike supervised learning, there are no correct input/output pairs, so suboptimal actions are not explicitly corrected, wrong actions just decrease the corresponding value in the Q-table, meaning there’s less chance choosing the same action should the same state be encountered again. Quora | Jaron Collis

Training deep neural networks to show that a novel end-to-end reinforcement learning agent, termed a deep Q-network (DQN) Human-level control through Deep Reinforcement Learning | Deepmind