Difference between revisions of "Deep Reinforcement Learning (DRL)"

Revision as of 15:09, 1 September 2019

OTHER: Policy Gradient Methods

_______________________________________________________________________________________

1*BEby_oK1mU8Wq0HABOqeVQ.png

Goal-oriented algorithms, which learn how to attain a complex objective (goal) or maximize along a particular dimension over many steps; for example, maximize the points won in a game over many moves. Reinforcement learning solves the difficult problem of correlating immediate actions with the delayed returns they produce. Like humans, reinforcement learning algorithms sometimes have to wait a while to see the fruit of their decisions. They operate in a delayed return environment, where it can be difficult to understand which action leads to which outcome over many time steps.

@@ Line 8: / Line 8: @@
 [http://www.google.com/search?q=reinforcement+machine+learning+ML+artificial+intelligence ...Google search]
+* [[OpenAI Gym]]
 * [[Reinforcement Learning (RL)]]
-* [[OpenAI Gym]]
+** [[Monte Carlo]] (MC) Method - Model Free Reinforcement Learning
-* [[IMPALA (Importance Weighted Actor-Learner Architecture)]]
+** [[Markov Decision Process (MDP)]]
+** [[Q Learning]]
-==== OTHER: Learning; MDP, Q, and SARSA ====
+** [[State-Action-Reward-State-Action (SARSA)]]
-* [[Markov Decision Process (MDP)]]
+** [[Deep Reinforcement Learning (DRL)]] DeepRL
-* [[Deep Q Network (DQN)]]
+*** [[IMPALA (Importance Weighted Actor-Learner Architecture)]]
-* [[Neural Coreference]]
+** [[Distributed Deep Reinforcement Learning (DDRL)]]
-* [[State-Action-Reward-State-Action (SARSA)]]
+** [[Deep Q Network (DQN)]]
+** [[Evolutionary Computation / Genetic Algorithms]]
+** [[Asynchronous Advantage Actor Critic (A3C)]]
+** [[Hierarchical Reinforcement Learning (HRL)]]
+*** [[HIerarchical Reinforcement learning with Off-policy correction(HIRO)]]
+** [[MERLIN]]
 ==== OTHER: Policy Gradient Methods ====

Difference between revisions of "Deep Reinforcement Learning (DRL)"

Revision as of 15:09, 1 September 2019

OTHER: Policy Gradient Methods

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools