Difference between revisions of "Distributed Deep Reinforcement Learning (DDRL)"

From
Jump to: navigation, search
(Created page with "{{#seo: |title=PRIMO.ai |titlemode=append |keywords=artificial, intelligence, machine, learning, models, algorithms, data, singularity, moonshot, Tensorflow, Google, Nvidia, M...")
 
m
 
(16 intermediate revisions by the same user not shown)
Line 5: Line 5:
 
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools  
 
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools  
 
}}
 
}}
[http://www.youtube.com/results?search_query=Distributed+Deep+Reinforcement+Learning+DeepRL Youtube search...]
+
[https://www.youtube.com/results?search_query=Distributed+Deep+Reinforcement+Learning+DeepRL Youtube search...]
[http://www.google.com/search?q=Distributed+Deep+Reinforcement+Learning+DeepRL+machine+learning+ML+artificial+intelligence ...Google search]
+
[https://www.google.com/search?q=Distributed+Deep+Reinforcement+Learning+DeepRL+machine+learning+ML+artificial+intelligence ...Google search]
  
* [[Deep Reinforcement Learning (DRL)]]
+
* [https://deepmind.com/blog/impala-scalable-distributed-deeprl-dmlab-30/ Importance Weighted Actor-Learner Architectures: Scalable Distributed DeepRL in DMLab-30]
* [http://deepmind.com/blog/impala-scalable-distributed-deeprl-dmlab-30/ Importance Weighted Actor-Learner Architectures: Scalable Distributed DeepRL in DMLab-30]
+
* [[Decentralized: Federated & Distributed]] Learning
 +
* [[Reinforcement Learning (RL)]]
 +
** [[Monte Carlo]] (MC) Method - Model Free Reinforcement Learning
 +
** [[Markov Decision Process (MDP)]]
 +
** [[State-Action-Reward-State-Action (SARSA)]]
 +
** [[Q Learning]]
 +
*** [[Deep Q Network (DQN)]]
 +
** [[Deep Reinforcement Learning (DRL)]] DeepRL
 +
** Distributed Deep Reinforcement Learning (DDRL)
 +
** [[Evolutionary Computation / Genetic Algorithms]]
 +
** [[Actor Critic]]
 +
*** [[Asynchronous Advantage Actor Critic (A3C)]]
 +
*** [[Advanced Actor Critic (A2C)]]
 +
*** [[Lifelong Latent Actor-Critic (LILAC)]]
 +
** [[Hierarchical Reinforcement Learning (HRL)]]
 +
* [[Agents]]  ... [[Agents#Communication | communications]]
 +
* [[Policy]]  ... [[Policy vs Plan]] ... [[Constitutional AI]] ... [[Trust Region Policy Optimization (TRPO)]] ... [[Policy Gradient (PG)]] ... [[Proximal Policy Optimization (PPO)]]
  
Deep Reinforcement Learning (DeepRL) has achieved remarkable success in a range of tasks, from continuous control problems in robotics to playing games like Go and Atari. The improvements seen in these domains have so far been limited to individual tasks where a separate agent has been tuned and trained for each task.
 
  
<youtube>PYQAI6Td2wo</youtube>
+
 
 +
a new, highly scalable [[Agents|agent]] architecture for distributed training called Importance Weighted Actor-Learner Architecture that uses a new off-policy correction algorithm called V-trace.
 +
 
 +
<youtube>-YMfJLFynmA</youtube>

Latest revision as of 15:36, 16 April 2023