Difference between revisions of "Actor Critic"
| Line 5: | Line 5: | ||
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools | |description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools | ||
}} | }} | ||
| − | [http://www.youtube.com/results?search_query=Asynchronous+Advantage+Actor+Critic | + | [http://www.youtube.com/results?search_query=Asynchronous+Advantage+Actor+Critic+Reinforcement+Machine+Learning YouTube search...] |
| − | [http://www.google.com/search?q=Asynchronous+Advantage+Actor+Critic | + | [http://www.google.com/search?q=Asynchronous+Advantage+Actor+Critic+Reinforcement+machine+learning+ML+artificial+intelligence ...Google search] |
* [[Reinforcement Learning (RL)]]: | * [[Reinforcement Learning (RL)]]: | ||
| Line 17: | Line 17: | ||
** [[Deep Q Network (DQN)]] | ** [[Deep Q Network (DQN)]] | ||
** [[Evolutionary Computation / Genetic Algorithms]] | ** [[Evolutionary Computation / Genetic Algorithms]] | ||
| + | ** Actor Critic | ||
| + | *** [[Advanced Actor Critic (A2C]] | ||
| + | *** [[Asynchronous Advantage Actor Critic (A3C)]] | ||
| + | *** [[Lifelong Latent Actor-Critic (LILAC)]] | ||
** [[Hierarchical Reinforcement Learning (HRL)]] | ** [[Hierarchical Reinforcement Learning (HRL)]] | ||
* [http://towardsdatascience.com/advanced-reinforcement-learning-6d769f529eb3 Beyond DQN/A3C: A Survey in Advanced Reinforcement Learning | Joyce Xu - Towards Data Science] | * [http://towardsdatascience.com/advanced-reinforcement-learning-6d769f529eb3 Beyond DQN/A3C: A Survey in Advanced Reinforcement Learning | Joyce Xu - Towards Data Science] | ||
| Line 22: | Line 26: | ||
Policy gradients and [[Deep Q Network (DQN)]] can only get us so far, but what if we used two networks to help train and AI instead of one? Thats the idea behind actor critic algorithms. | Policy gradients and [[Deep Q Network (DQN)]] can only get us so far, but what if we used two networks to help train and AI instead of one? Thats the idea behind actor critic algorithms. | ||
| + | |||
| + | <youtube>2vJtbAha3To</youtube> | ||
| + | <youtube>CLZkpo8rEG</youtube> | ||
<youtube>aODdNpihRwM</youtube> | <youtube>aODdNpihRwM</youtube> | ||
Revision as of 11:33, 3 July 2020
YouTube search... ...Google search
- Reinforcement Learning (RL):
- Monte Carlo (MC) Method - Model Free Reinforcement Learning
- Markov Decision Process (MDP)
- Q Learning
- State-Action-Reward-State-Action (SARSA)
- Deep Reinforcement Learning (DRL) DeepRL
- Distributed Deep Reinforcement Learning (DDRL)
- Deep Q Network (DQN)
- Evolutionary Computation / Genetic Algorithms
- Actor Critic
- Hierarchical Reinforcement Learning (HRL)
- Beyond DQN/A3C: A Survey in Advanced Reinforcement Learning | Joyce Xu - Towards Data Science
- Policy Gradient (PG)
Policy gradients and Deep Q Network (DQN) can only get us so far, but what if we used two networks to help train and AI instead of one? Thats the idea behind actor critic algorithms.
Asynchronous Advantage Actor Critic (A3C)