Difference between revisions of "Deep Distributed Q Network Partial Observability"
m |
|||
| (2 intermediate revisions by the same user not shown) | |||
| Line 5: | Line 5: | ||
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools | |description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools | ||
}} | }} | ||
| − | [ | + | [https://www.youtube.com/results?search_query=deep+distributed+Q+network+partial+observability Youtube search...] |
| − | [ | + | [https://www.google.com/search?q=deep+distributed+Q+network+partial+observability+deep+machine+learning+ML+artificial+intelligence ...Google search] |
* [[Architectures]] | * [[Architectures]] | ||
* [[SMART - Multi-Task Deep Neural Networks (MT-DNN)]] | * [[SMART - Multi-Task Deep Neural Networks (MT-DNN)]] | ||
| − | * [ | + | * [[Agents]] |
| − | * [ | + | * [https://arxiv.org/pdf/1703.06182.pdf Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability | ArXiv] |
| − | * [ | + | * [https://www.cs.utexas.edu/~larg/hausknecht_thesis/slides/peter_ijcai.pdf Deep Multiagent Reinforcement Learning for Partially Observable Parameterized Environments | Peter Stone] |
| + | * [https://www.ifaamas.org/Proceedings/aamas2016/pdfs/p530.pdf Reinforcement Learning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies with PAC Bounds] | ||
* [[Monte Carlo]] | * [[Monte Carlo]] | ||
* [[Markov Decision Process (MDP)]] | * [[Markov Decision Process (MDP)]] | ||
* [[Reinforcement Learning (RL)]] | * [[Reinforcement Learning (RL)]] | ||
| − | It is possible to design a Partially Observable [[Markov Decision Process (MDP)]] which calculates the decision making processes of two agents. A multi-agent reinforcement learning procedure calculates the decision process. For example if you want to calculate the probability for the decision of a person you introduce a random choice of one of the two roles or agents. This introduction is called the Harsanyi transformation. Two Partially Observable Markov Decision Processes are compatible if any policy for one of the agents is a policy for the other. [ | + | It is possible to design a Partially Observable [[Markov Decision Process (MDP)]] which calculates the decision making processes of two [[agents]]. A multi-agent reinforcement learning procedure calculates the decision process. For example if you want to calculate the probability for the decision of a person you introduce a random choice of one of the two roles or [[agents]]. This introduction is called the Harsanyi transformation. Two Partially Observable Markov Decision Processes are compatible if any [[policy]] for one of the [[agents]] is a [[policy]] for the other. [https://www.amazon.com/Zahra-M.M.A.-Sadiq/e/B071HGHXBD Zahra M.M.A. Sadiq] |
<youtube>JcJXfrT1mPI</youtube> | <youtube>JcJXfrT1mPI</youtube> | ||
<youtube>8JeweuKOA1M</youtube> | <youtube>8JeweuKOA1M</youtube> | ||
<youtube>dMOUp7YzUpQ</youtube> | <youtube>dMOUp7YzUpQ</youtube> | ||
Latest revision as of 16:32, 16 April 2023
Youtube search... ...Google search
- Architectures
- SMART - Multi-Task Deep Neural Networks (MT-DNN)
- Agents
- Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability | ArXiv
- Deep Multiagent Reinforcement Learning for Partially Observable Parameterized Environments | Peter Stone
- Reinforcement Learning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies with PAC Bounds
- Monte Carlo
- Markov Decision Process (MDP)
- Reinforcement Learning (RL)
It is possible to design a Partially Observable Markov Decision Process (MDP) which calculates the decision making processes of two agents. A multi-agent reinforcement learning procedure calculates the decision process. For example if you want to calculate the probability for the decision of a person you introduce a random choice of one of the two roles or agents. This introduction is called the Harsanyi transformation. Two Partially Observable Markov Decision Processes are compatible if any policy for one of the agents is a policy for the other. Zahra M.M.A. Sadiq