Difference between revisions of "Deep Distributed Q Network Partial Observability"

Revision as of 07:52, 6 July 2020

It is possible to design a Partially Observable Markov Decision Process (MDP) which calculates the decision making processes of two agents. A multi-agent reinforcement learning procedure calculates the decision process. For example if you want to calculate the probability for the decision of a person you introduce a random choice of one of the two roles or agents. This introduction is called the Harsanyi transformation. Two Partially Observable Markov Decision Processes are compatible if any policy for one of the agents is a policy for the other. Zahra M.M.A. Sadiq

@@ Line 9: / Line 9: @@
 * [[Architectures]]
+* [[SMART - Multi-Task Deep Neural Networks (MT-DNN)]]
 * [http://arxiv.org/pdf/1703.06182.pdf Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability | ArXiv]
 * [http://www.cs.utexas.edu/~larg/hausknecht_thesis/slides/peter_ijcai.pdf Deep Multiagent Reinforcement Learning for Partially Observable Parameterized Environments | Peter Stone]

Difference between revisions of "Deep Distributed Q Network Partial Observability"

Revision as of 07:52, 6 July 2020

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools