Difference between revisions of "Deep Distributed Q Network Partial Observability"

From
Jump to: navigation, search
Line 13: Line 13:
 
* [http://www.ifaamas.org/Proceedings/aamas2016/pdfs/p530.pdf Reinforcement Learning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies with PAC Bounds]
 
* [http://www.ifaamas.org/Proceedings/aamas2016/pdfs/p530.pdf Reinforcement Learning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies with PAC Bounds]
 
* [[Monte Carlo]]  
 
* [[Monte Carlo]]  
 +
* [[Markov Decision Process (MDP)]]
  
It is possible to design a Partially Observable Markov Decision Process which calculates the decision making processes of two agents. A multi-agent reinforcement learning procedure calculates the decision process. For example if you want to calculate the probability for the decision of a person you introduce a random choice of one of the two roles or agents. This introduction is called the Harsanyi transformation. Two Partially Observable Markov Decision Processes are compatible if any policy for one of the agents is a policy for the other.  [http://www.amazon.com/Zahra-M.M.A.-Sadiq/e/B071HGHXBD Zahra M.M.A. Sadiq]
+
It is possible to design a Partially Observable [[Markov Decision Process (MDP)]] which calculates the decision making processes of two agents. A multi-agent reinforcement learning procedure calculates the decision process. For example if you want to calculate the probability for the decision of a person you introduce a random choice of one of the two roles or agents. This introduction is called the Harsanyi transformation. Two Partially Observable Markov Decision Processes are compatible if any policy for one of the agents is a policy for the other.  [http://www.amazon.com/Zahra-M.M.A.-Sadiq/e/B071HGHXBD Zahra M.M.A. Sadiq]
  
 
<youtube>JcJXfrT1mPI</youtube>
 
<youtube>JcJXfrT1mPI</youtube>
 
<youtube>8JeweuKOA1M</youtube>
 
<youtube>8JeweuKOA1M</youtube>
 
<youtube>dMOUp7YzUpQ</youtube>
 
<youtube>dMOUp7YzUpQ</youtube>

Revision as of 19:26, 5 July 2020

Youtube search... ...Google search

It is possible to design a Partially Observable Markov Decision Process (MDP) which calculates the decision making processes of two agents. A multi-agent reinforcement learning procedure calculates the decision process. For example if you want to calculate the probability for the decision of a person you introduce a random choice of one of the two roles or agents. This introduction is called the Harsanyi transformation. Two Partially Observable Markov Decision Processes are compatible if any policy for one of the agents is a policy for the other. Zahra M.M.A. Sadiq