Difference between revisions of "Policy vs Plan"
(→Policy vs Strategy) |
|||
| Line 5: | Line 5: | ||
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools | |description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools | ||
}} | }} | ||
| − | [http://www.youtube.com/results?search_query= | + | [http://www.youtube.com/results?search_query=Policy+vs+Plan+Reinforcement+Learning Youtube search...] |
| − | [http://www.google.com/search?q= | + | [http://www.google.com/search?q=Policy+vs+Plan+Reinforcement+Learning ...Google search] |
* [[Policy Gradient (PG)]] | * [[Policy Gradient (PG)]] | ||
| Line 12: | Line 12: | ||
* [[Proximal Policy Optimization (PPO)]] | * [[Proximal Policy Optimization (PPO)]] | ||
* [[Reinforcement Learning (RL)]] | * [[Reinforcement Learning (RL)]] | ||
| − | |||
Compare: | Compare: | ||
| − | * | + | * <b>policy</b> will be defined by a set of pair "state -> action" which should allow from any reachable state. |
| − | * | + | * <b>plan</b> will be a a strictly defined sequence of actions leading from the initial state to the goal (well it can be more complex than that if you have concurrency but this is still the basic idea); |
<youtube>hlhzvQnXdAA</youtube> | <youtube>hlhzvQnXdAA</youtube> | ||
Revision as of 07:48, 6 July 2020
Youtube search... ...Google search
- Policy Gradient (PG)
- Trust Region Policy Optimization (TRPO)
- Proximal Policy Optimization (PPO)
- Reinforcement Learning (RL)
Compare:
- policy will be defined by a set of pair "state -> action" which should allow from any reachable state.
- plan will be a a strictly defined sequence of actions leading from the initial state to the goal (well it can be more complex than that if you have concurrency but this is still the basic idea);
Policy vs Strategy
In a reinforcement learning context,
- policy is a description of how an agent behaves in an environment, and is represented as the probability of performing each action for a given state.
- strategy - I don’t think strategy has a definition specific to the field, but I think it refers to some underlying mechanism/algorithm that drives an agent’s behavior. This can range from things like whether or not it does any planning, or motivation to explore new states. | Kris De Asis - Quora