Difference between revisions of "Policy vs Plan"
m |
m |
||
| (3 intermediate revisions by the same user not shown) | |||
| Line 18: | Line 18: | ||
* [[Policy]] ... [[Policy vs Plan]] ... [[Constitutional AI]] ... [[Trust Region Policy Optimization (TRPO)]] ... [[Policy Gradient (PG)]] ... [[Proximal Policy Optimization (PPO)]] | * [[Policy]] ... [[Policy vs Plan]] ... [[Constitutional AI]] ... [[Trust Region Policy Optimization (TRPO)]] ... [[Policy Gradient (PG)]] ... [[Proximal Policy Optimization (PPO)]] | ||
| − | * [[ | + | * [[Agents]] ... [[Robotic Process Automation (RPA)|Robotic Process Automation]] ... [[Assistants]] ... [[Personal Companions]] ... [[Personal Productivity|Productivity]] ... [[Email]] ... [[Negotiation]] ... [[LangChain]] |
| − | + | * [[What is Artificial Intelligence (AI)? | Artificial Intelligence (AI)]] ... [[Generative AI]] ... [[Machine Learning (ML)]] ... [[Deep Learning]] ... [[Neural Network]] ... [[Reinforcement Learning (RL)|Reinforcement]] ... [[Learning Techniques]] | |
| − | * [[Generative AI]] | + | * [[Conversational AI]] ... [[ChatGPT]] | [[OpenAI]] ... [[Bing/Copilot]] | [[Microsoft]] ... [[Gemini]] | [[Google]] ... [[Claude]] | [[Anthropic]] ... [[Perplexity]] ... [[You]] ... [[phind]] ... [[Ernie]] | [[Baidu]] |
Compare: | Compare: | ||
Latest revision as of 08:31, 23 March 2024
Youtube search... ...Google search
- Policy ... Policy vs Plan ... Constitutional AI ... Trust Region Policy Optimization (TRPO) ... Policy Gradient (PG) ... Proximal Policy Optimization (PPO)
- Agents ... Robotic Process Automation ... Assistants ... Personal Companions ... Productivity ... Email ... Negotiation ... LangChain
- Artificial Intelligence (AI) ... Generative AI ... Machine Learning (ML) ... Deep Learning ... Neural Network ... Reinforcement ... Learning Techniques
- Conversational AI ... ChatGPT | OpenAI ... Bing/Copilot | Microsoft ... Gemini | Google ... Claude | Anthropic ... Perplexity ... You ... phind ... Ernie | Baidu
Compare:
- policy will be defined by a set of pair "state -> action" which should allow from any reachable state.
- plan will be a a strictly defined sequence of actions leading from the initial state to the goal (well it can be more complex than that if you have concurrency but this is still the basic idea). Planning involves the unrolling of a policy through time, and refining the policy based on the resulting trajectory (the series of resulting states). What is the difference between reinforcement learning and planning? | Ryan Brigden - Quora
Policy vs Strategy
In a reinforcement learning context,
- policy is a description of how an agent behaves in an environment, and is represented as the probability of performing each action for a given state.
- strategy - I don’t think strategy has a definition specific to the field, but I think it refers to some underlying mechanism/algorithm that drives an agent’s behavior. This can range from things like whether or not it does any planning, or motivation to explore new states. | Kris De Asis - Quora