Difference between revisions of "Gato"

From
Jump to: navigation, search
m
m
 
(9 intermediate revisions by the same user not shown)
Line 5: Line 5:
 
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools  
 
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools  
 
}}
 
}}
[http://www.youtube.com/results?search_query=Google+Gato YouTube search...]
+
[https://www.youtube.com/results?search_query=Google+Gato YouTube search...]
[http://www.google.com/search?q=Google+Gato ...Google search]
+
[https://www.google.com/search?q=Google+Gato ...Google search]
  
* [http://ai.google/tools/ Google's Tools and Resources]
+
* [https://ai.google/tools/ Google's Tools and Resources]
 
* [[Google]]
 
* [[Google]]
* [http://storage.googleapis.com/deepmind-media/A%20Generalist%20Agent/Generalist%20Agent.pdf A Generalist Agent | DeepMind]
+
* [[Attention]] Mechanism  ...[[Transformer]] ...[[Generative Pre-trained Transformer (GPT)]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]]
* [http://www.louisbouchard.ai/deepmind-gato/ Deepmind's new model Gato is amazing! | Louis Bouchard]
+
* [https://storage.googleapis.com/deepmind-media/A%20Generalist%20Agent/Generalist%20Agent.pdf A Generalist] [[Agents|Agent]] | S. Reed, K. Żołna, E. Parisotto, S. Gómez Colmenarejo, A. Novikov, G. Barth-Maron, M. Giménez, Y. Sulsky, J. Kay, J. Springenberg, T. Eccles, J. Bruce, A. Razavi, A. Edwards, N. Heess, Y. Chen, R. Hadsell, O. Vinyals, M. Bordbar and N. de Freitas - DeepMind
 +
* [https://www.louisbouchard.ai/deepmind-gato/ Deepmind's new model Gato is amazing! | Louis Bouchard]
 +
* [[Policy]]  ... [[Policy vs Plan]] ... [[Constitutional AI]] ... [[Trust Region Policy Optimization (TRPO)]] ... [[Policy Gradient (PG)]] ... [[Proximal Policy Optimization (PPO)]]
  
DeepMind's “generalist” AI model inspired by progress in large-scale language modeling, we apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which we refer to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist policy. The same network with the same weights can play Atari, caption images, chat, stack blocks with a real robot arm and much more, deciding based on its context whether to output text, joint torques, button presses, or other tokens.
+
DeepMind's “generalist” AI model inspired by progress in large-scale language modeling, we apply a similar approach towards building a single generalist [[Agents|agent]] beyond the realm of text outputs. The [[Agents|agent]], which we refer to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist [[policy]]. The same network with the same weights can play Atari, caption images, chat, stack blocks with a real robot arm and much more, deciding based on its [[context]] whether to output text, joint torques, button presses, or other tokens.
 +
 
 +
Gato has 16 [[Attention]] Heads...
 +
 
 +
<img src="https://i.gzn.jp/img/2022/05/18/deepmind-gato/gato_m.png" width="800">
  
 
{|<!-- T -->
 
{|<!-- T -->
Line 21: Line 27:
 
<youtube>zO49vZ31xb0</youtube>
 
<youtube>zO49vZ31xb0</youtube>
 
<b>This AI Can Solve 604 Tasks [Paper Analysis of Gato by DeepMind]
 
<b>This AI Can Solve 604 Tasks [Paper Analysis of Gato by DeepMind]
</b><br>DeepMind published a revolutionary paper 🔥 They introduced Gato, a generalist AI agent that can carry out more than 600 tasks with a single transformer neural architecture. The tasks are varied, from playing Atari games to providing captions to images.  
+
</b><br>DeepMind published a revolutionary paper 🔥 They introduced Gato, a generalist AI [[Agents|agent]] that can carry out more than 600 tasks with a single transformer neural architecture. The tasks are varied, from playing Atari games to providing captions to images.  
  
 
This paper demonstrates that:
 
This paper demonstrates that:
  
📌 Generalist agents can perform reasonably well on many tasks / embodiments / modalities
+
📌 Generalist [[Agents|agent]] can perform reasonably well on many tasks / embodiments / modalities
📌 Generalist agents have the potential to learn new tasks with few data points
+
📌 Generalist [[Agents|agents]] have the potential to learn new tasks with few data points
📌 By scaling up the parameter size, we can build a general-purpose agent
+
📌 By scaling up the parameter size, we can build a general-purpose [[Agents|agent]]
 
|}
 
|}
 
|<!-- M -->
 
|<!-- M -->
Line 35: Line 41:
 
<youtube>wSQJZHfAg18</youtube>
 
<youtube>wSQJZHfAg18</youtube>
 
<b>Is Gato Really the Future of AI?
 
<b>Is Gato Really the Future of AI?
</b><br>DeepMind has released "A Generalist Agent", a paper that introduces their new multi-modal model Gato. But is Gato truly a generalist agent? It is a transformer based model with the goal of generalizing over new tasks. It is trained fully autoregressively with supervised learning (no reinforcement learning) on a total of 603 different tasks. The tasks include robotics, Atari, DM Lab, Procgen, and a lot more. It also includes text and image tasks. This video is a paper review / explanation where I also give my thoughts on the paper.
+
</b><br>DeepMind has released "A Generalist [[Agents|agent]]", a paper that introduces their new multi-modal model Gato. But is Gato truly a generalist [[Agents|agent]]? It is a transformer based model with the goal of generalizing over new tasks. It is trained fully autoregressively with supervised learning (no reinforcement learning) on a total of 603 different tasks. The tasks include robotics, Atari, DM Lab, Procgen, and a lot more. It also includes text and image tasks. This video is a paper review / explanation where I also give my thoughts on the paper.
 
|}
 
|}
 
|}<!-- B -->
 
|}<!-- B -->
Line 45: Line 51:
 
<youtube>6fWEHrXN9zo</youtube>
 
<youtube>6fWEHrXN9zo</youtube>
 
<b>Integrated AI - Gato by DeepMind (May/2022) 1.2B + Asimo, GPT-3, Tesla Optimus, Boston Dynamics
 
<b>Integrated AI - Gato by DeepMind (May/2022) 1.2B + Asimo, GPT-3, Tesla Optimus, Boston Dynamics
</b><br>Dr Alan D. Thompson is a world expert in artificial intelligence (AI), specializing in the augmentation of human intelligence, and advancing the evolution of ‘integrated AI’. Alan’s applied AI research and visualizations are featured across major international media, including citations in the University of Oxford’s debate on AI Ethics in December 2021. http://lifearchitect.ai/  
+
</b><br>Dr Alan D. Thompson is a world expert in artificial intelligence (AI), specializing in the augmentation of human intelligence, and advancing the evolution of ‘integrated AI’. Alan’s applied AI research and visualizations are featured across major international media, including citations in the University of Oxford’s debate on AI Ethics in December 2021. https://lifearchitect.ai/  
 
|}
 
|}
 
|<!-- M -->
 
|<!-- M -->
Line 53: Line 59:
 
<youtube>xZKSWNv6Esc</youtube>
 
<youtube>xZKSWNv6Esc</youtube>
 
<b>Gato: A single Transformer to RuLe them all! ([[Google]]'s Deepmind's new model)
 
<b>Gato: A single Transformer to RuLe them all! ([[Google]]'s Deepmind's new model)
</b><br>Deepmind's new model Gato is amazing! The first generalist RL agent using transformers!
+
</b><br>Deepmind's new model Gato is amazing! The first generalist RL [[Agents|agent]] using transformers!
 
|}
 
|}
 
|}<!-- B -->
 
|}<!-- B -->

Latest revision as of 20:49, 17 May 2023

YouTube search... ...Google search

DeepMind's “generalist” AI model inspired by progress in large-scale language modeling, we apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which we refer to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist policy. The same network with the same weights can play Atari, caption images, chat, stack blocks with a real robot arm and much more, deciding based on its context whether to output text, joint torques, button presses, or other tokens.

Gato has 16 Attention Heads...

This AI Can Solve 604 Tasks [Paper Analysis of Gato by DeepMind]
DeepMind published a revolutionary paper 🔥 They introduced Gato, a generalist AI agent that can carry out more than 600 tasks with a single transformer neural architecture. The tasks are varied, from playing Atari games to providing captions to images.

This paper demonstrates that:

📌 Generalist agent can perform reasonably well on many tasks / embodiments / modalities 📌 Generalist agents have the potential to learn new tasks with few data points 📌 By scaling up the parameter size, we can build a general-purpose agent

Is Gato Really the Future of AI?
DeepMind has released "A Generalist agent", a paper that introduces their new multi-modal model Gato. But is Gato truly a generalist agent? It is a transformer based model with the goal of generalizing over new tasks. It is trained fully autoregressively with supervised learning (no reinforcement learning) on a total of 603 different tasks. The tasks include robotics, Atari, DM Lab, Procgen, and a lot more. It also includes text and image tasks. This video is a paper review / explanation where I also give my thoughts on the paper.

Integrated AI - Gato by DeepMind (May/2022) 1.2B + Asimo, GPT-3, Tesla Optimus, Boston Dynamics
Dr Alan D. Thompson is a world expert in artificial intelligence (AI), specializing in the augmentation of human intelligence, and advancing the evolution of ‘integrated AI’. Alan’s applied AI research and visualizations are featured across major international media, including citations in the University of Oxford’s debate on AI Ethics in December 2021. https://lifearchitect.ai/

Gato: A single Transformer to RuLe them all! (Google's Deepmind's new model)
Deepmind's new model Gato is amazing! The first generalist RL agent using transformers!