Difference between revisions of "Large Language Model (LLM)"

From
Jump to: navigation, search
m
m
Line 22: Line 22:
 
** [https://www.deepmind.com/publications/an-empirical-analysis-of-compute-optimal-large-language-model-training Chinchilla |] [[Google | DeepMind]]
 
** [https://www.deepmind.com/publications/an-empirical-analysis-of-compute-optimal-large-language-model-training Chinchilla |] [[Google | DeepMind]]
 
** [https://arxiv.org/abs/2203.15556 ctrl] ... a Conditional Transformer Language Model for Controllable Generation | Salesforce
 
** [https://arxiv.org/abs/2203.15556 ctrl] ... a Conditional Transformer Language Model for Controllable Generation | Salesforce
 +
** [https://sambanova.ai/solutions/gpt/ Dataflow-as-a-Service | SambaNova]
 
** [https://www.deepmind.com/blog/language-modelling-at-scale-gopher-ethical-considerations-and-retrieval Gopher |] [[Google | DeepMind]]
 
** [https://www.deepmind.com/blog/language-modelling-at-scale-gopher-ethical-considerations-and-retrieval Gopher |] [[Google | DeepMind]]
** [https://www.deepmind.com/publications/improving-language-models-by-retrieving-from-trillions-of-tokens RETRO |] [[Google | DeepMind]]
 
 
** [https://www.infoq.com/news/2019/11/microsoft-ai-conversation/ DialogGPT]  ...Microsoft Releases DialogGPT AI Conversation Model | Anthony Alford - InfoQ - trained on over 147M dialogs  
 
** [https://www.infoq.com/news/2019/11/microsoft-ai-conversation/ DialogGPT]  ...Microsoft Releases DialogGPT AI Conversation Model | Anthony Alford - InfoQ - trained on over 147M dialogs  
 
** [https://github.com/karpathy/minGPT minGPT | Andrej Karpathy - GitHub]
 
** [https://github.com/karpathy/minGPT minGPT | Andrej Karpathy - GitHub]
Line 47: Line 47:
 
** [https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html Pathways Language Model (PaLM)]  ...scaling to 540 Billion Parameters
 
** [https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html Pathways Language Model (PaLM)]  ...scaling to 540 Billion Parameters
 
** [http://research.baidu.com/Blog/index-view?id=163 PLATO-XL | Baidu]  ... 11B Parameter Chatbot
 
** [http://research.baidu.com/Blog/index-view?id=163 PLATO-XL | Baidu]  ... 11B Parameter Chatbot
** [https://sambanova.ai/solutions/gpt/ Dataflow-as-a-Service | SambaNova]
+
** [https://www.deepmind.com/publications/improving-language-models-by-retrieving-from-trillions-of-tokens RETRO |] [[Google | DeepMind]]  
 
** [https://arxiv.org/abs/2101.03961 Switch Transformers | [[Google]] Brain  ... trillion parameters
 
** [https://arxiv.org/abs/2101.03961 Switch Transformers | [[Google]] Brain  ... trillion parameters
 
** [https://huggingface.co/bigscience/T0pp  T0pp |] [[Hugging Face]]
 
** [https://huggingface.co/bigscience/T0pp  T0pp |] [[Hugging Face]]

Revision as of 00:01, 25 February 2023

YouTube search... ...Google search