Difference between revisions of "Large Language Model (LLM)"

From
Jump to: navigation, search
m
m
Line 21: Line 21:
 
**** [[Supervised]] Learning
 
**** [[Supervised]] Learning
 
**** [[Proximal Policy Optimization (PPO)]]
 
**** [[Proximal Policy Optimization (PPO)]]
** [https://www.deepmind.com/publications/an-empirical-analysis-of-compute-optimal-large-language-model-training Chinchilla |] [[Google | DeepMind]]
+
** [https://www.deepmind.com/publications/an-empirical-analysis-of-compute-optimal-large-language-model-training Chinchilla |] [[Google | DeepMind]]   70B parameters
 
** [https://arxiv.org/abs/2203.15556 ctrl] ... a Conditional Transformer Language Model for Controllable Generation | Salesforce
 
** [https://arxiv.org/abs/2203.15556 ctrl] ... a Conditional Transformer Language Model for Controllable Generation | Salesforce
 
** [https://openai.com/ Codex |] [[OpenAI]] ... translates natural language into code
 
** [https://openai.com/ Codex |] [[OpenAI]] ... translates natural language into code
Line 35: Line 35:
 
** [https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf  Jurassic-1 Language Model] ... huge 178B language model to rival [[OpenAI]]'s GPT-3]
 
** [https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf  Jurassic-1 Language Model] ... huge 178B language model to rival [[OpenAI]]'s GPT-3]
 
** [https://www.blog.google/technology/ai/lamda/ LaMDA |] [[Google]]  ... experimental language model
 
** [https://www.blog.google/technology/ai/lamda/ LaMDA |] [[Google]]  ... experimental language model
** [https://www.reuters.com/technology/meta-launch-ai-language-model-llama-2023-02-24/ LLaMA] ... Large Language Model [[Meta]] AI
+
** [https://www.reuters.com/technology/meta-launch-ai-language-model-llama-2023-02-24/ LLaMA] ... Large Language Model [[Meta]] AI, 13B and 65B parameter versions 
 
** [https://github.com/allenai/macaw Macaw | AI2]
 
** [https://github.com/allenai/macaw Macaw | AI2]
 
** [https://arxiv.org/pdf/2212.13138.pdf Med-PaLM]  ... aligned to the medical domain
 
** [https://arxiv.org/pdf/2212.13138.pdf Med-PaLM]  ... aligned to the medical domain
Line 43: Line 43:
 
** [https://github.com/karpathy/nanoGPT nanoGPT] ... for training/finetuning medium-sized GPTs
 
** [https://github.com/karpathy/nanoGPT nanoGPT] ... for training/finetuning medium-sized GPTs
 
** [https://idw-online.de/en/news786967 OpenGPT-X]  ... model for Europe
 
** [https://idw-online.de/en/news786967 OpenGPT-X]  ... model for Europe
** [https://www.reuters.com/technology/facebook-owner-meta-opens-access-ai-large-language-model-2022-05-03/ OPT-175B]...[[Meta|Facebook]]-owner Meta opens access to AI large language model | Elizabeth Culliford - Reuters ... [[Meta|Facebook]] 175-billion-parameter language model - Open Pretrained Transformer  
+
** [https://www.reuters.com/technology/facebook-owner-meta-opens-access-ai-large-language-model-2022-05-03/ OPT-175B]...[[Meta|Facebook]]-owner Meta opens access to AI large language model | Elizabeth Culliford - Reuters ... [[Meta|Facebook]] 175-billion-parameter language model - Open Pretrained Transformer ... BlenderBot
 
** [https://huggingface.co/Writer/palmyra-base  Palmyra |] [[Hugging Face]] ... a privacy-first LLM for enterprises
 
** [https://huggingface.co/Writer/palmyra-base  Palmyra |] [[Hugging Face]] ... a privacy-first LLM for enterprises
** [https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html Pathways Language Model (PaLM)] ...scaling to 540 Billion Parameters
+
** [https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html Pathways Language Model (PaLM)]   540B parameters
** [http://research.baidu.com/Blog/index-view?id=163 PLATO-XL | Baidu]  ... 11B Parameter Chatbot
+
** [http://research.baidu.com/Blog/index-view?id=163 PLATO-XL | Baidu]  ... 11B parameter chatbot
 
** [https://www.deepmind.com/publications/improving-language-models-by-retrieving-from-trillions-of-tokens RETRO |] [[Google | DeepMind]]  
 
** [https://www.deepmind.com/publications/improving-language-models-by-retrieving-from-trillions-of-tokens RETRO |] [[Google | DeepMind]]  
 
** [https://arxiv.org/abs/2101.03961 Switch Transformers |] [[Google]] Brain  ... trillion parameters
 
** [https://arxiv.org/abs/2101.03961 Switch Transformers |] [[Google]] Brain  ... trillion parameters

Revision as of 09:06, 25 February 2023

YouTube search... ...Google search