Difference between revisions of "Large Language Model (LLM)"

From
Jump to: navigation, search
m
m
Line 16: Line 16:
 
** [https://github.com/microsoft/BioGPT BioGPT]  ... [[Microsoft]] language model trained for biomedical tasks
 
** [https://github.com/microsoft/BioGPT BioGPT]  ... [[Microsoft]] language model trained for biomedical tasks
 
** [https://bigscience.notion.site/BLOOM-BigScience-176B-Model-ad073ca07cdf479398d5f95d88e218c4 BLOOM]  ... Big Science Language Open-science Open-access Multilingual  ... 176B
 
** [https://bigscience.notion.site/BLOOM-BigScience-176B-Model-ad073ca07cdf479398d5f95d88e218c4 BLOOM]  ... Big Science Language Open-science Open-access Multilingual  ... 176B
** [https://cedille.ai/ Cedille]  ... open-source French language model
+
** [https://cedille.ai/ Cedille]  ... open-source French language model 6B
 
** [[ChatGPT]] | [[OpenAI]]
 
** [[ChatGPT]] | [[OpenAI]]
 
*** [https://www.technologyreview.com/2023/02/08/1068068/chatgpt-is-everywhere-heres-where-it-came-from/ ChatGPT is everywhere. Here’s where it came from | Will Douglas Heaven - MIT Technology Review]
 
*** [https://www.technologyreview.com/2023/02/08/1068068/chatgpt-is-everywhere-heres-where-it-came-from/ ChatGPT is everywhere. Here’s where it came from | Will Douglas Heaven - MIT Technology Review]
Line 36: Line 36:
 
** [https://openai.com/blog/better-language-models/ GPT-2 |] [[OpenAI]]  1.5B
 
** [https://openai.com/blog/better-language-models/ GPT-2 |] [[OpenAI]]  1.5B
 
** [https://openai.com/blog/better-language-models/ GPT-3 |] [[OpenAI]] 175B
 
** [https://openai.com/blog/better-language-models/ GPT-3 |] [[OpenAI]] 175B
** [https://github.com/EleutherAI/gpt-neo/ GPT-Neo] ... Open-source GPT-3 by EleutherAI
+
** [https://github.com/EleutherAI/gpt-neo/ GPT-Neo] ... Open-source GPT-3 by EleutherAI   20B
 
** [https://openai.com/blog/instruction-following/ InstructGPT] ... [[OpenAI]] 1.3B InstructGPT model over outputs from a 175B GPT-3 model  
 
** [https://openai.com/blog/instruction-following/ InstructGPT] ... [[OpenAI]] 1.3B InstructGPT model over outputs from a 175B GPT-3 model  
 
** [https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf  Jurassic-1] ... huge 178B language model to rival [[OpenAI]]'s GPT-3
 
** [https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf  Jurassic-1] ... huge 178B language model to rival [[OpenAI]]'s GPT-3
Line 42: Line 42:
 
** [https://www.reuters.com/technology/meta-launch-ai-language-model-llama-2023-02-24/ LLaMA] ... Large Language Model [[Meta]] AI, 13B and 65B parameter versions   
 
** [https://www.reuters.com/technology/meta-launch-ai-language-model-llama-2023-02-24/ LLaMA] ... Large Language Model [[Meta]] AI, 13B and 65B parameter versions   
 
** [https://www.aleph-alpha.com/luminous-explore-a-model-for-world-class-semantic-representation Luminous] ... Europe  200B
 
** [https://www.aleph-alpha.com/luminous-explore-a-model-for-world-class-semantic-representation Luminous] ... Europe  200B
** [https://github.com/allenai/macaw Macaw | AI2]
+
** [https://github.com/allenai/macaw Macaw | AI2] 11B
 
** [https://arxiv.org/pdf/2212.13138.pdf Med-PaLM]  ... aligned to the medical domain
 
** [https://arxiv.org/pdf/2212.13138.pdf Med-PaLM]  ... aligned to the medical domain
 
** [https://developer.nvidia.com/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/ Megatron] ... Monolithic Transformer Language NLP Model 11B
 
** [https://developer.nvidia.com/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/ Megatron] ... Monolithic Transformer Language NLP Model 11B
Line 64: Line 64:
 
** [https://openai.com/blog/webgpt/ WebGPT] ... GPT-3 version that can search the web
 
** [https://openai.com/blog/webgpt/ WebGPT] ... GPT-3 version that can search the web
 
** [https://syncedreview.com/2021/03/23/chinas-gpt-3-baai-introduces-superscale-intelligence-model-wu-dao-1-0/  Wu Dao 1.0 (Enlightment 1.0)]  ... China’s first homegrown super-scale intelligent model  
 
** [https://syncedreview.com/2021/03/23/chinas-gpt-3-baai-introduces-superscale-intelligence-model-wu-dao-1-0/  Wu Dao 1.0 (Enlightment 1.0)]  ... China’s first homegrown super-scale intelligent model  
 +
** [https://huggingface.co/docs/transformers/model_doc/xglm  XGLM|] [[Hugging Face]]  7.5B
 
** [https://github.com/yandex/YaLM-100B YaLM] ... Yandex YaLM 100B  
 
** [https://github.com/yandex/YaLM-100B YaLM] ... Yandex YaLM 100B  
 
** [https://arxiv.org/abs/2110.04725 Yuan 1.0 | Inspur]  ... 245B
 
** [https://arxiv.org/abs/2110.04725 Yuan 1.0 | Inspur]  ... 245B

Revision as of 12:10, 25 February 2023

YouTube search... ...Google search


Inside language models (from GPT-3 to PaLM) | Alan-D-Thompson