Difference between revisions of "Large Language Model (LLM)"

From
Jump to: navigation, search
m
m
Line 32: Line 32:
 
** [https://arxiv.org/abs/2006.16668 GShard |] [[Google]]  ... Scaling Giant Models with Conditional Computation and Automatic Sharding
 
** [https://arxiv.org/abs/2006.16668 GShard |] [[Google]]  ... Scaling Giant Models with Conditional Computation and Automatic Sharding
 
** [https://openai.com/blog/better-language-models/ GPT-2 |] [[OpenAI]] ... Generative Pre-trained Transformer 2 by [[OpenAI]]
 
** [https://openai.com/blog/better-language-models/ GPT-2 |] [[OpenAI]] ... Generative Pre-trained Transformer 2 by [[OpenAI]]
 +
** [https://github.com/EleutherAI/gpt-neo/ GPT-Neo] ... Open-source GPT-3 by EleutherAI
 +
** [https://openai.com/blog/instruction-following/ InstructGPT] ... [[OpenAI]] 1.3B InstructGPT model over outputs from a 175B GPT-3 model
 
* [https://openai.com/blog/gpt-2-6-month-follow-up/ OpenAI Blog] | [[OpenAI]]
 
* [https://openai.com/blog/gpt-2-6-month-follow-up/ OpenAI Blog] | [[OpenAI]]
 
* [[Attention]] Mechanism/[[Transformer]] Model
 
* [[Attention]] Mechanism/[[Transformer]] Model
 
* [[Generative Pre-trained Transformer (GPT)]]
 
* [[Generative Pre-trained Transformer (GPT)]]
 
* [https://sambanova.ai/solutions/gpt/ SambaNova Systems] ... Dataflow-as-a-Service GPT
 
* [https://sambanova.ai/solutions/gpt/ SambaNova Systems] ... Dataflow-as-a-Service GPT

Revision as of 23:03, 24 February 2023

YouTube search... ...Google search