Difference between revisions of "Large Language Model (LLM)"
m |
m |
||
| Line 8: | Line 8: | ||
[https://www.google.com/search?q=Large+Language+Model+LLM ...Google search] | [https://www.google.com/search?q=Large+Language+Model+LLM ...Google search] | ||
| − | * [[ChatGPT]] | [[OpenAI]] | + | * Models: |
| − | ** [https://www.technologyreview.com/2023/02/08/1068068/chatgpt-is-everywhere-heres-where-it-came-from/ ChatGPT is everywhere. Here’s where it came from | Will Douglas Heaven - MIT Technology Review] | + | ** [[ChatGPT]] | [[OpenAI]] |
| − | *** [[Transformer]] / [[Attention]] Mechanism | + | *** [https://www.technologyreview.com/2023/02/08/1068068/chatgpt-is-everywhere-heres-where-it-came-from/ ChatGPT is everywhere. Here’s where it came from | Will Douglas Heaven - MIT Technology Review] |
| − | *** [[Generative Pre-trained Transformer (GPT)]] | + | **** [[Transformer]] / [[Attention]] Mechanism |
| − | *** [[Reinforcement Learning (RL) from Human Feedback (RLHF)]] | + | **** [[Generative Pre-trained Transformer (GPT)]] |
| − | *** [[Supervised]] Learning | + | **** [[Reinforcement Learning (RL) from Human Feedback (RLHF)]] |
| − | *** [[Proximal Policy Optimization (PPO)]] | + | **** [[Supervised]] Learning |
| − | * [https://opt.alpa.ai/ Alpa] ... serving large models like GPT-3 simple, affordable, accessible | + | **** [[Proximal Policy Optimization (PPO)]] |
| − | * [https://gpt3demo.com/apps/biogpt BioGPT] ... [[Microsoft]] language model trained for biomedical tasks | + | ** [https://opt.alpa.ai/ Alpa] ... serving large models like GPT-3 simple, affordable, accessible |
| − | * [https://gpt3demo.com/apps/bloom BLOOM] ... Big Science Language Open-science Open-access Multilingual | + | ** [https://gpt3demo.com/apps/biogpt BioGPT] ... [[Microsoft]] language model trained for biomedical tasks |
| − | * [https://gpt3demo.com/apps/cedille-ai Cedille] ... open-source French language model | + | ** [https://gpt3demo.com/apps/bloom BLOOM] ... Big Science Language Open-science Open-access Multilingual |
| − | * [https://gpt3demo.com/apps/chinchilla-deepmind Chinchilla |] [[Google | DeepMind]] | + | ** [https://gpt3demo.com/apps/cedille-ai Cedille] ... open-source French language model |
| − | * [https://gpt3demo.com/apps/ctrl-salesforce ctrl] ... a Conditional Transformer Language Model for Controllable Generation | Salesforce | + | ** [https://gpt3demo.com/apps/chinchilla-deepmind Chinchilla |] [[Google | DeepMind]] |
| − | * [https://gpt3demo.com/apps/deepmind-gopher Gopher |] [[Google | DeepMind]] | + | ** [https://gpt3demo.com/apps/ctrl-salesforce ctrl] ... a Conditional Transformer Language Model for Controllable Generation | Salesforce |
| − | * [https://gpt3demo.com/apps/deepmind-retro RETRO |] [[Google | DeepMind]] | + | ** [https://gpt3demo.com/apps/deepmind-gopher Gopher |] [[Google | DeepMind]] |
| − | * [https://www.infoq.com/news/2019/11/microsoft-ai-conversation/ DialogGPT] ...Microsoft Releases DialogGPT AI Conversation Model | Anthony Alford - InfoQ - trained on over 147M dialogs | + | ** [https://gpt3demo.com/apps/deepmind-retro RETRO |] [[Google | DeepMind]] |
| − | * [https://github.com/karpathy/minGPT minGPT | Andrej Karpathy - GitHub] | + | ** [https://www.infoq.com/news/2019/11/microsoft-ai-conversation/ DialogGPT] ...Microsoft Releases DialogGPT AI Conversation Model | Anthony Alford - InfoQ - trained on over 147M dialogs |
| − | * [https://gpt3demo.com/apps/glm-130b GLM-130B] ... Open Bilingual Pre-Trained Model | + | ** [https://github.com/karpathy/minGPT minGPT | Andrej Karpathy - GitHub] |
| − | * [https://www.reuters.com/technology/facebook-owner-meta-opens-access-ai-large-language-model-2022-05-03/ OPT-175B]...[[Meta|Facebook]]-owner Meta opens access to AI large language model | Elizabeth Culliford - Reuters ... [[Meta|Facebook]] 175-billion-parameter language model - Open Pretrained Transformer | + | ** [https://gpt3demo.com/apps/glm-130b GLM-130B] ... Open Bilingual Pre-Trained Model |
| + | ** [https://www.reuters.com/technology/facebook-owner-meta-opens-access-ai-large-language-model-2022-05-03/ OPT-175B]...[[Meta|Facebook]]-owner Meta opens access to AI large language model | Elizabeth Culliford - Reuters ... [[Meta|Facebook]] 175-billion-parameter language model - Open Pretrained Transformer | ||
| + | ** [[Bidirectional Encoder Representations from Transformers (BERT)]] | ||
* [https://openai.com/blog/gpt-2-6-month-follow-up/ OpenAI Blog] | [[OpenAI]] | * [https://openai.com/blog/gpt-2-6-month-follow-up/ OpenAI Blog] | [[OpenAI]] | ||
* [[Attention]] Mechanism/[[Transformer]] Model | * [[Attention]] Mechanism/[[Transformer]] Model | ||
* [[Generative Pre-trained Transformer (GPT)]] | * [[Generative Pre-trained Transformer (GPT)]] | ||
* [https://sambanova.ai/solutions/gpt/ SambaNova Systems] ... Dataflow-as-a-Service GPT | * [https://sambanova.ai/solutions/gpt/ SambaNova Systems] ... Dataflow-as-a-Service GPT | ||
Revision as of 22:34, 24 February 2023
YouTube search... ...Google search
- Models:
- ChatGPT | OpenAI
- Alpa ... serving large models like GPT-3 simple, affordable, accessible
- BioGPT ... Microsoft language model trained for biomedical tasks
- BLOOM ... Big Science Language Open-science Open-access Multilingual
- Cedille ... open-source French language model
- Chinchilla | DeepMind
- ctrl ... a Conditional Transformer Language Model for Controllable Generation | Salesforce
- Gopher | DeepMind
- RETRO | DeepMind
- DialogGPT ...Microsoft Releases DialogGPT AI Conversation Model | Anthony Alford - InfoQ - trained on over 147M dialogs
- minGPT | Andrej Karpathy - GitHub
- GLM-130B ... Open Bilingual Pre-Trained Model
- OPT-175B...Facebook-owner Meta opens access to AI large language model | Elizabeth Culliford - Reuters ... Facebook 175-billion-parameter language model - Open Pretrained Transformer
- Bidirectional Encoder Representations from Transformers (BERT)
- OpenAI Blog | OpenAI
- Attention Mechanism/Transformer Model
- Generative Pre-trained Transformer (GPT)
- SambaNova Systems ... Dataflow-as-a-Service GPT