Difference between revisions of "Large Language Model (LLM)"

Revision as of 22:55, 24 February 2023

YouTube search... ...Google search

Models:
- ChatGPT | OpenAI
  - ChatGPT is everywhere. Here’s where it came from | Will Douglas Heaven - MIT Technology Review
    - Transformer / Attention Mechanism
    - Generative Pre-trained Transformer (GPT)
    - Reinforcement Learning (RL) from Human Feedback (RLHF)
    - Supervised Learning
    - Proximal Policy Optimization (PPO)
- Alpa ... serving large models like GPT-3 simple, affordable, accessible
- BioGPT ... Microsoft language model trained for biomedical tasks
- BLOOM ... Big Science Language Open-science Open-access Multilingual ... 176B
- Cedille ... open-source French language model
- Chinchilla | DeepMind
- ctrl ... a Conditional Transformer Language Model for Controllable Generation | Salesforce
- Gopher | DeepMind
- RETRO | DeepMind
- DialogGPT ...Microsoft Releases DialogGPT AI Conversation Model | Anthony Alford - InfoQ - trained on over 147M dialogs
- minGPT | Andrej Karpathy - GitHub
- GLM-130B ... Open Bilingual Pre-Trained Model
- OPT-175B...Facebook-owner Meta opens access to AI large language model | Elizabeth Culliford - Reuters ... Facebook 175-billion-parameter language model - Open Pretrained Transformer
- Bidirectional Encoder Representations from Transformers (BERT)
- GLaM | Google
- GShard | Google ... Scaling Giant Models with Conditional Computation and Automatic Sharding
- GPT-2 | OpenAI ... Generative Pre-trained Transformer 2 by OpenAI
OpenAI Blog | OpenAI
Attention Mechanism/Transformer Model
Generative Pre-trained Transformer (GPT)
SambaNova Systems ... Dataflow-as-a-Service GPT

@@ Line 30: / Line 30: @@
 ** [[Bidirectional Encoder Representations from Transformers (BERT)]]
 ** [https://ai.googleblog.com/2021/12/more-efficient-in-context-learning-with.html GLaM |] [[Google]]
+** [https://arxiv.org/abs/2006.16668 GShard |] [[Google]]   ... Scaling Giant Models with Conditional Computation and Automatic Sharding
+** [https://openai.com/blog/better-language-models/ GPT-2 |] [[OpenAI]] ... Generative Pre-trained Transformer 2 by [[OpenAI]]
 * [https://openai.com/blog/gpt-2-6-month-follow-up/ OpenAI Blog] | [[OpenAI]]
 * [[Attention]] Mechanism/[[Transformer]] Model
 * [[Generative Pre-trained Transformer (GPT)]]
 * [https://sambanova.ai/solutions/gpt/ SambaNova Systems] ... Dataflow-as-a-Service GPT

Difference between revisions of "Large Language Model (LLM)"

Revision as of 22:55, 24 February 2023

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools