Difference between revisions of "Large Language Model (LLM)"

Revision as of 12:10, 25 February 2023

YouTube search... ...Google search

Natural Language Processing (NLP) ...Generation ...LLM ...Tools & Services
Assistants ... Hybrid Assistants ... Agents ... Negotiation
Models:
- AlexaTM | Amazon 20B
- Alpa ... serving large models like GPT-3 simple, affordable, accessible
- Bidirectional Encoder Representations from Transformers (BERT) 340M
- BioGPT ... Microsoft language model trained for biomedical tasks
- BLOOM ... Big Science Language Open-science Open-access Multilingual ... 176B
- Cedille ... open-source French language model 6B
- ChatGPT | OpenAI
  - ChatGPT is everywhere. Here’s where it came from | Will Douglas Heaven - MIT Technology Review
    - Transformer / Attention Mechanism
    - Generative Pre-trained Transformer (GPT)
    - Reinforcement Learning (RL) from Human Feedback (RLHF)
    - Supervised Learning
    - Proximal Policy Optimization (PPO)
- Chinchilla | DeepMind 70B
- ctrl ... a Conditional Transformer Language Model for Controllable Generation | Salesforce
- Codex | OpenAI ... translates natural language into code
- Dataflow-as-a-Service | SambaNova
- DialogGPT ...Microsoft Releases DialogGPT AI Conversation Model | Anthony Alford - InfoQ - trained on over 147M dialogs
- Flamingo | DeepMind ... Flamingo Pytorch 80B
- GLM-130B ... Open Bilingual Pre-Trained Model 130B
- GLaM | Google
- Gopher | DeepMind 280B
- GShard | Google ... Scaling Giant Models with Conditional Computation and Automatic Sharding
- GPT-2 | OpenAI 1.5B
- GPT-3 | OpenAI 175B
- GPT-Neo ... Open-source GPT-3 by EleutherAI 20B
- InstructGPT ... OpenAI 1.3B InstructGPT model over outputs from a 175B GPT-3 model
- Jurassic-1 ... huge 178B language model to rival OpenAI's GPT-3
- LaMDA | Google ... experimental language model 137B
- LLaMA ... Large Language Model Meta AI, 13B and 65B parameter versions
- Luminous ... Europe 200B
- Macaw | AI2 11B
- Med-PaLM ... aligned to the medical domain
- Megatron ... Monolithic Transformer Language NLP Model 11B
- minGPT | Andrej Karpathy - GitHub
- Muse ... VLM-4, a set of natively trained large Language Models in French, Italian, Spanish, German, and English
- MT-NLG 530B
- nanoGPT ... for training/finetuning medium-sized GPTs
- NLLB | Meta 54.5B & 200B parameters; NLLB-200
- OpenGPT-X ... model for Europe
- OPT-175B...Facebook-owner Meta opens access to AI large language model | Elizabeth Culliford - Reuters ... Facebook 175B ... BlenderBot 175B
- Palmyra | Hugging Face ... a privacy-first LLM for enterprises
- Pathways Language Model (PaLM) 540B
- PLATO-XL | Baidu ... 11B
- RETRO | DeepMind
- Switch Transformers | Google Brain ... trillion parameters
- Textless NLP ... Generating expressive speech from raw audio
- T0pp | Hugging Face
- Toolformer | Meta ... models can teach themselves to use tools and APIs
- Turing-NLG | Microsoft
- UnifiedQA ... single QA system
- WebGPT ... GPT-3 version that can search the web
- Wu Dao 1.0 (Enlightment 1.0) ... China’s first homegrown super-scale intelligent model
- XGLM| Hugging Face 7.5B
- YaLM ... Yandex YaLM 100B
- Yuan 1.0 | Inspur ... 245B
OpenAI Blog | OpenAI
Attention Mechanism/Transformer Model
Generative Pre-trained Transformer (GPT)
SambaNova Systems ... Dataflow-as-a-Service GPT

Inside language models (from GPT-3 to PaLM) | Alan-D-Thompson

@@ Line 16: / Line 16: @@
 ** [https://github.com/microsoft/BioGPT BioGPT]  ... [[Microsoft]] language model trained for biomedical tasks
 ** [https://bigscience.notion.site/BLOOM-BigScience-176B-Model-ad073ca07cdf479398d5f95d88e218c4 BLOOM]  ... Big Science Language Open-science Open-access Multilingual  ... 176B
-** [https://cedille.ai/ Cedille]  ... open-source French language model
+** [https://cedille.ai/ Cedille]  ... open-source French language model  6B
 ** [[ChatGPT]] | [[OpenAI]]
 *** [https://www.technologyreview.com/2023/02/08/1068068/chatgpt-is-everywhere-heres-where-it-came-from/ ChatGPT is everywhere. Here’s where it came from | Will Douglas Heaven - MIT Technology Review]
@@ Line 36: / Line 36: @@
 ** [https://openai.com/blog/better-language-models/ GPT-2 |] [[OpenAI]]  1.5B
 ** [https://openai.com/blog/better-language-models/ GPT-3 |] [[OpenAI]] 175B
-** [https://github.com/EleutherAI/gpt-neo/ GPT-Neo] ... Open-source GPT-3 by EleutherAI
+** [https://github.com/EleutherAI/gpt-neo/ GPT-Neo] ... Open-source GPT-3 by EleutherAI   20B
 ** [https://openai.com/blog/instruction-following/ InstructGPT] ... [[OpenAI]] 1.3B InstructGPT model over outputs from a 175B GPT-3 model
 ** [https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf  Jurassic-1] ... huge 178B language model to rival [[OpenAI]]'s GPT-3
@@ Line 42: / Line 42: @@
 ** [https://www.reuters.com/technology/meta-launch-ai-language-model-llama-2023-02-24/ LLaMA] ... Large Language Model [[Meta]] AI, 13B and 65B parameter versions
 ** [https://www.aleph-alpha.com/luminous-explore-a-model-for-world-class-semantic-representation Luminous] ... Europe  200B
-** [https://github.com/allenai/macaw Macaw | AI2]
+** [https://github.com/allenai/macaw Macaw | AI2]  11B
 ** [https://arxiv.org/pdf/2212.13138.pdf Med-PaLM]  ... aligned to the medical domain
 ** [https://developer.nvidia.com/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/ Megatron] ... Monolithic Transformer Language NLP Model 11B
@@ Line 64: / Line 64: @@
 ** [https://openai.com/blog/webgpt/ WebGPT] ... GPT-3 version that can search the web
 ** [https://syncedreview.com/2021/03/23/chinas-gpt-3-baai-introduces-superscale-intelligence-model-wu-dao-1-0/   Wu Dao 1.0 (Enlightment 1.0)]   ... China’s first homegrown super-scale intelligent model
+** [https://huggingface.co/docs/transformers/model_doc/xglm  XGLM|] [[Hugging Face]]  7.5B
 ** [https://github.com/yandex/YaLM-100B YaLM] ... Yandex YaLM 100B
 ** [https://arxiv.org/abs/2110.04725 Yuan 1.0 | Inspur]   ... 245B

Difference between revisions of "Large Language Model (LLM)"

Revision as of 12:10, 25 February 2023

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools