Difference between revisions of "Large Language Model (LLM)"

From
Jump to: navigation, search
m
m
Line 25: Line 25:
 
** [https://www.deepmind.com/blog/language-modelling-at-scale-gopher-ethical-considerations-and-retrieval Gopher |] [[Google | DeepMind]]
 
** [https://www.deepmind.com/blog/language-modelling-at-scale-gopher-ethical-considerations-and-retrieval Gopher |] [[Google | DeepMind]]
 
** [https://www.infoq.com/news/2019/11/microsoft-ai-conversation/ DialogGPT]  ...Microsoft Releases DialogGPT AI Conversation Model | Anthony Alford - InfoQ - trained on over 147M dialogs  
 
** [https://www.infoq.com/news/2019/11/microsoft-ai-conversation/ DialogGPT]  ...Microsoft Releases DialogGPT AI Conversation Model | Anthony Alford - InfoQ - trained on over 147M dialogs  
** [https://github.com/karpathy/minGPT minGPT | Andrej Karpathy - GitHub]
 
 
** [https://github.com/THUDM/GLM-130B GLM-130B]  ... Open Bilingual Pre-Trained Model
 
** [https://github.com/THUDM/GLM-130B GLM-130B]  ... Open Bilingual Pre-Trained Model
 
** [https://www.reuters.com/technology/facebook-owner-meta-opens-access-ai-large-language-model-2022-05-03/ OPT-175B]...[[Meta|Facebook]]-owner Meta opens access to AI large language model | Elizabeth Culliford - Reuters ... [[Meta|Facebook]] 175-billion-parameter language model - Open Pretrained Transformer   
 
** [https://www.reuters.com/technology/facebook-owner-meta-opens-access-ai-large-language-model-2022-05-03/ OPT-175B]...[[Meta|Facebook]]-owner Meta opens access to AI large language model | Elizabeth Culliford - Reuters ... [[Meta|Facebook]] 175-billion-parameter language model - Open Pretrained Transformer   
Line 38: Line 37:
 
** [https://github.com/allenai/macaw Macaw | AI2]
 
** [https://github.com/allenai/macaw Macaw | AI2]
 
** [https://arxiv.org/pdf/2212.13138.pdf Med-PaLM]  ... aligned to the medical domain
 
** [https://arxiv.org/pdf/2212.13138.pdf Med-PaLM]  ... aligned to the medical domain
** [https://turing.microsoft.com/ Turing-NLG |] [[Microsoft]]  
+
** [https://github.com/karpathy/minGPT minGPT | Andrej Karpathy - GitHub]
 
** [https://developer.nvidia.com/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/ Megatron NLG] ... Monolithic Transformer Language NLP Model Triple the Size of [[OpenAI]]’s GPT-3
 
** [https://developer.nvidia.com/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/ Megatron NLG] ... Monolithic Transformer Language NLP Model Triple the Size of [[OpenAI]]’s GPT-3
 
** [https://muse.lighton.ai/home Muse] ... VLM-4, a set of natively trained large Language Models in French, Italian, Spanish, German, and English
 
** [https://muse.lighton.ai/home Muse] ... VLM-4, a set of natively trained large Language Models in French, Italian, Spanish, German, and English
Line 52: Line 51:
 
** [https://ai.facebook.com/blog/textless-nlp-generating-expressive-speech-from-raw-audio/  Textless NLP  ... Generating expressive speech from raw audio]
 
** [https://ai.facebook.com/blog/textless-nlp-generating-expressive-speech-from-raw-audio/  Textless NLP  ... Generating expressive speech from raw audio]
 
** [[Toolformer]] | [[Meta]] ... models can teach themselves to use tools and APIs
 
** [[Toolformer]] | [[Meta]] ... models can teach themselves to use tools and APIs
 +
** [https://turing.microsoft.com/ Turing-NLG |] [[Microsoft]]
 
** [https://github.com/allenai/unifiedqa  UnifiedQA]  ... single QA system
 
** [https://github.com/allenai/unifiedqa  UnifiedQA]  ... single QA system
 
** [https://openai.com/blog/webgpt/ WebGPT] ... GPT-3 version that can search the web
 
** [https://openai.com/blog/webgpt/ WebGPT] ... GPT-3 version that can search the web

Revision as of 00:03, 25 February 2023

YouTube search... ...Google search