Latest revision as of 07:23, 19 May 2024

YouTube ... Quora ...Google search ...Google News ...Bing News

Large Language Model (LLM) ... Multimodal ... Foundation Models (FM) ... Generative Pre-trained ... Transformer ... GPT-4 ... GPT-5 ... Attention ... GAN ... BERT
Natural Language Processing (NLP) ... Generation (NLG) ... Classification (NLC) ... Understanding (NLU) ... Translation ... Summarization ... Sentiment ... Tools
Embedding ... Fine-tuning ... RAG ... Search ... Clustering ... Recommendation ... Anomaly Detection ... Classification ... Dimensional Reduction. ...find outliers
Artificial Intelligence (AI) ... Generative AI ... Machine Learning (ML) ... Deep Learning ... Neural Network ... Reinforcement ... Learning Techniques
Conversational AI ... ChatGPT | OpenAI ... Bing/Copilot | Microsoft ... Gemini | Google ... Claude | Anthropic ... Perplexity ... You ... phind ... Ernie | Baidu
Cohere
Agents ... Robotic Process Automation ... Assistants ... Personal Companions ... Productivity ... Email ... Negotiation ... LangChain
Excel ... Documents ... Database; Vector & Relational ... Graph ... LlamaIndex
Video/Image ... Vision ... Colorize ... Image/Video Transfer Learning
End-to-End Speech ... Synthesize Speech ... Speech Recognition ... Music
Analytics ... Visualization ... Graphical Tools ... Diagrams & Business Analysis ... Requirements ... Loop ... Bayes ... Network Pattern
Development ... Notebooks ... AI Pair Programming ... Codeless ... Hugging Face ... AIOps/MLOps ... AIaaS/MLaaS
Prompt Engineering (PE) ... PromptBase ... Prompt Injection Attack
Artificial General Intelligence (AGI) to Singularity ... Curious Reasoning ... Emergence ... Moonshots ... Explainable AI ... Automated Learning
Chain of Thought (CoT) ... Tree of Thoughts (ToT)
Aviary ... fully free, cloud-based infrastructure designed to help developers choose and deploy the right technologies and approach for their LLM-based applications.
Loss Curve
Risk, Compliance and Regulation ... Ethics ... Privacy ... Law ... AI Governance ... AI Verification and Validation
8 Potentially Surprising Things To Know About Large Language Models LLMs | Dhanshree Shripad Shenwai - Marketechpost
This AI Paper Introduces SELF-REFINE: A Framework For Improving Initial Outputs From LLMs Through Iterative Feedback And Refinement | Aneesh Tickoo - MarkTechPost
Meet LMQL: An Open Source Programming Language and Platform for Large Language Model (LLM) Interaction | Tanya Malhotra - MarkTechPost
What Are Large Language Models Used For? | NVIDIA
Does your company need its own LLM? | Jason Ly - TechTalks ... inspired many to ask how to get their hands on their ‘own LLM’, or sometimes more ambitiously, their ‘own ChatGPT’.
Emerging Architectures for LLM Applications | M. Bornstein and R. Radovanovic - Andreessen Horowitz ... a reference architecture for the emerging LLM app stack
A jargon-free explanation of how AI large language models work | Timothy B. Lee & Sean Trot - ARS Technica
Essential Guide to Foundation Models and Large Language Models | Babar M Bhatti - Medium
How AI Built This Substack ... covering latest LLM developments

Large Language Model (LLM) is a Neural Network that learns skills, such as generating language and conducting conversations, by analyzing vast amounts of text from across the internet. The Neural Network with many parameters (typically billions of weights or more), trained on large quantities of unlabeled text using Self-Supervised Learning or Semi-Supervised Learning. LLMs use deep neural networks, such as Transformers, to learn from billions or trillions of words, and to produce texts on any topic or domain. LLMs are general purpose models which excel at a wide range of tasks, as opposed to being trained for one specific task (such as Sentiment Analysis, Named Entity Recognition (NER), or Mathematical Reasoning). They are capable of generating human-like text, from poetry to programming code.

One of the more interesting, but seemingly academic, concerns of the new era of AI sucking up everything on the web was that AIs will eventually start to absorb other AI-generated content and regurgitate it in a self-reinforcing loop. Not so academic after all, it appears, because Bing just did it! When asked, it produced verbatim a COVID-19 conspiracy coaxed out of ChatGPT by disinformation researchers just last month. AI is eating itself: Bing’s AI quotes COVID disinfo sourced from ChatGPT | Devin Coldewey, Frederic Lardinois - TechCrunch

Multimodal

Large Language Model (LLM) ... Multimodal ... Foundation Models (FM) ... Generative Pre-trained ... Transformer ... GPT-4 ... GPT-5 ... Attention ... GAN ... BERT

Large Multimodal Models (LMM)/Multimodal Language Model (MLM)/Multimodal Large Language Model (MLLM) are a type of Large Language Model (LLM) that combines text with other kinds of information, such as images, videos, audio, and other sensory data1. This allows LMMs to solve some of the problems of the current generation of LLMs and unlock new applications that were impossible with text-only models What you need to know about multimodal language models | Ben Dickson - TechTalks

GPT-4 | OpenAI ... can accept prompts of both text and images1. This means that it can take images as well as text as input, giving it the ability to describe the humor in unusual images, summarize text from screenshots, and answer exam questions that contain diagrams. 1 trillion parameters.
Kosmos-1 | Microsoft ... can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot). It can analyze images for content, solve visual puzzles, perform visual text recognition, and pass visual IQ tests. 1.6B
PaLM-E | Google ... an Embodied Multimodal Language Model that directly incorporates real-world continuous sensor modalities into language models and thereby establishes the link between words and percepts. It was developed by Google to be a model for robotics and can solve a variety of tasks on multiple types of robots and for multiple modalities (images, robot states, and neural scene representations). PaLM-E is also a generally-capable vision-and-language model. It can perform visual tasks, such as describing images, detecting objects, or classifying scenes, and is also proficient at language tasks, like quoting poetry, solving math equations or generating code. 562B
Gemini | Google ... (Generalized Multimodal Intelligence Network)synergistic network of multiple separate AI models that work in unison to handle an astonishingly wide variety of tasks. >100PB, 1 trillion tokens
Multimodal-CoT (Multimodal Chain-of-Thought Reasoning) GitHub ... incorporates language (text) and vision (images) modalities into a two-stage framework that separates rationale generation and answer inference. Under 1B
BLIP-2 | Salesforce Research ... a generic and efficient pre-training strategy that bootstraps vision-language pre-training from off-the-shelf frozen pre-trained image encoders and frozen large language models. It achieves state-of-the-art performance on various vision-language tasks, despite having significantly fewer trainable parameters than existing methods.

Large Language Models (LLM)

A Large Language Model (LLM) is a type of machine learning model that utilizes deep learning algorithms to process and understand language. They are trained on large amounts of data to learn language patterns so they can perform tasks such as translating texts or responding in chatbot conversations. LLMs are general-purpose models that excel at a wide range of tasks, as opposed to being trained for one specific task. It can be accessed and used through an API or a platform.

Everything you should know about AI models | Eray Eliaçık - Dataconomy
- AlexaTM | Amazon 20B
- Alpa ... serving large models like GPT-3 simple, affordable, accessible
- Bidirectional Encoder Representations from Transformers (BERT) 340M
- BioGPT ... Microsoft language model trained for biomedical tasks
- Baize ... 7B, 13B, 30B & 60B
- BLOOM ... Big Science Language Open-science Open-access Multilingual ... 176B
- BloombergGPT 50B ... trained financial data
- Cedille ... open-source French language model 6B
- ChatGPT | OpenAI
  - ChatGPT is everywhere. Here’s where it came from | Will Douglas Heaven - MIT Technology Review
    - Attention Mechanism ...Transformer Model ...Generative Pre-trained Transformer (GPT)
    - Reinforcement Learning (RL) from Human Feedback (RLHF)
    - Supervised Learning
    - Proximal Policy Optimization (PPO)
- Chinchilla | DeepMind 70B
- ctrl ... a Conditional Transformer Language Model for Controllable Generation | Salesforce
- Codex | OpenAI ... translates natural language into codeF
- Dataflow-as-a-Service | SambaNova
- DialogGPT ...Microsoft Releases DialogGPT AI Conversation Model | Anthony Alford - InfoQ - trained on over 147M dialogs
- Dolly | Databricks
- Flamingo | DeepMind ... Flamingo Pytorch 80B
- FLAN-T5-XXL | Goggle ... 11B
- Flan-U-PaLM Google ... capable of generating executable Python code
- GLM-130B ... Open Bilingual Pre-Trained Model 130B
- GLaM | Google
- Gopher | DeepMind 280B
- GShard | Google ... Scaling Giant Models with Conditional Computation and Automatic Sharding
- GPT-2 | OpenAI 1.5B
- GPT-3 | OpenAI 175B
- GPT-Neo ... Open-source GPT-3 by EleutherAI 20B
- HTML-T5 | Google ... A domain-specific LLM trained on a massive corpus of HTML documents, specializing in understanding and summarizing website content.
- InstructGPT ... OpenAI 1.3B InstructGPT model over outputs from a 175B GPT-3 model
- Jurassic-1 ... huge 178B language model to rival OpenAI's GPT-3
- LaMDA | Google ... Language Model for Dialogue Applications; experimental language model 137B
- LLaMA ... Large Language Model Meta AI, 13B and 65B parameter versions
- Luminous ... Europe 200B
- Macaw | AI2 11B
- Med-PaLM ... aligned to the medical domain ... PaLM
- Megatron | nVidia ... Monolithic Transformer Language NLP Model 11B
  - Megatron Turing (MT-NLG) 530B
- minGPT | Andrej Karpathy - GitHub
- Mistral ... Mixtral 8x7b ... Mixture-of-Experts (MoE)
- Muse ... VLM-4, a set of natively trained large Language Models in French, Italian, Spanish, German, and English
- nanoGPT ... for training/finetuning medium-sized GPTs
- NLLB | Meta 54.5B & 200B parameters; NLLB-200
- OpenChatKit | TogetherCompute ... The first open-source ChatGPT alternative released; a 20B chat-GPT model under the Apache-2.0 license, which is available for free on Hugging Face.
- OpenELM | Apple
- OpenGPT-X ... model for Europe
- OPT-175B...Facebook-owner Meta opens access to AI large language model | Elizabeth Culliford - Reuters ... Facebook 175B ... BlenderBot 175B
- Palmyra | Hugging Face ... a privacy-first LLM for enterprises
- PaLM | Google ... Pathways Language Model ... 540B
- Phi-3 | Microsoft ... outperform much larger models in math and computer science ... Phi-3-mini 3.8B
- PLATO-XL | Baidu ... 11B
- RETRO | DeepMind
- StackLLaMA | Hugging Face 7B .. trained with Stack Exchange Using RLHF
- Switch Transformers | Google Brain ... trillion parameters
- Textless NLP ... Generating expressive speech from raw audio
- T0pp | Hugging Face
- Toolformer | Meta ... models can teach themselves to use tools and APIs
- Turing-NLG | Microsoft
- UnifiedQA ... single QA system
- WebGPT ... GPT-3 version that can search the web
- Wu Dao 1.0 (Enlightment 1.0) ... China’s first homegrown super-scale intelligent model
- XGLM | Hugging Face 7.5B
- YaLM ... Yandex YaLM 100B
- Yuan 1.0 | Inspur ... 245B

Inside language models (from GPT-3 to PaLM) | Alan-D-Thompson ... PaLM

LLM Token / Parameter / Weight

YouTube ... Quora ... Google search ... Google News ... Bing News

In the context of a large language model (LLM), a token is a basic unit of meaning, such as a word, a punctuation mark, or a number. Parameters are the numerical values that define the behavior of the model. They are adjusted during training to optimize the model's ability to generate relevant and coherent text. Weights are a type of parameter that defines the strength of connections between neurons across different layers in the model. They are adjusted during training to optimize the model's ability to learn relationships between different tokens.

Here is a more detailed explanation of each term:

Token

tokenizers | HuggingFace

A token is a basic unit of meaning in a language. In natural language processing, tokens are typically words, but they can also be punctuation marks, numbers, or other symbols. For example, the sentence "The quick brown fox jumps over the lazy dog" contains 13 tokens.

Scaling Transformer to 1M tokens and beyond with Recurrent Memory Transformer (RMT) | A. Bulatov, Y. Kuratov, & M. Burtsev - arXiv - Cornell University ... Researchers are designing ways for ChatGPT to do 1M+ tokens by letting the model learn the meaning of groups of tokens instead of only tokens. ChatGPT only remembers a few thousand tokens (or word chunks) at a time. AKA, it has small short term memory.

Parameter

A parameter is a numerical value that defines the behavior of a model. In the context of LLMs, parameters are adjusted during training to optimize the model's ability to generate relevant and coherent text. For example, a parameter might define the strength of a connection between two neurons in the model's architecture.

Weight

A weight is a type of parameter that defines the strength of connections between neurons across different layers in the model. Weights are adjusted during training to optimize the model's ability to learn relationships between different tokens. For example, a weight might define the strength of the connection between the neuron that represents the token "the" and the neuron that represents the token "quick".

Embedding weights: These weights are associated with each token in the vocabulary and are used to represent the semantic meaning of the tokens.
Self-attention weights: These weights are used to determine the influence of different tokens on each other within a sequence
Feedforward weights: These weights are used in the feedforward layers of the model to compute the layer's output, which is a part of each block in the large language model (LLM)
Bias weights: Bias weights are added to the outputs of various layers, including the embedding, self-attention, and feedforward layers, to help the model make more accurate predictions

Sharing

LLaMA | Meta

Sharing "weights" refers to the distribution of the parameters that determine the strength of connections between neurons in different layers of a neural network. These weights are crucial for the model's ability to process and generate information. During the training phase, these weights are adjusted to optimize the model's performance, allowing it to learn and understand the relationships between different tokens. This process involves using a learning rate, which is a hyperparameter that controls the size of the steps taken to update the weights. Additionally, techniques like weight pruning can be used to simplify the model by removing weights that have minimal impact on the output. Regularization methods such as L1 and L2 are also employed to prevent overfitting by adding a penalty term to the loss function based on the magnitude of the weights. When Meta shares the "weights" of the LLaMA model, they are providing the parameters that have been learned during the training process, which include embedding, self-attention, feedforward, and bias weights.

Risks

Open-source LLMs (Large Language Models) are trained on massive amounts of data from the Internet, which makes them accessible and versatile, but also poses some risks. Some of the risk factors associated with open-source LLMs are:

Bias and toxicity: LLMs can reflect and amplify the social biases and harmful language that exist in their training data, such as discrimination, exclusion, stereotypes, hate speech, etc. This can cause unfairness, offense, and harm to certain groups or individuals¹².
Privacy and security: LLMs can leak or infer private or sensitive information from their training data or from user inputs, such as personal details, passwords, intellectual property, etc. This can compromise the confidentiality and integrity of the data and expose it to malicious actors¹².
Misinformation and manipulation: LLMs can produce false or misleading information that can confuse or deceive users, such as inaccurate facts, bad advice, fake news, etc. This can affect the quality and trustworthiness of the information and influence the users' decisions and actions¹²³.
Malicious uses: LLMs can be used by adversaries to cause harm or disruption, such as spreading disinformation, creating scams or frauds, generating malicious code or weapons, etc. This can threaten the security and stability of individuals, organizations, and society²⁴.
Human-computer interaction harms: LLMs can affect the psychological and social well-being of users who interact with them, such as creating unrealistic expectations, reducing critical thinking, diminishing human agency, etc. This can impact the users' identity, autonomy, and relationships².

Life or death isn’t an issue at Morgan Stanley, but producing highly accurate responses to financial and investing questions is important to the firm, its clients, and its regulators. The answers provided by the system were carefully evaluated by human reviewers before it was released to any users. Then it was piloted for several months by 300 financial advisors. As its primary approach to ongoing evaluation, Morgan Stanley has a set of 400 “golden questions” to which the correct answers are known. Every time any change is made to the system, employees test it with the golden questions to see if there has been any “regression,” or less accurate answers. - How to Train Generative AI Using Your Company’s Data | Tom Davenport & Maryam Alavi - Harvard Business Review

Multi-step Multi-model Approach

The Multi-step Multi-model Approach with Large Language Models (LLMs) refers to the utilization of multiple LLMs in a sequential manner to tackle complex language processing tasks. As with any multi-model approach, there are considerations related to computational resources, deployment complexity, and potential challenges in combining the outputs effectively. However, when properly implemented, the Multi-step Multi-model Approach with LLMs can lead to significant improvements in various language-related applications.

In the context of the Multi-step Multi-model Approach with LLMs, the following steps are generally involved:

Problem Formulation: Clearly define the language processing task you want to address. It could be natural language understanding (NLU), natural language generation (NLG), question-answering, sentiment analysis, language translation, or any other language-related challenge.

Model Selection: Choose a set of diverse LLMs that are well-suited for the specific language processing tasks you want to address. Different LLMs might excel in different aspects of language understanding, so having a variety of models can be beneficial.

Model Pre-training: Before fine-tuning the LLMs for your specific task, you need to pre-train them on a large corpus of text data. Pre-training helps the models learn the underlying patterns and structures in the language.

Fine-tuning: After pre-training, the LLMs are fine-tuned on a task-specific dataset. Fine-tuning involves training the models on labeled data relevant to your target task. This process allows the models to adapt to the specific problem and make better predictions.

Ensemble Construction: Once you have multiple LLMs that are fine-tuned for the target tasks, you can create an ensemble of these models. Ensemble methods combine the predictions of individual models to make the final decision. Techniques like averaging, voting, or more sophisticated approaches can be used to combine the outputs of the LLMs.

Sequential Application: The ensemble of LLMs can be applied sequentially, where the output of one model becomes the input to the next model in the pipeline. Each LLM can focus on different aspects of the language processing task, contributing to the overall understanding and generation process.

Model Evaluation and Refinement: Evaluate the performance of the multi-step multi-model approach using appropriate metrics and validation techniques. If necessary, fine-tuning or retraining of individual LLMs can be performed to optimize the overall system.

Benefits of the Multi-step Multi-model Approach with LLMs:

Enhanced Language Understanding: By leveraging multiple LLMs, you can benefit from their diverse capabilities and obtain a more comprehensive understanding of the language context.
Improved Generation Quality: When generating responses or text, the ensemble of LLMs can produce more coherent and contextually appropriate outputs.
Flexibility and Adaptability: The approach can be adapted to a wide range of language processing tasks, making it a versatile solution.

LM Studio: Discover, download, and run local LLMs

LM Studio

LM Studio is a desktop application that allows users to experiment with local and open-source Large Language Models (LLMs). It provides an easy-to-use platform for running LLMs locally on Mac, Windows, and potentially Linux. To download and install LM Studio, follow these steps:

1.Download LM Studio:

Visit the LM Studio website at lmstudio.ai.
Download the LM Studio desktop app for your specific platform (Mac or Windows)
The download size is approximately 400MB, so it may take some time depending on your internet speed

2. Choose a Model:

After launching LM Studio, select a model to download from the available options provided within the application
Models like Zephyr-7B, Mixtral 8x7B, Google's Gemma model, and others are available for use

3. Run LLMs Locally:

Once you have chosen and downloaded a model, you can start using it locally through LM Studio.
You can converse with the LLM by selecting your model and enabling GPU acceleration if desired

System Requirements: Ensure your system meets the requirements to run LM Studio effectively:

Apple Silicon Mac with macOS 13.6 or newer.
Windows/Linux PC with a processor supporting AVX2.
Recommended RAM of 16GB+ and VRAM of 6GB+ for PCs.
NVIDIA/AMD GPUs are supported

Difference between revisions of "Large Language Model (LLM)"

Latest revision as of 07:23, 19 May 2024

Contents

Multimodal

Large Language Models (LLM)

LLM Token / Parameter / Weight

Token

Parameter

Weight

Sharing

Risks

Multi-step Multi-model Approach

LM Studio: Discover, download, and run local LLMs

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools

@@ Line 2: / Line 2: @@
 |title=PRIMO.ai
 |titlemode=append
-|keywords=ChatGPT, artificial, intelligence, machine, learning, GPT-4, GPT-5, NLP, NLG, NLC, NLU, models, data, singularity, moonshot, Sentience, AGI, Emergence, Moonshot, Explainable, TensorFlow, Google, Nvidia, Microsoft, Azure, Amazon, AWS, Hugging Face, OpenAI, Tensorflow, OpenAI, Google, Nvidia, Microsoft, Azure, Amazon, AWS, Meta, LLM, metaverse, assistants, agents, digital twin, IoT, Transhumanism, Immersive Reality, Generative AI, Conversational AI, Perplexity, Bing, You, Bard, Ernie, prompt Engineering LangChain, Video/Image, Vision, End-to-End Speech, Synthesize Speech, Speech Recognition, Stanford, MIT |description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools
+|keywords=ChatGPT, artificial, intelligence, machine, learning, GPT-4, GPT-5, NLP, NLG, NLC, NLU, models, data, singularity, moonshot, Sentience, AGI, Emergence, Moonshot, Explainable, TensorFlow, Google, Nvidia, Microsoft, Azure, Amazon, AWS, Hugging Face, OpenAI, Tensorflow, OpenAI, Google, Nvidia, Microsoft, Azure, Amazon, AWS, Meta, LLM, metaverse, assistants, agents, digital twin, IoT, Transhumanism, Immersive Reality, Generative AI, Conversational AI, Perplexity, Bing, You, Gemini, Ernie, prompt Engineering LangChain, Video/Image, Vision, End-to-End Speech, Synthesize Speech, Speech Recognition, Stanford, MIT |description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools
 <!-- Google tag (gtag.js) -->
@@ Line 20: / Line 20: @@
 [https://www.bing.com/news/search?q=ai+Large+Language+Model+LLM&qft=interval%3d%228%22 ...Bing News]
-* [[Large Language Model (LLM)]] ... [[Large Language Model (LLM)#Multimodal|Multimodal]] ... [[Foundation Models (FM)]] ... Generative Pre-trained Transformer ([[GPT-4]]) ... [[GPT-5]]
+* [[Large Language Model (LLM)]] ... [[Large Language Model (LLM)#Multimodal|Multimodal]] ... [[Foundation Models (FM)]] ... [[Generative Pre-trained Transformer (GPT)|Generative Pre-trained]] ... [[Transformer]] ... [[GPT-4]] ... [[GPT-5]] ... [[Attention]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]]
 * [[Natural Language Processing (NLP)]] ... [[Natural Language Generation (NLG)|Generation (NLG)]] ... [[Natural Language Classification (NLC)|Classification (NLC)]] ... [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding (NLU)]] ... [[Language Translation|Translation]] ... [[Summarization]] ... [[Sentiment Analysis|Sentiment]] ... [[Natural Language Tools & Services|Tools]]
-* [[Embedding]] ... [[Fine-tuning]] ... [[Agents#AI-Powered Search|Search]] ... [[Clustering]] ... [[Recommendation]] ... [[Anomaly Detection]] ... [[Classification]] ... [[Dimensional Reduction]] ... [[...find outliers]]
+* [[Embedding]] ... [[Fine-tuning]] ... [[Retrieval-Augmented Generation (RAG)|RAG]] ... [[Agents#AI-Powered Search|Search]] ... [[Clustering]] ... [[Recommendation]] ... [[Anomaly Detection]] ... [[Classification]] ... [[Dimensional Reduction]].  [[...find outliers]]
-* [[Attention]] Mechanism ... [[Transformer]] ... [[Generative Pre-trained Transformer (GPT)]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]]
 * [[What is Artificial Intelligence (AI)? | Artificial Intelligence (AI)]] ... [[Generative AI]] ... [[Machine Learning (ML)]] ... [[Deep Learning]] ... [[Neural Network]] ... [[Reinforcement Learning (RL)|Reinforcement]] ... [[Learning Techniques]]
-* [[Conversational AI]] ... [[ChatGPT]] | [[OpenAI]] ... [[Bing]] | [[Microsoft]] ... [[Bard]] | [[Google]] ... [[Claude]] | [[Anthropic]] ... [[Perplexity]] ... [[You]] ... [[Ernie]] | [[Baidu]]
+* [[Conversational AI]] ... [[ChatGPT]] | [[OpenAI]] ... [[Bing/Copilot]] | [[Microsoft]] ... [[Gemini]] | [[Google]] ... [[Claude]] | [[Anthropic]] ... [[Perplexity]] ... [[You]] ... [[phind]] ... [[Ernie]] | [[Baidu]]
 * [[Cohere]]
-* [[Assistants]] ... [[Personal Companions]] ... [[Agents]]  ... [[Negotiation]] ... [[LangChain]]
+* [[Agents]] ... [[Robotic Process Automation (RPA)|Robotic Process Automation]] ... [[Assistants]] ... [[Personal Companions]] ... [[Personal Productivity|Productivity]] ... [[Email]] ... [[Negotiation]] ... [[LangChain]]
 * [[Excel]] ... [[LangChain#Documents|Documents]] ... [[Database|Database; Vector & Relational]] ... [[Graph]] ... [[LlamaIndex]]
 * [[Video/Image]] ... [[Vision]] ... [[Colorize]] ... [[Image/Video Transfer Learning]]
 * [[End-to-End Speech]] ... [[Synthesize Speech]] ... [[Speech Recognition]] ... [[Music]]
 * [[Analytics]] ... [[Visualization]] ... [[Graphical Tools for Modeling AI Components|Graphical Tools]] ... [[Diagrams for Business Analysis|Diagrams]] & [[Generative AI for Business Analysis|Business Analysis]] ... [[Requirements Management|Requirements]] ... [[Loop]] ... [[Bayes]] ... [[Network Pattern]]
-* [[Development]] ... [[Notebooks]] ... [[Development#AI Pair Programming Tools|AI Pair Programming]] ... [[Codeless Options, Code Generators, Drag n' Drop|Codeless, Generators, Drag n' Drop]] ... [[Algorithm Administration#AIOps/MLOps|AIOps/MLOps]] ... [[Platforms: AI/Machine Learning as a Service (AIaaS/MLaaS)|AIaaS/MLaaS]]
+* [[Development]] ... [[Notebooks]] ... [[Development#AI Pair Programming Tools|AI Pair Programming]] ... [[Codeless Options, Code Generators, Drag n' Drop|Codeless]] ... [[Hugging Face]] ... [[Algorithm Administration#AIOps/MLOps|AIOps/MLOps]] ... [[Platforms: AI/Machine Learning as a Service (AIaaS/MLaaS)|AIaaS/MLaaS]]
 * [[Prompt Engineering (PE)]] ... [[Prompt Engineering (PE)#PromptBase|PromptBase]] ... [[Prompt Injection Attack]]
-* [[Singularity]] ... [[Artificial Consciousness / Sentience|Sentience]] ... [[Artificial General Intelligence (AGI)| AGI]] ... [[Inside Out - Curious Optimistic Reasoning| Curious Reasoning]] ... [[Emergence]] ... [[Moonshots]] ... [[Explainable / Interpretable AI|Explainable AI]] ... [[Algorithm Administration#Automated Learning|Automated Learning]]
+* [[Artificial General Intelligence (AGI) to Singularity]] ... [[Inside Out - Curious Optimistic Reasoning| Curious Reasoning]] ... [[Emergence]] ... [[Moonshots]] ... [[Explainable / Interpretable AI|Explainable AI]] ...  [[Algorithm Administration#Automated Learning|Automated Learning]]
 * [[Chain of Thought (CoT)]]  ... [[Chain of Thought (CoT)#Tree of Thoughts (ToT)|Tree of Thoughts (ToT)]]
 * [[Aviary|Aviary ... fully free, cloud-based infrastructure designed to help developers choose and deploy the right technologies and approach for their LLM-based applications.]]
 * [[Loss#Loss Curve|Loss Curve]]
+* [[Risk, Compliance and Regulation]] ... [[Ethics]] ... [[Privacy]] ... [[Law]] ... [[AI Governance]] ... [[AI Verification and Validation]]
 * [https://www.marktechpost.com/2023/04/06/8-potentially-surprising-things-to-know-about-large-language-models-llms/ 8 Potentially Surprising Things To Know About Large Language Models LLMs | Dhanshree Shripad Shenwai - Marketechpost]
 * [https://www.marktechpost.com/2023/04/07/this-ai-paper-introduce-self-refine-a-framework-for-improving-initial-outputs-from-llms-through-iterative-feedback-and-refinement/ This AI Paper Introduces SELF-REFINE: A Framework For Improving Initial Outputs From LLMs Through Iterative Feedback And Refinement | Aneesh Tickoo - MarkTechPost]
@@ Line 44: / Line 44: @@
 * [https://bdtechtalks.com/2023/07/14/chatgpt-llm-prompt-architecting/ Does your company need its own LLM? | Jason Ly - TechTalks] ... inspired many to ask how to get their hands on their ‘own LLM’, or sometimes more ambitiously, their ‘own ChatGPT’.
 * [https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/#section--4 Emerging Architectures for LLM Applications | M. Bornstein and R. Radovanovic - Andreessen Horowitz] ... a reference architecture for the emerging LLM app stack
 * [https://arstechnica.com/science/2023/07/a-jargon-free-explanation-of-how-ai-large-language-models-work/ A jargon-free explanation of how AI large language models work | Timothy B. Lee & Sean Trot - ARS Technica]
+* [https://thebabar.medium.com/essential-guide-to-foundation-models-and-large-language-models-27dab58f7404 Essential Guide to Foundation Models and Large Language Models | Babar M Bhatti - Medium]
+* [https://howaibuildthis.substack.com/ How AI Built This Substack] ... covering latest LLM developments
 <B>Large Language Model (LLM)</b> is a [[Neural Network]] that learns skills, such as generating language and conducting conversations, by analyzing vast amounts of text from across the internet. The [[Neural Network]] with many parameters (typically billions of weights or more), trained on large quantities of unlabeled text using [[Self-Supervised]] Learning or [[Semi-Supervised]] Learning. LLMs use deep neural networks, such as [[Transformer]]s, to learn from billions or trillions of words, and to produce texts on any topic or domain. LLMs are general purpose models which excel at a wide range of tasks, as opposed to being trained for one specific task (such as [[Sentiment Analysis]], [[Natural Language Processing (NLP)#Named Entity Recognition (NER)|Named Entity Recognition (NER)]], or [[Math for Intelligence#Mathematical Reasoning|Mathematical Reasoning]]). They are capable of generating human-like text, from poetry to programming code.
@@ Line 58: / Line 60: @@
 = <span id="Multimodal"></span>Multimodal =
-* [[Large Language Model (LLM)]] ... [[Large Language Model (LLM)#Multimodal|Multimodal]] ... [[Foundation Models (FM)]] ... Generative Pre-trained Transformer ([[GPT-4]]) ... [[GPT-5]]
+* [[Large Language Model (LLM)]] ... [[Large Language Model (LLM)#Multimodal|Multimodal]] ... [[Foundation Models (FM)]] ... [[Generative Pre-trained Transformer (GPT)|Generative Pre-trained]] ... [[Transformer]] ... [[GPT-4]] ... [[GPT-5]] ... [[Attention]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]]
-* [[Natural Language Processing (NLP)]] ... [[Natural Language Generation (NLG)|Generation (NLG)]] ... [[Natural Language Classification (NLC)|Classification (NLC)]] ... [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding (NLU)]] ... [[Language Translation|Translation]] ... [[Summarization]] ... [[Sentiment Analysis|Sentiment]] ... [[Natural Language Tools & Services|Tools]]
-Multimodal Language Models; Multimodal Language Model (MLM)/Multimodal Large Language Model (MLLM) are is a type of Large Language Model (LLM) that combines text with other kinds of information, such as images, videos, audio, and other sensory data1. This allows MLLMs to solve some of the problems of the current generation of LLMs and unlock new applications that were impossible with text-only models [https://bdtechtalks.com/2023/03/13/multimodal-large-language-models/ What you need to know about multimodal language models | Ben Dickson - TechTalks]
+Large Multimodal Models (LMM)/Multimodal Language Model (MLM)/Multimodal Large Language Model (MLLM) are a type of Large Language Model (LLM) that combines text with other kinds of information, such as images, videos, audio, and other sensory data1. This allows LMMs to solve some of the problems of the current generation of LLMs and unlock new applications that were impossible with text-only models [https://bdtechtalks.com/2023/03/13/multimodal-large-language-models/ What you need to know about multimodal language models | Ben Dickson - TechTalks]
 * [[GPT-4]] | [[OpenAI]]  ... can accept prompts of both text and images1. This means that it can take images as well as text as input, giving it the ability to describe the humor in unusual images, summarize text from screenshots, and answer exam questions that contain diagrams. 1 trillion parameters.
 * [[Kosmos-1]] | [[Microsoft]]  ... can perceive general modalities, learn in [[context]] (i.e., few-shot), and follow instructions (i.e., zero-shot). It can analyze images for content, solve visual puzzles, perform visual text recognition, and pass visual IQ tests. 1.6B
-* [[PaLM-E]] | [[Google]] ... an Embodied Multimodal Language Model that directly incorporates real-world continuous sensor modalities into language models and thereby establishes the link between words and percepts. It was developed by Google to be a model for robotics and can solve a variety of tasks on multiple types of robots and for multiple modalities (images, robot states, and neural scene representations). PaLM-E is also a generally-capable vision-and-language model. It can perform visual tasks, such as describing images, detecting objects, or classifying scenes, and is also proficient at language tasks, like quoting poetry, solving math equations or generating code. 562B
+* [[PaLM|PaLM-E]] | [[Google]] ... an Embodied Multimodal Language Model that directly incorporates real-world continuous sensor modalities into language models and thereby establishes the link between words and percepts. It was developed by Google to be a model for robotics and can solve a variety of tasks on multiple types of robots and for multiple modalities (images, robot states, and neural scene representations). [[PaLM|PaLM-E]] is also a generally-capable vision-and-language model. It can perform visual tasks, such as describing images, detecting objects, or classifying scenes, and is also proficient at language tasks, like quoting poetry, solving math equations or generating code. 562B
 * [[Gemini]] | [[Google]] ... (Generalized Multimodal Intelligence Network)synergistic network of multiple separate AI models that work in unison to handle an astonishingly wide variety of tasks. >100PB, 1 trillion tokens
 * [https://arxiv.org/abs/2302.00923 Multimodal-CoT (Multimodal Chain-of-Thought Reasoning)] [https://github.com/amazon-science/mm-cot GitHub] ... incorporates language (text) and vision (images) modalities into a two-stage framework that separates rationale generation and answer inference. Under 1B
@@ Line 98: / Line 99: @@
 ** [https://medium.com/syncedreview/deepminds-flamingo-visual-language-model-demonstrates-sota-few-shot-multimodal-learning-f795c3034b94 Flamingo |] [[Google|DeepMind]] ... [https://github.com/lucidrains/flamingo-pytorch Flamingo Pytorch] 80B
 ** [[FLAN-T5 LLM | FLAN-T5-XXL |]] [[Goggle]] ... 11B
+** [https://sh-tsang.medium.com/brief-review-flan-palm-scaling-instruction-finetuned-language-models-79f47cbcb882 Flan-U-PaLM] [[Google]] ... capable of generating executable [[Python]] code
 ** [https://github.com/THUDM/GLM-130B GLM-130B]  ... Open Bilingual Pre-Trained Model  130B
 ** [https://ai.googleblog.com/2021/12/more-efficient-in-context-learning-with.html GLaM |] [[Google]]
-** [[Bard#Gopher | Gopher]] | [[Google | DeepMind]]  280B
+** [[Gemini#Gopher | Gopher]] | [[Google | DeepMind]]  280B
 ** [https://arxiv.org/abs/2006.16668 GShard |] [[Google]]   ... Scaling Giant Models with Conditional Computation and Automatic Sharding
 ** [https://openai.com/blog/better-language-models/ GPT-2 |] [[OpenAI]]  1.5B
 ** [https://openai.com/blog/better-language-models/ GPT-3 |] [[OpenAI]] 175B
 ** [https://github.com/EleutherAI/gpt-neo/ GPT-Neo] ... Open-source GPT-3 by EleutherAI   20B
+** HTML-T5 | [[Google]] ... A domain-specific LLM trained on a massive corpus of HTML documents, specializing in understanding and summarizing website content.
 ** [https://openai.com/blog/instruction-following/ InstructGPT] ... [[OpenAI]] 1.3B InstructGPT model over outputs from a 175B GPT-3 model
 ** [https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf  Jurassic-1] ... huge 178B language model to rival [[OpenAI]]'s GPT-3
-** [[Bard#LaMDA | LaMDA]] | [[Google]]  ... Language Model for Dialogue Applications; experimental language model  137B
+** [[Gemini#LaMDA | LaMDA]] | [[Google]]  ... Language Model for Dialogue Applications; experimental language model  137B
 ** [[LLaMA]] ... Large Language Model [[Meta]] AI, 13B and 65B parameter versions
 ** [https://www.aleph-alpha.com/luminous-explore-a-model-for-world-class-semantic-representation Luminous] ... Europe  200B
 ** [https://github.com/allenai/macaw Macaw | AI2]  11B
-** [https://arxiv.org/pdf/2212.13138.pdf Med-PaLM]  ... aligned to the medical domain
+** [https://arxiv.org/pdf/2212.13138.pdf Med-PaLM]  ... aligned to the medical domain  ... [[PaLM]]
 ** [https://developer.nvidia.com/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/ Megatron] | [[nVidia]] ... Monolithic Transformer Language NLP Model 11B
 *** [https://arxiv.org/abs/2201.11990 Megatron Turing (MT-NLG)]  530B
 ** [https://github.com/karpathy/minGPT minGPT | Andrej Karpathy - GitHub]
+** [[Mistral]] ... Mixtral 8x7b ... [[Mixture-of-Experts (MoE)]]
 ** [https://muse.lighton.ai/home Muse] ... VLM-4, a set of natively trained large Language Models in French, Italian, Spanish, German, and English
 ** [https://github.com/karpathy/nanoGPT nanoGPT] ... for training/finetuning medium-sized GPTs
 ** [https://ai.facebook.com/blog/nllb-200-high-quality-machine-translation/ NLLB |] [[Meta]]  54.5B & 200B parameters; NLLB-200
 ** [https://www.together.xyz/blog/openchatkit OpenChatKit | TogetherCompute] ... The first open-source [[ChatGPT]] alternative released; a 20B chat-GPT model under the Apache-2.0 license, which is available for free on Hugging Face.
+** [https://machinelearning.apple.com/research/openelm  OpenELM | Apple]
 ** [https://idw-online.de/en/news786967 OpenGPT-X]  ... model for Europe
 ** [https://www.reuters.com/technology/facebook-owner-meta-opens-access-ai-large-language-model-2022-05-03/ OPT-175B]...[[Meta|Facebook]]-owner Meta opens access to AI large language model | Elizabeth Culliford - Reuters ... [[Meta|Facebook]] 175B  ... BlenderBot   175B
 ** [https://huggingface.co/Writer/palmyra-base  Palmyra |] [[Hugging Face]] ... a privacy-first LLM for enterprises
-** [[Bard#PaLM | PaLM]] | [[Google]] ... Pathways Language Model  ... 540B
+** [[PaLM]] | [[Google]] ... Pathways Language Model  ... 540B
+** [https://news.microsoft.com/source/features/ai/the-phi-3-small-language-models-with-big-potential/ Phi-3] | [[Microsoft]] ... outperform much larger models in math and computer science ... Phi-3-mini 3.8B
 ** [https://research.baidu.com/Blog/index-view?id=163 PLATO-XL | Baidu]  ... 11B
 ** [[RETRO]] | [[Google | DeepMind]]
@@ Line 141: / Line 147: @@
 <img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F647e7326-fb7e-4a95-9f62-06deeba4d72e_3840x2160.png" width="900">
-[https://lifearchitect.ai/models/ Inside language models (from GPT-3 to PaLM) | Alan-D-Thompson]
+[https://lifearchitect.ai/models/ Inside language models (from GPT-3 to PaLM) | Alan-D-Thompson]  ... [[PaLM]]
 = LLM Token / Parameter / Weight =
@@ Line 163: / Line 167: @@
 Here is a more detailed explanation of each term:
-* <b>Token</b>: A token is a basic unit of meaning in a language. In natural language processing, tokens are typically words, but they can also be punctuation marks, numbers, or other symbols. For example, the sentence "The quick brown fox jumps over the lazy dog" contains 13 tokens.
+== <span id="Token"></span>Token ==
-** [https://arxiv.org/abs/2304.11062?utm_source=the-ai-exchange&utm_medium=newsletter&utm_campaign=chatgpt-that-can-handle-1m-tokens Scaling Transformer to 1M tokens and beyond with Recurrent Memory Transformer (RMT) | A. Bulatov, Y. Kuratov, & M. Burtsev - arXiv - Cornell University] ... Researchers are designing ways for [[ChatGPT]] to do 1M+ <b>tokens</b> by letting the model learn the meaning of groups of tokens instead of only tokens. [[ChatGPT]] only remembers a few thousand tokens (or word chunks) at a time. AKA, it has small short term memory.
+* [https://github.com/huggingface/tokenizers tokenizers | HuggingFace]
-* <b>Parameter</b>: A parameter is a numerical value that defines the behavior of a model. In the [[context]] of LLMs, parameters are adjusted during training to optimize the model's ability to generate relevant and coherent text. For example, a parameter might define the strength of a connection between two neurons in the model's architecture.
-* <b>Weight</b>: A weight is a type of parameter that defines the strength of connections between neurons across different layers in the model. Weights are adjusted during training to optimize the model's ability to learn relationships between different tokens. For example, a weight might define the strength of the connection between the neuron that represents the token "the" and the neuron that represents the token "quick".
+A token is a basic unit of meaning in a language. In natural language processing, tokens are typically words, but they can also be punctuation marks, numbers, or other symbols. For example, the sentence "The quick brown fox jumps over the lazy dog" contains 13 tokens.
-** <b>[[Embedding]]</b> weights: These weights are associated with each token in the vocabulary and are used to represent the meaning of the token.
+* [https://arxiv.org/abs/2304.11062?utm_source=the-ai-exchange&utm_medium=newsletter&utm_campaign=chatgpt-that-can-handle-1m-tokens Scaling Transformer to 1M tokens and beyond with Recurrent [[Memory]] Transformer (RMT) | A. Bulatov, Y. Kuratov, & M. Burtsev - arXiv - Cornell University] ... Researchers are designing ways for [[ChatGPT]] to do 1M+ <b>tokens</b> by letting the model learn the meaning of groups of tokens instead of only tokens. [[ChatGPT]] only remembers a few thousand tokens (or word chunks) at a time. AKA, it has small short term [[memory]].
-** <b>Self-attention</b> weights: used to calculate the attention weights between each token in a sequence.
-** <b>Feedforward</b> weights: used to calculate the output of the feedforward layer in each block of the LLM.
-** <b>Bias</b> weights: added to the outputs of the [[embedding]] layer, the self-attention layer, and the feedforward layer.
-= <span id="Large Language Model (LLM) Stack"></span>Large Language Model (LLM) Stack =
+== <span id="Parameter"></span>Parameter ==
-[https://www.youtube.com/results?search_query=Large+Language+Model+LLM+Stack YouTube]
+A parameter is a numerical value that defines the behavior of a model. In the [[context]] of LLMs, parameters are adjusted during training to optimize the model's ability to generate relevant and coherent text. For example, a parameter might define the strength of a connection between two neurons in the model's architecture.
-[https://www.quora.com/search?q=Large%20Language%20Model%20LLM%20Stack ... Quora]
-[https://www.google.com/search?q=Large+Language+Model+LLM+Stack ...Google search]
-[https://news.google.com/search?q=Large+Language+Model+LLM+Stack ...Google News]
-[https://www.bing.com/news/search?q=Large+Language+Model+LLM+Stack&qft=interval%3d%228%22 ...Bing News]
-<i>Excerpt from...</i> [https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/ Emerging Architectures for LLM Applications | Matt Bornstein & Rajko Radovanovic - andreessen horowitz] ... Large language models are a powerful new primitive for building software. But since they are so new—and behave so differently from normal computing resources—it’s not always obvious how to use them. At a very high level, the workflow can be divided into three stages:
+== <span id="Weight"></span>Weight ==
+A weight is a type of parameter that defines the strength of connections between neurons across different layers in the model. Weights are adjusted during training to optimize the model's ability to learn relationships between different tokens. For example, a weight might define the strength of the connection between the neuron that represents the token "the" and the neuron that represents the token "quick".
+* <b>[[Embedding]]</b> weights: These weights are associated with each token in the vocabulary and are used to represent the semantic meaning of the tokens.
+* <b>Self-attention</b> weights: These weights are used to determine the influence of different tokens on each other within a sequence
+* <b>Feedforward</b> weights: These weights are used in the feedforward layers of the model to compute the layer's output, which is a part of each block in the large language model (LLM)
+* <b>Bias</b> weights: Bias weights are added to the outputs of various layers, including the [[embedding]], self-attention, and feedforward layers, to help the model make more accurate predictions
+=== <span id="Sharing"></span>Sharing ===
+* [[LLaMA]] | [[Meta]]
-<hr>
+Sharing "weights" refers to the distribution of the parameters that determine the strength of connections between neurons in different layers of a neural network. These weights are crucial for the model's ability to process and generate information. During the training phase, these weights are adjusted to optimize the model's performance, allowing it to learn and understand the relationships between different tokens. This process involves using a learning rate, which is a hyperparameter that controls the size of the steps taken to update the weights. Additionally, techniques like weight pruning can be used to simplify the model by removing weights that have minimal impact on the output. Regularization methods such as L1 and L2 are also employed to prevent overfitting by adding a penalty term to the loss function based on the magnitude of the weights. When [[Meta]] shares the "weights" of the [[LLaMA]] model, they are providing the parameters that have been learned during the training process, which include embedding, self-attention, feedforward, and bias weights.
-<b>Data preprocessing / [[embedding]]</b>: This stage involves storing private data (legal documents, in our example) to be retrieved later. Typically, the documents are broken into chunks, passed through an [[embedding]] model, then stored in a specialized database called a vector database.
-<hr>
-* <b>Contextual data</b> Contextual data for LLM apps can come in a variety of formats, including text documents, PDFs, CSVs, and SQL tables.
-** <b>Data pipelines</b> Data pipelines with LLMs are a way to use large language models (LLMs) to process and analyze large amounts of data. LLMs can be used to extract information from text, translate languages, and answer questions.
-*** [[Databricks]] ... direct file access and direct native support for Python, data science and AI frameworks
-*** [https://airflow.apache.org/ Airflow] ... an open-source project that lets you programmatically author, schedule, and monitor your data pipelines using [[Python]]
-*** Unstructured ... is data that does not have a predefined schema, such as text documents, images, and audio files.
-** <b>[[Embedding]] model</b>
-*** [[OpenAI]] ... called text-[[embedding]]-ada-002, 2nd generation [[embedding]] model
-*** [[Cohere]] ... a large language model (LLM) that is trained on a massive dataset of text and code. It can be used to represent the meaning of text as a list of numbers, which is useful for comparing text for similarity, clustering text, and classifying text.
-*** [[Hugging Face]]
-** <b>Vector database</b>
-*** [[Database#Pinecone|Pinecone]]  ... provide long-term memory for storing and query vector [[embedding]]s, a type of data that represents semantic information
-*** [https://weaviate.io/ Weaviate] ... open source vector database that allows storing and retrieving data objects based on their semantic properties by indexing them with vectors
-*** [https://pypi.org/project/chromadb/ ChromaDB] ... the open-source [[embedding]] database, build Python or JavaScript LLM apps with memory
-<hr>
-<b>Prompt construction / retrieval</b>: When a user submits a query (a legal question, in this case), the application constructs a series of prompts to submit to the language model. A compiled prompt typically combines a prompt template hard-coded by the developer; examples of valid outputs called few-shot examples; any necessary information retrieved from external APIs; and a set of relevant documents retrieved from the vector database.
-<hr>
-* <b>Prompt Few-shot examples</b> Most developers start new projects by experimenting with simple prompts, consisting of direct instructions (zero-shot prompting) or possibly some example outputs (few-shot prompting).
-** <b>Playground</b>
-*** [[OpenAI]] ... a web-based tool that makes it easy to test prompts and get familiar with how the API works.
-*** [https://nat.dev/ nat.dev] ... An LLM playground you can run on your laptop; Use any model from [[OpenAI]], [[Anthropic]], [[Cohere]], Forefront, [[HuggingFace]], Aleph Alpha, Replicate, Banana and llama.cpp. [https://github.com/nat/openplayground  Open Playground - GitHub]
-*** [[Humanloop]] ... use the playground to experiment with new prompts, collect model generated data and user feedback, and finetune models
-** <b>Orchestration</b>
-*** [[LangChain]] ... chain together different components to create more advanced use cases around LLMs
-*** [[LlamaIndex]] ... data framework that allows users to connect custom data sources to [[Large Language Model (LLM)]]s. It provides tools to structure data, offers data connectors to ingest existing data sources and data formats (APIs, PDFs, docs, SQL, etc.), and provides an advanced retrieval/query interface over the data.
-*** [[ChatGPT]] ... creates chat-based applications.
-** <b>APIs/plugins</b>
-*** [https://github.com/serp-ai/ChatGPT-Plugins Serp] ... giving [[ChatGPT]] the ability to use web browsing, [[python]] code execution, and custom plugins
-*** [[Wolfram]]... Simple API, Short Answers API, Spoken Results API, Full Results API, & Conversational API
-*** [[LangChain#Zapier|Zapier]] ... automates 5,000+ app integrations
-<hr>
-<b>Prompt execution / inference</b>: Once the prompts have been compiled, they are submitted to a pre-trained LLM for inference—including both proprietary model APIs and open-source or self-trained models. Some developers also add operational systems like logging, caching, and validation at this stage.
-<hr>
-* <b>LLM cache</b>
-** [https://github.com/redis/redis Redis] ... (remote dictionary server) provides an adaptive prompt creation mechanism based on the current context, which helps overcome the context length limitations of LLMs
-** [https://www.sqlite.org/index.html SQLite] ... disk-based storage to cache LLM prompts and responses for LangChain
-** [https://github.com/zilliztech/gptcache GPTCache] ... uses different storage backends, such as Redis, SQLite, or MinIO, to cache LLM prompts
-* <b>Logging/LLMops</b>
-** [[LangChain#Weights & Biases (W&B)|Weights & Biases (W&B)]] ... improve prompt engineering with visually interactive evaluation loops.
-** MLflow
-** PromptLayer
-** Helicone
-* <b>Validation</b>
-** Guardrails
-** Rebuff
-** Microsoft Guidance
-** LMQL
-* <b>App hosting</b>
-** Vercel
-** Steamship
-** Streamlit
-** Modal
-* <b>LM APIs (proprietary)</b>
-** [[OpenAI]]
-** [https://www.anthropic.com/ Anthropic]
-* <b>LLM APIs (open)</b>
-** [[Hugging Face]]
-** [https://replicate.com/ Replicate]
-* <b>Cloud providers</b>
-** AWS
-** GCP
-** Azure
-** CoreWeave
-* <b>Opinionated clouds</b>
-** [[Databricks]]
-** Anyscale
-** Mosaic
-** Modal
-** RunPod
-<img src="https://i0.wp.com/a16z.com/wp-content/uploads/2023/06/2657-Emerging-LLM-App-Stack-R2-1-of-4-2.png?w=2000&ssl=1" width="1100">
 = Risks =
 * [https://www.marktechpost.com/2023/07/07/how-risky-is-your-open-source-llm-project-a-new-research-explains-the-risk-factors-associated-with-open-source-llms/ How Risky Is Your Open-Source LLM Project? A New Research Explains The Risk Factors Associated With Open-Source LLMs | Anant Shahi - MarkTechPost]
 * [https://venturebeat.com/datadecisionmakers/instead-of-ai-sentience-focus-on-the-current-risks-of-large-language-models Instead of AI sentience, focus on the current risks of large language models]
-* [https://www.deepmind.com/publications/ethical-and-social-risks-of-harm-from-language-models Ethical and social risks of harm from Language Models | DeepMind|
+* [https://www.deepmind.com/publications/ethical-and-social-risks-of-harm-from-language-models Ethical and social risks of harm from Language Models | DeepMind]
 Open-source LLMs (Large Language Models) are trained on massive amounts of data from the Internet, which makes them accessible and versatile, but also poses some risks. Some of the risk factors associated with open-source LLMs are:
@@ Line 306: / Line 231: @@
 * Improved Generation Quality: When generating responses or text, the ensemble of LLMs can produce more coherent and contextually appropriate outputs.
 * Flexibility and Adaptability: The approach can be adapted to a wide range of language processing tasks, making it a versatile solution.
+= LM Studio: Discover, download, and run local LLMs =
+* [https://lmstudio.ai/ LM Studio]
+LM Studio is a desktop application that allows users to experiment with local and open-source Large Language Models (LLMs). It provides an easy-to-use platform for running LLMs locally on Mac, Windows, and potentially Linux. To download and install LM Studio, follow these steps:
+.Download LM Studio:
+* Visit the LM Studio website at lmstudio.ai.
+* Download the LM Studio desktop app for your specific platform (Mac or Windows)
+* The download size is approximately 400MB, so it may take some time depending on your internet speed
+. Choose a Model:
+* After launching LM Studio, select a model to download from the available options provided within the application
+* Models like Zephyr-7B, Mixtral 8x7B, Google's Gemma model, and others are available for use
+. Run LLMs Locally:
+* Once you have chosen and downloaded a model, you can start using it locally through LM Studio.
+* You can converse with the LLM by selecting your model and enabling GPU acceleration if desired
+System Requirements:
+Ensure your system meets the requirements to run LM Studio effectively:
+* Apple Silicon Mac with macOS 13.6 or newer.
+* Windows/Linux PC with a processor supporting AVX2.
+* Recommended RAM of 16GB+ and VRAM of 6GB+ for PCs.
+* NVIDIA/AMD GPUs are supported
+<youtube>yBI1nPep72Q</youtube>