Difference between revisions of "GPT-4"
m |
m (→PrivateGPT) |
||
(9 intermediate revisions by the same user not shown) | |||
Line 21: | Line 21: | ||
* [[Large Language Model (LLM)]] ... [[Large Language Model (LLM)#Multimodal|Multimodal]] ... [[Foundation Models (FM)]] ... [[Generative Pre-trained Transformer (GPT)|Generative Pre-trained]] ... [[Transformer]] ... [[GPT-4]] ... [[GPT-5]] ... [[Attention]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]] | * [[Large Language Model (LLM)]] ... [[Large Language Model (LLM)#Multimodal|Multimodal]] ... [[Foundation Models (FM)]] ... [[Generative Pre-trained Transformer (GPT)|Generative Pre-trained]] ... [[Transformer]] ... [[GPT-4]] ... [[GPT-5]] ... [[Attention]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]] | ||
+ | * [[Conversational AI]] ... [[ChatGPT]] | [[OpenAI]] ... [[Bing/Copilot]] | [[Microsoft]] ... [[Gemini]] | [[Google]] ... [[Claude]] | [[Anthropic]] ... [[Perplexity]] ... [[You]] ... [[phind]] ... [[Ernie]] | [[Baidu]] | ||
* [[Natural Language Processing (NLP)]] ... [[Natural Language Generation (NLG)|Generation (NLG)]] ... [[Natural Language Classification (NLC)|Classification (NLC)]] ... [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding (NLU)]] ... [[Language Translation|Translation]] ... [[Summarization]] ... [[Sentiment Analysis|Sentiment]] ... [[Natural Language Tools & Services|Tools]] | * [[Natural Language Processing (NLP)]] ... [[Natural Language Generation (NLG)|Generation (NLG)]] ... [[Natural Language Classification (NLC)|Classification (NLC)]] ... [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding (NLU)]] ... [[Language Translation|Translation]] ... [[Summarization]] ... [[Sentiment Analysis|Sentiment]] ... [[Natural Language Tools & Services|Tools]] | ||
* [https://openai.com/product/gpt-4 GPT-4 |] [[OpenAI]] | * [https://openai.com/product/gpt-4 GPT-4 |] [[OpenAI]] | ||
* [https://openai.com/research/gpt-4 Research Paper |] [[OpenAI]] | * [https://openai.com/research/gpt-4 Research Paper |] [[OpenAI]] | ||
* [[What is Artificial Intelligence (AI)? | Artificial Intelligence (AI)]] ... [[Generative AI]] ... [[Machine Learning (ML)]] ... [[Deep Learning]] ... [[Neural Network]] ... [[Reinforcement Learning (RL)|Reinforcement]] ... [[Learning Techniques]] | * [[What is Artificial Intelligence (AI)? | Artificial Intelligence (AI)]] ... [[Generative AI]] ... [[Machine Learning (ML)]] ... [[Deep Learning]] ... [[Neural Network]] ... [[Reinforcement Learning (RL)|Reinforcement]] ... [[Learning Techniques]] | ||
− | * [[ | + | * [[Agents]] ... [[Robotic Process Automation (RPA)|Robotic Process Automation]] ... [[Assistants]] ... [[Personal Companions]] ... [[Personal Productivity|Productivity]] ... [[Email]] ... [[Negotiation]] ... [[LangChain]] |
− | |||
* [[Video/Image]] ... [[Vision]] ... [[Enhancement]] ... [[Fake]] ... [[Reconstruction]] ... [[Colorize]] ... [[Occlusions]] ... [[Predict image]] ... [[Image/Video Transfer Learning]] | * [[Video/Image]] ... [[Vision]] ... [[Enhancement]] ... [[Fake]] ... [[Reconstruction]] ... [[Colorize]] ... [[Occlusions]] ... [[Predict image]] ... [[Image/Video Transfer Learning]] | ||
* [[End-to-End Speech]] ... [[Synthesize Speech]] ... [[Speech Recognition]] ... [[Music]] | * [[End-to-End Speech]] ... [[Synthesize Speech]] ... [[Speech Recognition]] ... [[Music]] | ||
* [[Analytics]] ... [[Visualization]] ... [[Graphical Tools for Modeling AI Components|Graphical Tools]] ... [[Diagrams for Business Analysis|Diagrams]] & [[Generative AI for Business Analysis|Business Analysis]] ... [[Requirements Management|Requirements]] ... [[Loop]] ... [[Bayes]] ... [[Network Pattern]] | * [[Analytics]] ... [[Visualization]] ... [[Graphical Tools for Modeling AI Components|Graphical Tools]] ... [[Diagrams for Business Analysis|Diagrams]] & [[Generative AI for Business Analysis|Business Analysis]] ... [[Requirements Management|Requirements]] ... [[Loop]] ... [[Bayes]] ... [[Network Pattern]] | ||
− | * [[Development]] ... [[Notebooks]] ... [[Development#AI Pair Programming Tools|AI Pair Programming]] ... [[Codeless Options, Code Generators, Drag n' Drop|Codeless | + | * [[Development]] ... [[Notebooks]] ... [[Development#AI Pair Programming Tools|AI Pair Programming]] ... [[Codeless Options, Code Generators, Drag n' Drop|Codeless]] ... [[Hugging Face]] ... [[Algorithm Administration#AIOps/MLOps|AIOps/MLOps]] ... [[Platforms: AI/Machine Learning as a Service (AIaaS/MLaaS)|AIaaS/MLaaS]] |
* [[Prompt Engineering (PE)]] ... [[Prompt Engineering (PE)#PromptBase|PromptBase]] ... [[Prompt Injection Attack]] | * [[Prompt Engineering (PE)]] ... [[Prompt Engineering (PE)#PromptBase|PromptBase]] ... [[Prompt Injection Attack]] | ||
− | * | + | * [[Artificial General Intelligence (AGI) to Singularity]] ... [[Inside Out - Curious Optimistic Reasoning| Curious Reasoning]] ... [[Emergence]] ... [[Moonshots]] ... [[Explainable / Interpretable AI|Explainable AI]] ... [[Algorithm Administration#Automated Learning|Automated Learning]] |
* [https://arxiv.org/pdf/2303.12712.pdf Sparks of Artificial General Intelligence: Early experiments with GPT-4 | S. Bubeck, V. Chandrasekaran, R. Eldan, J. Gehrke, E. Horvitz, E. Kamar, P. Lee, Y. Tat Lee, Y. Li, S. Lundberg, H. Nori, H. Palangi, M. Ribeiro, Y. Zhang -] [[Microsoft]] Research | * [https://arxiv.org/pdf/2303.12712.pdf Sparks of Artificial General Intelligence: Early experiments with GPT-4 | S. Bubeck, V. Chandrasekaran, R. Eldan, J. Gehrke, E. Horvitz, E. Kamar, P. Lee, Y. Tat Lee, Y. Li, S. Lundberg, H. Nori, H. Palangi, M. Ribeiro, Y. Zhang -] [[Microsoft]] Research | ||
* [https://www.zdnet.com/article/what-is-gpt-4-heres-everything-you-need-to-know/ What is GPT-4? Here's everything you need to know | Sabrina Ortiz - ZDnet] | * [https://www.zdnet.com/article/what-is-gpt-4-heres-everything-you-need-to-know/ What is GPT-4? Here's everything you need to know | Sabrina Ortiz - ZDnet] | ||
Line 43: | Line 43: | ||
− | GPT-4 can accept prompts of both text and images. This means that it can take images as well as text as input, giving it the ability to describe the humor in unusual images, summarize text from screenshots, and answer exam questions that contain diagrams. It has 1 trillion parameters, short-term memory extends to around 64,000 words, while GPT-3.5's short-term memory is around 8,000 words. | + | == GPT-4o == |
+ | [https://www.youtube.com/results?search_query=Generative+Pre+trained+Transformer+GPT4o+AI YouTube] | ||
+ | [https://www.quora.com/search?q=Generative%20Pre%20trained%20Transformer%20%GPT4o20AI ... Quora] | ||
+ | [https://www.google.com/search?q=Generative+Pre+trained+Transformer+GPT4o+AI ...Google search] | ||
+ | [https://news.google.com/search?q=Generative+Pre+trained+Transformer+GPT4o+AI ...Google News] | ||
+ | [https://www.bing.com/news/search?q=Generative+Pre+trained+Transformer+GPT4o+AI&qft=interval%3d%228%22 ...Bing News] | ||
+ | |||
+ | GPT-4o is [[OpenAI]]'s latest advanced AI model, which is described as a multimodal model integrating text, vision, and audio capabilities. This model offers significant improvements over its predecessors, including faster processing and enhanced capabilities in understanding and generating text, images, and audio content ([[OpenAI]]) ([[Azure]]). | ||
+ | |||
+ | One of the standout features of GPT-4o is its advanced voice-to-voice capabilities, which allow for real-time, seamless voice interactions without relying on other models. It has also set new benchmarks in multilingual support and vision tasks, scoring higher than GPT-4 in the Massive Multitask Language Understanding (MMLU) benchmark. | ||
+ | |||
+ | GPT-4o supports over 50 languages, covering about 97% of the world's speakers, and features a more efficient tokenizer that reduces the number of tokens required, particularly for non-Latin alphabet languages. This makes it more cost-effective and accessible for users across different languages. | ||
+ | |||
+ | This model is available to [[ChatGPT]] Plus and Team users, with plans to expand to Enterprise users soon. It is also accessible in a limited capacity to free users, with certain usage limits in place. GPT-4o is now powering [[ChatGPT]], enhancing its abilities to provide more accurate and insightful responses across various inputs. | ||
+ | |||
+ | <youtube>WkB2bvYi73k</youtube> | ||
+ | <youtube>GPNq0WiXa50</youtube> | ||
+ | |||
+ | == GPT-4 == | ||
+ | GPT-4 can accept prompts of both text and images. This means that it can take images as well as text as input, giving it the ability to describe the humor in unusual images, summarize text from screenshots, and answer exam questions that contain diagrams. It has 1 trillion parameters, short-term [[memory]] extends to around 64,000 words, while GPT-3.5's short-term [[memory]] is around 8,000 words. | ||
Line 96: | Line 115: | ||
|} | |} | ||
|}<!-- B --> | |}<!-- B --> | ||
+ | |||
+ | == Edge Impulse == | ||
+ | Using GPT-4o to train a 2,000,000x smaller model (that runs directly on device) | ||
+ | The latest generation LLMs are absolutely astonishing — thanks to their multi-modal capabilities you can ask questions in natural language about stuff you can see or hear in the real world ("is there a person without a hard hat standing close to a machine?") and get relatively fast and reliable answers. But these large LLMs have downsides; they're absolutely huge, so you need to run them in the cloud, adding high latency (often seconds per inference), high cost (think about the tokens you'll burn when running inference 24/7), and high power (need a constant network connection). | ||
+ | |||
+ | In this video we're distilling knowledge from a large multimodal LLM (GPT-4o) and putting it in a tiny model, which we can run directly on device; for ultra-low latency, and without the need for a network connection, scaling to even microcontrollers with kilobytes of RAM if needed. Training was done fully unsupervised, all labels were set by GPT-4o, including deciding when to throw out data, then trained onto a transfer learning model w/ default settings. | ||
+ | |||
+ | One of the models we train has 800K parameters (an NVIDIA TAO model with MobileNet backend), a cool 2,200,000x fewer parameters than GPT-4o :-) with similar accuracy on this very narrow and specific task. | ||
+ | |||
+ | The GPT-4o labeling block and TAO transfer learning models are available for any enterprise customers in Edge Impulse. There's a 2-week free trial available, sign up at [https://edgeimpulse.com Edge Impulse] | ||
+ | |||
+ | <youtube>Jou0aRgGiis</youtube> | ||
== PrivateGPT == | == PrivateGPT == |
Latest revision as of 08:20, 2 June 2024
YouTube ... Quora ...Google search ...Google News ...Bing News
- Large Language Model (LLM) ... Multimodal ... Foundation Models (FM) ... Generative Pre-trained ... Transformer ... GPT-4 ... GPT-5 ... Attention ... GAN ... BERT
- Conversational AI ... ChatGPT | OpenAI ... Bing/Copilot | Microsoft ... Gemini | Google ... Claude | Anthropic ... Perplexity ... You ... phind ... Ernie | Baidu
- Natural Language Processing (NLP) ... Generation (NLG) ... Classification (NLC) ... Understanding (NLU) ... Translation ... Summarization ... Sentiment ... Tools
- GPT-4 | OpenAI
- Research Paper | OpenAI
- Artificial Intelligence (AI) ... Generative AI ... Machine Learning (ML) ... Deep Learning ... Neural Network ... Reinforcement ... Learning Techniques
- Agents ... Robotic Process Automation ... Assistants ... Personal Companions ... Productivity ... Email ... Negotiation ... LangChain
- Video/Image ... Vision ... Enhancement ... Fake ... Reconstruction ... Colorize ... Occlusions ... Predict image ... Image/Video Transfer Learning
- End-to-End Speech ... Synthesize Speech ... Speech Recognition ... Music
- Analytics ... Visualization ... Graphical Tools ... Diagrams & Business Analysis ... Requirements ... Loop ... Bayes ... Network Pattern
- Development ... Notebooks ... AI Pair Programming ... Codeless ... Hugging Face ... AIOps/MLOps ... AIaaS/MLaaS
- Prompt Engineering (PE) ... PromptBase ... Prompt Injection Attack
- Artificial General Intelligence (AGI) to Singularity ... Curious Reasoning ... Emergence ... Moonshots ... Explainable AI ... Automated Learning
- Sparks of Artificial General Intelligence: Early experiments with GPT-4 | S. Bubeck, V. Chandrasekaran, R. Eldan, J. Gehrke, E. Horvitz, E. Kamar, P. Lee, Y. Tat Lee, Y. Li, S. Lundberg, H. Nori, H. Palangi, M. Ribeiro, Y. Zhang - Microsoft Research
- What is GPT-4? Here's everything you need to know | Sabrina Ortiz - ZDnet
- How does GPT-4 work and how can you start using it in ChatGPT? | Mohammed Haddad - Aljazeera] ... Launched on March 14, GPT-4 is the successor to GPT-3 and is the technology behind the viral Chatbot ChatGPT.
- OpenAI unveils GPT-4 with new capabilities, Microsoft's Bing is already using it
- Stripe | OpenAI Customer Stories ... 15 of the prototypes were considered strong candidates to be integrated into the platform, including support customization, answering questions about support, and fraud detection
- Morgan Stanley | OpenAI Customer Stories ... access, process and synthesize content almost instantaneously
- The 411 on GPT-4 | The AI Exchange
- OpenAI released GPT-4, the highly anticipated successor to ChatGPT | Eray Eliaçık - Dataconomy
GPT-4o
YouTube ... Quora ...Google search ...Google News ...Bing News
GPT-4o is OpenAI's latest advanced AI model, which is described as a multimodal model integrating text, vision, and audio capabilities. This model offers significant improvements over its predecessors, including faster processing and enhanced capabilities in understanding and generating text, images, and audio content (OpenAI) (Azure).
One of the standout features of GPT-4o is its advanced voice-to-voice capabilities, which allow for real-time, seamless voice interactions without relying on other models. It has also set new benchmarks in multilingual support and vision tasks, scoring higher than GPT-4 in the Massive Multitask Language Understanding (MMLU) benchmark.
GPT-4o supports over 50 languages, covering about 97% of the world's speakers, and features a more efficient tokenizer that reduces the number of tokens required, particularly for non-Latin alphabet languages. This makes it more cost-effective and accessible for users across different languages.
This model is available to ChatGPT Plus and Team users, with plans to expand to Enterprise users soon. It is also accessible in a limited capacity to free users, with certain usage limits in place. GPT-4o is now powering ChatGPT, enhancing its abilities to provide more accurate and insightful responses across various inputs.
GPT-4
GPT-4 can accept prompts of both text and images. This means that it can take images as well as text as input, giving it the ability to describe the humor in unusual images, summarize text from screenshots, and answer exam questions that contain diagrams. It has 1 trillion parameters, short-term memory extends to around 64,000 words, while GPT-3.5's short-term memory is around 8,000 words.
GPT-4, known as Prometheus can be used on:
- Microsoft Edge: Microsoft Bing Chat
- Chrome Extension: UseChatGPT.AI: Copilot on Chrome (GPT-4 ✓)
- Android: AI Assistant Widget Chat GPT-4 on Google Play Store
One of ChatGPT-4’s most dazzling new features is the ability to handle not only words, but pictures too, in what is being called “multimodal” technology. A user will have the ability to submit a picture alongside text — both of which ChatGPT-4 will be able to process and discuss. The ability to input video is also on the horizon. - Everything You Need to Know About ChatGPT-4 | Alex Millson - Bloomberg, Time
GPT4All
YouTube ... Quora ...Google search ...Google News ...Bing News
- Github | GPT4All
- Dataset viewer | NOMIC.ai
- Tech report: GPT4All: Training an Assistant-style Chatbot with Large Scale DataDistillation from GPT-3.5-Turbo | Y. Anand, Z. Nussbaum, B. Duderstadt, B. Schmidt, & A. Mulyar - NOMIC.ai
A chatbot trained on a massive collection of clean assistant data including code, stories and dialogue. Demo, data and code to train an assistant-style large language model with ~800k GPT-3.5-Turbo Generations based on LLaMa
|
|
Edge Impulse
Using GPT-4o to train a 2,000,000x smaller model (that runs directly on device) The latest generation LLMs are absolutely astonishing — thanks to their multi-modal capabilities you can ask questions in natural language about stuff you can see or hear in the real world ("is there a person without a hard hat standing close to a machine?") and get relatively fast and reliable answers. But these large LLMs have downsides; they're absolutely huge, so you need to run them in the cloud, adding high latency (often seconds per inference), high cost (think about the tokens you'll burn when running inference 24/7), and high power (need a constant network connection).
In this video we're distilling knowledge from a large multimodal LLM (GPT-4o) and putting it in a tiny model, which we can run directly on device; for ultra-low latency, and without the need for a network connection, scaling to even microcontrollers with kilobytes of RAM if needed. Training was done fully unsupervised, all labels were set by GPT-4o, including deciding when to throw out data, then trained onto a transfer learning model w/ default settings.
One of the models we train has 800K parameters (an NVIDIA TAO model with MobileNet backend), a cool 2,200,000x fewer parameters than GPT-4o :-) with similar accuracy on this very narrow and specific task.
The GPT-4o labeling block and TAO transfer learning models are available for any enterprise customers in Edge Impulse. There's a 2-week free trial available, sign up at Edge Impulse
PrivateGPT
PrivateGPT, which allows you to chat directly with your documents (PDF, TXT, and CSV) completely locally, securely, privately, and open-source. PrivateGPT is a project that uses GPT4All to achieve a specific task, i.e. querying over documents using the LangChain framework. It does this by using the GPT4All model, however, any model can be used and sentence_transformer embeddings, which can also be replaced by any embeddings that LangChain supports. PrivateGPT is built with LangChain, GPT4All, LlamaCpp, Chroma and SentenceTransformers. You can ingest documents and ask questions without an internet connection. PrivateGPT works by ingesting your documents into a vector store and then using a Large Language Model (LLM) to answer questions about the information contained in those documents. PrivateGPT can be used offline without connecting to any online servers or adding any API keys from OpenAI or Pinecone. To facilitate this, it runs an Large Language Model (LLM) locally on your computer. This makes it possible to use PrivateGPT without an internet connection and ensures that your data remains private and secure. You can set up PrivateGPT by installing the required dependencies, downloading the LLM, and configuring the environment variables in the `.env` file¹. Once set up, you can ingest your documents into the vector store and then use PrivateGPT to ask questions about the information contained in those documents.
SentenceTransformers is a Python framework for state-of-the-art sentence, text, and image embeddings. It is based on PyTorch and Transformers and offers a large collection of pre-trained models tuned for various tasks. You can use this framework to compute sentence/text embeddings for more than 100 languages. These embeddings can then be compared, for example, with cosine similarity to find sentences with a similar meaning. This can be useful for semantic textual similarity, semantic search, or paraphrase mining. You can install the Sentence Transformers library using pip: pip install -U sentence-transformers