Difference between revisions of "GPT-4"

From
Jump to: navigation, search
m
m (PrivateGPT)
 
(7 intermediate revisions by the same user not shown)
Line 21: Line 21:
  
 
* [[Large Language Model (LLM)]] ... [[Large Language Model (LLM)#Multimodal|Multimodal]] ... [[Foundation Models (FM)]] ... [[Generative Pre-trained Transformer (GPT)|Generative Pre-trained]] ... [[Transformer]] ... [[GPT-4]] ... [[GPT-5]] ... [[Attention]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]]
 
* [[Large Language Model (LLM)]] ... [[Large Language Model (LLM)#Multimodal|Multimodal]] ... [[Foundation Models (FM)]] ... [[Generative Pre-trained Transformer (GPT)|Generative Pre-trained]] ... [[Transformer]] ... [[GPT-4]] ... [[GPT-5]] ... [[Attention]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]]
 +
* [[Conversational AI]] ... [[ChatGPT]] | [[OpenAI]] ... [[Bing/Copilot]] | [[Microsoft]] ... [[Gemini]] | [[Google]] ... [[Claude]] | [[Anthropic]] ... [[Perplexity]] ... [[You]] ... [[phind]] ... [[Ernie]] | [[Baidu]]
 
* [[Natural Language Processing (NLP)]] ... [[Natural Language Generation (NLG)|Generation (NLG)]] ... [[Natural Language Classification (NLC)|Classification (NLC)]] ... [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding (NLU)]] ... [[Language Translation|Translation]] ... [[Summarization]] ... [[Sentiment Analysis|Sentiment]] ... [[Natural Language Tools & Services|Tools]]
 
* [[Natural Language Processing (NLP)]] ... [[Natural Language Generation (NLG)|Generation (NLG)]] ... [[Natural Language Classification (NLC)|Classification (NLC)]] ... [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding (NLU)]] ... [[Language Translation|Translation]] ... [[Summarization]] ... [[Sentiment Analysis|Sentiment]] ... [[Natural Language Tools & Services|Tools]]
 
* [https://openai.com/product/gpt-4 GPT-4 |] [[OpenAI]]
 
* [https://openai.com/product/gpt-4 GPT-4 |] [[OpenAI]]
 
* [https://openai.com/research/gpt-4 Research Paper |] [[OpenAI]]
 
* [https://openai.com/research/gpt-4 Research Paper |] [[OpenAI]]
 
* [[What is Artificial Intelligence (AI)? | Artificial Intelligence (AI)]] ... [[Generative AI]] ... [[Machine Learning (ML)]] ... [[Deep Learning]] ... [[Neural Network]] ... [[Reinforcement Learning (RL)|Reinforcement]] ... [[Learning Techniques]]
 
* [[What is Artificial Intelligence (AI)? | Artificial Intelligence (AI)]] ... [[Generative AI]] ... [[Machine Learning (ML)]] ... [[Deep Learning]] ... [[Neural Network]] ... [[Reinforcement Learning (RL)|Reinforcement]] ... [[Learning Techniques]]
* [[Conversational AI]] ... [[ChatGPT]] | [[OpenAI]] ... [[Bing]] | [[Microsoft]] ... [[Bard]] | [[Google]] ... [[Claude]] | [[Anthropic]] ... [[Perplexity]] ... [[You]] ... [[Ernie]] | [[Baidu]]
+
* [[Agents]] ... [[Robotic Process Automation (RPA)|Robotic Process Automation]] ... [[Assistants]] ... [[Personal Companions]] ... [[Personal Productivity|Productivity]] ... [[Email]] ... [[Negotiation]] ... [[LangChain]]
* [[Assistants]] ... [[Personal Companions]] ... [[Agents]] ... [[Negotiation]] ... [[LangChain]]
 
 
* [[Video/Image]] ... [[Vision]] ... [[Enhancement]] ... [[Fake]] ... [[Reconstruction]] ... [[Colorize]] ... [[Occlusions]] ... [[Predict image]] ... [[Image/Video Transfer Learning]]
 
* [[Video/Image]] ... [[Vision]] ... [[Enhancement]] ... [[Fake]] ... [[Reconstruction]] ... [[Colorize]] ... [[Occlusions]] ... [[Predict image]] ... [[Image/Video Transfer Learning]]
 
* [[End-to-End Speech]] ... [[Synthesize Speech]] ... [[Speech Recognition]] ... [[Music]]
 
* [[End-to-End Speech]] ... [[Synthesize Speech]] ... [[Speech Recognition]] ... [[Music]]
 
* [[Analytics]] ... [[Visualization]] ... [[Graphical Tools for Modeling AI Components|Graphical Tools]] ... [[Diagrams for Business Analysis|Diagrams]] & [[Generative AI for Business Analysis|Business Analysis]] ... [[Requirements Management|Requirements]] ... [[Loop]] ... [[Bayes]] ... [[Network Pattern]]
 
* [[Analytics]] ... [[Visualization]] ... [[Graphical Tools for Modeling AI Components|Graphical Tools]] ... [[Diagrams for Business Analysis|Diagrams]] & [[Generative AI for Business Analysis|Business Analysis]] ... [[Requirements Management|Requirements]] ... [[Loop]] ... [[Bayes]] ... [[Network Pattern]]
* [[Development]] ... [[Notebooks]] ... [[Development#AI Pair Programming Tools|AI Pair Programming]] ... [[Codeless Options, Code Generators, Drag n' Drop|Codeless, Generators, Drag n' Drop]] ... [[Algorithm Administration#AIOps/MLOps|AIOps/MLOps]] ... [[Platforms: AI/Machine Learning as a Service (AIaaS/MLaaS)|AIaaS/MLaaS]]
+
* [[Development]] ... [[Notebooks]] ... [[Development#AI Pair Programming Tools|AI Pair Programming]] ... [[Codeless Options, Code Generators, Drag n' Drop|Codeless]] ... [[Hugging Face]] ... [[Algorithm Administration#AIOps/MLOps|AIOps/MLOps]] ... [[Platforms: AI/Machine Learning as a Service (AIaaS/MLaaS)|AIaaS/MLaaS]]
 
* [[Prompt Engineering (PE)]] ... [[Prompt Engineering (PE)#PromptBase|PromptBase]] ... [[Prompt Injection Attack]]  
 
* [[Prompt Engineering (PE)]] ... [[Prompt Engineering (PE)#PromptBase|PromptBase]] ... [[Prompt Injection Attack]]  
 
* [[Artificial General Intelligence (AGI) to Singularity]] ... [[Inside Out - Curious Optimistic Reasoning| Curious Reasoning]] ... [[Emergence]] ... [[Moonshots]] ... [[Explainable / Interpretable AI|Explainable AI]] ...  [[Algorithm Administration#Automated Learning|Automated Learning]]
 
* [[Artificial General Intelligence (AGI) to Singularity]] ... [[Inside Out - Curious Optimistic Reasoning| Curious Reasoning]] ... [[Emergence]] ... [[Moonshots]] ... [[Explainable / Interpretable AI|Explainable AI]] ...  [[Algorithm Administration#Automated Learning|Automated Learning]]
Line 43: Line 43:
  
  
 +
== GPT-4o ==
 +
[https://www.youtube.com/results?search_query=Generative+Pre+trained+Transformer+GPT4o+AI YouTube]
 +
[https://www.quora.com/search?q=Generative%20Pre%20trained%20Transformer%20%GPT4o20AI ... Quora]
 +
[https://www.google.com/search?q=Generative+Pre+trained+Transformer+GPT4o+AI ...Google search]
 +
[https://news.google.com/search?q=Generative+Pre+trained+Transformer+GPT4o+AI ...Google News]
 +
[https://www.bing.com/news/search?q=Generative+Pre+trained+Transformer+GPT4o+AI&qft=interval%3d%228%22 ...Bing News]
 +
 +
GPT-4o is [[OpenAI]]'s latest advanced AI model, which is described as a multimodal model integrating text, vision, and audio capabilities. This model offers significant improvements over its predecessors, including faster processing and enhanced capabilities in understanding and generating text, images, and audio content​ ([[OpenAI]])​​ ([[Azure]])​.
 +
 +
One of the standout features of GPT-4o is its advanced voice-to-voice capabilities, which allow for real-time, seamless voice interactions without relying on other models. It has also set new benchmarks in multilingual support and vision tasks, scoring higher than GPT-4 in the Massive Multitask Language Understanding (MMLU) benchmark​​.
 +
 +
GPT-4o supports over 50 languages, covering about 97% of the world's speakers, and features a more efficient tokenizer that reduces the number of tokens required, particularly for non-Latin alphabet languages. This makes it more cost-effective and accessible for users across different languages​​.
 +
 +
This model is available to [[ChatGPT]] Plus and Team users, with plans to expand to Enterprise users soon. It is also accessible in a limited capacity to free users, with certain usage limits in place​​. GPT-4o is now powering [[ChatGPT]], enhancing its abilities to provide more accurate and insightful responses across various inputs​.
 +
 +
<youtube>WkB2bvYi73k</youtube>
 +
<youtube>GPNq0WiXa50</youtube>
 +
 +
== GPT-4 ==
 
GPT-4 can accept prompts of both text and images. This means that it can take images as well as text as input, giving it the ability to describe the humor in unusual images, summarize text from screenshots, and answer exam questions that contain diagrams. It has 1 trillion parameters, short-term [[memory]] extends to around 64,000 words, while GPT-3.5's short-term [[memory]] is around 8,000 words.
 
GPT-4 can accept prompts of both text and images. This means that it can take images as well as text as input, giving it the ability to describe the humor in unusual images, summarize text from screenshots, and answer exam questions that contain diagrams. It has 1 trillion parameters, short-term [[memory]] extends to around 64,000 words, while GPT-3.5's short-term [[memory]] is around 8,000 words.
  
Line 96: Line 115:
 
|}
 
|}
 
|}<!-- B -->
 
|}<!-- B -->
 +
 +
== Edge Impulse ==
 +
Using GPT-4o to train a 2,000,000x smaller model (that runs directly on device)
 +
The latest generation LLMs are absolutely astonishing — thanks to their multi-modal capabilities you can ask questions in natural language about stuff you can see or hear in the real world ("is there a person without a hard hat standing close to a machine?") and get relatively fast and reliable answers. But these large LLMs have downsides; they're absolutely huge, so you need to run them in the cloud, adding high latency (often seconds per inference), high cost (think about the tokens you'll burn when running inference 24/7), and high power (need a constant network connection).
 +
 +
In this video we're distilling knowledge from a large multimodal LLM (GPT-4o) and putting it in a tiny model, which we can run directly on device; for ultra-low latency, and without the need for a network connection, scaling to even microcontrollers with kilobytes of RAM if needed. Training was done fully unsupervised, all labels were set by GPT-4o, including deciding when to throw out data, then trained onto a transfer learning model w/ default settings.
 +
 +
One of the models we train has 800K parameters (an NVIDIA TAO model with MobileNet backend), a cool 2,200,000x fewer parameters than GPT-4o :-) with similar accuracy on this very narrow and specific task.
 +
 +
The GPT-4o labeling block and TAO transfer learning models are available for any enterprise customers in Edge Impulse. There's a 2-week free trial available, sign up at [https://edgeimpulse.com Edge Impulse]
 +
 +
<youtube>Jou0aRgGiis</youtube>
  
 
== PrivateGPT ==  
 
== PrivateGPT ==  

Latest revision as of 08:20, 2 June 2024

YouTube ... Quora ...Google search ...Google News ...Bing News


GPT-4o

YouTube ... Quora ...Google search ...Google News ...Bing News

GPT-4o is OpenAI's latest advanced AI model, which is described as a multimodal model integrating text, vision, and audio capabilities. This model offers significant improvements over its predecessors, including faster processing and enhanced capabilities in understanding and generating text, images, and audio content​ (OpenAI)​​ (Azure)​.

One of the standout features of GPT-4o is its advanced voice-to-voice capabilities, which allow for real-time, seamless voice interactions without relying on other models. It has also set new benchmarks in multilingual support and vision tasks, scoring higher than GPT-4 in the Massive Multitask Language Understanding (MMLU) benchmark​​.

GPT-4o supports over 50 languages, covering about 97% of the world's speakers, and features a more efficient tokenizer that reduces the number of tokens required, particularly for non-Latin alphabet languages. This makes it more cost-effective and accessible for users across different languages​​.

This model is available to ChatGPT Plus and Team users, with plans to expand to Enterprise users soon. It is also accessible in a limited capacity to free users, with certain usage limits in place​​. GPT-4o is now powering ChatGPT, enhancing its abilities to provide more accurate and insightful responses across various inputs​.

GPT-4

GPT-4 can accept prompts of both text and images. This means that it can take images as well as text as input, giving it the ability to describe the humor in unusual images, summarize text from screenshots, and answer exam questions that contain diagrams. It has 1 trillion parameters, short-term memory extends to around 64,000 words, while GPT-3.5's short-term memory is around 8,000 words.



GPT-4, known as Prometheus can be used on:



One of ChatGPT-4’s most dazzling new features is the ability to handle not only words, but pictures too, in what is being called “multimodal” technology. A user will have the ability to submit a picture alongside text — both of which ChatGPT-4 will be able to process and discuss. The ability to input video is also on the horizon. - Everything You Need to Know About ChatGPT-4 | Alex Millson - Bloomberg, Time


GPT4All

YouTube ... Quora ...Google search ...Google News ...Bing News

A chatbot trained on a massive collection of clean assistant data including code, stories and dialogue. Demo, data and code to train an assistant-style large language model with ~800k GPT-3.5-Turbo Generations based on LLaMa

GPT4ALL: Install 'ChatGPT' Locally (weights & fine-tuning!) - Tutorial
Matthew Berman - In this video, I walk you through installing the newly released GPT4ALL large language model on your local computer. This model is brought to you by the fine people at Nomic AI, furthering the open-source LLM mission. GPT4ALL is trained using the same technique as Alpaca, which is an assistant-style large language model with ~800k GPT-3.5-Turbo Generations based on LLaMa. IMO, it works even better than Alpaca and is super fast. This is basically like having ChatGPT on your local computer. Easy install. Nomic AI was also kind enough to include the weights in addition to the quantized model.

Is GPT4All your new personal ChatGPT?
In this video we are looking at the GPT4ALL model which an interesting (even though not for commercial use) project of taking a LLaMa model and finetuning with a lot more instruction tasks than Alpaca.

Edge Impulse

Using GPT-4o to train a 2,000,000x smaller model (that runs directly on device) The latest generation LLMs are absolutely astonishing — thanks to their multi-modal capabilities you can ask questions in natural language about stuff you can see or hear in the real world ("is there a person without a hard hat standing close to a machine?") and get relatively fast and reliable answers. But these large LLMs have downsides; they're absolutely huge, so you need to run them in the cloud, adding high latency (often seconds per inference), high cost (think about the tokens you'll burn when running inference 24/7), and high power (need a constant network connection).

In this video we're distilling knowledge from a large multimodal LLM (GPT-4o) and putting it in a tiny model, which we can run directly on device; for ultra-low latency, and without the need for a network connection, scaling to even microcontrollers with kilobytes of RAM if needed. Training was done fully unsupervised, all labels were set by GPT-4o, including deciding when to throw out data, then trained onto a transfer learning model w/ default settings.

One of the models we train has 800K parameters (an NVIDIA TAO model with MobileNet backend), a cool 2,200,000x fewer parameters than GPT-4o :-) with similar accuracy on this very narrow and specific task.

The GPT-4o labeling block and TAO transfer learning models are available for any enterprise customers in Edge Impulse. There's a 2-week free trial available, sign up at Edge Impulse

PrivateGPT

PrivateGPT, which allows you to chat directly with your documents (PDF, TXT, and CSV) completely locally, securely, privately, and open-source. PrivateGPT is a project that uses GPT4All to achieve a specific task, i.e. querying over documents using the LangChain framework. It does this by using the GPT4All model, however, any model can be used and sentence_transformer embeddings, which can also be replaced by any embeddings that LangChain supports. PrivateGPT is built with LangChain, GPT4All, LlamaCpp, Chroma and SentenceTransformers. You can ingest documents and ask questions without an internet connection. PrivateGPT works by ingesting your documents into a vector store and then using a Large Language Model (LLM) to answer questions about the information contained in those documents. PrivateGPT can be used offline without connecting to any online servers or adding any API keys from OpenAI or Pinecone. To facilitate this, it runs an Large Language Model (LLM) locally on your computer. This makes it possible to use PrivateGPT without an internet connection and ensures that your data remains private and secure. You can set up PrivateGPT by installing the required dependencies, downloading the LLM, and configuring the environment variables in the `.env` file¹. Once set up, you can ingest your documents into the vector store and then use PrivateGPT to ask questions about the information contained in those documents.

SentenceTransformers is a Python framework for state-of-the-art sentence, text, and image embeddings. It is based on PyTorch and Transformers and offers a large collection of pre-trained models tuned for various tasks. You can use this framework to compute sentence/text embeddings for more than 100 languages. These embeddings can then be compared, for example, with cosine similarity to find sentences with a similar meaning. This can be useful for semantic textual similarity, semantic search, or paraphrase mining. You can install the Sentence Transformers library using pip: pip install -U sentence-transformers