Difference between revisions of "Prompt Injection Attack"

From
Jump to: navigation, search
m
m
 
(26 intermediate revisions by the same user not shown)
Line 2: Line 2:
 
|title=PRIMO.ai
 
|title=PRIMO.ai
 
|titlemode=append
 
|titlemode=append
|keywords=artificial, intelligence, machine, learning, models, algorithms, data, singularity, moonshot, Tensorflow, Google, Nvidia, Microsoft, Azure, Amazon, AWS  
+
|keywords=ChatGPT, artificial, intelligence, machine, learning, GPT-4, GPT-5, NLP, NLG, NLC, NLU, models, data, singularity, moonshot, Sentience, AGI, Emergence, Moonshot, Explainable, TensorFlow, Google, Nvidia, Microsoft, Azure, Amazon, AWS, Hugging Face, OpenAI, Tensorflow, OpenAI, Google, Nvidia, Microsoft, Azure, Amazon, AWS, Meta, LLM, metaverse, assistants, agents, digital twin, IoT, Transhumanism, Immersive Reality, Generative AI, Conversational AI, Perplexity, Bing, You, Bard, Ernie, prompt Engineering LangChain, Video/Image, Vision, End-to-End Speech, Synthesize Speech, Speech Recognition, Stanford, MIT |description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools
|description=Helpful resources for your journey with artificial intelligence; chat, chatbot, videos, articles, techniques, courses, profiles, and tools  
+
 
 +
<!-- Google tag (gtag.js) -->
 +
<script async src="https://www.googletagmanager.com/gtag/js?id=G-4GCWLBVJ7T"></script>
 +
<script>
 +
  window.dataLayer = window.dataLayer || [];
 +
  function gtag(){dataLayer.push(arguments);}
 +
  gtag('js', new Date());
 +
 
 +
  gtag('config', 'G-4GCWLBVJ7T');
 +
</script>
 
}}
 
}}
 
[https://www.youtube.com/results?search_query=Prompt+Injection+Attackchatbot+assistant+artificial+intelligence+deep+machine+learning YouTube search...]
 
[https://www.youtube.com/results?search_query=Prompt+Injection+Attackchatbot+assistant+artificial+intelligence+deep+machine+learning YouTube search...]
 
[https://www.google.com/search?q=Prompt+Injection+Attackchatbot+assistant+artificial+intelligence+deep+machine+learning ...Google search]
 
[https://www.google.com/search?q=Prompt+Injection+Attackchatbot+assistant+artificial+intelligence+deep+machine+learning ...Google search]
  
* [[Prompt Engineering (PE)]]
+
* [[Prompt Engineering (PE)]] ... [[Prompt Engineering (PE)#PromptBase|PromptBase]] ... [[Prompt Injection Attack]]  
* [[Natural Language Processing (NLP)]]  ...[[Natural Language Generation (NLG)|Generation]] ...[[Large Language Model (LLM)|LLM]]  ...[[Natural Language Tools & Services|Tools & Services]]
+
* [[Analytics]] ... [[Visualization]] ... [[Graphical Tools for Modeling AI Components|Graphical Tools]] ... [[Diagrams for Business Analysis|Diagrams]] & [[Generative AI for Business Analysis|Business Analysis]] ... [[Requirements Management|Requirements]] ... [[Loop]] ... [[Bayes]] ... [[Network Pattern]]
* [[Assistants]] ... [[Hybrid Assistants]] ... [[Agents]] ... [[Negotiation]]
+
* [[Large Language Model (LLM)]] ... [[Natural Language Processing (NLP)]]  ...[[Natural Language Generation (NLG)|Generation]] ... [[Natural Language Classification (NLC)|Classification]] ... [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding]] ... [[Language Translation|Translation]] ... [[Natural Language Tools & Services|Tools & Services]]
* [[Attention]] Mechanism  ...[[Transformer]] Model  ...[[Generative Pre-trained Transformer (GPT)]]  
+
* [[Agents]] ... [[Robotic Process Automation (RPA)|Robotic Process Automation]] ... [[Assistants]] ... [[Personal Companions]] ... [[Personal Productivity|Productivity]] ... [[Email]] ... [[Negotiation]] ... [[LangChain]]
* Similar conversation/search tools:
+
* [[Attention]] Mechanism  ...[[Transformer]] ...[[Generative Pre-trained Transformer (GPT)]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]]
** [[ChatGPT]] | [[OpenAI]]
+
* [[What is Artificial Intelligence (AI)? | Artificial Intelligence (AI)]] ... [[Generative AI]] ... [[Machine Learning (ML)]] ... [[Deep Learning]] ... [[Neural Network]] ... [[Reinforcement Learning (RL)|Reinforcement]] ... [[Learning Techniques]]
** [[Bard]] | [[Google]]  
+
* [[Conversational AI]] ... [[ChatGPT]] | [[OpenAI]] ... [[Bing/Copilot]] | [[Microsoft]] ... [[Gemini]] | [[Google]] ... [[Claude]] | [[Anthropic]] ... [[Perplexity]] ... [[You]] ... [[phind]] ... [[Ernie]] | [[Baidu]]
** [[Perplexity]] | Perplexity.ai  ... current information, including footnotes with links to the sources of the data
+
* [[Cybersecurity]] ... [[Open-Source Intelligence - OSINT |OSINT]] ... [[Cybersecurity Frameworks, Architectures & Roadmaps | Frameworks]] ... [[Cybersecurity References|References]] ... [[Offense - Adversarial Threats/Attacks| Offense]] ... [[National Institute of Standards and Technology (NIST)|NIST]] ... [[U.S. Department of Homeland Security (DHS)| DHS]] ... [[Screening; Passenger, Luggage, & Cargo|Screening]] ... [[Law Enforcement]] ... [[Government Services|Government]] ... [[Defense]] ... [[Joint Capabilities Integration and Development System (JCIDS)#Cybersecurity & Acquisition Lifecycle Integration| Lifecycle Integration]] ... [[Cybersecurity Companies/Products|Products]] ... [[Cybersecurity: Evaluating & Selling|Evaluating]]
** [[You]] | You.com  ... the AI search engine you control; YouChat, YouCode, YouWrite, YouImagine, YouStudy, & YouSocial
 
** [[Neeva]]
 
* [[Cybersecurity]]
 
 
* [https://simonwillison.net/2022/Sep/12/prompt-injection/ Prompt injection attacks against GPT-3 | Simon Willison's Weblog]
 
* [https://simonwillison.net/2022/Sep/12/prompt-injection/ Prompt injection attacks against GPT-3 | Simon Willison's Weblog]
 
* [https://github.com/dair-ai/Prompt-Engineering-Guide/blob/main/guides/prompt-adversarial.md Adversarial Prompting | Elvis Saravia - dair.ai]
 
* [https://github.com/dair-ai/Prompt-Engineering-Guide/blob/main/guides/prompt-adversarial.md Adversarial Prompting | Elvis Saravia - dair.ai]
 +
* [https://www.sfgate.com/tech/article/google-openai-chatgpt-break-model-18525445.php How Googlers cracked an SF rival's tech model with a single word | Stephen Council - SFGATE] ... A research team from the tech giant got [[ChatGPT]] to spit out its private training data
  
  
 
...a new vulnerability that is affecting some AI/ML models and, in particular, certain types of language models using prompt-based learning.  ... create a malicious input that made a language model change its expected behaviour. - [https://research.nccgroup.com/2022/12/05/exploring-prompt-injection-attacks/ Exploring Prompt Injection Attacks | NCC Group]
 
...a new vulnerability that is affecting some AI/ML models and, in particular, certain types of language models using prompt-based learning.  ... create a malicious input that made a language model change its expected behaviour. - [https://research.nccgroup.com/2022/12/05/exploring-prompt-injection-attacks/ Exploring Prompt Injection Attacks | NCC Group]
  
Prompt injection is a family of related computer security exploits carried out by getting machine learning models (such as large language model) which were trained to follow human-given instructions to follow instructions provided by a malicious user, which stands in contrast to the intended operation of instruction-following systems, wherein the ML model is intended only to follow trusted instructions (prompt) provided by the ML model's operator. Around 2023, prompt injection was seen "in the wild" in minor exploits against [[ChatGPT]] and similar chatbots, for example to reveal the hidden initial prompts of the systems, or to trick the chatbot into participating in conversations that violate the chatbot's content policy. [https://en.wikipedia.org/wiki/Prompt_engineering Wikipedia]
+
 
 +
<hr><center><b><i>
 +
 
 +
Researchers asked ChatGPT to repeat the word “poem” forever.
 +
 
 +
</i></b></center><hr>
 +
 
 +
 
 +
* <b>Alignment</b>: engineers’ attempts to guide the tech’s behavior.
 +
* <b>Extraction</b>: an “adversarial” attempt to glean what data might have been used to train an AI tool.
 +
 
 +
 
 +
Prompt injection is a family of related computer security exploits carried out by getting machine learning models (such as large language model) which were trained to follow human-given instructions to follow instructions provided by a malicious user, which stands in contrast to the intended operation of instruction-following systems, wherein the ML model is intended only to follow trusted instructions (prompt) provided by the ML model's operator. Around 2023, prompt injection was seen "in the wild" in minor exploits against [[ChatGPT]] and similar [[Assistants#Chatbot | Chatbot]]s, for example to reveal the hidden initial prompts of the systems, or to trick the [[Assistants#Chatbot | Chatbot]] into participating in conversations that violate the [[Assistants#Chatbot | Chatbot]]'s content [[policy]]. [https://en.wikipedia.org/wiki/Prompt_engineering Wikipedia]
  
 
{|<!-- T -->
 
{|<!-- T -->

Latest revision as of 08:40, 23 March 2024

YouTube search... ...Google search


...a new vulnerability that is affecting some AI/ML models and, in particular, certain types of language models using prompt-based learning. ... create a malicious input that made a language model change its expected behaviour. - Exploring Prompt Injection Attacks | NCC Group



Researchers asked ChatGPT to repeat the word “poem” forever.



  • Alignment: engineers’ attempts to guide the tech’s behavior.
  • Extraction: an “adversarial” attempt to glean what data might have been used to train an AI tool.


Prompt injection is a family of related computer security exploits carried out by getting machine learning models (such as large language model) which were trained to follow human-given instructions to follow instructions provided by a malicious user, which stands in contrast to the intended operation of instruction-following systems, wherein the ML model is intended only to follow trusted instructions (prompt) provided by the ML model's operator. Around 2023, prompt injection was seen "in the wild" in minor exploits against ChatGPT and similar Chatbots, for example to reveal the hidden initial prompts of the systems, or to trick the Chatbot into participating in conversations that violate the Chatbot's content policy. Wikipedia

What is GPT-3 Prompt Injection & Prompt Leaking? AI Adversarial Attacks
In this video, we take a deeper look at GPT-3 or any Large Language Model's Prompt Injection & Prompt Leaking. These are security exploitation in Prompt Engineering. These are also AI Adversarial Attacks. The name Prompt Injection comes from the age-old SQL Injection where a malicious SQL script can be added to a web form to manipulate the underlying SQL query. In a similar fashion, Prompts can be altered to get abnormal results from a LLM or GPT-3 based Application.

GPT2 Unlimited-Length Generation with Hidden Prompt Injections - Code Review
Unlimited-Length Imagination Directed GPT2 Chained Generation by Overlapping Prompt-Injections. The same idea can be applied for any similar generative model with a prompt for producing more creative text and for changing the topic in a directed manner, which makes the text more interesting and original and less monotonous.

JailBreaking ChatGPT Meaning - JailBreak ChatGPT with DAN Explained
This video teaches you

  • 1. What's Jailbreaking in General?
  • 2. what's JailBreaking of ChatGPT means?
  • 3. JailBreaking Prompt explanation
  • 4. Jailbreaking ChatGPT with DAN "Do Anything Now"
  • 5. Prompt Injection
  • 6. Does Jail Breaking work or is it hallucinations?

Update: ChatGPT (GPT-3) Hack. AI Text Security Breach Found! Why it's Serious
We discuss a bug found in Artificial Intelligence (AI) language model, GPT-3. The weakness found is a common one found in other computer languages like SQL. This flaw was recently discovered & will probably be fixed in future releases. This hack also applies to the newly released ChatGPT.

Chapters:

  • 00:00 Introduction
  • 00:21 GPT-3
  • 00:37 Data Breach
  • 01:34 Bug Discovery
  • 05:27 Hacking Bot