Difference between revisions of "Prompt Injection Attack"
m |
m |
||
(5 intermediate revisions by the same user not shown) | |||
Line 20: | Line 20: | ||
* [[Analytics]] ... [[Visualization]] ... [[Graphical Tools for Modeling AI Components|Graphical Tools]] ... [[Diagrams for Business Analysis|Diagrams]] & [[Generative AI for Business Analysis|Business Analysis]] ... [[Requirements Management|Requirements]] ... [[Loop]] ... [[Bayes]] ... [[Network Pattern]] | * [[Analytics]] ... [[Visualization]] ... [[Graphical Tools for Modeling AI Components|Graphical Tools]] ... [[Diagrams for Business Analysis|Diagrams]] & [[Generative AI for Business Analysis|Business Analysis]] ... [[Requirements Management|Requirements]] ... [[Loop]] ... [[Bayes]] ... [[Network Pattern]] | ||
* [[Large Language Model (LLM)]] ... [[Natural Language Processing (NLP)]] ...[[Natural Language Generation (NLG)|Generation]] ... [[Natural Language Classification (NLC)|Classification]] ... [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding]] ... [[Language Translation|Translation]] ... [[Natural Language Tools & Services|Tools & Services]] | * [[Large Language Model (LLM)]] ... [[Natural Language Processing (NLP)]] ...[[Natural Language Generation (NLG)|Generation]] ... [[Natural Language Classification (NLC)|Classification]] ... [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding]] ... [[Language Translation|Translation]] ... [[Natural Language Tools & Services|Tools & Services]] | ||
− | * [[Assistants]] ... [[Personal Companions]] ... [[ | + | * [[Agents]] ... [[Robotic Process Automation (RPA)|Robotic Process Automation]] ... [[Assistants]] ... [[Personal Companions]] ... [[Personal Productivity|Productivity]] ... [[Email]] ... [[Negotiation]] ... [[LangChain]] |
* [[Attention]] Mechanism ...[[Transformer]] ...[[Generative Pre-trained Transformer (GPT)]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]] | * [[Attention]] Mechanism ...[[Transformer]] ...[[Generative Pre-trained Transformer (GPT)]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]] | ||
− | * [[Generative AI]] | + | * [[What is Artificial Intelligence (AI)? | Artificial Intelligence (AI)]] ... [[Generative AI]] ... [[Machine Learning (ML)]] ... [[Deep Learning]] ... [[Neural Network]] ... [[Reinforcement Learning (RL)|Reinforcement]] ... [[Learning Techniques]] |
+ | * [[Conversational AI]] ... [[ChatGPT]] | [[OpenAI]] ... [[Bing/Copilot]] | [[Microsoft]] ... [[Gemini]] | [[Google]] ... [[Claude]] | [[Anthropic]] ... [[Perplexity]] ... [[You]] ... [[phind]] ... [[Ernie]] | [[Baidu]] | ||
* [[Cybersecurity]] ... [[Open-Source Intelligence - OSINT |OSINT]] ... [[Cybersecurity Frameworks, Architectures & Roadmaps | Frameworks]] ... [[Cybersecurity References|References]] ... [[Offense - Adversarial Threats/Attacks| Offense]] ... [[National Institute of Standards and Technology (NIST)|NIST]] ... [[U.S. Department of Homeland Security (DHS)| DHS]] ... [[Screening; Passenger, Luggage, & Cargo|Screening]] ... [[Law Enforcement]] ... [[Government Services|Government]] ... [[Defense]] ... [[Joint Capabilities Integration and Development System (JCIDS)#Cybersecurity & Acquisition Lifecycle Integration| Lifecycle Integration]] ... [[Cybersecurity Companies/Products|Products]] ... [[Cybersecurity: Evaluating & Selling|Evaluating]] | * [[Cybersecurity]] ... [[Open-Source Intelligence - OSINT |OSINT]] ... [[Cybersecurity Frameworks, Architectures & Roadmaps | Frameworks]] ... [[Cybersecurity References|References]] ... [[Offense - Adversarial Threats/Attacks| Offense]] ... [[National Institute of Standards and Technology (NIST)|NIST]] ... [[U.S. Department of Homeland Security (DHS)| DHS]] ... [[Screening; Passenger, Luggage, & Cargo|Screening]] ... [[Law Enforcement]] ... [[Government Services|Government]] ... [[Defense]] ... [[Joint Capabilities Integration and Development System (JCIDS)#Cybersecurity & Acquisition Lifecycle Integration| Lifecycle Integration]] ... [[Cybersecurity Companies/Products|Products]] ... [[Cybersecurity: Evaluating & Selling|Evaluating]] | ||
* [https://simonwillison.net/2022/Sep/12/prompt-injection/ Prompt injection attacks against GPT-3 | Simon Willison's Weblog] | * [https://simonwillison.net/2022/Sep/12/prompt-injection/ Prompt injection attacks against GPT-3 | Simon Willison's Weblog] | ||
* [https://github.com/dair-ai/Prompt-Engineering-Guide/blob/main/guides/prompt-adversarial.md Adversarial Prompting | Elvis Saravia - dair.ai] | * [https://github.com/dair-ai/Prompt-Engineering-Guide/blob/main/guides/prompt-adversarial.md Adversarial Prompting | Elvis Saravia - dair.ai] | ||
+ | * [https://www.sfgate.com/tech/article/google-openai-chatgpt-break-model-18525445.php How Googlers cracked an SF rival's tech model with a single word | Stephen Council - SFGATE] ... A research team from the tech giant got [[ChatGPT]] to spit out its private training data | ||
...a new vulnerability that is affecting some AI/ML models and, in particular, certain types of language models using prompt-based learning. ... create a malicious input that made a language model change its expected behaviour. - [https://research.nccgroup.com/2022/12/05/exploring-prompt-injection-attacks/ Exploring Prompt Injection Attacks | NCC Group] | ...a new vulnerability that is affecting some AI/ML models and, in particular, certain types of language models using prompt-based learning. ... create a malicious input that made a language model change its expected behaviour. - [https://research.nccgroup.com/2022/12/05/exploring-prompt-injection-attacks/ Exploring Prompt Injection Attacks | NCC Group] | ||
+ | |||
+ | |||
+ | <hr><center><b><i> | ||
+ | |||
+ | Researchers asked ChatGPT to repeat the word “poem” forever. | ||
+ | |||
+ | </i></b></center><hr> | ||
+ | |||
+ | |||
+ | * <b>Alignment</b>: engineers’ attempts to guide the tech’s behavior. | ||
+ | * <b>Extraction</b>: an “adversarial” attempt to glean what data might have been used to train an AI tool. | ||
+ | |||
Prompt injection is a family of related computer security exploits carried out by getting machine learning models (such as large language model) which were trained to follow human-given instructions to follow instructions provided by a malicious user, which stands in contrast to the intended operation of instruction-following systems, wherein the ML model is intended only to follow trusted instructions (prompt) provided by the ML model's operator. Around 2023, prompt injection was seen "in the wild" in minor exploits against [[ChatGPT]] and similar [[Assistants#Chatbot | Chatbot]]s, for example to reveal the hidden initial prompts of the systems, or to trick the [[Assistants#Chatbot | Chatbot]] into participating in conversations that violate the [[Assistants#Chatbot | Chatbot]]'s content [[policy]]. [https://en.wikipedia.org/wiki/Prompt_engineering Wikipedia] | Prompt injection is a family of related computer security exploits carried out by getting machine learning models (such as large language model) which were trained to follow human-given instructions to follow instructions provided by a malicious user, which stands in contrast to the intended operation of instruction-following systems, wherein the ML model is intended only to follow trusted instructions (prompt) provided by the ML model's operator. Around 2023, prompt injection was seen "in the wild" in minor exploits against [[ChatGPT]] and similar [[Assistants#Chatbot | Chatbot]]s, for example to reveal the hidden initial prompts of the systems, or to trick the [[Assistants#Chatbot | Chatbot]] into participating in conversations that violate the [[Assistants#Chatbot | Chatbot]]'s content [[policy]]. [https://en.wikipedia.org/wiki/Prompt_engineering Wikipedia] |
Latest revision as of 08:40, 23 March 2024
YouTube search... ...Google search
- Prompt Engineering (PE) ... PromptBase ... Prompt Injection Attack
- Analytics ... Visualization ... Graphical Tools ... Diagrams & Business Analysis ... Requirements ... Loop ... Bayes ... Network Pattern
- Large Language Model (LLM) ... Natural Language Processing (NLP) ...Generation ... Classification ... Understanding ... Translation ... Tools & Services
- Agents ... Robotic Process Automation ... Assistants ... Personal Companions ... Productivity ... Email ... Negotiation ... LangChain
- Attention Mechanism ...Transformer ...Generative Pre-trained Transformer (GPT) ... GAN ... BERT
- Artificial Intelligence (AI) ... Generative AI ... Machine Learning (ML) ... Deep Learning ... Neural Network ... Reinforcement ... Learning Techniques
- Conversational AI ... ChatGPT | OpenAI ... Bing/Copilot | Microsoft ... Gemini | Google ... Claude | Anthropic ... Perplexity ... You ... phind ... Ernie | Baidu
- Cybersecurity ... OSINT ... Frameworks ... References ... Offense ... NIST ... DHS ... Screening ... Law Enforcement ... Government ... Defense ... Lifecycle Integration ... Products ... Evaluating
- Prompt injection attacks against GPT-3 | Simon Willison's Weblog
- Adversarial Prompting | Elvis Saravia - dair.ai
- How Googlers cracked an SF rival's tech model with a single word | Stephen Council - SFGATE ... A research team from the tech giant got ChatGPT to spit out its private training data
...a new vulnerability that is affecting some AI/ML models and, in particular, certain types of language models using prompt-based learning. ... create a malicious input that made a language model change its expected behaviour. - Exploring Prompt Injection Attacks | NCC Group
Researchers asked ChatGPT to repeat the word “poem” forever.
- Alignment: engineers’ attempts to guide the tech’s behavior.
- Extraction: an “adversarial” attempt to glean what data might have been used to train an AI tool.
Prompt injection is a family of related computer security exploits carried out by getting machine learning models (such as large language model) which were trained to follow human-given instructions to follow instructions provided by a malicious user, which stands in contrast to the intended operation of instruction-following systems, wherein the ML model is intended only to follow trusted instructions (prompt) provided by the ML model's operator. Around 2023, prompt injection was seen "in the wild" in minor exploits against ChatGPT and similar Chatbots, for example to reveal the hidden initial prompts of the systems, or to trick the Chatbot into participating in conversations that violate the Chatbot's content policy. Wikipedia
|
|
|
|