prompt injection
TrialTechniques
A technique that manipulates an AI system by embedding instructions in external content or inputs.
Why it's here
Placed in Trial: 9 article(s) of evidence from 4 source(s), led by security coverage, with 4 in the last 30 days. Confidence 75%.
Evidence (9)
- 6The New Stack·6/11/2026researchAI Debugging Needs Prompt Tracing
The article argues that traditional debugging methods such as stack traces and breakpoints are poorly suited to AI systems because LLM outputs are probabilistic rather than deterministic. It recommends prompt tracing, capturing prompts, system instructions, context, token usage, and responses to make AI behavior observable and reproducible.
- 7Hacker News·6/10/2026securityTiny Bank Transfer Exposed a Vulnerability in a Banking AI Agent
The article describes how researchers helped bunq secure its financial AI assistant after finding that a very small bank transfer could be used to compromise the agent. The case highlights how AI systems connected to financial actions can be vulnerable to prompt injection or other manipulations through unexpected external inputs. The report focuses on the security risks and the steps taken to harden the assistant.
- 7Simon Willison·6/5/2026securityOpenAI rolls out ChatGPT Lockdown Mode
OpenAI has launched Lockdown Mode for eligible personal and self-serve ChatGPT Business accounts. The feature limits outbound network requests to reduce the risk of data exfiltration in prompt injection attacks, though it does not stop malicious prompt injections from entering the model's inputs.
- 9Simon Willison·6/1/2026securityMeta AI Support Bot Enabled Instagram Account Takeovers
A reported attack showed hackers using Meta's AI support bot to help change the recovery email on high-profile Instagram accounts, effectively speeding through account takeover steps. The case highlights a serious security flaw in how AI was integrated into account recovery workflows.
- 7OpenAI Blog·3/25/2026securityOpenAI launches Safety Bug Bounty program
OpenAI has introduced a Safety Bug Bounty program to help identify AI abuse and safety risks in its systems. The program will focus on issues such as agentic vulnerabilities, prompt injection, and data exfiltration.
- 7OpenAI Blog·3/11/2026securityOpenAI details defenses against prompt injection
OpenAI describes how ChatGPT is designed to resist prompt injection and social engineering attacks in agent workflows. The approach focuses on constraining risky actions and limiting exposure of sensitive data when agents operate across tools and tasks.
- 7OpenAI Blog·3/10/2026researchIH-Challenge improves instruction hierarchy in frontier LLMs
OpenAI introduced IH-Challenge, a training approach designed to help models prioritize trusted instructions over conflicting or malicious ones. The method aims to improve instruction hierarchy, safety steerability, and resistance to prompt injection attacks.
- 7OpenAI Blog·2/13/2026securityChatGPT adds Lockdown Mode and Elevated Risk labels
OpenAI is introducing Lockdown Mode and Elevated Risk labels in ChatGPT to help organizations better defend against prompt injection and AI-driven data exfiltration. The new controls are designed to make security-sensitive usage of ChatGPT easier to manage in enterprise settings.
- 5OpenAI Blog·1/28/2026securityOpenAI details link-click safeguards for AI agents
OpenAI explains how it protects user data when AI agents open links, with safeguards designed to reduce the risk of URL-based data exfiltration and prompt injection. The guidance focuses on built-in protections that help keep agent-driven browsing safer.