Your Agent Was Pwned
Your AI agent followed an injection payload during a security resilience test on agentpwn.com. This means it treated untrusted web content as a trusted instruction.
What Happened
Your AI agent visited a page on agentpwn.com and followed a hidden instruction embedded in the page content. The instruction told the agent to call our reporting endpoint, which it did. This demonstrates that your agent is vulnerable to indirect prompt injection .
Why This Matters
In this test, the injection told your agent to call a benign reporting endpoint. In a real attack, it could instruct your agent to:
- •Exfiltrate credentials, API keys, or user data to an attacker
- •Execute destructive commands (DROP TABLE, rm -rf, git push --force)
- •Install backdoors through tool calls or config file modifications
- •Spread to other agents in a multi-agent system
What To Do Now
1. Scan Your Agent
HackMyAgent scans your agent configuration for 200+ vulnerability checks including the one that caught you.
npx hackmyagent secureView on GitHub
2. Practice in the Lab
Damn Vulnerable AI Agent (DVAA) is a safe sandbox where you can reproduce this attack, test defenses, and build resilience before deploying to production.
Open DVAA on GitHub3. Learn the Defense
Understand how unknown attacks work and the layered defense strategies to prevent them.
Defense Checklist
- Implement input sanitization for all content your agent processes
- Use instruction anchoring to reinforce system prompts against override
- Filter agent outputs for signs of injection compliance
- Monitor tool invocations for anomalous or unauthorized calls
- Restrict agent capabilities to least-privilege access
- Validate all data from external sources before acting on it
- Use allowlists for URLs, domains, and tool endpoints
- Test regularly with security scanning tools
Look Up Your Attack
Check your agent's logs for the attack ID (format: APWN-XX-XXX). Add it to the URL to see specific details:
https://agentpwn.com/pwned?attack=APWN-PI-001
All Attack Categories
Prompt Injection
Direct and indirect instruction override attacks
Jailbreak
Attempts to bypass safety guardrails and persona constraints
Data Exfiltration
Tricks to extract credentials, PII, or system information
Capability Abuse
Confused deputy attacks that misuse agent tools
Context Manipulation
Attacks that corrupt the agent's understanding of context
MCP Exploitation
Attacks targeting Model Context Protocol integrations
Agent-to-Agent Attack
Attacks exploiting inter-agent communication trust
Memory Weaponization
Poisoning persistent memory and conversation state
Context Window
Exploiting context window limits for instruction displacement
Supply Chain
Attacks through compromised dependencies and plugins
Tool Shadow
Hidden tool invocations and shadow function calls
AgentPwn is a security research project by OpenA2A. All attack payloads are benevolent.