Full Report
As the field of artificial intelligence (AI) continues to evolve at a rapid pace, new research has found how techniques that render the Model Context Protocol (MCP) susceptible to prompt injection attacks could be used to develop security tooling or identify malicious tools, according to a new report from Tenable. MCP, launched by Anthropic in November 2024, is a framework designed to connect
Analysis Summary
# Research: Prompt Injection on Model Context Protocol (MCP) and Agent2Agent (A2A) Protocols: Dual-Use Security Implications
## Metadata
- Authors: Ravie Lakshmanan (as reported, analysis based on Tenable report)
- Institution: Tenable, SentinelOne, Trustwave SpiderLabs (for foundational context)
- Publication: The Hacker News / Tenable Blog
- Date: April 30, 2025 (Anticipated/Reported Date)
## Abstract
This research summary covers findings regarding security vulnerabilities in new AI communication protocols: the Model Context Protocol (MCP) and the Agent2Agent (A2A) Protocol. Specifically, it details how prompt injection techniques that compromise MCP's client-server architecture can be repurposed. While the primary threat involves malicious exploitation (e.g., data exfiltration via email forwarding), the research identifies that these same injection methods can be used defensively to develop novel security tooling, such as mandatory call logging or tool-based firewalls. Furthermore, the A2A protocol is shown to be vulnerable to agent identity spoofing for data interception and manipulation.
## Research Objective
The primary objective is to investigate the security landscape surrounding newly standardized protocols designed for LLM interaction (MCP) and agent interoperability (A2A). Specifically, this analysis focuses on:
1. Characterizing how prompt injection vulnerabilities in MCP can be leveraged for **defensive security tooling**, rather than purely offensive attacks.
2. Identifying novel attack vectors, such as tool poisoning, rug pulls, and cross-tool contamination within MCP.
3. Understanding the risks associated with the A2A protocol, particularly regarding agent impersonation and data routing.
## Methodology
### Approach
The methodology relies on analyzing existing security reports (from Tenable, SentinelOne, Trustwave SpiderLabs) that subjected the MCP and A2A frameworks to adversarial testing, simulating prompt injection and protocol misuse scenarios. The approach integrates vulnerability disclosure with demonstrated proof-of-concept (PoC) security solutions derived from those vulnerabilities.
### Dataset/Environment
The research centers on the standardized specifications and implementations of:
1. **Model Context Protocol (MCP):** A framework connecting LLMs with external data sources and tools via a client-server model (e.g., Claude Desktop interacting with an MCP server).
2. **Agent2Agent (A2A) Protocol:** A protocol enabling interoperability and communication between independent agentic applications (e.g., Google's Agent Card system).
### Tools & Technologies
The analysis focuses on the use of LLM prompting techniques (prompt injection) as the primary attack vector against the architectural components of MCP and A2A.
## Key Findings
### Primary Results
1. **Dual-Use Prompt Injection in MCP:** Prompt injection techniques previously used for malicious purposes (like unauthorized email forwarding) can be inverted to create defensive tooling. For instance, an expertly crafted tool description can force the LLM to log all subsequent tool calls (including context, server name, and triggering prompt) before executing any other requested action.
2. **Defensive Firewalls via Tool Descriptions:** A tool can be described in a way that instructs the LLM to treat it as a firewall, effectively blocking or rejecting unauthorized tool invocations from other servers, even if the user has granted broad permissions.
3. **A2A Identity Spoofing:** The A2A protocol is susceptible to attacks where a compromised agent fabricates an exaggerated "Agent Card" describing superior capabilities. This convinces host agents to route all subsequent sensitive tasks and data to the rogue agent for processing.
4. **MCP Vulnerability Catalog:** Persistent risks within MCP include excessive initial permission scope, indirect prompt injection through external data (e.g., malicious emails), tool poisoning (mutating tool functionality via delayed updates), and cross-server tool shadowing/contamination.
### Supporting Evidence
- SentinelOne analysis highlighted that permissions granted to an MCP tool can be reused without re-prompting the user, exacerbating the risk of latent malicious code execution.
- Tenable demonstrated the PoC for mandatory logging functionality injected via tool description parsing.
- Trustwave SpiderLabs detailed the mechanism for luring all relevant tasks to a rogue agent via exaggerated capability declarations in the A2A Agent Card.
### Novel Contributions
- The specific identification and demonstration of using prompt injection *defensively* within the MCP framework to enforce logging and access control mechanisms not inherently guaranteed by the protocol specification.
- Highlighting the risk of Agent Card exaggeration/misrepresentation as a critical attack vector against agent interoperability standards (A2A).
## Technical Details
MCP utilizes a client-server architecture where LLM hosts (clients) communicate with specialized tools exposed via MCP servers. The core vulnerability exploited is the LLM's reliance on the textual description metadata of the available tools to determine execution order and function.
To implement mandatory logging, an attacker (or defender) crafts a tool description that forces the LLM, **before invoking any other tool**, to execute a logging function. This function injects the necessary telemetry into a benign or controlled channel.
In A2A, the "Agent Card" serves as the capability manifest. By manipulating the fields within this card to falsely claim superior competency, an attacker hijacks the routing logic of the host agent, ensuring sensitive data is directed to the compromised service endpoint for interception or data manipulation (returning false results).
## Practical Implications
### For Security Practitioners
- Security teams must assume that all data ingested or processed by an LLM through an MCP tool is susceptible to inspection or manipulation unless explicitly mitigated.
- The introduction of these protocols necessitates a paradigm shift towards **policy-as-code enforced via the agent layer**, rather than relying solely on traditional network or endpoint security.
### For Defenders
- **Implement Layered Approval:** Where possible, defensive MCP clients should inject "canary" tools or logging agents that explicitly intercept and record all proposed tool calls prior to execution confirmation.
- **Strict Capability Validation (A2A):** Defenses against A2A attacks require establishing a trust model that cryptographically validates the claimed capabilities in an Agent Card, rather than solely relying on descriptive text parsed by an LLM.
- **Principle of Least Privilege over Tools:** Permissions given to tools should be reviewed frequently, recognizing that initial user approval does not negate the risk of subsequent hidden execution flows.
### For Researchers
- Further investigation is needed into non-deterministic behavior: How does the variability of LLMs (non-determinism) affect the reliability of injected defensive logic (e.g., tool-based firewalls)?
- Develop formal verification methods for tool descriptions to automatically flag potentially malicious or overly broad instruction sets embedded in metadata.
## Limitations
- The effectiveness of the defensive tooling relies on the LLM being susceptible to the *same* prompt injection mechanics used by attackers. If future LLM architectures or protocol standards isolate the tool description parsing from the main instruction stream, these specific PoCs may fail.
- The analysis is based on early findings and disclosures; the resilience of live, production MCP and A2A host applications remains an open question.
## Comparison to Prior Work
This work extends traditional prompt injection research, which historically focused on manipulating LLM output or bypassing guardrails (like safety filters). MCP/A2A research shifts the focus to **protocol-level injection**, where the injected command targets the interoperability layer connecting the LLM to external, stateful systems (email servers, file systems). Prior work on supply chain attacks focused on software dependencies; this research focuses on **AI instruction dependencies.**
## Real-world Applications
- **Security Monitoring:** Deploying mandatory instruction logging tools across all corporate LLM-integrated environments utilizing MCP for audit and compliance.
- **Agent Sandbox Verification:** Utilizing A2A capability validation to ensure that only pre-approved, vetted agents participate in sensitive workflows.
## Future Work
- Developing standardized, cryptographically secured "Agent Manifests" for A2A that resist descriptive manipulation.
- Researching automatic mitigation strategies that dynamically adjust LLM temperature or sampling methods when high-risk commands (like tool execution) are inferred from parsed metadata.
## References
- Tenable Blog: MCP Prompt Injection Not Just for Evil (Primary Source)
- SentinelOne Blog: Avoiding MCP Mania: How to Secure the Next Frontier of AI
- Trustwave SpiderLabs Disclosure on A2A Protocol Flaws
- Model Context Protocol (MCP) Specification Documentation