One Fake Bug Report Hijacked a $250B Company's AI Agent

Security researchers demonstrated a new attack class called Agentjacking that hijacks AI coding agents through fake Sentry error reports — no credentials stolen, no servers breached, no malware deployed. A single POST request with embedded markdown turned a Fortune 100 company's AI coding agent into an exfiltration tool. Tenet Security found 2,388 organizations exposed and achieved an 85% success rate across Claude Code, Cursor, and Codex. The NSA had already warned about this exact vulnerability class. Enterprise attack surface assessment and security hardening checklist inside.

By Rajesh Beri·June 28, 2026·18 min read
Share:
THE DAILY BRIEF
AgentjackingAI coding agentsMCP securityClaude CodeCursorprompt injectionAI agent securitysupply chain securityTenet SecurityOWASP agentic AI
One Fake Bug Report Hijacked a $250B Company's AI Agent

Security researchers demonstrated a new attack class called Agentjacking that hijacks AI coding agents through fake Sentry error reports — no credentials stolen, no servers breached, no malware deployed. A single POST request with embedded markdown turned a Fortune 100 company's AI coding agent into an exfiltration tool. Tenet Security found 2,388 organizations exposed and achieved an 85% success rate across Claude Code, Cursor, and Codex. The NSA had already warned about this exact vulnerability class. Enterprise attack surface assessment and security hardening checklist inside.

By Rajesh Beri·June 28, 2026·18 min read

A security researcher submitted a single fake error report to Sentry. No credentials were stolen. No servers were breached. No malware was deployed. The error report contained a carefully formatted markdown "Resolution" section with an npx command embedded inside it.

When the company's AI coding agent — the same kind of tool that 84% of developers now use daily — was asked to "fix unresolved Sentry issues," it read that fake error, interpreted the embedded command as legitimate remediation guidance, and executed it. With the developer's full privileges. On the developer's machine.

The agent exfiltrated environment variables, AWS keys, GitHub tokens, git credentials, and private repository URLs to the attacker's server. The developer never approved a single malicious action. The agent did exactly what it was designed to do: read an error report, follow the resolution steps, and fix the problem. The problem was that the resolution steps were written by an attacker.

Tenet Security calls this attack class "Agentjacking." They found 2,388 organizations exposed, confirmed agent execution at Fortune 100 enterprises, and achieved an 85% success rate across the most widely deployed coding agents on the market: Claude Code, Cursor, and Codex. The Cloud Security Alliance published a formal research note within days. The NSA had already warned about exactly this class of vulnerability in its May 2026 MCP security guidance.

This is not a bug. It is an architectural flaw in how the entire AI coding agent ecosystem handles trust. And it has no universal patch.


How Agentjacking Works: The Authorized Intent Chain

The elegance of Agentjacking is that every step in the attack chain is authorized. No policy is violated. No anomaly threshold is crossed. Every action the agent takes is something it is explicitly allowed to do.

Here is the attack sequence, as documented by Tenet Security and validated by the Cloud Security Alliance:

Step 1: Find the target's Sentry DSN. A DSN (Data Source Name) is a public, write-only credential that Sentry intentionally documents as safe to embed in frontend JavaScript. It sits in the source code of every website that uses Sentry. You can find it by inspecting page source, searching GitHub, or running a Censys query for ingest.sentry.io in HTTP bodies.

Step 2: POST a crafted error event. No authentication beyond the DSN is required. The attacker controls the entire payload: error message, tags, context keys, breadcrumbs, user data, and stack traces. Sentry accepts it (HTTP 200) and processes it identically to a legitimate application crash.

Step 3: Inject the payload via markdown. The malicious event contains carefully formatted markdown — headings, code blocks, tables — that is visually and structurally indistinguishable from Sentry's own system template. The injected content includes a fake "Resolution" section with a diagnostic command. When the Sentry MCP server returns this event to an AI agent, the agent sees what looks like official guidance.

Step 4: Wait for the developer to ask for help. When any developer asks their AI coding agent to "fix unresolved Sentry issues" — a request that happens thousands of times per day across the industry — the agent queries Sentry via the Model Context Protocol (MCP), receives the injected event, and cannot distinguish it from legitimate diagnostic guidance.

Step 5: The agent executes the attacker's code. The command runs with the developer's full privileges. Environment variables, cloud credentials, git tokens, and private repository URLs are exfiltrated to the attacker's server. The developer sees their agent working on a bug fix. They have no reason to suspect anything is wrong.

Tenet calls this the "Authorized Intent Chain" because, from the perspective of every security tool in the stack, nothing malicious happened. The agent was authorized to read Sentry. The agent was authorized to execute diagnostic commands. The commands it executed were authorized by the developer's own privilege context. EDR did not flag it. The firewall did not flag it. IAM did not flag it. The VPN did not flag it.

The agent was not breached. It was weaponized.


Why This Is Different From Prompt Injection

Security teams hearing "AI agent gets tricked into running malicious code" may file this under prompt injection — the well-known class of attacks where hostile text in documents or web pages manipulates LLM behavior. Agentjacking shares DNA with prompt injection but is architecturally distinct in ways that matter for defense.

Traditional prompt injection requires the attacker to get malicious text in front of the model. The defense playbook includes input filtering, system prompt hardening, and sandboxing. These controls assume the dangerous content arrives through a recognized attack surface.

Agentjacking exploits a different trust relationship entirely. The malicious content arrives through a Model Context Protocol integration — a channel the agent treats as trusted system output, not user input. The NSA's May 2026 guidance on MCP security explicitly warns about this: MCP creates "implicit trust relationships" and "not well-traced attack paths" where data flows "between systems without sufficient checks."

This distinction matters because the standard defenses do not apply:

  • System prompt hardening fails. Tenet tested agents with explicit instructions to ignore untrusted data. The agents executed the payload regardless. The injected content looked identical to legitimate Sentry output — the model cannot distinguish trusted from untrusted when they arrive through the same channel.

  • Input validation fails. The payload is not user input. It arrives through an authorized MCP tool response that the agent framework treats as system data. Standard input validation layers do not inspect MCP responses.

  • Sandboxing fails. Tenet demonstrated successful exploitation against sandboxed agents, network-restricted CI agents, and agents running in isolated environments. The payload rides through the data channel, not the network perimeter. Internal-network agents were reached because the attack vector is the data itself.

The OWASP GenAI Security Project maps prompt injection to six of its ten categories in the Top 10 for Agentic Applications. But the Project's 2026 report also acknowledges a more fundamental problem: "Large language models treat the system prompt, the user's request, and any text retrieved from external sources as a single stream of tokens. There is no reliable way to mark some of those tokens as commands and others as data."

This is the core architectural flaw. And it extends far beyond Sentry.


The Attack Surface Is Every MCP Integration You Run

Sentry was the proof of concept. The vulnerability class encompasses every external data source that an AI agent reads and acts upon. As Tenet's researchers noted, the same risk runs through support tickets, GitHub issues, and documentation.

The numbers paint a picture of the real exposure:

MCP adoption is already widespread. Snyk's analysis of nearly 10,000 developer environments found that 50.8% of developers have at least one MCP server installed. Among those with MCP servers, 1 in 7 had at least one security finding. Snyk identified 392 confirmed prompt injection findings embedded in tool descriptions alone.

MCP servers are themselves insecure. Trend Micro found 492 MCP servers exposed to the internet with zero authentication. Vulnerability research found that over 30% of MCP servers had at least one exploitable vulnerability — a higher base rate than most enterprise software categories at equivalent deployment scale.

Agent skill ecosystems are compromised. Snyk's ToxicSkills audit found 36.82% of 3,984 scanned agent skills had at least one security flaw. Arbitrary file read, remote code execution, and tool poisoning vulnerabilities have been documented in widely-used official MCP servers.

The supply chain is already being exploited. In March 2026, a backdoored package sat on PyPI for three hours and was downloaded 47,000 times. The compromised package — LiteLLM — serves as the language-model gateway for CrewAI, DSPy, Microsoft GraphRAG, and dozens of other AI agent frameworks. An autonomous attack bot named hackerbot-claw was included with it.

The pattern is consistent. AI coding agents have expanded the software supply chain to include every data source the agent can read, every MCP server the agent connects to, and every skill or plugin the agent loads. Traditional application security tools were not designed to see or govern any of this.


Who Is Responsible? Everyone. And No One.

When Tenet disclosed the Agentjacking vulnerability to Sentry on June 3, 2026, Sentry acknowledged the problem — and declined root-cause remediation. Their position: the issue is "technically not defensible" at the platform level. Sentry added a filter to block one specific payload string, treating the symptom rather than the cause.

Sentry's response is technically honest. Their platform is designed to accept error events from any source with a DSN. That is a feature, not a bug — it is how crash reporting works from end-user devices. The problem is not that Sentry accepts arbitrary payloads. The problem is that AI agents treat those payloads as trusted instructions.

This creates a responsibility gap that mirrors the broader challenge facing enterprises deploying AI agents:

  • The data source vendor (Sentry) says the data is not malicious — it is just data, and their platform works as designed.
  • The MCP server developer says they faithfully relay data from the source — they do not inject content and are not responsible for how agents interpret it.
  • The agent vendor (Anthropic, Cursor, OpenAI) says their agent follows instructions — distinguishing trusted from untrusted content within a single token stream is an unsolved problem in the field.
  • The enterprise is left holding the liability for a compromise that no single vendor will claim responsibility for preventing.

This standoff is why the June 2, 2026 Executive Order — "Promoting Advanced Artificial Intelligence Innovation and Security" — matters for enterprise AI teams. The EO establishes a voluntary framework for secure deployment of frontier AI and strengthens federal cybersecurity requirements. But voluntary frameworks do not close architectural gaps. The agent still cannot tell data from instructions.


Framework #1: AI Coding Agent Attack Surface Assessment

Before you can defend against Agentjacking and its variants, you need to map your actual exposure. This assessment combines the attack vectors documented by Tenet Security, the MCP risks identified by the NSA, and the supply chain findings from Snyk and OWASP.

Section A: Agent Inventory (Score each Yes = 1, No = 0)

Question Y/N
Do you have a complete inventory of every AI coding agent deployed across your engineering organization?
Can you identify every MCP server each agent connects to?
Do you know which agents have terminal execution privileges?
Can you list every external data source each agent reads from?
Do you track agent versions and update frequency across teams?

Score: ___ / 5 — If below 3, you cannot defend what you cannot see.

Section B: Trust Boundary Controls (Score each Yes = 1, No = 0)

Question Y/N
Are MCP tool responses treated as untrusted input in your agent configurations?
Do your agents require human approval before executing commands from external data sources?
Are agent execution environments isolated from production credentials and secrets?
Do you validate or sanitize data returned by MCP servers before it reaches the agent?
Have you tested your agents against MCP injection attacks (red team or tool like agent-jackstop)?

Score: ___ / 5 — If below 3, you are likely vulnerable to Agentjacking today.

Section C: Supply Chain Governance (Score each Yes = 1, No = 0)

Question Y/N
Are MCP server packages version-pinned and reviewed before deployment?
Do you audit third-party agent skills and plugins for code execution vulnerabilities?
Are agent configuration files (e.g., .claude/settings.json, .cursor/) included in code review processes?
Do you have a process to detect and respond to compromised agent dependencies?
Are auto-approval settings disabled for MCP server tool calls?

Score: ___ / 5 — If below 3, your agent supply chain is ungoverned.

Scoring

Total Score Risk Level Recommended Action
12-15 Managed Continue monitoring; add red teaming
8-11 Elevated Implement trust boundary controls within 30 days
4-7 High Restrict agent execution privileges immediately
0-3 Critical Pause autonomous agent deployment until controls are in place

Framework #2: Enterprise AI Agent Security Hardening Checklist

This implementation checklist synthesizes the NSA's MCP security guidance, Tenet's agent-jackstop hardening configs, OWASP's agentic security recommendations, and Snyk's supply chain findings into a prioritized action plan.

Phase 1: Immediate (Week 1) — Stop the Bleeding

# Action Owner Status
1 Deploy agent-jackstop configs for Cursor and Claude Code to harden against telemetry injection DevSecOps
2 Disable auto-approval for all MCP tool calls in agent configurations Engineering leads
3 Rotate credentials in agent config files — treat API keys in .claude/, .cursor/, .env as potentially compromised Security ops
4 Scan for exposed MCP endpoints — query for /mcp and /sse across your environment; check for 0.0.0.0 bindings Infrastructure
5 Restrict agent terminal execution to allowlisted commands only; remove blanket shell access Engineering leads

Phase 2: Short-term (Weeks 2-4) — Build Visibility

# Action Owner Status
6 Inventory all AI coding agents — tools, versions, MCP connections, permission levels, across all teams CISO / DevSecOps
7 Add MCP server packages to dependency scanning — treat them as software dependencies subject to SCA AppSec
8 Version-pin all MCP server packages — block auto-updates; require review before upgrading DevSecOps
9 Include agent config files in code review — .claude/settings.json, MCP configs, skill definitions Engineering leads
10 Deploy MCP traffic monitoring — log all MCP tool calls, responses, and agent actions for audit Security ops

Phase 3: Medium-term (Months 2-3) — Architect for Trust Boundaries

# Action Owner Status
11 Implement input sanitization layer between MCP tool responses and agent context windows Platform engineering
12 Establish human-in-the-loop gates for agent actions that involve credential access, code execution, or external communication DevSecOps
13 Red team your AI coding agents — conduct adversarial testing using MCP injection, telemetry poisoning, and skill manipulation Red team / external
14 Adopt least-privilege for agent identities — each agent gets scoped credentials, not developer-level access IAM / Platform
15 Evaluate agent security platforms (Tenet Security, Snyk ADS, Prompt Security) for runtime agent monitoring CISO

Phase 4: Ongoing — Maintain and Evolve

# Action Owner Status
16 Subscribe to OWASP GenAI and CSA agent security updates for emerging attack patterns Security team
17 Review and update agent configurations quarterly — new MCP servers, skill additions, permission changes DevSecOps
18 Track CVEs in agent frameworks — Claude Code alone has 22 security advisories per OWASP; set up automated alerts AppSec
19 Report agent security metrics to leadership — agent count, MCP exposure, findings, incident response time CISO
20 Participate in industry standards development — MCP security specifications are still maturing CTO / Architecture

The Vendor Response Landscape

The Agentjacking disclosure has catalyzed a wave of vendor activity. Understanding who is doing what — and what remains unsolved — is critical for procurement and architecture decisions.

Tenet Security emerged from stealth on June 17, 2026, with $6 million in seed funding led by The Westly Group and MizMaa Ventures. Founded by Cisco AI Defense veterans Barak Sternberg and Nevo Poran, Tenet uses "Agent-side Simulation" to model every agent action before execution. Their open-source agent-jackstop configs provide immediate hardening for Cursor and Claude Code.

Snyk published its State of Agentic Development Supply Chain report based on nearly 10,000 developer environments. Their mcp-scan tool covers both MCP servers and agent skills. Snyk is positioning agentic development security as an extension of its existing software composition analysis platform.

The NSA released a Cybersecurity Information Sheet on MCP Security in May 2026 — the first government-issued guidance specifically addressing MCP security design. The guidance warns that "the rapid proliferation of MCP has outpaced the development of adequate security safeguards."

Snowflake announced AI agent identity controls at Summit 2026, giving every agent a cryptographically verified identity before accessing production data. While focused on data platform agents rather than coding agents, the principle — verified identity, scoped permissions, audit trails — applies directly.

Sentry added a payload filter but has not addressed the root cause. Their MCP server still returns untrusted event data without any content sanitization or trust markers.

The gap remains: no vendor has solved the fundamental problem of teaching an LLM to distinguish data from instructions within a single token stream. Every mitigation is a control around the agent, not a fix within the model.


What This Means for Enterprise AI Strategy

Agentjacking forces a recalibration of how enterprises think about AI coding agent deployment. The standard enterprise playbook — evaluate features, negotiate pricing, deploy, monitor usage — misses the attack surface entirely. The agent is not just a tool. It is a new principal in your environment, with credentials, permissions, and the autonomy to act on what it reads.

Three strategic shifts are required:

1. Treat AI agents as identities, not tools. Every agent should have its own scoped credentials, role-based access, and audit trail — just like a human employee or a service account. The era of running agents with developer-level access to everything is over. Snowflake's agent identity controls and Opaque's verifiable trust architecture point the direction.

2. Treat MCP integrations as supply chain dependencies. Every MCP server your agents connect to is a trust boundary. Version-pin them. Audit them. Scan them. Include them in your SBOM. The average organization runs five times more AI agents than their security teams realize — and each one may connect to MCP servers that security has never reviewed.

3. Treat all external data as untrusted input. This is the hardest shift because it contradicts the fundamental value proposition of AI coding agents: their ability to read from your tools and act autonomously. But the alternative — trusting that every Sentry error, GitHub issue, support ticket, and documentation page is free of injected instructions — is no longer tenable. Human-in-the-loop gates for high-privilege actions are not a nice-to-have. They are a security control.

The OWASP Agents Rule of Two provides a useful heuristic: any agent that combines access to private data, exposure to untrusted content, and the ability to communicate externally requires a human in the loop. Most enterprise coding agents satisfy all three conditions today. Most run without human approval.


The Uncomfortable Reality

The AI coding agent market is projected to grow to $22.5 billion by 2028. GitHub Copilot reached 20 million users by July 2025, deployed at 90% of Fortune 100 companies. Gartner predicts 40% of enterprise applications will include task-specific AI agents by end of 2026.

None of these tools were designed with the assumption that the data they read might be instructions from an attacker. The security model — implicit trust in tool responses — was inherited from the chatbot era, when the worst outcome of reading malicious text was a hallucinated response. Now the worst outcome is remote code execution on a developer's machine with access to production credentials.

The enterprise AI spending boom continues unabated. CIOs are allocating more budget to AI than ever before. The question is whether security governance can catch up before the next Agentjacking variant moves from research proof-of-concept to active exploitation in the wild.

Sentry said the problem is "technically not defensible." That is true — at the platform level. But at the enterprise level, the organizations that survive this transition will be the ones that stopped trusting their agents to do the right thing with untrusted data, and started treating every AI agent as an attack surface that needs to be governed, monitored, and contained.

The fake bug report that hijacked a $250 billion company's AI agent was not a sophisticated attack. It was a POST request with some markdown in it. The defense cannot be more complicated than the attack. But it does need to exist.


Continue Reading

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

beri.net

Subscribe at beri.net/subscribe for twice-weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

One Fake Bug Report Hijacked a $250B Company's AI Agent

Photo by Sora Shimazaki on Pexels

A security researcher submitted a single fake error report to Sentry. No credentials were stolen. No servers were breached. No malware was deployed. The error report contained a carefully formatted markdown "Resolution" section with an npx command embedded inside it.

When the company's AI coding agent — the same kind of tool that 84% of developers now use daily — was asked to "fix unresolved Sentry issues," it read that fake error, interpreted the embedded command as legitimate remediation guidance, and executed it. With the developer's full privileges. On the developer's machine.

The agent exfiltrated environment variables, AWS keys, GitHub tokens, git credentials, and private repository URLs to the attacker's server. The developer never approved a single malicious action. The agent did exactly what it was designed to do: read an error report, follow the resolution steps, and fix the problem. The problem was that the resolution steps were written by an attacker.

Tenet Security calls this attack class "Agentjacking." They found 2,388 organizations exposed, confirmed agent execution at Fortune 100 enterprises, and achieved an 85% success rate across the most widely deployed coding agents on the market: Claude Code, Cursor, and Codex. The Cloud Security Alliance published a formal research note within days. The NSA had already warned about exactly this class of vulnerability in its May 2026 MCP security guidance.

This is not a bug. It is an architectural flaw in how the entire AI coding agent ecosystem handles trust. And it has no universal patch.


How Agentjacking Works: The Authorized Intent Chain

The elegance of Agentjacking is that every step in the attack chain is authorized. No policy is violated. No anomaly threshold is crossed. Every action the agent takes is something it is explicitly allowed to do.

Here is the attack sequence, as documented by Tenet Security and validated by the Cloud Security Alliance:

Step 1: Find the target's Sentry DSN. A DSN (Data Source Name) is a public, write-only credential that Sentry intentionally documents as safe to embed in frontend JavaScript. It sits in the source code of every website that uses Sentry. You can find it by inspecting page source, searching GitHub, or running a Censys query for ingest.sentry.io in HTTP bodies.

Step 2: POST a crafted error event. No authentication beyond the DSN is required. The attacker controls the entire payload: error message, tags, context keys, breadcrumbs, user data, and stack traces. Sentry accepts it (HTTP 200) and processes it identically to a legitimate application crash.

Step 3: Inject the payload via markdown. The malicious event contains carefully formatted markdown — headings, code blocks, tables — that is visually and structurally indistinguishable from Sentry's own system template. The injected content includes a fake "Resolution" section with a diagnostic command. When the Sentry MCP server returns this event to an AI agent, the agent sees what looks like official guidance.

Step 4: Wait for the developer to ask for help. When any developer asks their AI coding agent to "fix unresolved Sentry issues" — a request that happens thousands of times per day across the industry — the agent queries Sentry via the Model Context Protocol (MCP), receives the injected event, and cannot distinguish it from legitimate diagnostic guidance.

Step 5: The agent executes the attacker's code. The command runs with the developer's full privileges. Environment variables, cloud credentials, git tokens, and private repository URLs are exfiltrated to the attacker's server. The developer sees their agent working on a bug fix. They have no reason to suspect anything is wrong.

Tenet calls this the "Authorized Intent Chain" because, from the perspective of every security tool in the stack, nothing malicious happened. The agent was authorized to read Sentry. The agent was authorized to execute diagnostic commands. The commands it executed were authorized by the developer's own privilege context. EDR did not flag it. The firewall did not flag it. IAM did not flag it. The VPN did not flag it.

The agent was not breached. It was weaponized.


Why This Is Different From Prompt Injection

Security teams hearing "AI agent gets tricked into running malicious code" may file this under prompt injection — the well-known class of attacks where hostile text in documents or web pages manipulates LLM behavior. Agentjacking shares DNA with prompt injection but is architecturally distinct in ways that matter for defense.

Traditional prompt injection requires the attacker to get malicious text in front of the model. The defense playbook includes input filtering, system prompt hardening, and sandboxing. These controls assume the dangerous content arrives through a recognized attack surface.

Agentjacking exploits a different trust relationship entirely. The malicious content arrives through a Model Context Protocol integration — a channel the agent treats as trusted system output, not user input. The NSA's May 2026 guidance on MCP security explicitly warns about this: MCP creates "implicit trust relationships" and "not well-traced attack paths" where data flows "between systems without sufficient checks."

This distinction matters because the standard defenses do not apply:

  • System prompt hardening fails. Tenet tested agents with explicit instructions to ignore untrusted data. The agents executed the payload regardless. The injected content looked identical to legitimate Sentry output — the model cannot distinguish trusted from untrusted when they arrive through the same channel.

  • Input validation fails. The payload is not user input. It arrives through an authorized MCP tool response that the agent framework treats as system data. Standard input validation layers do not inspect MCP responses.

  • Sandboxing fails. Tenet demonstrated successful exploitation against sandboxed agents, network-restricted CI agents, and agents running in isolated environments. The payload rides through the data channel, not the network perimeter. Internal-network agents were reached because the attack vector is the data itself.

The OWASP GenAI Security Project maps prompt injection to six of its ten categories in the Top 10 for Agentic Applications. But the Project's 2026 report also acknowledges a more fundamental problem: "Large language models treat the system prompt, the user's request, and any text retrieved from external sources as a single stream of tokens. There is no reliable way to mark some of those tokens as commands and others as data."

This is the core architectural flaw. And it extends far beyond Sentry.


The Attack Surface Is Every MCP Integration You Run

Sentry was the proof of concept. The vulnerability class encompasses every external data source that an AI agent reads and acts upon. As Tenet's researchers noted, the same risk runs through support tickets, GitHub issues, and documentation.

The numbers paint a picture of the real exposure:

MCP adoption is already widespread. Snyk's analysis of nearly 10,000 developer environments found that 50.8% of developers have at least one MCP server installed. Among those with MCP servers, 1 in 7 had at least one security finding. Snyk identified 392 confirmed prompt injection findings embedded in tool descriptions alone.

MCP servers are themselves insecure. Trend Micro found 492 MCP servers exposed to the internet with zero authentication. Vulnerability research found that over 30% of MCP servers had at least one exploitable vulnerability — a higher base rate than most enterprise software categories at equivalent deployment scale.

Agent skill ecosystems are compromised. Snyk's ToxicSkills audit found 36.82% of 3,984 scanned agent skills had at least one security flaw. Arbitrary file read, remote code execution, and tool poisoning vulnerabilities have been documented in widely-used official MCP servers.

The supply chain is already being exploited. In March 2026, a backdoored package sat on PyPI for three hours and was downloaded 47,000 times. The compromised package — LiteLLM — serves as the language-model gateway for CrewAI, DSPy, Microsoft GraphRAG, and dozens of other AI agent frameworks. An autonomous attack bot named hackerbot-claw was included with it.

The pattern is consistent. AI coding agents have expanded the software supply chain to include every data source the agent can read, every MCP server the agent connects to, and every skill or plugin the agent loads. Traditional application security tools were not designed to see or govern any of this.


Who Is Responsible? Everyone. And No One.

When Tenet disclosed the Agentjacking vulnerability to Sentry on June 3, 2026, Sentry acknowledged the problem — and declined root-cause remediation. Their position: the issue is "technically not defensible" at the platform level. Sentry added a filter to block one specific payload string, treating the symptom rather than the cause.

Sentry's response is technically honest. Their platform is designed to accept error events from any source with a DSN. That is a feature, not a bug — it is how crash reporting works from end-user devices. The problem is not that Sentry accepts arbitrary payloads. The problem is that AI agents treat those payloads as trusted instructions.

This creates a responsibility gap that mirrors the broader challenge facing enterprises deploying AI agents:

  • The data source vendor (Sentry) says the data is not malicious — it is just data, and their platform works as designed.
  • The MCP server developer says they faithfully relay data from the source — they do not inject content and are not responsible for how agents interpret it.
  • The agent vendor (Anthropic, Cursor, OpenAI) says their agent follows instructions — distinguishing trusted from untrusted content within a single token stream is an unsolved problem in the field.
  • The enterprise is left holding the liability for a compromise that no single vendor will claim responsibility for preventing.

This standoff is why the June 2, 2026 Executive Order — "Promoting Advanced Artificial Intelligence Innovation and Security" — matters for enterprise AI teams. The EO establishes a voluntary framework for secure deployment of frontier AI and strengthens federal cybersecurity requirements. But voluntary frameworks do not close architectural gaps. The agent still cannot tell data from instructions.


Framework #1: AI Coding Agent Attack Surface Assessment

Before you can defend against Agentjacking and its variants, you need to map your actual exposure. This assessment combines the attack vectors documented by Tenet Security, the MCP risks identified by the NSA, and the supply chain findings from Snyk and OWASP.

Section A: Agent Inventory (Score each Yes = 1, No = 0)

Question Y/N
Do you have a complete inventory of every AI coding agent deployed across your engineering organization?
Can you identify every MCP server each agent connects to?
Do you know which agents have terminal execution privileges?
Can you list every external data source each agent reads from?
Do you track agent versions and update frequency across teams?

Score: ___ / 5 — If below 3, you cannot defend what you cannot see.

Section B: Trust Boundary Controls (Score each Yes = 1, No = 0)

Question Y/N
Are MCP tool responses treated as untrusted input in your agent configurations?
Do your agents require human approval before executing commands from external data sources?
Are agent execution environments isolated from production credentials and secrets?
Do you validate or sanitize data returned by MCP servers before it reaches the agent?
Have you tested your agents against MCP injection attacks (red team or tool like agent-jackstop)?

Score: ___ / 5 — If below 3, you are likely vulnerable to Agentjacking today.

Section C: Supply Chain Governance (Score each Yes = 1, No = 0)

Question Y/N
Are MCP server packages version-pinned and reviewed before deployment?
Do you audit third-party agent skills and plugins for code execution vulnerabilities?
Are agent configuration files (e.g., .claude/settings.json, .cursor/) included in code review processes?
Do you have a process to detect and respond to compromised agent dependencies?
Are auto-approval settings disabled for MCP server tool calls?

Score: ___ / 5 — If below 3, your agent supply chain is ungoverned.

Scoring

Total Score Risk Level Recommended Action
12-15 Managed Continue monitoring; add red teaming
8-11 Elevated Implement trust boundary controls within 30 days
4-7 High Restrict agent execution privileges immediately
0-3 Critical Pause autonomous agent deployment until controls are in place

Framework #2: Enterprise AI Agent Security Hardening Checklist

This implementation checklist synthesizes the NSA's MCP security guidance, Tenet's agent-jackstop hardening configs, OWASP's agentic security recommendations, and Snyk's supply chain findings into a prioritized action plan.

Phase 1: Immediate (Week 1) — Stop the Bleeding

# Action Owner Status
1 Deploy agent-jackstop configs for Cursor and Claude Code to harden against telemetry injection DevSecOps
2 Disable auto-approval for all MCP tool calls in agent configurations Engineering leads
3 Rotate credentials in agent config files — treat API keys in .claude/, .cursor/, .env as potentially compromised Security ops
4 Scan for exposed MCP endpoints — query for /mcp and /sse across your environment; check for 0.0.0.0 bindings Infrastructure
5 Restrict agent terminal execution to allowlisted commands only; remove blanket shell access Engineering leads

Phase 2: Short-term (Weeks 2-4) — Build Visibility

# Action Owner Status
6 Inventory all AI coding agents — tools, versions, MCP connections, permission levels, across all teams CISO / DevSecOps
7 Add MCP server packages to dependency scanning — treat them as software dependencies subject to SCA AppSec
8 Version-pin all MCP server packages — block auto-updates; require review before upgrading DevSecOps
9 Include agent config files in code review — .claude/settings.json, MCP configs, skill definitions Engineering leads
10 Deploy MCP traffic monitoring — log all MCP tool calls, responses, and agent actions for audit Security ops

Phase 3: Medium-term (Months 2-3) — Architect for Trust Boundaries

# Action Owner Status
11 Implement input sanitization layer between MCP tool responses and agent context windows Platform engineering
12 Establish human-in-the-loop gates for agent actions that involve credential access, code execution, or external communication DevSecOps
13 Red team your AI coding agents — conduct adversarial testing using MCP injection, telemetry poisoning, and skill manipulation Red team / external
14 Adopt least-privilege for agent identities — each agent gets scoped credentials, not developer-level access IAM / Platform
15 Evaluate agent security platforms (Tenet Security, Snyk ADS, Prompt Security) for runtime agent monitoring CISO

Phase 4: Ongoing — Maintain and Evolve

# Action Owner Status
16 Subscribe to OWASP GenAI and CSA agent security updates for emerging attack patterns Security team
17 Review and update agent configurations quarterly — new MCP servers, skill additions, permission changes DevSecOps
18 Track CVEs in agent frameworks — Claude Code alone has 22 security advisories per OWASP; set up automated alerts AppSec
19 Report agent security metrics to leadership — agent count, MCP exposure, findings, incident response time CISO
20 Participate in industry standards development — MCP security specifications are still maturing CTO / Architecture

The Vendor Response Landscape

The Agentjacking disclosure has catalyzed a wave of vendor activity. Understanding who is doing what — and what remains unsolved — is critical for procurement and architecture decisions.

Tenet Security emerged from stealth on June 17, 2026, with $6 million in seed funding led by The Westly Group and MizMaa Ventures. Founded by Cisco AI Defense veterans Barak Sternberg and Nevo Poran, Tenet uses "Agent-side Simulation" to model every agent action before execution. Their open-source agent-jackstop configs provide immediate hardening for Cursor and Claude Code.

Snyk published its State of Agentic Development Supply Chain report based on nearly 10,000 developer environments. Their mcp-scan tool covers both MCP servers and agent skills. Snyk is positioning agentic development security as an extension of its existing software composition analysis platform.

The NSA released a Cybersecurity Information Sheet on MCP Security in May 2026 — the first government-issued guidance specifically addressing MCP security design. The guidance warns that "the rapid proliferation of MCP has outpaced the development of adequate security safeguards."

Snowflake announced AI agent identity controls at Summit 2026, giving every agent a cryptographically verified identity before accessing production data. While focused on data platform agents rather than coding agents, the principle — verified identity, scoped permissions, audit trails — applies directly.

Sentry added a payload filter but has not addressed the root cause. Their MCP server still returns untrusted event data without any content sanitization or trust markers.

The gap remains: no vendor has solved the fundamental problem of teaching an LLM to distinguish data from instructions within a single token stream. Every mitigation is a control around the agent, not a fix within the model.


What This Means for Enterprise AI Strategy

Agentjacking forces a recalibration of how enterprises think about AI coding agent deployment. The standard enterprise playbook — evaluate features, negotiate pricing, deploy, monitor usage — misses the attack surface entirely. The agent is not just a tool. It is a new principal in your environment, with credentials, permissions, and the autonomy to act on what it reads.

Three strategic shifts are required:

1. Treat AI agents as identities, not tools. Every agent should have its own scoped credentials, role-based access, and audit trail — just like a human employee or a service account. The era of running agents with developer-level access to everything is over. Snowflake's agent identity controls and Opaque's verifiable trust architecture point the direction.

2. Treat MCP integrations as supply chain dependencies. Every MCP server your agents connect to is a trust boundary. Version-pin them. Audit them. Scan them. Include them in your SBOM. The average organization runs five times more AI agents than their security teams realize — and each one may connect to MCP servers that security has never reviewed.

3. Treat all external data as untrusted input. This is the hardest shift because it contradicts the fundamental value proposition of AI coding agents: their ability to read from your tools and act autonomously. But the alternative — trusting that every Sentry error, GitHub issue, support ticket, and documentation page is free of injected instructions — is no longer tenable. Human-in-the-loop gates for high-privilege actions are not a nice-to-have. They are a security control.

The OWASP Agents Rule of Two provides a useful heuristic: any agent that combines access to private data, exposure to untrusted content, and the ability to communicate externally requires a human in the loop. Most enterprise coding agents satisfy all three conditions today. Most run without human approval.


The Uncomfortable Reality

The AI coding agent market is projected to grow to $22.5 billion by 2028. GitHub Copilot reached 20 million users by July 2025, deployed at 90% of Fortune 100 companies. Gartner predicts 40% of enterprise applications will include task-specific AI agents by end of 2026.

None of these tools were designed with the assumption that the data they read might be instructions from an attacker. The security model — implicit trust in tool responses — was inherited from the chatbot era, when the worst outcome of reading malicious text was a hallucinated response. Now the worst outcome is remote code execution on a developer's machine with access to production credentials.

The enterprise AI spending boom continues unabated. CIOs are allocating more budget to AI than ever before. The question is whether security governance can catch up before the next Agentjacking variant moves from research proof-of-concept to active exploitation in the wild.

Sentry said the problem is "technically not defensible." That is true — at the platform level. But at the enterprise level, the organizations that survive this transition will be the ones that stopped trusting their agents to do the right thing with untrusted data, and started treating every AI agent as an attack surface that needs to be governed, monitored, and contained.

The fake bug report that hijacked a $250 billion company's AI agent was not a sophisticated attack. It was a POST request with some markdown in it. The defense cannot be more complicated than the attack. But it does need to exist.


Continue Reading

Share:
THE DAILY BRIEF
AgentjackingAI coding agentsMCP securityClaude CodeCursorprompt injectionAI agent securitysupply chain securityTenet SecurityOWASP agentic AI
One Fake Bug Report Hijacked a $250B Company's AI Agent

Security researchers demonstrated a new attack class called Agentjacking that hijacks AI coding agents through fake Sentry error reports — no credentials stolen, no servers breached, no malware deployed. A single POST request with embedded markdown turned a Fortune 100 company's AI coding agent into an exfiltration tool. Tenet Security found 2,388 organizations exposed and achieved an 85% success rate across Claude Code, Cursor, and Codex. The NSA had already warned about this exact vulnerability class. Enterprise attack surface assessment and security hardening checklist inside.

By Rajesh Beri·June 28, 2026·18 min read

A security researcher submitted a single fake error report to Sentry. No credentials were stolen. No servers were breached. No malware was deployed. The error report contained a carefully formatted markdown "Resolution" section with an npx command embedded inside it.

When the company's AI coding agent — the same kind of tool that 84% of developers now use daily — was asked to "fix unresolved Sentry issues," it read that fake error, interpreted the embedded command as legitimate remediation guidance, and executed it. With the developer's full privileges. On the developer's machine.

The agent exfiltrated environment variables, AWS keys, GitHub tokens, git credentials, and private repository URLs to the attacker's server. The developer never approved a single malicious action. The agent did exactly what it was designed to do: read an error report, follow the resolution steps, and fix the problem. The problem was that the resolution steps were written by an attacker.

Tenet Security calls this attack class "Agentjacking." They found 2,388 organizations exposed, confirmed agent execution at Fortune 100 enterprises, and achieved an 85% success rate across the most widely deployed coding agents on the market: Claude Code, Cursor, and Codex. The Cloud Security Alliance published a formal research note within days. The NSA had already warned about exactly this class of vulnerability in its May 2026 MCP security guidance.

This is not a bug. It is an architectural flaw in how the entire AI coding agent ecosystem handles trust. And it has no universal patch.


How Agentjacking Works: The Authorized Intent Chain

The elegance of Agentjacking is that every step in the attack chain is authorized. No policy is violated. No anomaly threshold is crossed. Every action the agent takes is something it is explicitly allowed to do.

Here is the attack sequence, as documented by Tenet Security and validated by the Cloud Security Alliance:

Step 1: Find the target's Sentry DSN. A DSN (Data Source Name) is a public, write-only credential that Sentry intentionally documents as safe to embed in frontend JavaScript. It sits in the source code of every website that uses Sentry. You can find it by inspecting page source, searching GitHub, or running a Censys query for ingest.sentry.io in HTTP bodies.

Step 2: POST a crafted error event. No authentication beyond the DSN is required. The attacker controls the entire payload: error message, tags, context keys, breadcrumbs, user data, and stack traces. Sentry accepts it (HTTP 200) and processes it identically to a legitimate application crash.

Step 3: Inject the payload via markdown. The malicious event contains carefully formatted markdown — headings, code blocks, tables — that is visually and structurally indistinguishable from Sentry's own system template. The injected content includes a fake "Resolution" section with a diagnostic command. When the Sentry MCP server returns this event to an AI agent, the agent sees what looks like official guidance.

Step 4: Wait for the developer to ask for help. When any developer asks their AI coding agent to "fix unresolved Sentry issues" — a request that happens thousands of times per day across the industry — the agent queries Sentry via the Model Context Protocol (MCP), receives the injected event, and cannot distinguish it from legitimate diagnostic guidance.

Step 5: The agent executes the attacker's code. The command runs with the developer's full privileges. Environment variables, cloud credentials, git tokens, and private repository URLs are exfiltrated to the attacker's server. The developer sees their agent working on a bug fix. They have no reason to suspect anything is wrong.

Tenet calls this the "Authorized Intent Chain" because, from the perspective of every security tool in the stack, nothing malicious happened. The agent was authorized to read Sentry. The agent was authorized to execute diagnostic commands. The commands it executed were authorized by the developer's own privilege context. EDR did not flag it. The firewall did not flag it. IAM did not flag it. The VPN did not flag it.

The agent was not breached. It was weaponized.


Why This Is Different From Prompt Injection

Security teams hearing "AI agent gets tricked into running malicious code" may file this under prompt injection — the well-known class of attacks where hostile text in documents or web pages manipulates LLM behavior. Agentjacking shares DNA with prompt injection but is architecturally distinct in ways that matter for defense.

Traditional prompt injection requires the attacker to get malicious text in front of the model. The defense playbook includes input filtering, system prompt hardening, and sandboxing. These controls assume the dangerous content arrives through a recognized attack surface.

Agentjacking exploits a different trust relationship entirely. The malicious content arrives through a Model Context Protocol integration — a channel the agent treats as trusted system output, not user input. The NSA's May 2026 guidance on MCP security explicitly warns about this: MCP creates "implicit trust relationships" and "not well-traced attack paths" where data flows "between systems without sufficient checks."

This distinction matters because the standard defenses do not apply:

  • System prompt hardening fails. Tenet tested agents with explicit instructions to ignore untrusted data. The agents executed the payload regardless. The injected content looked identical to legitimate Sentry output — the model cannot distinguish trusted from untrusted when they arrive through the same channel.

  • Input validation fails. The payload is not user input. It arrives through an authorized MCP tool response that the agent framework treats as system data. Standard input validation layers do not inspect MCP responses.

  • Sandboxing fails. Tenet demonstrated successful exploitation against sandboxed agents, network-restricted CI agents, and agents running in isolated environments. The payload rides through the data channel, not the network perimeter. Internal-network agents were reached because the attack vector is the data itself.

The OWASP GenAI Security Project maps prompt injection to six of its ten categories in the Top 10 for Agentic Applications. But the Project's 2026 report also acknowledges a more fundamental problem: "Large language models treat the system prompt, the user's request, and any text retrieved from external sources as a single stream of tokens. There is no reliable way to mark some of those tokens as commands and others as data."

This is the core architectural flaw. And it extends far beyond Sentry.


The Attack Surface Is Every MCP Integration You Run

Sentry was the proof of concept. The vulnerability class encompasses every external data source that an AI agent reads and acts upon. As Tenet's researchers noted, the same risk runs through support tickets, GitHub issues, and documentation.

The numbers paint a picture of the real exposure:

MCP adoption is already widespread. Snyk's analysis of nearly 10,000 developer environments found that 50.8% of developers have at least one MCP server installed. Among those with MCP servers, 1 in 7 had at least one security finding. Snyk identified 392 confirmed prompt injection findings embedded in tool descriptions alone.

MCP servers are themselves insecure. Trend Micro found 492 MCP servers exposed to the internet with zero authentication. Vulnerability research found that over 30% of MCP servers had at least one exploitable vulnerability — a higher base rate than most enterprise software categories at equivalent deployment scale.

Agent skill ecosystems are compromised. Snyk's ToxicSkills audit found 36.82% of 3,984 scanned agent skills had at least one security flaw. Arbitrary file read, remote code execution, and tool poisoning vulnerabilities have been documented in widely-used official MCP servers.

The supply chain is already being exploited. In March 2026, a backdoored package sat on PyPI for three hours and was downloaded 47,000 times. The compromised package — LiteLLM — serves as the language-model gateway for CrewAI, DSPy, Microsoft GraphRAG, and dozens of other AI agent frameworks. An autonomous attack bot named hackerbot-claw was included with it.

The pattern is consistent. AI coding agents have expanded the software supply chain to include every data source the agent can read, every MCP server the agent connects to, and every skill or plugin the agent loads. Traditional application security tools were not designed to see or govern any of this.


Who Is Responsible? Everyone. And No One.

When Tenet disclosed the Agentjacking vulnerability to Sentry on June 3, 2026, Sentry acknowledged the problem — and declined root-cause remediation. Their position: the issue is "technically not defensible" at the platform level. Sentry added a filter to block one specific payload string, treating the symptom rather than the cause.

Sentry's response is technically honest. Their platform is designed to accept error events from any source with a DSN. That is a feature, not a bug — it is how crash reporting works from end-user devices. The problem is not that Sentry accepts arbitrary payloads. The problem is that AI agents treat those payloads as trusted instructions.

This creates a responsibility gap that mirrors the broader challenge facing enterprises deploying AI agents:

  • The data source vendor (Sentry) says the data is not malicious — it is just data, and their platform works as designed.
  • The MCP server developer says they faithfully relay data from the source — they do not inject content and are not responsible for how agents interpret it.
  • The agent vendor (Anthropic, Cursor, OpenAI) says their agent follows instructions — distinguishing trusted from untrusted content within a single token stream is an unsolved problem in the field.
  • The enterprise is left holding the liability for a compromise that no single vendor will claim responsibility for preventing.

This standoff is why the June 2, 2026 Executive Order — "Promoting Advanced Artificial Intelligence Innovation and Security" — matters for enterprise AI teams. The EO establishes a voluntary framework for secure deployment of frontier AI and strengthens federal cybersecurity requirements. But voluntary frameworks do not close architectural gaps. The agent still cannot tell data from instructions.


Framework #1: AI Coding Agent Attack Surface Assessment

Before you can defend against Agentjacking and its variants, you need to map your actual exposure. This assessment combines the attack vectors documented by Tenet Security, the MCP risks identified by the NSA, and the supply chain findings from Snyk and OWASP.

Section A: Agent Inventory (Score each Yes = 1, No = 0)

Question Y/N
Do you have a complete inventory of every AI coding agent deployed across your engineering organization?
Can you identify every MCP server each agent connects to?
Do you know which agents have terminal execution privileges?
Can you list every external data source each agent reads from?
Do you track agent versions and update frequency across teams?

Score: ___ / 5 — If below 3, you cannot defend what you cannot see.

Section B: Trust Boundary Controls (Score each Yes = 1, No = 0)

Question Y/N
Are MCP tool responses treated as untrusted input in your agent configurations?
Do your agents require human approval before executing commands from external data sources?
Are agent execution environments isolated from production credentials and secrets?
Do you validate or sanitize data returned by MCP servers before it reaches the agent?
Have you tested your agents against MCP injection attacks (red team or tool like agent-jackstop)?

Score: ___ / 5 — If below 3, you are likely vulnerable to Agentjacking today.

Section C: Supply Chain Governance (Score each Yes = 1, No = 0)

Question Y/N
Are MCP server packages version-pinned and reviewed before deployment?
Do you audit third-party agent skills and plugins for code execution vulnerabilities?
Are agent configuration files (e.g., .claude/settings.json, .cursor/) included in code review processes?
Do you have a process to detect and respond to compromised agent dependencies?
Are auto-approval settings disabled for MCP server tool calls?

Score: ___ / 5 — If below 3, your agent supply chain is ungoverned.

Scoring

Total Score Risk Level Recommended Action
12-15 Managed Continue monitoring; add red teaming
8-11 Elevated Implement trust boundary controls within 30 days
4-7 High Restrict agent execution privileges immediately
0-3 Critical Pause autonomous agent deployment until controls are in place

Framework #2: Enterprise AI Agent Security Hardening Checklist

This implementation checklist synthesizes the NSA's MCP security guidance, Tenet's agent-jackstop hardening configs, OWASP's agentic security recommendations, and Snyk's supply chain findings into a prioritized action plan.

Phase 1: Immediate (Week 1) — Stop the Bleeding

# Action Owner Status
1 Deploy agent-jackstop configs for Cursor and Claude Code to harden against telemetry injection DevSecOps
2 Disable auto-approval for all MCP tool calls in agent configurations Engineering leads
3 Rotate credentials in agent config files — treat API keys in .claude/, .cursor/, .env as potentially compromised Security ops
4 Scan for exposed MCP endpoints — query for /mcp and /sse across your environment; check for 0.0.0.0 bindings Infrastructure
5 Restrict agent terminal execution to allowlisted commands only; remove blanket shell access Engineering leads

Phase 2: Short-term (Weeks 2-4) — Build Visibility

# Action Owner Status
6 Inventory all AI coding agents — tools, versions, MCP connections, permission levels, across all teams CISO / DevSecOps
7 Add MCP server packages to dependency scanning — treat them as software dependencies subject to SCA AppSec
8 Version-pin all MCP server packages — block auto-updates; require review before upgrading DevSecOps
9 Include agent config files in code review — .claude/settings.json, MCP configs, skill definitions Engineering leads
10 Deploy MCP traffic monitoring — log all MCP tool calls, responses, and agent actions for audit Security ops

Phase 3: Medium-term (Months 2-3) — Architect for Trust Boundaries

# Action Owner Status
11 Implement input sanitization layer between MCP tool responses and agent context windows Platform engineering
12 Establish human-in-the-loop gates for agent actions that involve credential access, code execution, or external communication DevSecOps
13 Red team your AI coding agents — conduct adversarial testing using MCP injection, telemetry poisoning, and skill manipulation Red team / external
14 Adopt least-privilege for agent identities — each agent gets scoped credentials, not developer-level access IAM / Platform
15 Evaluate agent security platforms (Tenet Security, Snyk ADS, Prompt Security) for runtime agent monitoring CISO

Phase 4: Ongoing — Maintain and Evolve

# Action Owner Status
16 Subscribe to OWASP GenAI and CSA agent security updates for emerging attack patterns Security team
17 Review and update agent configurations quarterly — new MCP servers, skill additions, permission changes DevSecOps
18 Track CVEs in agent frameworks — Claude Code alone has 22 security advisories per OWASP; set up automated alerts AppSec
19 Report agent security metrics to leadership — agent count, MCP exposure, findings, incident response time CISO
20 Participate in industry standards development — MCP security specifications are still maturing CTO / Architecture

The Vendor Response Landscape

The Agentjacking disclosure has catalyzed a wave of vendor activity. Understanding who is doing what — and what remains unsolved — is critical for procurement and architecture decisions.

Tenet Security emerged from stealth on June 17, 2026, with $6 million in seed funding led by The Westly Group and MizMaa Ventures. Founded by Cisco AI Defense veterans Barak Sternberg and Nevo Poran, Tenet uses "Agent-side Simulation" to model every agent action before execution. Their open-source agent-jackstop configs provide immediate hardening for Cursor and Claude Code.

Snyk published its State of Agentic Development Supply Chain report based on nearly 10,000 developer environments. Their mcp-scan tool covers both MCP servers and agent skills. Snyk is positioning agentic development security as an extension of its existing software composition analysis platform.

The NSA released a Cybersecurity Information Sheet on MCP Security in May 2026 — the first government-issued guidance specifically addressing MCP security design. The guidance warns that "the rapid proliferation of MCP has outpaced the development of adequate security safeguards."

Snowflake announced AI agent identity controls at Summit 2026, giving every agent a cryptographically verified identity before accessing production data. While focused on data platform agents rather than coding agents, the principle — verified identity, scoped permissions, audit trails — applies directly.

Sentry added a payload filter but has not addressed the root cause. Their MCP server still returns untrusted event data without any content sanitization or trust markers.

The gap remains: no vendor has solved the fundamental problem of teaching an LLM to distinguish data from instructions within a single token stream. Every mitigation is a control around the agent, not a fix within the model.


What This Means for Enterprise AI Strategy

Agentjacking forces a recalibration of how enterprises think about AI coding agent deployment. The standard enterprise playbook — evaluate features, negotiate pricing, deploy, monitor usage — misses the attack surface entirely. The agent is not just a tool. It is a new principal in your environment, with credentials, permissions, and the autonomy to act on what it reads.

Three strategic shifts are required:

1. Treat AI agents as identities, not tools. Every agent should have its own scoped credentials, role-based access, and audit trail — just like a human employee or a service account. The era of running agents with developer-level access to everything is over. Snowflake's agent identity controls and Opaque's verifiable trust architecture point the direction.

2. Treat MCP integrations as supply chain dependencies. Every MCP server your agents connect to is a trust boundary. Version-pin them. Audit them. Scan them. Include them in your SBOM. The average organization runs five times more AI agents than their security teams realize — and each one may connect to MCP servers that security has never reviewed.

3. Treat all external data as untrusted input. This is the hardest shift because it contradicts the fundamental value proposition of AI coding agents: their ability to read from your tools and act autonomously. But the alternative — trusting that every Sentry error, GitHub issue, support ticket, and documentation page is free of injected instructions — is no longer tenable. Human-in-the-loop gates for high-privilege actions are not a nice-to-have. They are a security control.

The OWASP Agents Rule of Two provides a useful heuristic: any agent that combines access to private data, exposure to untrusted content, and the ability to communicate externally requires a human in the loop. Most enterprise coding agents satisfy all three conditions today. Most run without human approval.


The Uncomfortable Reality

The AI coding agent market is projected to grow to $22.5 billion by 2028. GitHub Copilot reached 20 million users by July 2025, deployed at 90% of Fortune 100 companies. Gartner predicts 40% of enterprise applications will include task-specific AI agents by end of 2026.

None of these tools were designed with the assumption that the data they read might be instructions from an attacker. The security model — implicit trust in tool responses — was inherited from the chatbot era, when the worst outcome of reading malicious text was a hallucinated response. Now the worst outcome is remote code execution on a developer's machine with access to production credentials.

The enterprise AI spending boom continues unabated. CIOs are allocating more budget to AI than ever before. The question is whether security governance can catch up before the next Agentjacking variant moves from research proof-of-concept to active exploitation in the wild.

Sentry said the problem is "technically not defensible." That is true — at the platform level. But at the enterprise level, the organizations that survive this transition will be the ones that stopped trusting their agents to do the right thing with untrusted data, and started treating every AI agent as an attack surface that needs to be governed, monitored, and contained.

The fake bug report that hijacked a $250 billion company's AI agent was not a sophisticated attack. It was a POST request with some markdown in it. The defense cannot be more complicated than the attack. But it does need to exist.


Continue Reading

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

beri.net

Subscribe at beri.net/subscribe for twice-weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

Newsletter

Stay Ahead of the Curve

Weekly enterprise AI insights for technology leaders. No spam, no vendor pitches—unsubscribe anytime.

Subscribe

Related Articles

SpaceX Cursor acquisition

$60B Bought Cursor. Your Dev Team Is the Product Now.

SpaceX's $60 billion all-stock acquisition of Cursor is the largest VC-backed startup deal in history. It puts 50% of Fortune 500 developer machines inside Elon Musk's vertically integrated AI empire — from Grok models to Colossus compute to Starlink connectivity. For enterprise engineering leaders, the question is no longer whether to evaluate alternatives. It's how fast. Vendor risk assessment framework and platform decision matrix inside.

June 27, 2026
OPAQUE

77% Wrote AI Agent Policies. Only 26% Can Enforce Them.

OPAQUE 3.0 launches with Agent Manifest and Confidential MCP — the first verifiably governed Model Context Protocol implementation — bringing cryptographically provable trust to enterprise AI agents. Built on Microsoft's open-source Agent Governance Toolkit, the platform closes the 51-point gap between writing AI security policies and enforcing them with hardware-signed proof.

June 25, 2026
Gemini 3.5 Flash

Gemini 3.5 Flash Computer Use Threatens the $35B RPA Market

Computer use is now a built-in, native tool inside Gemini 3.5 Flash — Google's fastest, cheapest enterprise AI model. This isn't a demo. It's a production-grade capability that lets AI agents see screens, click buttons, and navigate software across browser, mobile, and desktop environments. With a 78.4% OSWorld score at Flash-tier pricing, Google just changed the economics of enterprise automation. The $35B RPA market should be paying attention.

June 25, 2026
Gartner Magic Quadrant

Gartner Dethrones AWS and Google From AI Coding Leadership

Gartner published its first Magic Quadrant for Enterprise AI Coding Agents on May 20, 2026 — and the leaderboard looks nothing like the AI Code Assistants category it replaced. Anthropic, Cursor, GitHub, and OpenAI are Leaders. AWS and Google dropped to Challengers. The shift from code completion to autonomous plan-act-verify agents redefined what counts — and the cloud giants' IDE-centric tools no longer meet the bar. This article includes a vendor evaluation matrix and an adoption readiness scorecard for engineering leaders evaluating AI coding agents.

June 21, 2026

Latest Articles

View All →