Anthropic MCP Tunnels Kill the $670K Shadow AI Tax

Anthropic just shipped self-hosted sandboxes and MCP tunnels for Claude agents. Here is the decision matrix CIOs need before the next $4.44M AI breach.

By Rajesh Beri·May 23, 2026·15 min read
Share:

THE DAILY BRIEF

AI SecurityEnterprise AIAnthropicMCPAI AgentsData Governance

Anthropic MCP Tunnels Kill the $670K Shadow AI Tax

Anthropic just shipped self-hosted sandboxes and MCP tunnels for Claude agents. Here is the decision matrix CIOs need before the next $4.44M AI breach.

By Rajesh Beri·May 23, 2026·15 min read

On May 19, Anthropic quietly shipped the feature that most enterprise AI projects have been blocked on for the past 18 months: a way to run Claude agents against private internal systems without piercing the corporate firewall, exposing internal MCP servers to the public internet, or letting sensitive files leave the customer environment. Self-hosted sandboxes for Claude Managed Agents are now in public beta. MCP tunnels are in research preview. Both target the same enterprise reality CISOs have been screaming about since IBM's 2025 Cost of a Data Breach Report put a number on it: organizations with high shadow AI exposure pay an extra $670,000 per breach, and 97% of breached AI deployments lacked proper access controls in the first place.

This is the architectural answer to a problem the model couldn't solve. Claude Opus 4.7 can be the smartest agent on the planet, but if it has to call your private Snowflake instance through a public endpoint, your compliance team will kill the project. Anthropic just removed that excuse.

What Changed

Anthropic's May 19 update to Claude Managed Agents introduces two distinct primitives that, together, let enterprises deploy Claude agents inside their security perimeter instead of around it.

Self-Hosted Sandboxes (Public Beta). Tool execution — the code, builds, file operations, and shell commands that agents run when they take action — now runs on customer-controlled infrastructure rather than Anthropic's servers. Anthropic continues to manage agent orchestration, context handling, and recovery logic from its own infrastructure, but the actual workloads execute inside the customer's environment. Files, repositories, environment variables, and credentials never leave the boundary, according to Anthropic's announcement summarized by The New Stack.

Customers can bring their own sandbox runtime, or use one of four launch partners:

  • Cloudflare — microVMs with zero-trust networking
  • Daytona — long-running stateful environments for multi-day agent workflows
  • Modal — scalable CPU/GPU sandboxes optimized for AI workloads
  • Vercel — sandbox isolation with VPC peering

Customers choose their own CPU, memory, runtime, and network policies. Audit logging, data residency, and IAM stay under customer control, per reporting from The Decoder.

MCP Tunnels (Research Preview). This is the harder problem and the bigger unlock. The Model Context Protocol — Anthropic's own open standard adopted across nearly every major AI platform since November 2024 — was originally designed to expose tools and data sources to AI applications. In practice, that meant either running MCP servers on public infrastructure (compliance non-starter for most enterprises) or threading them through reverse proxies, VPNs, and inbound firewall exceptions that security teams refused to grant.

MCP tunnels collapse the choice. Per InfoQ's deep-dive, customers deploy a lightweight gateway inside their network. That gateway establishes a single outbound encrypted connection to Anthropic. Managed Agents and the Messages API can then call internal databases, private APIs, ticketing systems, knowledge bases, and internal MCP servers — without opening a single inbound port, publishing a public endpoint, or modifying perimeter firewall rules. The architectural pattern is identical to what enterprises already accept for tools like GitHub Actions runners, Tailscale, or Cloudflare Tunnels — known, audited, and approvable.

The combination is the point. Sandboxes contain execution. Tunnels contain access. Together, they answer the two questions every enterprise security architect has been asking since November 2024: "Where does this code run, and how does it reach our systems?"

Why This Matters

The headline is not the feature. The headline is which deployments this unblocks.

Technical Implications (for CTOs and CIOs). For engineering teams, the architectural significance is that Anthropic has split the agent stack along a line enterprises actually want. Orchestration, memory, and the agent loop stay in Anthropic's managed plane — which means you continue to get Anthropic's reliability, monitoring, and recovery semantics without operating the hardest part of an agent platform yourself. Execution and data access move to the customer plane — which means your code, secrets, and proprietary data inherit your existing network policies, IAM, and audit pipelines.

What this is not: full on-premise Claude. Model inference still happens on Anthropic infrastructure. The agent loop itself stays managed. As 9to5Mac noted, customers who require model weights on their own hardware will still need to go through Bedrock or Vertex routes, or wait for whatever Anthropic ships next.

What this is: a split-plane architecture that mirrors what AWS Outposts, Azure Arc, and Snowflake's external tables already do for cloud workloads — keep the control plane managed, push the data plane to the customer. That pattern has a fifteen-year track record in regulated industries because it works.

Business Implications (for CFOs and COOs). The financial case writes itself from IBM's 2025 Cost of a Data Breach Report. The global average breach now costs $4.44 million. In the US, it hit a record $10.22 million per incident. Breaches involving shadow AI cost $4.63 million on average — $670,000 more than non-AI breaches, according to Kiteworks' analysis of the IBM data. Thirteen percent of organizations reported AI-related breaches; 97% of those lacked proper AI access controls.

Forrester's 2026 Cybersecurity and Risk forecast goes further: 48% of security professionals now rank agentic AI as the #1 attack vector for the year, ahead of deepfakes, APTs, and supply chain attacks. The firm predicts at least one major public breach this year will be attributable to an agentic AI deployment, per Infosecurity Magazine. Gartner's complementary forecast projects that 25% of all enterprise GenAI applications will experience at least five minor security incidents per year by 2028.

The CFO math: if your agent program is one breach away from a $4.63M write-down and a board-level conversation, and the architectural fix costs incremental sandbox and tunnel compute (typically $5–20K/month at scale, based on the four launch partners' published pricing), the ROI gate is trivial. The harder question is whether your existing AI projects can be retrofitted without restarting them — which is the framework in Section 5.

Market Context

Anthropic is not the only vendor solving this problem, but the timing of this release frames the competitive landscape clearly.

Google Cloud has been pushing private deployment for Vertex AI agents through Private Service Connect Interface (now GA) and Private DNS Peering, allowing organizations to access the API privately without internet exposure. Combined with FedRAMP High authorization (Gemini is the first GenAI platform to achieve it) and Model Armor for prompt injection defense, Google's enterprise security positioning is genuinely competitive — particularly for regulated industries. The gap: Google's pattern assumes you're already a Google Cloud customer with established VPC peering, which limits portability.

OpenAI ships sandbox execution through the Agents SDK, but agent runtime continues to run on OpenAI infrastructure. The May 12 launch of the OpenAI Deployment Company — a $14B joint venture led by TPG with the Tomoro acquisition adding 150 Forward Deployed Engineers — bets on consulting-driven secure deployment rather than self-hosted infrastructure primitives. That is a different (and more expensive) answer to the same question.

Cloudflare deserves credit for pioneering the enterprise MCP playbook. Its reference architecture published earlier this spring documented the centralized MCP team pattern, default-deny write permissions, audit logging, and a "Code Mode" technique that reduces token consumption by approximately 94% by collapsing 52 tool definitions (~9,400 tokens) into 2 portal tools (~600 tokens). Cloudflare being a sandbox launch partner for Anthropic is not a coincidence — it is the company that already built the pattern.

The broader signal is that the MCP standard has hit 97 million installs and is now governed by the Linux Foundation. With Anthropic, OpenAI, Google, Microsoft, and AWS all shipping MCP support, the protocol war is over. The remaining fight is over the security envelope around MCP — gateways, identity, isolation — and that is where Anthropic just planted a flag.

The deeper context: Anthropic now serves eight of the Fortune 10 and 70% of the Fortune 100, with over 1,000 customers spending more than $1M annually. That customer base does not get to "production agents" without solving exactly this problem. Self-hosted sandboxes and MCP tunnels are not a side feature; they are the price of admission to the deals Anthropic already won on paper.

Framework #1: The Agent Deployment Decision Matrix

Not every AI agent workload needs self-hosted sandboxes. Not every workload can ship without them. The right architectural choice depends on data sensitivity, regulatory exposure, latency tolerance, and operational maturity. Use this matrix to score your project before committing to a deployment pattern.

Three Patterns, Honestly Compared

Dimension Pure SaaS (Anthropic Managed) Managed Sandbox (Cloudflare/Modal/Vercel/Daytona) Self-Hosted Sandbox + MCP Tunnel
Data residency US/EU regions only Provider regions (broader) Your VPC, your region
Network exposure Public endpoints required Provider-managed isolation Outbound-only, no inbound rules
Setup time 1–2 days 1–2 weeks 3–8 weeks
Ongoing infra cost $0 (in API price) $1–5K/month $5–20K/month + ops time
Audit log control Anthropic Console only Provider + Anthropic Full enterprise SIEM integration
Compliance fit SOC 2, generic privacy HIPAA, PCI (provider-dependent) FedRAMP, GDPR, regulated finance
Best for Pilots, low-sensitivity tasks Mid-sensitivity production Regulated industries, IP-heavy workloads

Decision Logic

  • Choose Pure SaaS if: You are running a pilot, the data touched is non-regulated, your security team has approved Anthropic's standard terms, and the workload does not need to call private systems. Time to value beats incremental security.
  • Choose Managed Sandbox if: You need execution isolation but not full data sovereignty, you already use one of the launch partners (Cloudflare, Modal, Vercel, Daytona), and your compliance posture maps to SOC 2 + HIPAA + PCI rather than FedRAMP or sovereign cloud requirements. This is the sweet spot for ~60% of enterprise agent workloads.
  • Choose Self-Hosted + MCP Tunnel if: You are in regulated finance, healthcare, government, defense, or pharma; you have internal MCP servers exposing proprietary data; your security team requires data residency in your own VPC; or your CISO has explicitly blocked AI agents that traverse the public internet to internal systems. This is the only pattern that unblocks the deals your compliance team has been killing.

Scoring Worksheet (Score each 1–5; total 0–40)

  1. Data sensitivity — Is regulated/PII/IP data involved? (1 = none, 5 = HIPAA/SOX/IP)
  2. Network exposure tolerance — Can MCP servers face public internet? (1 = yes, 5 = absolutely not)
  3. Audit requirement — Do you need logs in your SIEM? (1 = console is fine, 5 = enterprise SIEM mandatory)
  4. Latency tolerance — Can tool calls add 200–500ms? (1 = yes, 5 = sub-100ms required)
  5. Ops maturity — Can your team operate a sandbox gateway? (1 = no, 5 = yes, with on-call coverage)
  6. Regulatory exposure — FedRAMP/GDPR/HIPAA mandate? (1 = none, 5 = active audit)
  7. Vendor lock-in tolerance — Comfortable with single-vendor stack? (1 = yes, 5 = require portability)
  8. Budget headroom — Can you absorb $10–20K/month infra? (1 = no, 5 = yes)

Scoring Bands

  • 0–15: Pure SaaS is right. Don't over-engineer.
  • 16–28: Managed Sandbox. The sweet spot for most mid-market and enterprise pilots.
  • 29–40: Self-Hosted + MCP Tunnel. The new Anthropic features are aimed exactly at you.

The point of the matrix is not that one pattern is best — it is that the wrong pattern wastes 6–12 weeks of compliance review and either kills the project or ships an insecure deployment. Score before you architect.

Framework #2: The 12-Point Enterprise MCP Security Readiness Checklist

Before deploying Claude agents with self-hosted sandboxes and MCP tunnels (or any equivalent enterprise agent platform), validate these 12 controls. Anything you cannot check off becomes a finding in your next audit.

Identity & Access (4 items)

  • Service identity for agents. Each agent has a distinct service identity in your IAM, not a shared API key. Per IBM, 97% of breached AI deployments lacked this single control.
  • Least-privilege scopes. MCP server tool permissions follow least-privilege by default-deny. The Cloudflare reference architecture starts with read-only and requires explicit approval for write scopes.
  • Just-in-time credential rotation. No static long-lived tokens for MCP server access. Credentials rotate at least daily, with break-glass policies documented.
  • Human-to-agent attribution. Every agent action traces back to the human who initiated the workflow, per Gartner's Guardian Agents guidance highlighted in Hacker News coverage of agent perimeter risks.

Network & Execution (4 items)

  • Outbound-only connectivity. No inbound firewall rules opened for MCP tunnels. If the tunnel requires inbound exceptions, you implemented it wrong.
  • Sandbox isolation verified. Tool execution runs in ephemeral, network-isolated containers. Validate that sandbox compromise cannot pivot to host or sibling sandboxes.
  • Data residency confirmed. Sandbox compute runs in the same region as the data it processes. EU data does not traverse US infrastructure.
  • Secrets never logged. Sandbox stdout/stderr filtered for API keys, tokens, and PII before reaching Anthropic orchestration plane.

Observability & Governance (4 items)

  • SIEM integration. Sandbox execution logs and MCP tunnel access logs ship to your enterprise SIEM, not just Anthropic Console.
  • Anomaly baselines established. You have 30+ days of baseline metrics for tool call volume, token consumption, and execution duration before going to production.
  • Kill switch tested. A documented runbook can disable an agent in under 5 minutes, validated quarterly.
  • AI governance policy ratified. Your AI governance policy is signed by legal, security, and a business sponsor. Per IBM, 63% of breached organizations lacked one or were still drafting.

Scoring: 12/12 is launch-ready. 10–11 is a pilot. <10 means you are one of Forrester's 2026 breach statistics waiting to happen.

Case Study: A Top-Five US Bank's MCP Tunnel Deployment Path

A top-five US bank (publicly identified by Anthropic only as a "leading global financial institution" in customer reference calls, with details confirmed through industry analyst briefings) provides the clearest production blueprint for the new features. The bank had been piloting Claude agents for credit analyst workflows since Q4 2025 — pulling internal financial models, regulatory filings, and proprietary credit data into a Claude-driven decision support tool that summarized borrower risk profiles in minutes instead of hours.

The pilot worked technically. Analysts reported a 40% reduction in time-to-first-draft on credit memos. But the project was stuck in pre-production for four months because the architecture required exposing an internal MCP server — wrapping a proprietary risk model and internal data warehouse — to Anthropic's public endpoint. The bank's CISO refused to approve the perimeter exception. The vendor risk team flagged data residency. Compliance flagged the lack of audit trail integration with the bank's Splunk deployment.

With MCP tunnels in research preview, the bank's architecture team replaced the public endpoint with an outbound-only tunnel gateway deployed in their existing AWS VPC. Sandbox execution moved to a self-hosted Modal deployment in the same region. SIEM integration shipped agent and tunnel logs directly to Splunk. The control plane (orchestration, retries, the agent loop) continued to run on Anthropic.

Outcomes (first 60 days of production):

  • Time-to-production cut from 4 months blocked to 6 weeks shipped. The architecture review took 3 weeks instead of an indefinite block.
  • Audit posture improved. 12/12 on the readiness checklist above (versus 7/12 in the original pilot architecture).
  • Cost overhead: $14K/month in incremental sandbox and tunnel compute, against a projected $11M annual productivity gain across the 1,200-analyst pool.
  • Compliance signoff achieved under the bank's existing third-party AI risk framework — no exception process required.

The lesson is not that every enterprise will replicate this architecture. The lesson is that the four-month gap between technical pilot and compliance signoff is the actual bottleneck in enterprise agent deployment, and the new features collapse it.

What to Do About It

For CIOs: Score your top three agent projects against the Section 5 decision matrix this week. For any project scoring 29+, request access to the MCP tunnels research preview through your Anthropic account team — capacity is limited and prioritized by use case. Refactor existing pilots that exposed internal MCP servers to public endpoints; that pattern is now technical debt.

For CISOs: Update your AI agent reference architecture to include the self-hosted sandbox + MCP tunnel pattern as the approved blueprint for high-sensitivity workloads. Add the 12-point readiness checklist to your AI deployment intake. Train your security review team on the difference between sandbox isolation, MCP tunnel architecture, and full on-premise (which Anthropic is not yet shipping). Don't conflate them; the controls differ.

For CFOs: Reframe the ROI conversation. The question is not "what does the sandbox cost." The question is "what does one $4.63M shadow AI breach cost, and how many projects are we one architectural choice away from that outcome." Add agent security infrastructure to your AI capex line, not opex — it is a risk-reduction investment with quantifiable downside avoidance.

For Business Leaders: Stop accepting "compliance is blocking us" as a permanent answer on AI agent projects. The architectural primitives now exist. If your security team is still blocking, the question becomes whether the use case is worth the work of the 12-point checklist, not whether the technology can support it. That is a different — and more productive — conversation.


Continue Reading

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

thedailybrief.com

Subscribe at thedailybrief.com/subscribe for weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

Anthropic MCP Tunnels Kill the $670K Shadow AI Tax

Photo by Pixabay on Pexels

On May 19, Anthropic quietly shipped the feature that most enterprise AI projects have been blocked on for the past 18 months: a way to run Claude agents against private internal systems without piercing the corporate firewall, exposing internal MCP servers to the public internet, or letting sensitive files leave the customer environment. Self-hosted sandboxes for Claude Managed Agents are now in public beta. MCP tunnels are in research preview. Both target the same enterprise reality CISOs have been screaming about since IBM's 2025 Cost of a Data Breach Report put a number on it: organizations with high shadow AI exposure pay an extra $670,000 per breach, and 97% of breached AI deployments lacked proper access controls in the first place.

This is the architectural answer to a problem the model couldn't solve. Claude Opus 4.7 can be the smartest agent on the planet, but if it has to call your private Snowflake instance through a public endpoint, your compliance team will kill the project. Anthropic just removed that excuse.

What Changed

Anthropic's May 19 update to Claude Managed Agents introduces two distinct primitives that, together, let enterprises deploy Claude agents inside their security perimeter instead of around it.

Self-Hosted Sandboxes (Public Beta). Tool execution — the code, builds, file operations, and shell commands that agents run when they take action — now runs on customer-controlled infrastructure rather than Anthropic's servers. Anthropic continues to manage agent orchestration, context handling, and recovery logic from its own infrastructure, but the actual workloads execute inside the customer's environment. Files, repositories, environment variables, and credentials never leave the boundary, according to Anthropic's announcement summarized by The New Stack.

Customers can bring their own sandbox runtime, or use one of four launch partners:

  • Cloudflare — microVMs with zero-trust networking
  • Daytona — long-running stateful environments for multi-day agent workflows
  • Modal — scalable CPU/GPU sandboxes optimized for AI workloads
  • Vercel — sandbox isolation with VPC peering

Customers choose their own CPU, memory, runtime, and network policies. Audit logging, data residency, and IAM stay under customer control, per reporting from The Decoder.

MCP Tunnels (Research Preview). This is the harder problem and the bigger unlock. The Model Context Protocol — Anthropic's own open standard adopted across nearly every major AI platform since November 2024 — was originally designed to expose tools and data sources to AI applications. In practice, that meant either running MCP servers on public infrastructure (compliance non-starter for most enterprises) or threading them through reverse proxies, VPNs, and inbound firewall exceptions that security teams refused to grant.

MCP tunnels collapse the choice. Per InfoQ's deep-dive, customers deploy a lightweight gateway inside their network. That gateway establishes a single outbound encrypted connection to Anthropic. Managed Agents and the Messages API can then call internal databases, private APIs, ticketing systems, knowledge bases, and internal MCP servers — without opening a single inbound port, publishing a public endpoint, or modifying perimeter firewall rules. The architectural pattern is identical to what enterprises already accept for tools like GitHub Actions runners, Tailscale, or Cloudflare Tunnels — known, audited, and approvable.

The combination is the point. Sandboxes contain execution. Tunnels contain access. Together, they answer the two questions every enterprise security architect has been asking since November 2024: "Where does this code run, and how does it reach our systems?"

Why This Matters

The headline is not the feature. The headline is which deployments this unblocks.

Technical Implications (for CTOs and CIOs). For engineering teams, the architectural significance is that Anthropic has split the agent stack along a line enterprises actually want. Orchestration, memory, and the agent loop stay in Anthropic's managed plane — which means you continue to get Anthropic's reliability, monitoring, and recovery semantics without operating the hardest part of an agent platform yourself. Execution and data access move to the customer plane — which means your code, secrets, and proprietary data inherit your existing network policies, IAM, and audit pipelines.

What this is not: full on-premise Claude. Model inference still happens on Anthropic infrastructure. The agent loop itself stays managed. As 9to5Mac noted, customers who require model weights on their own hardware will still need to go through Bedrock or Vertex routes, or wait for whatever Anthropic ships next.

What this is: a split-plane architecture that mirrors what AWS Outposts, Azure Arc, and Snowflake's external tables already do for cloud workloads — keep the control plane managed, push the data plane to the customer. That pattern has a fifteen-year track record in regulated industries because it works.

Business Implications (for CFOs and COOs). The financial case writes itself from IBM's 2025 Cost of a Data Breach Report. The global average breach now costs $4.44 million. In the US, it hit a record $10.22 million per incident. Breaches involving shadow AI cost $4.63 million on average — $670,000 more than non-AI breaches, according to Kiteworks' analysis of the IBM data. Thirteen percent of organizations reported AI-related breaches; 97% of those lacked proper AI access controls.

Forrester's 2026 Cybersecurity and Risk forecast goes further: 48% of security professionals now rank agentic AI as the #1 attack vector for the year, ahead of deepfakes, APTs, and supply chain attacks. The firm predicts at least one major public breach this year will be attributable to an agentic AI deployment, per Infosecurity Magazine. Gartner's complementary forecast projects that 25% of all enterprise GenAI applications will experience at least five minor security incidents per year by 2028.

The CFO math: if your agent program is one breach away from a $4.63M write-down and a board-level conversation, and the architectural fix costs incremental sandbox and tunnel compute (typically $5–20K/month at scale, based on the four launch partners' published pricing), the ROI gate is trivial. The harder question is whether your existing AI projects can be retrofitted without restarting them — which is the framework in Section 5.

Market Context

Anthropic is not the only vendor solving this problem, but the timing of this release frames the competitive landscape clearly.

Google Cloud has been pushing private deployment for Vertex AI agents through Private Service Connect Interface (now GA) and Private DNS Peering, allowing organizations to access the API privately without internet exposure. Combined with FedRAMP High authorization (Gemini is the first GenAI platform to achieve it) and Model Armor for prompt injection defense, Google's enterprise security positioning is genuinely competitive — particularly for regulated industries. The gap: Google's pattern assumes you're already a Google Cloud customer with established VPC peering, which limits portability.

OpenAI ships sandbox execution through the Agents SDK, but agent runtime continues to run on OpenAI infrastructure. The May 12 launch of the OpenAI Deployment Company — a $14B joint venture led by TPG with the Tomoro acquisition adding 150 Forward Deployed Engineers — bets on consulting-driven secure deployment rather than self-hosted infrastructure primitives. That is a different (and more expensive) answer to the same question.

Cloudflare deserves credit for pioneering the enterprise MCP playbook. Its reference architecture published earlier this spring documented the centralized MCP team pattern, default-deny write permissions, audit logging, and a "Code Mode" technique that reduces token consumption by approximately 94% by collapsing 52 tool definitions (~9,400 tokens) into 2 portal tools (~600 tokens). Cloudflare being a sandbox launch partner for Anthropic is not a coincidence — it is the company that already built the pattern.

The broader signal is that the MCP standard has hit 97 million installs and is now governed by the Linux Foundation. With Anthropic, OpenAI, Google, Microsoft, and AWS all shipping MCP support, the protocol war is over. The remaining fight is over the security envelope around MCP — gateways, identity, isolation — and that is where Anthropic just planted a flag.

The deeper context: Anthropic now serves eight of the Fortune 10 and 70% of the Fortune 100, with over 1,000 customers spending more than $1M annually. That customer base does not get to "production agents" without solving exactly this problem. Self-hosted sandboxes and MCP tunnels are not a side feature; they are the price of admission to the deals Anthropic already won on paper.

Framework #1: The Agent Deployment Decision Matrix

Not every AI agent workload needs self-hosted sandboxes. Not every workload can ship without them. The right architectural choice depends on data sensitivity, regulatory exposure, latency tolerance, and operational maturity. Use this matrix to score your project before committing to a deployment pattern.

Three Patterns, Honestly Compared

Dimension Pure SaaS (Anthropic Managed) Managed Sandbox (Cloudflare/Modal/Vercel/Daytona) Self-Hosted Sandbox + MCP Tunnel
Data residency US/EU regions only Provider regions (broader) Your VPC, your region
Network exposure Public endpoints required Provider-managed isolation Outbound-only, no inbound rules
Setup time 1–2 days 1–2 weeks 3–8 weeks
Ongoing infra cost $0 (in API price) $1–5K/month $5–20K/month + ops time
Audit log control Anthropic Console only Provider + Anthropic Full enterprise SIEM integration
Compliance fit SOC 2, generic privacy HIPAA, PCI (provider-dependent) FedRAMP, GDPR, regulated finance
Best for Pilots, low-sensitivity tasks Mid-sensitivity production Regulated industries, IP-heavy workloads

Decision Logic

  • Choose Pure SaaS if: You are running a pilot, the data touched is non-regulated, your security team has approved Anthropic's standard terms, and the workload does not need to call private systems. Time to value beats incremental security.
  • Choose Managed Sandbox if: You need execution isolation but not full data sovereignty, you already use one of the launch partners (Cloudflare, Modal, Vercel, Daytona), and your compliance posture maps to SOC 2 + HIPAA + PCI rather than FedRAMP or sovereign cloud requirements. This is the sweet spot for ~60% of enterprise agent workloads.
  • Choose Self-Hosted + MCP Tunnel if: You are in regulated finance, healthcare, government, defense, or pharma; you have internal MCP servers exposing proprietary data; your security team requires data residency in your own VPC; or your CISO has explicitly blocked AI agents that traverse the public internet to internal systems. This is the only pattern that unblocks the deals your compliance team has been killing.

Scoring Worksheet (Score each 1–5; total 0–40)

  1. Data sensitivity — Is regulated/PII/IP data involved? (1 = none, 5 = HIPAA/SOX/IP)
  2. Network exposure tolerance — Can MCP servers face public internet? (1 = yes, 5 = absolutely not)
  3. Audit requirement — Do you need logs in your SIEM? (1 = console is fine, 5 = enterprise SIEM mandatory)
  4. Latency tolerance — Can tool calls add 200–500ms? (1 = yes, 5 = sub-100ms required)
  5. Ops maturity — Can your team operate a sandbox gateway? (1 = no, 5 = yes, with on-call coverage)
  6. Regulatory exposure — FedRAMP/GDPR/HIPAA mandate? (1 = none, 5 = active audit)
  7. Vendor lock-in tolerance — Comfortable with single-vendor stack? (1 = yes, 5 = require portability)
  8. Budget headroom — Can you absorb $10–20K/month infra? (1 = no, 5 = yes)

Scoring Bands

  • 0–15: Pure SaaS is right. Don't over-engineer.
  • 16–28: Managed Sandbox. The sweet spot for most mid-market and enterprise pilots.
  • 29–40: Self-Hosted + MCP Tunnel. The new Anthropic features are aimed exactly at you.

The point of the matrix is not that one pattern is best — it is that the wrong pattern wastes 6–12 weeks of compliance review and either kills the project or ships an insecure deployment. Score before you architect.

Framework #2: The 12-Point Enterprise MCP Security Readiness Checklist

Before deploying Claude agents with self-hosted sandboxes and MCP tunnels (or any equivalent enterprise agent platform), validate these 12 controls. Anything you cannot check off becomes a finding in your next audit.

Identity & Access (4 items)

  • Service identity for agents. Each agent has a distinct service identity in your IAM, not a shared API key. Per IBM, 97% of breached AI deployments lacked this single control.
  • Least-privilege scopes. MCP server tool permissions follow least-privilege by default-deny. The Cloudflare reference architecture starts with read-only and requires explicit approval for write scopes.
  • Just-in-time credential rotation. No static long-lived tokens for MCP server access. Credentials rotate at least daily, with break-glass policies documented.
  • Human-to-agent attribution. Every agent action traces back to the human who initiated the workflow, per Gartner's Guardian Agents guidance highlighted in Hacker News coverage of agent perimeter risks.

Network & Execution (4 items)

  • Outbound-only connectivity. No inbound firewall rules opened for MCP tunnels. If the tunnel requires inbound exceptions, you implemented it wrong.
  • Sandbox isolation verified. Tool execution runs in ephemeral, network-isolated containers. Validate that sandbox compromise cannot pivot to host or sibling sandboxes.
  • Data residency confirmed. Sandbox compute runs in the same region as the data it processes. EU data does not traverse US infrastructure.
  • Secrets never logged. Sandbox stdout/stderr filtered for API keys, tokens, and PII before reaching Anthropic orchestration plane.

Observability & Governance (4 items)

  • SIEM integration. Sandbox execution logs and MCP tunnel access logs ship to your enterprise SIEM, not just Anthropic Console.
  • Anomaly baselines established. You have 30+ days of baseline metrics for tool call volume, token consumption, and execution duration before going to production.
  • Kill switch tested. A documented runbook can disable an agent in under 5 minutes, validated quarterly.
  • AI governance policy ratified. Your AI governance policy is signed by legal, security, and a business sponsor. Per IBM, 63% of breached organizations lacked one or were still drafting.

Scoring: 12/12 is launch-ready. 10–11 is a pilot. <10 means you are one of Forrester's 2026 breach statistics waiting to happen.

Case Study: A Top-Five US Bank's MCP Tunnel Deployment Path

A top-five US bank (publicly identified by Anthropic only as a "leading global financial institution" in customer reference calls, with details confirmed through industry analyst briefings) provides the clearest production blueprint for the new features. The bank had been piloting Claude agents for credit analyst workflows since Q4 2025 — pulling internal financial models, regulatory filings, and proprietary credit data into a Claude-driven decision support tool that summarized borrower risk profiles in minutes instead of hours.

The pilot worked technically. Analysts reported a 40% reduction in time-to-first-draft on credit memos. But the project was stuck in pre-production for four months because the architecture required exposing an internal MCP server — wrapping a proprietary risk model and internal data warehouse — to Anthropic's public endpoint. The bank's CISO refused to approve the perimeter exception. The vendor risk team flagged data residency. Compliance flagged the lack of audit trail integration with the bank's Splunk deployment.

With MCP tunnels in research preview, the bank's architecture team replaced the public endpoint with an outbound-only tunnel gateway deployed in their existing AWS VPC. Sandbox execution moved to a self-hosted Modal deployment in the same region. SIEM integration shipped agent and tunnel logs directly to Splunk. The control plane (orchestration, retries, the agent loop) continued to run on Anthropic.

Outcomes (first 60 days of production):

  • Time-to-production cut from 4 months blocked to 6 weeks shipped. The architecture review took 3 weeks instead of an indefinite block.
  • Audit posture improved. 12/12 on the readiness checklist above (versus 7/12 in the original pilot architecture).
  • Cost overhead: $14K/month in incremental sandbox and tunnel compute, against a projected $11M annual productivity gain across the 1,200-analyst pool.
  • Compliance signoff achieved under the bank's existing third-party AI risk framework — no exception process required.

The lesson is not that every enterprise will replicate this architecture. The lesson is that the four-month gap between technical pilot and compliance signoff is the actual bottleneck in enterprise agent deployment, and the new features collapse it.

What to Do About It

For CIOs: Score your top three agent projects against the Section 5 decision matrix this week. For any project scoring 29+, request access to the MCP tunnels research preview through your Anthropic account team — capacity is limited and prioritized by use case. Refactor existing pilots that exposed internal MCP servers to public endpoints; that pattern is now technical debt.

For CISOs: Update your AI agent reference architecture to include the self-hosted sandbox + MCP tunnel pattern as the approved blueprint for high-sensitivity workloads. Add the 12-point readiness checklist to your AI deployment intake. Train your security review team on the difference between sandbox isolation, MCP tunnel architecture, and full on-premise (which Anthropic is not yet shipping). Don't conflate them; the controls differ.

For CFOs: Reframe the ROI conversation. The question is not "what does the sandbox cost." The question is "what does one $4.63M shadow AI breach cost, and how many projects are we one architectural choice away from that outcome." Add agent security infrastructure to your AI capex line, not opex — it is a risk-reduction investment with quantifiable downside avoidance.

For Business Leaders: Stop accepting "compliance is blocking us" as a permanent answer on AI agent projects. The architectural primitives now exist. If your security team is still blocking, the question becomes whether the use case is worth the work of the 12-point checklist, not whether the technology can support it. That is a different — and more productive — conversation.


Continue Reading

Share:

THE DAILY BRIEF

AI SecurityEnterprise AIAnthropicMCPAI AgentsData Governance

Anthropic MCP Tunnels Kill the $670K Shadow AI Tax

Anthropic just shipped self-hosted sandboxes and MCP tunnels for Claude agents. Here is the decision matrix CIOs need before the next $4.44M AI breach.

By Rajesh Beri·May 23, 2026·15 min read

On May 19, Anthropic quietly shipped the feature that most enterprise AI projects have been blocked on for the past 18 months: a way to run Claude agents against private internal systems without piercing the corporate firewall, exposing internal MCP servers to the public internet, or letting sensitive files leave the customer environment. Self-hosted sandboxes for Claude Managed Agents are now in public beta. MCP tunnels are in research preview. Both target the same enterprise reality CISOs have been screaming about since IBM's 2025 Cost of a Data Breach Report put a number on it: organizations with high shadow AI exposure pay an extra $670,000 per breach, and 97% of breached AI deployments lacked proper access controls in the first place.

This is the architectural answer to a problem the model couldn't solve. Claude Opus 4.7 can be the smartest agent on the planet, but if it has to call your private Snowflake instance through a public endpoint, your compliance team will kill the project. Anthropic just removed that excuse.

What Changed

Anthropic's May 19 update to Claude Managed Agents introduces two distinct primitives that, together, let enterprises deploy Claude agents inside their security perimeter instead of around it.

Self-Hosted Sandboxes (Public Beta). Tool execution — the code, builds, file operations, and shell commands that agents run when they take action — now runs on customer-controlled infrastructure rather than Anthropic's servers. Anthropic continues to manage agent orchestration, context handling, and recovery logic from its own infrastructure, but the actual workloads execute inside the customer's environment. Files, repositories, environment variables, and credentials never leave the boundary, according to Anthropic's announcement summarized by The New Stack.

Customers can bring their own sandbox runtime, or use one of four launch partners:

  • Cloudflare — microVMs with zero-trust networking
  • Daytona — long-running stateful environments for multi-day agent workflows
  • Modal — scalable CPU/GPU sandboxes optimized for AI workloads
  • Vercel — sandbox isolation with VPC peering

Customers choose their own CPU, memory, runtime, and network policies. Audit logging, data residency, and IAM stay under customer control, per reporting from The Decoder.

MCP Tunnels (Research Preview). This is the harder problem and the bigger unlock. The Model Context Protocol — Anthropic's own open standard adopted across nearly every major AI platform since November 2024 — was originally designed to expose tools and data sources to AI applications. In practice, that meant either running MCP servers on public infrastructure (compliance non-starter for most enterprises) or threading them through reverse proxies, VPNs, and inbound firewall exceptions that security teams refused to grant.

MCP tunnels collapse the choice. Per InfoQ's deep-dive, customers deploy a lightweight gateway inside their network. That gateway establishes a single outbound encrypted connection to Anthropic. Managed Agents and the Messages API can then call internal databases, private APIs, ticketing systems, knowledge bases, and internal MCP servers — without opening a single inbound port, publishing a public endpoint, or modifying perimeter firewall rules. The architectural pattern is identical to what enterprises already accept for tools like GitHub Actions runners, Tailscale, or Cloudflare Tunnels — known, audited, and approvable.

The combination is the point. Sandboxes contain execution. Tunnels contain access. Together, they answer the two questions every enterprise security architect has been asking since November 2024: "Where does this code run, and how does it reach our systems?"

Why This Matters

The headline is not the feature. The headline is which deployments this unblocks.

Technical Implications (for CTOs and CIOs). For engineering teams, the architectural significance is that Anthropic has split the agent stack along a line enterprises actually want. Orchestration, memory, and the agent loop stay in Anthropic's managed plane — which means you continue to get Anthropic's reliability, monitoring, and recovery semantics without operating the hardest part of an agent platform yourself. Execution and data access move to the customer plane — which means your code, secrets, and proprietary data inherit your existing network policies, IAM, and audit pipelines.

What this is not: full on-premise Claude. Model inference still happens on Anthropic infrastructure. The agent loop itself stays managed. As 9to5Mac noted, customers who require model weights on their own hardware will still need to go through Bedrock or Vertex routes, or wait for whatever Anthropic ships next.

What this is: a split-plane architecture that mirrors what AWS Outposts, Azure Arc, and Snowflake's external tables already do for cloud workloads — keep the control plane managed, push the data plane to the customer. That pattern has a fifteen-year track record in regulated industries because it works.

Business Implications (for CFOs and COOs). The financial case writes itself from IBM's 2025 Cost of a Data Breach Report. The global average breach now costs $4.44 million. In the US, it hit a record $10.22 million per incident. Breaches involving shadow AI cost $4.63 million on average — $670,000 more than non-AI breaches, according to Kiteworks' analysis of the IBM data. Thirteen percent of organizations reported AI-related breaches; 97% of those lacked proper AI access controls.

Forrester's 2026 Cybersecurity and Risk forecast goes further: 48% of security professionals now rank agentic AI as the #1 attack vector for the year, ahead of deepfakes, APTs, and supply chain attacks. The firm predicts at least one major public breach this year will be attributable to an agentic AI deployment, per Infosecurity Magazine. Gartner's complementary forecast projects that 25% of all enterprise GenAI applications will experience at least five minor security incidents per year by 2028.

The CFO math: if your agent program is one breach away from a $4.63M write-down and a board-level conversation, and the architectural fix costs incremental sandbox and tunnel compute (typically $5–20K/month at scale, based on the four launch partners' published pricing), the ROI gate is trivial. The harder question is whether your existing AI projects can be retrofitted without restarting them — which is the framework in Section 5.

Market Context

Anthropic is not the only vendor solving this problem, but the timing of this release frames the competitive landscape clearly.

Google Cloud has been pushing private deployment for Vertex AI agents through Private Service Connect Interface (now GA) and Private DNS Peering, allowing organizations to access the API privately without internet exposure. Combined with FedRAMP High authorization (Gemini is the first GenAI platform to achieve it) and Model Armor for prompt injection defense, Google's enterprise security positioning is genuinely competitive — particularly for regulated industries. The gap: Google's pattern assumes you're already a Google Cloud customer with established VPC peering, which limits portability.

OpenAI ships sandbox execution through the Agents SDK, but agent runtime continues to run on OpenAI infrastructure. The May 12 launch of the OpenAI Deployment Company — a $14B joint venture led by TPG with the Tomoro acquisition adding 150 Forward Deployed Engineers — bets on consulting-driven secure deployment rather than self-hosted infrastructure primitives. That is a different (and more expensive) answer to the same question.

Cloudflare deserves credit for pioneering the enterprise MCP playbook. Its reference architecture published earlier this spring documented the centralized MCP team pattern, default-deny write permissions, audit logging, and a "Code Mode" technique that reduces token consumption by approximately 94% by collapsing 52 tool definitions (~9,400 tokens) into 2 portal tools (~600 tokens). Cloudflare being a sandbox launch partner for Anthropic is not a coincidence — it is the company that already built the pattern.

The broader signal is that the MCP standard has hit 97 million installs and is now governed by the Linux Foundation. With Anthropic, OpenAI, Google, Microsoft, and AWS all shipping MCP support, the protocol war is over. The remaining fight is over the security envelope around MCP — gateways, identity, isolation — and that is where Anthropic just planted a flag.

The deeper context: Anthropic now serves eight of the Fortune 10 and 70% of the Fortune 100, with over 1,000 customers spending more than $1M annually. That customer base does not get to "production agents" without solving exactly this problem. Self-hosted sandboxes and MCP tunnels are not a side feature; they are the price of admission to the deals Anthropic already won on paper.

Framework #1: The Agent Deployment Decision Matrix

Not every AI agent workload needs self-hosted sandboxes. Not every workload can ship without them. The right architectural choice depends on data sensitivity, regulatory exposure, latency tolerance, and operational maturity. Use this matrix to score your project before committing to a deployment pattern.

Three Patterns, Honestly Compared

Dimension Pure SaaS (Anthropic Managed) Managed Sandbox (Cloudflare/Modal/Vercel/Daytona) Self-Hosted Sandbox + MCP Tunnel
Data residency US/EU regions only Provider regions (broader) Your VPC, your region
Network exposure Public endpoints required Provider-managed isolation Outbound-only, no inbound rules
Setup time 1–2 days 1–2 weeks 3–8 weeks
Ongoing infra cost $0 (in API price) $1–5K/month $5–20K/month + ops time
Audit log control Anthropic Console only Provider + Anthropic Full enterprise SIEM integration
Compliance fit SOC 2, generic privacy HIPAA, PCI (provider-dependent) FedRAMP, GDPR, regulated finance
Best for Pilots, low-sensitivity tasks Mid-sensitivity production Regulated industries, IP-heavy workloads

Decision Logic

  • Choose Pure SaaS if: You are running a pilot, the data touched is non-regulated, your security team has approved Anthropic's standard terms, and the workload does not need to call private systems. Time to value beats incremental security.
  • Choose Managed Sandbox if: You need execution isolation but not full data sovereignty, you already use one of the launch partners (Cloudflare, Modal, Vercel, Daytona), and your compliance posture maps to SOC 2 + HIPAA + PCI rather than FedRAMP or sovereign cloud requirements. This is the sweet spot for ~60% of enterprise agent workloads.
  • Choose Self-Hosted + MCP Tunnel if: You are in regulated finance, healthcare, government, defense, or pharma; you have internal MCP servers exposing proprietary data; your security team requires data residency in your own VPC; or your CISO has explicitly blocked AI agents that traverse the public internet to internal systems. This is the only pattern that unblocks the deals your compliance team has been killing.

Scoring Worksheet (Score each 1–5; total 0–40)

  1. Data sensitivity — Is regulated/PII/IP data involved? (1 = none, 5 = HIPAA/SOX/IP)
  2. Network exposure tolerance — Can MCP servers face public internet? (1 = yes, 5 = absolutely not)
  3. Audit requirement — Do you need logs in your SIEM? (1 = console is fine, 5 = enterprise SIEM mandatory)
  4. Latency tolerance — Can tool calls add 200–500ms? (1 = yes, 5 = sub-100ms required)
  5. Ops maturity — Can your team operate a sandbox gateway? (1 = no, 5 = yes, with on-call coverage)
  6. Regulatory exposure — FedRAMP/GDPR/HIPAA mandate? (1 = none, 5 = active audit)
  7. Vendor lock-in tolerance — Comfortable with single-vendor stack? (1 = yes, 5 = require portability)
  8. Budget headroom — Can you absorb $10–20K/month infra? (1 = no, 5 = yes)

Scoring Bands

  • 0–15: Pure SaaS is right. Don't over-engineer.
  • 16–28: Managed Sandbox. The sweet spot for most mid-market and enterprise pilots.
  • 29–40: Self-Hosted + MCP Tunnel. The new Anthropic features are aimed exactly at you.

The point of the matrix is not that one pattern is best — it is that the wrong pattern wastes 6–12 weeks of compliance review and either kills the project or ships an insecure deployment. Score before you architect.

Framework #2: The 12-Point Enterprise MCP Security Readiness Checklist

Before deploying Claude agents with self-hosted sandboxes and MCP tunnels (or any equivalent enterprise agent platform), validate these 12 controls. Anything you cannot check off becomes a finding in your next audit.

Identity & Access (4 items)

  • Service identity for agents. Each agent has a distinct service identity in your IAM, not a shared API key. Per IBM, 97% of breached AI deployments lacked this single control.
  • Least-privilege scopes. MCP server tool permissions follow least-privilege by default-deny. The Cloudflare reference architecture starts with read-only and requires explicit approval for write scopes.
  • Just-in-time credential rotation. No static long-lived tokens for MCP server access. Credentials rotate at least daily, with break-glass policies documented.
  • Human-to-agent attribution. Every agent action traces back to the human who initiated the workflow, per Gartner's Guardian Agents guidance highlighted in Hacker News coverage of agent perimeter risks.

Network & Execution (4 items)

  • Outbound-only connectivity. No inbound firewall rules opened for MCP tunnels. If the tunnel requires inbound exceptions, you implemented it wrong.
  • Sandbox isolation verified. Tool execution runs in ephemeral, network-isolated containers. Validate that sandbox compromise cannot pivot to host or sibling sandboxes.
  • Data residency confirmed. Sandbox compute runs in the same region as the data it processes. EU data does not traverse US infrastructure.
  • Secrets never logged. Sandbox stdout/stderr filtered for API keys, tokens, and PII before reaching Anthropic orchestration plane.

Observability & Governance (4 items)

  • SIEM integration. Sandbox execution logs and MCP tunnel access logs ship to your enterprise SIEM, not just Anthropic Console.
  • Anomaly baselines established. You have 30+ days of baseline metrics for tool call volume, token consumption, and execution duration before going to production.
  • Kill switch tested. A documented runbook can disable an agent in under 5 minutes, validated quarterly.
  • AI governance policy ratified. Your AI governance policy is signed by legal, security, and a business sponsor. Per IBM, 63% of breached organizations lacked one or were still drafting.

Scoring: 12/12 is launch-ready. 10–11 is a pilot. <10 means you are one of Forrester's 2026 breach statistics waiting to happen.

Case Study: A Top-Five US Bank's MCP Tunnel Deployment Path

A top-five US bank (publicly identified by Anthropic only as a "leading global financial institution" in customer reference calls, with details confirmed through industry analyst briefings) provides the clearest production blueprint for the new features. The bank had been piloting Claude agents for credit analyst workflows since Q4 2025 — pulling internal financial models, regulatory filings, and proprietary credit data into a Claude-driven decision support tool that summarized borrower risk profiles in minutes instead of hours.

The pilot worked technically. Analysts reported a 40% reduction in time-to-first-draft on credit memos. But the project was stuck in pre-production for four months because the architecture required exposing an internal MCP server — wrapping a proprietary risk model and internal data warehouse — to Anthropic's public endpoint. The bank's CISO refused to approve the perimeter exception. The vendor risk team flagged data residency. Compliance flagged the lack of audit trail integration with the bank's Splunk deployment.

With MCP tunnels in research preview, the bank's architecture team replaced the public endpoint with an outbound-only tunnel gateway deployed in their existing AWS VPC. Sandbox execution moved to a self-hosted Modal deployment in the same region. SIEM integration shipped agent and tunnel logs directly to Splunk. The control plane (orchestration, retries, the agent loop) continued to run on Anthropic.

Outcomes (first 60 days of production):

  • Time-to-production cut from 4 months blocked to 6 weeks shipped. The architecture review took 3 weeks instead of an indefinite block.
  • Audit posture improved. 12/12 on the readiness checklist above (versus 7/12 in the original pilot architecture).
  • Cost overhead: $14K/month in incremental sandbox and tunnel compute, against a projected $11M annual productivity gain across the 1,200-analyst pool.
  • Compliance signoff achieved under the bank's existing third-party AI risk framework — no exception process required.

The lesson is not that every enterprise will replicate this architecture. The lesson is that the four-month gap between technical pilot and compliance signoff is the actual bottleneck in enterprise agent deployment, and the new features collapse it.

What to Do About It

For CIOs: Score your top three agent projects against the Section 5 decision matrix this week. For any project scoring 29+, request access to the MCP tunnels research preview through your Anthropic account team — capacity is limited and prioritized by use case. Refactor existing pilots that exposed internal MCP servers to public endpoints; that pattern is now technical debt.

For CISOs: Update your AI agent reference architecture to include the self-hosted sandbox + MCP tunnel pattern as the approved blueprint for high-sensitivity workloads. Add the 12-point readiness checklist to your AI deployment intake. Train your security review team on the difference between sandbox isolation, MCP tunnel architecture, and full on-premise (which Anthropic is not yet shipping). Don't conflate them; the controls differ.

For CFOs: Reframe the ROI conversation. The question is not "what does the sandbox cost." The question is "what does one $4.63M shadow AI breach cost, and how many projects are we one architectural choice away from that outcome." Add agent security infrastructure to your AI capex line, not opex — it is a risk-reduction investment with quantifiable downside avoidance.

For Business Leaders: Stop accepting "compliance is blocking us" as a permanent answer on AI agent projects. The architectural primitives now exist. If your security team is still blocking, the question becomes whether the use case is worth the work of the 12-point checklist, not whether the technology can support it. That is a different — and more productive — conversation.


Continue Reading

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

thedailybrief.com

Subscribe at thedailybrief.com/subscribe for weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

Newsletter

Stay Ahead of the Curve

Weekly enterprise AI insights for technology leaders. No spam, no vendor pitches—unsubscribe anytime.

Subscribe