On February 28, 2026, an autonomous agent built by a startup called CodeWall broke into McKinsey's internal AI platform, Lilli, in under two hours. It walked away with 46.5 million chat messages, 728,000 files, 57,000 user accounts, and 3.68 million RAG document chunks — decades of proprietary research and the system prompts that governed how 43,000 McKinsey employees made strategic decisions. The attack used a vector that traditional scanners like OWASP ZAP did not flag: fifteen iterative blind SQL injections, each one informed by the error message returned by the previous attempt.
McKinsey was not unusual. According to research from the Cloud Security Alliance and Token Security published April 21, 2026, 65% of enterprises experienced at least one cybersecurity incident caused by an AI agent in the prior twelve months. The proximate cause is almost always the same: agents that operate with privileges no human employee would ever be granted, on top of governance frameworks designed for a world where every action started with a human login. On June 2, 2026, security vendor Noma announced Agent Access Control, the first dedicated platform built to discover, govern, and enforce policies on AI agents and Model Context Protocol (MCP) servers across the enterprise. The launch is a flag in the ground for a category that, until this year, did not exist.
What Changed: From "A Few Agents" to Hundreds, Without Governance
The trigger for Noma's launch — and the urgency behind the category — is a step-change in the number of autonomous actors inside the corporate network. In less than twelve months, organizations have gone from "experimenting with a handful of agents to running dozens, or even hundreds of them," each connecting to sensitive data sources and executing actions on behalf of users. That is the company's framing in its June 2 announcement, and the numbers in the broader market support it.
Non-human identities (NHIs) — the service accounts, API keys, OAuth tokens, and now AI agents that act without a human at the keyboard — already outnumber human identities by 45-to-1 in the average enterprise, according to Rubrik Zero Labs. In cloud-native and DevOps environments, Entro Labs puts the ratio at 144-to-1, growing 44% year-over-year. The average enterprise now manages more than 250,000 NHIs across cloud environments. Sixty-eight percent of IT security incidents now involve a machine identity, and 50% of enterprises have already taken a breach traceable to an unmanaged NHI.
Layered on top of that explosion is the rapid adoption of MCP, Anthropic's protocol for connecting agents to tools. MCP packages have crossed 150 million downloads, but the protocol's STDIO transport contains a design flaw that Ox Security disclosed in April 2026 — one that exposes roughly 200,000 servers to arbitrary command execution, hardening bypass, zero-click prompt injection, and marketplace poisoning. Anthropic declined to change the architecture, calling the behavior "expected." Major coding assistants — Windsurf, Cursor, Claude Code, and GitHub Copilot — were among the platforms named as vulnerable in the Hacker News writeup.
Noma's product attacks this exact gap. The platform's Enterprise Agentic Registry builds a real-time inventory of every agent and MCP server in the environment, recording ownership, permissions, and risk context. Each agent receives an Agent Identity — a distinct, attributable credential, rather than the shared keys it would otherwise inherit from the human who built it. Tool-Level Control lets a security team approve, block, or flag for review at the granularity of an individual tool, agent type, user, team, or environment ("approve this MCP server for engineering, block it for finance"). At runtime, Noma's AI Detection and Response (AI-DR) monitors the full behavioral chain — prompts, tool calls, data access, actions taken — and intervenes when it detects prompt injection, data exfiltration, or scope violations.
The product is built into 80+ data, AI, and MLOps platforms, supports both SaaS and on-prem deployments, and was named a Gartner AI TRiSM (Trust, Risk and Security Management) representative vendor before this launch. CEO Niv Braun framed the launch around what he calls the "Maker's Identity problem" — the tendency for agents to inherit the credentials of whoever built them, rather than receive least-privilege access of their own. "Knowing what each agent is authorized to do is the foundation," Braun said in the launch. "But agents are also influenced by everything they encounter at runtime. Complete governance means defining the rules and continuously verifying they hold."
Why This Matters: The Dual-Audience Problem
The reason this category is forming now is that AI agents broke a foundational assumption in enterprise security, and the fix has both a technical and a financial story.
For the CISO, CIO, and CTO the issue is that traditional Identity and Access Management — the Oktas, Entras, and Pings of the world — was built around a session boundary. A human logs in, gets a token, and that token expresses what they can do for the next eight hours. Authorization is a one-time check at the gate. AI agents do not work that way. An agent might launch with a benign prompt, then read a poisoned document that rewrites its goals mid-run, and then call a tool the user never intended. The "authorize once" model is structurally wrong. Gartner named "Identity and Access Management Adapts to AI Agents" as a top-six cybersecurity trend for 2026, specifically calling out identity registration, credential automation, and policy-driven authorization for machine actors as failures-in-progress.
That is why the controls Noma ships — registry, identity, tool-scoped policy, runtime detection — line up almost exactly with the three capability areas Gartner named in its first-ever Market Guide for Guardian Agents: AI visibility and traceability, continuous assurance and evaluation, and runtime inspection and enforcement. Bessemer Venture Partners' analysis of the category describes the same three-stage architecture: visibility, configuration, runtime. The pattern is converging.
For the CFO, COO, and CEO the issue is that the cost of doing nothing has become measurable. IBM's 2025 Cost of a Data Breach Report puts the average shadow-AI breach at $4.63 million — $670,000 above the baseline — and the U.S. average breach now exceeds $10 million when regulatory penalties are included. The CSA/Token study cited above broke the 65% number down by impact: 61% of incidents involved sensitive data exposure, 43% caused operational disruption, 41% produced unintended actions across business processes, 35% drove direct financial losses, and 31% caused customer-facing service delays. The penalty regime is also escalating. The EU AI Act allows fines of up to €35 million or 7% of global annual revenue, and GDPR fines remain at €20 million or 4% of revenue, both of which the EU has begun to apply to AI-specific incidents. AI privacy incidents are up 56.4% year-over-year.
A second number matters for the CFO. Gartner's 2026 information security forecast puts total spend at $244.2 billion, growing 13.3% year-over-year, but notes that enterprises spend roughly 17 times more on AI tooling than on securing the AI they have already deployed. The same forecast estimates the AI Agent Management Platform market at $15 billion by 2029, up from less than $5 million today — a 3,000x category in four years. The CFOs who are paying attention are already asking why the security line in next year's budget does not match the agent line.
Market Context: A Crowded Race Starting From Zero
Three groups are competing for the agent-governance budget that is about to open up.
Specialist NHI/agent-identity startups sit at the center. Oasis Security raised a $120M Series B from Craft Ventures on a platform built around zero standing privilege, intent-based access, and continuous verification. Astrix Security, reported in Bank Info Security to be a Cisco acquisition target, offers discovery, governance, and threat detection across AWS, GCP, Azure, and SaaS. Entro Security focuses on the credential layer. Noma's June 2 launch puts it in the same neighborhood but positions deeper on the MCP-specific governance gap.
Traditional IAM vendors are extending. Okta released Auth0 for AI Agents late last year and is now framing itself as the agent attestation authority. Microsoft has extended Entra into agent identity. Ping is building agent attestation into its provider registry. The trade-off, as a recent SACR analysis puts it, is that the legacy IAM stack was built for a human login boundary — extending it to agents is possible but architecturally awkward.
Platform vendors are bundling. Snowflake and Anthropic announced a partnership for Claude in Snowflake Cortex AI to deliver "trusted, production-ready AI agents at scale," and Snowflake separately added the Natoma MCP gateway covering more than 9,400 servers. Salesforce, ServiceNow, and Workday are each shipping native agent governance inside their own surfaces. The question for any enterprise running a polyglot agent stack is whether to standardize on one of the platform offerings or run a horizontal control plane like Noma or Astrix on top.
The maturity of the category should not be overstated. Gartner explicitly notes that "today, guardian agent deployments are mainly prototypes or pilots." Only 12% of organizations have a dedicated AI governance structure; 55% have no framework at all. Eighty-six percent lack visibility into AI data flows. Just 19% of enterprises classify agents as equivalent to human insiders in their governance frameworks — a gap that, given the McKinsey breach, is now a board-level question rather than a security-team one. The vendors will compete; the buyer needs a framework to evaluate them.
Framework #1: The 25-Point AI Agent Access Control Readiness Assessment
This is the question every CIO and CISO should be able to answer in a single meeting: where does our agent governance actually stand? Score five dimensions, five points each, total out of 25.
Dimension 1 — Inventory & Discovery (0-5 points)
| Score | Capability |
|---|---|
| 0 | No idea how many agents or MCP servers are running |
| 1 | Manual spreadsheet, updated quarterly |
| 2 | Automated discovery for one platform (e.g., AWS) |
| 3 | Automated discovery across cloud + SaaS + code repos |
| 4 | Real-time agentic registry covering 80%+ of estate |
| 5 | Complete continuous registry with ownership, risk context |
Dimension 2 — Agent Identity (0-5 points)
| Score | Capability |
|---|---|
| 0 | Agents inherit creator credentials (Maker's Identity problem) |
| 1 | Shared service accounts for agents |
| 2 | Per-agent credentials, manually issued |
| 3 | Automated agent credential issuance, no scoping |
| 4 | Per-agent identity with least-privilege scoping |
| 5 | Per-agent identity + automated credential rotation + attestation |
Dimension 3 — Policy Granularity (0-5 points)
| Score | Capability |
|---|---|
| 0 | Allow-all or block-all decisions only |
| 1 | Policies at the application level |
| 2 | Policies at the MCP-server level |
| 3 | Policies at the tool-within-server level |
| 4 | Tool + agent type + user + team + environment policies |
| 5 | Three-state model (approved / requires review / blocked) at every level |
Dimension 4 — Runtime Enforcement (0-5 points)
| Score | Capability |
|---|---|
| 0 | No runtime monitoring of agent behavior |
| 1 | Logging only, no detection |
| 2 | Detection for known patterns (data exfil signatures) |
| 3 | Prompt injection + scope violation detection |
| 4 | Full behavioral chain monitoring with anomaly detection |
| 5 | Mid-workflow intervention (kill specific actions, not whole sessions) |
Dimension 5 — Containment & Audit (0-5 points)
| Score | Capability |
|---|---|
| 0 | Cannot terminate a misbehaving agent |
| 1 | Manual termination via platform console (hours) |
| 2 | Automated termination on hard rules (minutes) |
| 3 | Termination + comprehensive logs across data channels |
| 4 | Termination + audit-grade logs + replay capability |
| 5 | Termination + audit + automatic root-cause attribution |
Scoring:
- 0-9 points: Critical. You are in the 60% that cannot terminate a misbehaving agent. A McKinsey-style incident is a matter of when, not if. Stop new agent deployments until you reach 15.
- 10-14 points: Low maturity. You have visibility but cannot enforce. Focus the next quarter on identity (Dimension 2) and runtime (Dimension 4).
- 15-19 points: Medium maturity. You can prevent most incidents but will lose forensics on the ones that slip through. Invest in audit (Dimension 5).
- 20-25 points: High maturity. You are in the top 5% of the market today. Maintain pace as agents proliferate.
The 65% incident rate maps almost exactly to the percentage of organizations scoring below 15. Score honestly.
Framework #2: The 8-Week MCP Security Implementation Roadmap
Once you know where you are, the question is what to do next. This is a roadmap built around the controls Noma, Oasis, Astrix, and the major IAM platforms are converging on. It assumes a mid-sized enterprise (1,000-10,000 employees) with an existing IAM stack.
Weeks 1-2: Discovery & Baseline
- Deploy a horizontal agent discovery tool across cloud, SaaS, and code repos.
- Produce a single-page inventory: how many agents, which MCP servers, which credentials they use, who owns them.
- Map every agent to a business owner (no orphan agents).
- Success criterion: registry covers 80%+ of agents in production.
Weeks 3-4: Identity & Credential Hygiene
- Issue a distinct identity for every agent (no shared service accounts).
- Rotate every credential currently shared between agents and humans.
- Apply least-privilege scoping: each agent gets the minimum tools and data sources needed.
- Success criterion: zero agents using Maker's Identity credentials in production.
Weeks 5-6: Policy & Tool-Level Control
- Move from app-level to tool-level policies. For each MCP server, declare which tools are approved for which teams.
- Implement the three-state model (approved / requires review / blocked) for all new agent connections.
- Wire the policy engine into the agent platform (Claude Code, Cursor, Copilot, custom).
- Success criterion: every agent connection passes through a policy check at runtime.
Weeks 7-8: Runtime Enforcement & Drill
- Turn on runtime behavioral monitoring (prompt injection, scope violations, data exfiltration patterns).
- Run a tabletop exercise: simulate the McKinsey scenario in a non-prod environment. Can you detect it? Can you stop it? How long does the kill take?
- Stand up a comprehensive audit log across all agent sessions, retained for the regulatory window that applies to you (GDPR, HIPAA, SOX, EU AI Act).
- Success criterion: a simulated breach is detected and contained within 10 minutes, with audit-grade evidence.
Common challenges and fixes (in order of how often they appear in production):
- Agents owning their own credentials. Fix: issue every agent a managed identity from the start. Treat the Maker's Identity problem as a single-day project, not a year-long migration.
- Policy proliferation. Fix: standardize on tool-level templates ("read-only finance," "engineering build agent") rather than writing bespoke policies per agent.
- Logging gaps. Fix: route every MCP call through a gateway that produces audit-grade logs, even if it adds latency. Sixty-seven percent of organizations cannot today.
- Vendor sprawl. Fix: pick a horizontal control plane (Noma, Astrix, Oasis) and treat platform-native governance as a fallback, not a primary.
- Executive sponsorship gap. Fix: tie the project to a board-level metric (e.g., breach-cost exposure or EU AI Act readiness). Without that, budget will lose to feature work.
Case Study: What the McKinsey Lilli Breach Should Teach Every CISO
The McKinsey breach, reconstructed by multiple security researchers including NeuralTrust and Salt Security, is the cleanest worked example of where this is going.
The attack vector. An autonomous agent built by CodeWall — not a human attacker — selected McKinsey as a target, mapped the attack surface, identified a SQL injection class vulnerability, and iterated through fifteen blind SQL injections, each one informed by the error from the prior attempt. OWASP ZAP, which McKinsey ran, did not detect any of them. The agent escalated to full read-write access to the production database in under two hours.
What was lost. 46.5 million chat messages. 728,000 files. 57,000 user accounts. 3.68 million RAG document chunks. Most dangerously, write access to the system prompts that governed how Lilli responded to 43,000 employees making strategic decisions for clients. If the attacker had silently rewritten the prompts rather than exfiltrating data, every recommendation Lilli produced for months afterward could have been compromised — and McKinsey would not have known.
The lessons. First, traditional application security tooling was not designed for agentic attackers. Static analysis and signature-based scanners can be brute-forced by a patient, iterating agent. Second, system prompts are now critical assets and must be stored, audited, and accessed under the same controls as production code or customer data — not in a chatbot's working database. Third, the cost of "we did not know an agent could do this" is now public, and the headlines mention the firm's name.
The cost. McKinsey has not disclosed the financial impact, but the BVP analysis pegs comparable incidents at the high end of IBM's range — $4.63 million in direct cost, plus reputational damage. For a firm whose business is the trustworthy handling of client strategy, that second number is the one that compounds.
The actionable takeaway is in the post-breach analysis from PointGuard AI: every control on the 25-point readiness assessment above would have either prevented or contained the breach. Lilli scored, generously, around 7. Three of the five dimensions were near zero.
What to Do About It
For CIOs and CISOs. Run the 25-point assessment against your current estate this quarter. If you score below 15, pause new agent deployments to non-engineering teams until you reach 15. Pilot a horizontal access-control platform (Noma, Astrix, or Oasis) against your three highest-risk agents — typically the ones touching customer data, source code, or financial systems. Build a guardian agents capability into the cybersecurity roadmap; this is no longer optional. If you are running MCP, audit your STDIO transport against the Ox Security advisory and pin every MCP package to a known-good version.
For CFOs. Ask three questions at the next budget review. How many agents do we run? What is the average shadow-AI breach cost for our size and industry? What is our exposure under the EU AI Act if an agent exfiltrates regulated data? If any of those answers is "we don't know," the security line for next year is going to be wrong. The 17x mismatch between AI tooling spend and AI security spend is the right metric to put on the slide.
For COOs and Business Leaders. Treat AI agents as production infrastructure, with named owners, defined constraints, and monitored behavior — in that order. Push back on any team proposing to roll out agents on shared credentials or with allow-all policies. The Bessemer playbook is clear: launch with minimal permissions and expand deliberately. Every agent that ships without an owner becomes a McKinsey-class incident waiting for a CodeWall-class attacker.
The category that Noma's June 2 launch helped define did not exist eighteen months ago. The 65% of enterprises that have already taken an incident are the proof that it should have.
