Two-thirds of the people accountable for enterprise AI cannot actually see what it is doing.
That is the uncomfortable headline buried inside IBM's latest CIO/CTO study, released June 8, 2026. The IBM Institute for Business Value, working with Oxford Economics, surveyed 2,000 senior technology executives across 33 geographies and 19 industries between January and April. The verdict is consistent across every region and sector they touched: 67% of CIOs and CTOs report they are personally accountable for AI systems they do not fully control. Only 11% feel ready for the scale of agent deployment their own CEOs are demanding. And 70% admit business units are now deploying AI faster than IT can track.
This is not a story about emerging risk. It is a story about an accountability gap that has already opened, while spending on AI heads from 15% of IT budgets in 2025 to nearly 25% by 2027 — a 71% increase that lands squarely on the same executives who say they cannot see what they are buying. For CIOs, CFOs, and boards trying to decide what to do this quarter, the IBM data is the clearest snapshot yet of where the operating model is breaking — and what the top performers are doing differently.
What the IBM Study Actually Found
The headline number — 67% accountable for systems they cannot fully control — is the lead. But the rest of the data set is where the operational picture sharpens.
Adoption is outpacing oversight by every measure. 77% of surveyed executives report AI adoption is moving faster than their governance capability. 80% say they are operating under a CEO-driven AI transformation mandate. Only 11% describe themselves as "fully prepared" for the 38% increase in deployed AI agents that they themselves expect by 2027. That gap — 80% mandated, 11% ready — is the cleanest definition of the control gap IBM coined.
The financial blind spot is just as wide. 84% of tech CxOs have not operationalized financial management for AI. 85% lack real-time visibility into AI spend. This is happening as global AI spending climbs to a forecasted $2.59 trillion in 2026, per Gartner — a 47% jump year-over-year, with AI agent software alone hitting $206.5 billion in 2026 and $376.3 billion in 2027.
Incidents are already routine. Surveyed organizations reported an average of 54 AI agent incidents per year — events where an unintended or harmful outcome required human correction. Of those, 17% were high-severity (more than four hours to contain). The breakdown of incident types is the part that should land in every board deck:
- 37% resulted in data exposure or security breaches
- 33% caused cascading system failures
- 17% triggered compliance issues
Security and compliance is now the top scaling barrier, named by 59% of respondents. That tracks with separate data showing shadow AI breaches cost an average of $670,000 more than standard breaches, and that 98% of organizations report some level of unsanctioned AI use today.
The high performers look completely different. The most important number IBM published is the gap between organizations that bolt governance on after the fact and those that embed control directly into their AI systems. The embedded-control group runs:
- 25% fewer incidents than peers using manual governance
- 16x more AI agents deployed
- 18% higher operating margins
- 4x lower AI budget spend relative to scale
- 2.4x more AI agents within the same budget envelope
- 10% higher ROI when systems are designed for workload portability
That is not a marginal advantage. It is two different operating models — and the gap is widening every quarter that the laggards keep treating AI governance as a documentation exercise.
Why This Matters — For CTOs and CFOs Alike
Technical Implications (CTO / CIO)
The IBM data is essentially a verdict on AI architecture choices made over the last 18 months. Most enterprise AI portfolios were assembled in pieces — a chatbot here, a copilot there, an agent platform somewhere else — without a unifying control plane. The result is what IBM CIO Matt Lyteson described in the report: governance bolted on after the fact, running at human speed, against systems that run at machine speed.
That mismatch is now showing up in the incident logs. When 37% of AI incidents result in data exposure and 33% cause cascading failures, the underlying technical problem is almost always the same: agents have access to systems and data their controllers cannot fully audit in real time. Logging is incomplete. Identity boundaries are weak. Tool-use telemetry is fragmented across vendors. Manual governance — committees, quarterly reviews, attestation forms — cannot keep up with workloads that spin up new model calls thousands of times an hour.
The architectural answer the IBM report points to is unmistakable: embed control as a primitive of the AI platform, not a wrapper around it. That means identity-aware agent runtimes, native policy engines that travel with the workload, evaluation pipelines that run continuously rather than at release, and a single observability layer that spans every model and every agent — including the ones business units shipped without telling IT. Without that, the 70% "business is deploying faster than IT can track" finding will keep growing.
Business Implications (CFO / CMO / COO)
The financial story is sharper. AI spend is on its way to a quarter of the entire IT budget, but 85% of the people writing the checks cannot see, in real time, where the money is going. That is a degree of opacity that no other meaningful enterprise spend category tolerates. No CFO accepts 85% blindness on cloud, on contractor spend, or on travel.
The risk is twofold. First, dollars are being spent on agents and pilots that nobody is measuring against a business KPI. McKinsey's most recent state of AI work found that top performers are nearly three times more likely to fundamentally redesign workflows around AI — 55% versus 20% for the rest — and workflow redesign is the single largest driver of EBIT impact from generative AI. Spend without redesign is largely sunk cost.
Second, the boards and audit committees are starting to ask questions. McKinsey's 2026 board-governance work argues that AI governance is moving into the same review tier as financial audit and executive succession. Gartner, in a separate May 2026 analysis, predicts that by 2027 40% of enterprises will demote or decommission autonomous AI agents because of governance gaps identified only after production incidents. The CFO impact is direct: those decommissioned agents represent stranded investment, sunk transformation budget, and — in the worst cases — regulatory penalty.
Market Context: The IBM Findings Are Not an Outlier
The IBM study lands in a market that has been quietly converging on the same finding for nine months.
Gartner. Gartner forecasts that 40% of enterprise applications will integrate task-specific AI agents by end of 2026, up from less than 5% at the start of the year. But Gartner's governance pessimism is hard to overstate: more than 40% of agentic AI projects will be canceled by the end of 2027, the firm warns, with the three drivers being escalating costs, unclear business value, and inadequate risk controls. Gartner's prescription — and this matches the IBM control-gap framing — is to abandon uniform governance and instead apply proportional governance that classifies agents by autonomy level, with each level mapped to a different trust boundary and a different set of controls.
McKinsey. McKinsey's 2026 AI Trust Maturity Survey reports that average Responsible AI maturity rose from 2.0 to 2.3 year-over-year, but only about a third of organizations reach maturity 3 or higher in strategy, governance, and agentic AI governance specifically. The McKinsey takeaway echoes IBM's: "the tools for building AI systems have outpaced the tools for governing them."
Stanford. Stanford's Enterprise AI Playbook, which analyzed 51 successful enterprise deployments earlier this year, identifies operating-model change — not model selection — as the dominant predictor of successful scaling.
Shadow AI data. Separate research compiled by Authentech, Vectra, and others puts unsanctioned AI use at 98% of enterprises, with 86% of organizations reporting no visibility into the resulting data flows and only 24.4% reporting full visibility into AI agent communications. That is the operational fabric the IBM 70% finding actually describes.
The signal is consistent across every credible analyst tracking the space: enterprises are not failing at AI model quality. They are failing at the operating model around AI.
Practical Framework #1: The 25-Point AI Control Gap Assessment
The IBM data implies a five-dimensional definition of "control." Below is a 25-point readiness assessment your team can run in 45 minutes — five dimensions, five points each, scored 1 (none) to 5 (mature).
Dimension 1: Visibility (5 points)
Score 1 for each capability your enterprise can actually demonstrate today:
- Real-time inventory of every AI agent and model in production, regardless of which business unit deployed it
- Per-agent cost telemetry feeding into the finance system within 24 hours
- Continuous logging of every model call, with prompt and output captured
- Single dashboard for AI incident reporting across all platforms
- Quarterly third-party discovery scan of shadow AI usage
Dimension 2: Identity & Access (5 points)
- Every AI agent has a unique service identity, not a shared key
- Agent-to-data access is mediated by short-lived tokens, not standing credentials
- Least-privilege scoping enforced at the tool/MCP/connector layer
- Identity events (agent creation, scope change) flow into the SIEM
- Joiner-mover-leaver process covers agent identities, not just humans
Dimension 3: Policy & Guardrails (5 points)
- Written acceptable-use policy specific to AI agents (not generic AUP)
- Runtime policy engine that blocks high-risk actions before they execute
- Data classification enforced at the prompt layer (no PII into ungoverned models)
- Per-agent risk tier with proportional controls (Gartner-style autonomy levels)
- Documented exception process with named approver and time limit
Dimension 4: Financial Control (5 points)
- AI spend categorized in the GL at the same granularity as cloud
- Budget owner identified for every model, agent, and pilot
- Anomaly detection on AI spend (spike alerts to FinOps)
- ROI hypothesis documented before pilot kickoff, measured after
- Sunset criteria written for every AI project at the time of approval
Dimension 5: Incident Readiness (5 points)
- AI incident response runbook tested in the last six months
- Defined severity tiers (matching IBM's 17% high-severity threshold)
- Human-in-the-loop circuit breakers on autonomous actions
- Post-incident review process that updates policy, not just patches
- Board-level reporting on AI incident trend, not just count
Scoring band:
- 0–9: Critical Gap. You are part of the 89% that IBM says is not ready. Pause new agent rollouts until visibility and identity are addressed.
- 10–14: Early Stage. Typical of organizations IBM describes as operating with manual governance. Expect roughly 54 incidents per year and 17% high-severity rate.
- 15–19: Operationalizing. You are starting to look like the embedded-control cohort. ROI lift is visible.
- 20–25: Embedded Control. This is the 11% IBM identifies as fully ready. Continue investment in observability and policy automation; you are positioned to scale 16x more agents than peers.
The first time most enterprises run this, the score is in the low teens. The shock is useful: it concentrates the next quarter's roadmap on the dimensions that are actually empty.
Practical Framework #2: 12-Week Control-Gap Closure Plan
Once the assessment surfaces the gaps, the next question is sequencing. Below is a 12-week plan structured around the IBM finding that embedded control beats bolted-on governance by 16x deployment scale and 4x lower budget.
Weeks 1–3: Discover
- Week 1: Run the 25-point assessment with CIO, CISO, CFO, and at least two business-unit AI leaders in the room.
- Week 2: Commission a one-time shadow AI discovery — endpoint, network, and SaaS-side. Expect the resulting list to be 3–5x what IT currently tracks.
- Week 3: Build the master inventory: every model, every agent, every connector, owner, business case, and current spend.
Success criteria: Single inventory, fewer than 30 unknown agents remaining, named owner for every line item.
Weeks 4–6: Stabilize
- Week 4: Issue per-agent service identities. Rotate or revoke every shared key found in discovery.
- Week 5: Stand up centralized logging — every model call, every tool call, captured to a single store.
- Week 6: Define and publish three autonomy tiers (read-only, sandboxed action, production action) with the controls each tier requires.
Success criteria: Zero shared keys in production; >90% of agents mapped to a tier; logging coverage above 80%.
Weeks 7–9: Embed
- Week 7: Deploy a runtime policy engine. Start with two non-negotiable blocks: PII out of ungoverned models, and production writes from non-production agents.
- Week 8: Wire AI spend into the FinOps platform with the same granularity as cloud. Turn on anomaly alerts.
- Week 9: Pilot a continuous-evaluation harness on the three highest-value agents.
Success criteria: Policy engine blocking real events in production; AI spend visible in the FinOps dashboard within 24 hours of incurring.
Weeks 10–12: Govern
- Week 10: Run a tabletop exercise on the AI incident runbook with the actual on-call team.
- Week 11: Build the board-level dashboard: incident count by severity, spend trend by business unit, ROI realization vs. business case.
- Week 12: Re-run the 25-point assessment. Compare to Week 1.
Success criteria: Re-assessment shows a minimum 8-point improvement; first board readout delivered.
The common failure modes are predictable: skipping Weeks 1–3 because "we already know what we have" (you don't — see the 70% IBM finding); trying to deploy a policy engine before the inventory is clean; treating Week 10's tabletop as optional. Each of these failure modes maps directly to one of the incident categories IBM measured.
Case Study: Allianz Spain and the Light-and-Dark Side
The IBM report contains an unusually candid quote from Victoria Medina, Chief Technology Officer of Allianz Spain: "AI has both a light side and a dark side. Many organizations are more exposed than they realize." Allianz's response, profiled in the IBM data set, is one of the cleanest illustrations of the embedded-control model.
The starting point was familiar. Allianz had multiple business units deploying AI capabilities — claims triage, underwriting copilots, customer-service automation — each with its own model vendor, its own access pattern, and its own logging. The risk picture was opaque even to the CTO. The "light side" — measurable productivity lift in claims handling — was visible in quarterly metrics. The "dark side" — exposure to data leakage, model drift, and uncontrolled agent action — was visible only after near-misses.
The Allianz Spain approach mirrored the IBM high-performer profile. They embedded controls at the platform layer — a unified policy and observability fabric across vendors — rather than relying on each business unit to govern its own agents. They invested in workload portability so that models and agents were not locked to a single vendor's control plane, an architectural choice IBM found correlated with 10% higher ROI. And they treated governance as a product, with a roadmap, a backlog, and continuous deployment, rather than as an annual review cycle.
The IBM data set does not publish line-item financials for Allianz, but the cohort effect is unambiguous: enterprises that followed this pattern deployed 16x more agents than peers, ran 25% fewer incidents, and operated at 18% higher margins. Allianz, Airbus, Banco BPI, Baylor Scott & White Health, Roush, and Volkswagen Group UK — all named in the IBM report — describe versions of the same operating model: control as a primitive, not an overlay.
For comparison, Booking Holdings publicly targeted $450 million in savings from AI-driven workflow redesign across customer service and backend operations. The mechanism in both cases is the same — the difference between AI as a feature add and AI as a re-architected operating model is the entire ROI gap.
What to Do About It
For CIOs (Technical Next Steps)
Run the 25-point assessment this week. Do not delegate it — the data IBM published makes clear that the score will be lower than executives believe, and the surprise is part of the value. Then build the inventory; you cannot govern what you cannot list. Prioritize identity and centralized logging before policy enforcement, because policy without visibility is performative. Set a Q3 target of moving from the early-stage band (10–14) to operationalizing (15–19); that is a realistic single-quarter jump if the executive cover is there.
For CFOs (Financial Next Steps)
Demand FinOps-grade visibility on AI spend at the same cadence you have on cloud. If 85% of your peers cannot see their AI spend in real time, the differentiator is not exotic — it is just doing what you already do for IaaS. Tie every AI pilot budget to a written ROI hypothesis and a written sunset date; the Gartner forecast that 40% of agent projects will be canceled by 2027 is a budget-planning input, not a tail risk. Re-rate AI projects from "innovation" to "operational" in the GL once they exit pilot, so the spend lands where it should.
For Business and Board Leaders (Strategic Next Steps)
Treat the 11% readiness figure as a positioning opportunity, not a problem. The embedded-control cohort is small enough that disciplined execution over the next four quarters meaningfully changes competitive position — 16x more agents, 18% better margins, 4x more efficient spend. Ask the board AI committee for the 25-point score this quarter and the trend line next quarter. And reset the CEO-mandate framing: the goal is not "more AI faster"; it is "more controlled AI faster," because the IBM data is now unambiguous that the second one wins.
The AI control gap is not a new category of risk. It is the inevitable result of treating governance as paperwork while treating AI as infrastructure. The 11% who closed it are already pulling away. The remaining 89% have one more quarter, maybe two, to decide which side of that gap they want to be on.
Continue Reading
- Cisco Cloud Control & AgenticOps: 95% of AI Pilots Stuck
- AI Agent Access Control: Noma, MCP, and the 65% Breach Problem
- AI Vendor Bankruptcy Crisis: A CIO Due Diligence Framework
- Microsoft IQ + Work IQ APIs: The Enterprise AI Context Layer
- Azure Agent Mesh & Windows Agent Framework: The $15M TCO Question
