Agentic AI Data Foundation Enterprise AI AI Readiness Data Governance CIO Strategy Fivetran AI ROI

85% of AI Agents Run on Data That Isn't Ready

Fivetran 2026 study: 60% of enterprises spend millions on agentic AI, but only 15% have the data foundation. Score your readiness with a 25-point gap assessment.

By Rajesh Beri·May 10, 2026·16 min read

THE DAILY BRIEF

Agentic AIData FoundationEnterprise AIAI ReadinessData GovernanceCIO StrategyFivetranAI ROI

85% of AI Agents Run on Data That Isn't Ready

Fivetran 2026 study: 60% of enterprises spend millions on agentic AI, but only 15% have the data foundation. Score your readiness with a 25-point gap assessment.

By Rajesh Beri·May 10, 2026·16 min read

The numbers came out on May 5, 2026, and they should reset every CIO's agentic AI plan. Fivetran and Redpoint Ventures surveyed 400 data leaders across the United States, United Kingdom, EMEA, and Asia-Pacific and found that 60% of enterprises are investing millions to tens of millions in agentic AI, while only 15% have the data foundation to run it securely and effectively at scale (Fivetran 2026 Agentic AI Readiness Index). That is the 85/15 gap. It is not a model gap. It is not a talent gap. It is a plumbing gap, and it now explains nearly every "why didn't our pilot make it to production?" post-mortem in the enterprise.

The reason this matters now: 41% of those same enterprises already have agents in production despite the readiness gap. They are scaling on infrastructure they would not have approved if they had read the survey first. Gartner's complementary forecast — that over 40% of agentic AI projects will be canceled by the end of 2027 (Gartner press release, June 2025) — stops feeling like analyst pessimism once you map it onto Fivetran's data. The cancellations are not coming from bad models. They are coming from data foundations that cannot keep up with what an agent actually does.

This piece unpacks the survey, builds a 25-point readiness assessment so a CIO or CDO can score their organization in the next hour, and lays out a four-quarter remediation roadmap calibrated to where most enterprises actually start. If you are about to greenlight another agent pilot, run this first.

What the Fivetran Survey Actually Found

The methodology matters because it explains who the numbers describe. Redpoint Ventures fielded the survey for Fivetran in early 2026, sampling 400 data professionals — data architects, data engineers, analytics leaders, and decision-makers responsible for data infrastructure and AI systems. Coverage spans the United States, United Kingdom, EMEA, and Asia-Pacific, with company-size thresholds of 2,000+ employees in the US and EMEA and 500+ in Japan, Australia, and Singapore. Industries represented include technology, financial services, healthcare, retail, and manufacturing (UK Tech News breakdown).

The headline numbers are bracing:

15% of organizations are fully prepared to support agentic AI in production.
60% are already investing millions to tens of millions in agentic AI initiatives.
41% are running agents in production despite acknowledged gaps in data reliability, governance, and interoperability.
42% name data quality and lineage as their primary obstacle.
39% name regulatory compliance and sovereignty as a primary obstacle.
39% name security and privacy risk.
65% would heavily restrict or block AI vendors that lack governance and sovereignty controls.
86% consider platform extensibility and interoperability important or critical, but only 17% treat interoperability as critical when actually picking tools.
The average composite Agentic AI Readiness Index score is 61–62% — meaning the median enterprise is two-thirds of the way to ready, with one-third of the foundation still missing.

The most uncomfortable contrast in the data is the ROI line. 98% of fully prepared organizations report strong confidence in their AI ROI. Only 16% of the least prepared report the same. The data foundation is not just a precondition for safety. It is a precondition for measurable returns (Fivetran blog).

George Fraser, CEO of Fivetran, framed the issue bluntly: "Most companies aren't failing at AI because of the models — they're failing because their data isn't ready." Co-founder and COO Taylor Brown was sharper still on the production risk: "The challenge with AI agents isn't that they make mistakes. It's that they can make the same mistake across a business, instantly and at scale."

Why This Matters: Two Audiences, One Bill

Technical Implications (CIO, CTO, CDO)

For technical leaders, the survey is a quiet warning that the entire integration pattern most enterprises used to scale chatbots will not scale to agents. Chatbots read. Agents write. They update CRMs, send messages, push quotes, file expense reports, modify ledgers. That pattern shift is what changes the data-foundation requirements.

A 2025-era data warehouse with batch ETL, partial lineage, and human-mediated governance was sufficient for a Q&A copilot. It is not sufficient for an agent that touches customer records before a human reviews the output. Datadog's April 2026 LLM Observability research surfaced the same pattern under a different label, calling it the "silent failure problem": agents return outputs that look correct while sourcing from stale, fragmented, or mis-permissioned data, and the failures are invisible until they compound (Datadog report coverage). Cloudera's separate 2026 research found that nearly 80% of enterprises say AI is held back by data access challenges specifically — fragmented sources, brittle pipelines, and unclear ownership (Cloudera press release, April 2026).

The four technical capabilities Fivetran identified as the foundation for ready-state organizations are not exotic. They are the base layer of any modern data platform: automated data movement that keeps information current, end-to-end lineage visibility, cross-system interoperability, and governance that defines exactly which agent can read which dataset and which actions require human review. The discipline is not new. The standard is.

Business Implications (CFO, CEO, COO)

For business leaders, the 85/15 gap is a budgeting problem disguised as a technology problem. Enterprises in the bottom quartile of the readiness index are spending similar amounts on agent tooling as those in the top quartile — and earning nothing like the same return. The 98% versus 16% ROI confidence split is the most expensive line item nobody is putting on a slide.

Gartner's broader 2025–2027 forecast lays out the destination: over 40% of agentic AI projects will be canceled by the end of 2027, driven by "escalating costs, unclear business value, or inadequate risk controls." Gartner also estimates that of the thousands of agentic AI vendors in market, only about 130 are real — the rest are repackaged RPA, chatbots, and AI assistants under what Gartner now calls "agent washing" (CDO Magazine summary). A CFO funding ten agent pilots in 2026 should expect, on a base-rate basis, four to be canceled, three to deliver disappointing returns, and three to deliver. The premium for being in the top group is not paid in algorithms. It is paid in pipelines.

The bigger strategic risk is reputational. Brown's quote about agents making the same mistake "instantly and at scale" is the failure mode regulators will eventually price in. If an agent miscategorizes a customer because lineage is broken and it fans that out across 50,000 records before anyone notices, the cost is not the rework — it is the regulatory disclosure, the customer-trust hit, and the one-line headline.

Market Context: Where the 85/15 Sits in 2026

The Fivetran finding does not sit alone. It is consistent with a stack of 2026 surveys all pointing to the same gap from different angles:

Cloudera (April 2026): nearly 80% of enterprises say AI is held back by data access challenges (Cloudera).
MIT NANDA (mid-2025), still widely cited: 95% of generative AI pilots produce zero measurable impact, with data quality and integration consistently cited as the top blocker.
Writer (April 2026): 79% of enterprise AI adopters report significant challenges scaling, with data and governance leading the list (Writer survey).
Snowflake (2026): organizations realizing measurable ROI from agents share a near-universal trait of automated data movement and unified governance.
Datadog (April 2026): ~5% of agent requests already fail silently in production, with most root-causes traced to upstream data — not the LLM itself (BigDATAwire summary).

The analyst consensus around the 85/15 gap is converging. Gartner's view is that data limitations are a roadblock for eight in ten companies trying to scale agents, with fewer than 10% of enterprises achieving measurable scale even though nearly two-thirds are experimenting. Forrester and IDC are both publishing 2026 guidance that essentially says the same thing: the AI infrastructure gap is no longer about GPUs or model access — it is about whether the data-ready base layer exists. Where vendors disagree is on packaging. Fivetran sells the pipeline. Snowflake and MongoDB sell the unified store. Databricks sells the governed lakehouse. Atlan, Acceldata, and OvalEdge sell the metadata layer (Atlan AI governance framework). They are all pointing at the same hole. The CIO's job is to look at the hole, not at whose shovel.

Framework #1: The 25-Point Agentic AI Data Foundation Readiness Assessment

The Fivetran survey gives the headline. It does not give a way to score your own organization in an hour. This framework does. Score five dimensions, five points each. Total of 25. The thresholds at the bottom are calibrated to the survey's findings on what separates ready-state organizations from the rest.

Dimension 1: Data Freshness & Movement (5 points)

Score one point for each that applies:

We have automated, scheduled pipelines for every data source an agent reads from (no manual exports).
The maximum lag between source-system change and AI-accessible data is known and documented for every critical dataset.
We can demonstrate end-to-end pipeline monitoring with alerting for stale or stalled sources.
Schema changes upstream propagate automatically without breaking downstream agent consumers.
We have a tested rollback path if a pipeline fails mid-run.

Dimension 2: Lineage Visibility (5 points)

We can trace any output a customer-facing agent produces back to the source records that fed it.
Lineage is documented at the column level for every dataset that touches a regulated workflow.
A non-engineer (auditor, legal, compliance) can pull lineage for any agent decision without a ticket.
Lineage diagrams are refreshed automatically — not maintained manually in Confluence.
We have run a "blast radius" exercise in the last 90 days that simulates one bad upstream record.

Dimension 3: Governance & Access Controls (5 points)

Every agent has a distinct identity (not a shared service account).
Permissions are scoped per agent, not inherited from a generic "AI user."
Read access, write access, and external-action access are governed by separate policies.
High-risk actions (financial, customer-facing communications, irreversible writes) require human-in-the-loop or two-agent verification.
Every agent action is logged in a tamper-evident audit trail.

Dimension 4: Cross-System Interoperability (5 points)

Agents access data through a unified semantic or knowledge layer, not through bespoke integrations per source.
Vendor lock-in to a single AI cloud is mapped and quantified (we know the cost and time of leaving).
Open standards (MCP, A2A, OpenAPI) are preferred over proprietary connectors for new builds.
We can replace any one model provider in our stack without rewriting more than 20% of the agent code.
Data sovereignty controls are honored when an agent crosses regional boundaries.

Dimension 5: Quality & Reliability (5 points)

We have automated data-quality tests on every dataset that feeds an agent (freshness, nulls, schema drift, duplicates).
We track agent-output quality with the same rigor as model accuracy (precision, recall, hallucination rate).
Silent-failure detection (output looks right, source was wrong) is a defined responsibility, not a hope.
We have a documented kill-switch and rollback path for every production agent.
Production rollback in the last 12 months has happened fewer than once per agent (the 2026 enterprise base rate is 41% having at least one rollback — beating that average is the bar).

Scoring

0–9 points: Not ready. Stay in sandbox. Production agents at this score will produce the silent failures the Datadog research describes — and will be in the 40% Gartner cancellation cohort.
10–14 points: Pilot-ready, not production-ready. Internal-only agents on read-only datasets are acceptable. External-facing or write-capable agents are not.
15–19 points: Limited production. Agents can ship to production with strict scoping (single domain, capped blast radius, human-in-the-loop on writes).
20–25 points: Scale-ready. You are in the 15% Fivetran identified. Expand carefully — the next failure mode is governance debt, not data debt.

If your honest score is below 15, the right move is not "ship the next agent pilot." It is to fix the foundation in parallel with one tightly scoped pilot.

Framework #2: 12-Month Data Foundation Remediation Roadmap

If the assessment landed you below 20, this is the operating sequence. It is calibrated to the four ready-state capabilities Fivetran identified and to the realistic budget cycles of a 2,000-plus-employee enterprise.

Quarter 1 (Months 1–3): Inventory and Lineage Audit

Success criteria: Every dataset that feeds an agent in production or a near-term pilot is inventoried, owned, and lineage-mapped at column-level for any regulated field.

Key activities:

Run a full agent-and-dataset inventory. Most enterprises discover 30–50% more agents in flight than the CIO knew about.
Classify every agent by blast radius (read-only, internal-write, external-write, financial).
Pick a metadata platform (Atlan, OvalEdge, Acceldata, native cloud catalog) and standardize lineage capture.
Identify the top five "sovereignty-sensitive" datasets and document the cross-border flow.

Quarter 2 (Months 4–6): Governance and Access Controls

Success criteria: Every production agent has a distinct identity, scoped permissions, an action-tier policy, and an audit log.

Key activities:

Issue per-agent identities. Retire shared service accounts.
Implement an action-tier model: read, write, communicate, financial. Different approval thresholds per tier.
Wire agent action logs into the SIEM or equivalent. Make the logs immutable.
Stand up an Agent Review Board with security, data, legal, and a business sponsor. Cadence: weekly.

Quarter 3 (Months 7–9): Pipeline Modernization

Success criteria: Every agent-feeding pipeline is automated, monitored, and documented to the lineage standard from Q1.

Key activities:

Replace manual exports with managed pipelines (Fivetran, Airbyte, native cloud, or rolled-your-own with proper SLAs).
Add data-quality tests at ingestion: freshness, schema drift, null thresholds, duplicate detection.
Implement rollback paths for each pipeline. Test them at least once.
Define and publish data freshness SLAs for every dataset that feeds a production agent.

Quarter 4 (Months 10–12): Production Hardening and Scale

Success criteria: You can stand up the 26th agent in the same week your team finishes the 25th, without re-litigating identity, lineage, or governance.

Key activities:

Implement silent-failure detection: shadow runs, golden datasets, output quality monitors.
Standardize an "agent onboarding" template — identity, dataset access, governance approvals, observability — that takes hours, not weeks.
Move at least three production agents through a full audit using the lineage system from Q1. Treat the gaps you find as Q4 backlog.
Re-score the 25-point assessment. The target is 20+. If you are still below, do not scale — repeat the weakest quarter.

This sequence is the operating analog of what the 15% of ready-state organizations have done. The order matters: lineage before governance, governance before pipeline expansion, pipeline expansion before scale. Most enterprises invert that and pay for it.

Case Study: How Klarna's Agent Worked — and Why the Data Foundation Was the Reason

Klarna's customer service AI is the most-cited enterprise agent success of the cycle: $60 million in annual savings, the work of 853 full-time agents, resolution time cut from 11 minutes to under two minutes, and a 25% drop in repeat inquiries (Klarna disclosures, Q3 2025; widely reported across enterprise AI ROI roundups in 2026). That outcome is not a model story. The model is mostly an OpenAI commodity. The story is the data foundation underneath it.

What Klarna did before the agent went live, in roughly the order this article recommends:

Unified the customer record. Every dataset the agent reads — order history, payment history, shipping events, dispute history — was available through a single semantic layer. No batch lag, no broken joins.
Wired lineage end-to-end. When the agent makes a refund decision, support leaders can trace it back to the exact records and policies that drove it.
Implemented a strict action-tier model. The agent can communicate, but financial actions above a threshold escalate to a human. Read access is broad; write access is narrow.
Built silent-failure monitoring. Output quality is sampled and reviewed daily. When upstream data shifts, the team finds out in hours, not weeks.

The Klarna takeaway is not "buy GPT." Almost anyone can. The takeaway is that the gap between Klarna and the bottom-quartile enterprise is mostly data plumbing, governance, and observability — the same four capabilities Fivetran's survey says 85% of enterprises do not yet have. JPMorgan's COiN, Morgan Stanley's DevGen.AI (280,000 developer hours reclaimed across 9 million lines of legacy code), and General Mills' supply chain agent ($20M+ in savings since FY2024) all rest on the same base. The model is the visible layer. The data foundation is the load-bearing one.

What to Do About It in the Next 30 Days

For different audiences, the next 30 days look different.

For CIOs and CTOs

Run the 25-point assessment yourself, then have your top three data leaders run it independently. The gap between scores is more diagnostic than the scores themselves — it surfaces where the organization has shared illusions about its own readiness. Then commit publicly to the four-quarter roadmap and tie one specific agent's go-live date to a Q1 lineage milestone. The roadmap dies in the first month it has no business-tied dependency.

For CFOs

Stop funding agent pilots one at a time. Re-bundle 2026 agent spend into two line items: foundation (lineage, governance, pipeline, observability) and applications (the agents themselves). Refuse to fund a new agent pilot unless its sponsor can name which dimension of the readiness assessment improves as a result. The 98% versus 16% ROI gap will be visible in the 2027 budget review, but only if 2026 budgeting traces foundation spend separately.

For Business Leaders and Sponsors

Adopt one rule: no agent goes external until it has scored ≥20 on the assessment. Agents that talk to customers or move money are the ones that turn into the regulatory headlines, and they are the ones that suffer most from the silent-failure pattern. Internal-only agents at lower scores are tolerable. Customer-facing agents at lower scores are not.

The one quote to take into the boardroom is Brown's: an AI agent will not just make mistakes, it will make the same mistake across the business, instantly and at scale. That is the executive framing for why data foundation spend in 2026 is risk reduction first and ROI multiplier second. Enterprises that internalize that order will join the 15%. The rest will fund Gartner's 40% cancellation forecast.

Continue Reading

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

thedailybrief.com

Subscribe at thedailybrief.com/subscribe for weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi | X: x.com/rajeshberi

85% of AI Agents Run on Data That Isn't Ready

Photo by Manuel Geissinger on Pexels

What the Fivetran Survey Actually Found

The headline numbers are bracing:

15% of organizations are fully prepared to support agentic AI in production.
60% are already investing millions to tens of millions in agentic AI initiatives.
41% are running agents in production despite acknowledged gaps in data reliability, governance, and interoperability.
42% name data quality and lineage as their primary obstacle.
39% name regulatory compliance and sovereignty as a primary obstacle.
39% name security and privacy risk.
65% would heavily restrict or block AI vendors that lack governance and sovereignty controls.
86% consider platform extensibility and interoperability important or critical, but only 17% treat interoperability as critical when actually picking tools.
The average composite Agentic AI Readiness Index score is 61–62% — meaning the median enterprise is two-thirds of the way to ready, with one-third of the foundation still missing.

Why This Matters: Two Audiences, One Bill

Technical Implications (CIO, CTO, CDO)

Business Implications (CFO, CEO, COO)

Market Context: Where the 85/15 Sits in 2026

The Fivetran finding does not sit alone. It is consistent with a stack of 2026 surveys all pointing to the same gap from different angles:

Cloudera (April 2026): nearly 80% of enterprises say AI is held back by data access challenges (Cloudera).
MIT NANDA (mid-2025), still widely cited: 95% of generative AI pilots produce zero measurable impact, with data quality and integration consistently cited as the top blocker.
Writer (April 2026): 79% of enterprise AI adopters report significant challenges scaling, with data and governance leading the list (Writer survey).
Snowflake (2026): organizations realizing measurable ROI from agents share a near-universal trait of automated data movement and unified governance.
Datadog (April 2026): ~5% of agent requests already fail silently in production, with most root-causes traced to upstream data — not the LLM itself (BigDATAwire summary).

Framework #1: The 25-Point Agentic AI Data Foundation Readiness Assessment

Dimension 1: Data Freshness & Movement (5 points)

Score one point for each that applies:

We have automated, scheduled pipelines for every data source an agent reads from (no manual exports).
The maximum lag between source-system change and AI-accessible data is known and documented for every critical dataset.
We can demonstrate end-to-end pipeline monitoring with alerting for stale or stalled sources.
Schema changes upstream propagate automatically without breaking downstream agent consumers.
We have a tested rollback path if a pipeline fails mid-run.

Dimension 2: Lineage Visibility (5 points)

We can trace any output a customer-facing agent produces back to the source records that fed it.
Lineage is documented at the column level for every dataset that touches a regulated workflow.
A non-engineer (auditor, legal, compliance) can pull lineage for any agent decision without a ticket.
Lineage diagrams are refreshed automatically — not maintained manually in Confluence.
We have run a "blast radius" exercise in the last 90 days that simulates one bad upstream record.

Dimension 3: Governance & Access Controls (5 points)

Every agent has a distinct identity (not a shared service account).
Permissions are scoped per agent, not inherited from a generic "AI user."
Read access, write access, and external-action access are governed by separate policies.
High-risk actions (financial, customer-facing communications, irreversible writes) require human-in-the-loop or two-agent verification.
Every agent action is logged in a tamper-evident audit trail.

Dimension 4: Cross-System Interoperability (5 points)

Agents access data through a unified semantic or knowledge layer, not through bespoke integrations per source.
Vendor lock-in to a single AI cloud is mapped and quantified (we know the cost and time of leaving).
Open standards (MCP, A2A, OpenAPI) are preferred over proprietary connectors for new builds.
We can replace any one model provider in our stack without rewriting more than 20% of the agent code.
Data sovereignty controls are honored when an agent crosses regional boundaries.

Dimension 5: Quality & Reliability (5 points)

We have automated data-quality tests on every dataset that feeds an agent (freshness, nulls, schema drift, duplicates).
We track agent-output quality with the same rigor as model accuracy (precision, recall, hallucination rate).
Silent-failure detection (output looks right, source was wrong) is a defined responsibility, not a hope.
We have a documented kill-switch and rollback path for every production agent.
Production rollback in the last 12 months has happened fewer than once per agent (the 2026 enterprise base rate is 41% having at least one rollback — beating that average is the bar).

Scoring

0–9 points: Not ready. Stay in sandbox. Production agents at this score will produce the silent failures the Datadog research describes — and will be in the 40% Gartner cancellation cohort.
10–14 points: Pilot-ready, not production-ready. Internal-only agents on read-only datasets are acceptable. External-facing or write-capable agents are not.
15–19 points: Limited production. Agents can ship to production with strict scoping (single domain, capped blast radius, human-in-the-loop on writes).
20–25 points: Scale-ready. You are in the 15% Fivetran identified. Expand carefully — the next failure mode is governance debt, not data debt.

If your honest score is below 15, the right move is not "ship the next agent pilot." It is to fix the foundation in parallel with one tightly scoped pilot.

Framework #2: 12-Month Data Foundation Remediation Roadmap

Quarter 1 (Months 1–3): Inventory and Lineage Audit

Success criteria: Every dataset that feeds an agent in production or a near-term pilot is inventoried, owned, and lineage-mapped at column-level for any regulated field.

Key activities:

Run a full agent-and-dataset inventory. Most enterprises discover 30–50% more agents in flight than the CIO knew about.
Classify every agent by blast radius (read-only, internal-write, external-write, financial).
Pick a metadata platform (Atlan, OvalEdge, Acceldata, native cloud catalog) and standardize lineage capture.
Identify the top five "sovereignty-sensitive" datasets and document the cross-border flow.

Quarter 2 (Months 4–6): Governance and Access Controls

Success criteria: Every production agent has a distinct identity, scoped permissions, an action-tier policy, and an audit log.

Key activities:

Issue per-agent identities. Retire shared service accounts.
Implement an action-tier model: read, write, communicate, financial. Different approval thresholds per tier.
Wire agent action logs into the SIEM or equivalent. Make the logs immutable.
Stand up an Agent Review Board with security, data, legal, and a business sponsor. Cadence: weekly.

Quarter 3 (Months 7–9): Pipeline Modernization

Success criteria: Every agent-feeding pipeline is automated, monitored, and documented to the lineage standard from Q1.

Key activities:

Replace manual exports with managed pipelines (Fivetran, Airbyte, native cloud, or rolled-your-own with proper SLAs).
Add data-quality tests at ingestion: freshness, schema drift, null thresholds, duplicate detection.
Implement rollback paths for each pipeline. Test them at least once.
Define and publish data freshness SLAs for every dataset that feeds a production agent.

Quarter 4 (Months 10–12): Production Hardening and Scale

Success criteria: You can stand up the 26th agent in the same week your team finishes the 25th, without re-litigating identity, lineage, or governance.

Key activities:

Implement silent-failure detection: shadow runs, golden datasets, output quality monitors.
Standardize an "agent onboarding" template — identity, dataset access, governance approvals, observability — that takes hours, not weeks.
Move at least three production agents through a full audit using the lineage system from Q1. Treat the gaps you find as Q4 backlog.
Re-score the 25-point assessment. The target is 20+. If you are still below, do not scale — repeat the weakest quarter.

Case Study: How Klarna's Agent Worked — and Why the Data Foundation Was the Reason

What Klarna did before the agent went live, in roughly the order this article recommends:

Unified the customer record. Every dataset the agent reads — order history, payment history, shipping events, dispute history — was available through a single semantic layer. No batch lag, no broken joins.
Wired lineage end-to-end. When the agent makes a refund decision, support leaders can trace it back to the exact records and policies that drove it.
Implemented a strict action-tier model. The agent can communicate, but financial actions above a threshold escalate to a human. Read access is broad; write access is narrow.
Built silent-failure monitoring. Output quality is sampled and reviewed daily. When upstream data shifts, the team finds out in hours, not weeks.

What to Do About It in the Next 30 Days

For different audiences, the next 30 days look different.

For CIOs and CTOs

For CFOs

For Business Leaders and Sponsors

Continue Reading

THE DAILY BRIEF

Agentic AIData FoundationEnterprise AIAI ReadinessData GovernanceCIO StrategyFivetranAI ROI

85% of AI Agents Run on Data That Isn't Ready

Fivetran 2026 study: 60% of enterprises spend millions on agentic AI, but only 15% have the data foundation. Score your readiness with a 25-point gap assessment.

By Rajesh Beri·May 10, 2026·16 min read

What the Fivetran Survey Actually Found

The headline numbers are bracing:

15% of organizations are fully prepared to support agentic AI in production.
60% are already investing millions to tens of millions in agentic AI initiatives.
41% are running agents in production despite acknowledged gaps in data reliability, governance, and interoperability.
42% name data quality and lineage as their primary obstacle.
39% name regulatory compliance and sovereignty as a primary obstacle.
39% name security and privacy risk.
65% would heavily restrict or block AI vendors that lack governance and sovereignty controls.
86% consider platform extensibility and interoperability important or critical, but only 17% treat interoperability as critical when actually picking tools.
The average composite Agentic AI Readiness Index score is 61–62% — meaning the median enterprise is two-thirds of the way to ready, with one-third of the foundation still missing.

Why This Matters: Two Audiences, One Bill

Technical Implications (CIO, CTO, CDO)

Business Implications (CFO, CEO, COO)

Market Context: Where the 85/15 Sits in 2026

The Fivetran finding does not sit alone. It is consistent with a stack of 2026 surveys all pointing to the same gap from different angles:

Cloudera (April 2026): nearly 80% of enterprises say AI is held back by data access challenges (Cloudera).
MIT NANDA (mid-2025), still widely cited: 95% of generative AI pilots produce zero measurable impact, with data quality and integration consistently cited as the top blocker.
Writer (April 2026): 79% of enterprise AI adopters report significant challenges scaling, with data and governance leading the list (Writer survey).
Snowflake (2026): organizations realizing measurable ROI from agents share a near-universal trait of automated data movement and unified governance.
Datadog (April 2026): ~5% of agent requests already fail silently in production, with most root-causes traced to upstream data — not the LLM itself (BigDATAwire summary).

Framework #1: The 25-Point Agentic AI Data Foundation Readiness Assessment

Dimension 1: Data Freshness & Movement (5 points)

Score one point for each that applies:

We have automated, scheduled pipelines for every data source an agent reads from (no manual exports).
The maximum lag between source-system change and AI-accessible data is known and documented for every critical dataset.
We can demonstrate end-to-end pipeline monitoring with alerting for stale or stalled sources.
Schema changes upstream propagate automatically without breaking downstream agent consumers.
We have a tested rollback path if a pipeline fails mid-run.

Dimension 2: Lineage Visibility (5 points)

We can trace any output a customer-facing agent produces back to the source records that fed it.
Lineage is documented at the column level for every dataset that touches a regulated workflow.
A non-engineer (auditor, legal, compliance) can pull lineage for any agent decision without a ticket.
Lineage diagrams are refreshed automatically — not maintained manually in Confluence.
We have run a "blast radius" exercise in the last 90 days that simulates one bad upstream record.

Dimension 3: Governance & Access Controls (5 points)

Every agent has a distinct identity (not a shared service account).
Permissions are scoped per agent, not inherited from a generic "AI user."
Read access, write access, and external-action access are governed by separate policies.
High-risk actions (financial, customer-facing communications, irreversible writes) require human-in-the-loop or two-agent verification.
Every agent action is logged in a tamper-evident audit trail.

Dimension 4: Cross-System Interoperability (5 points)

Agents access data through a unified semantic or knowledge layer, not through bespoke integrations per source.
Vendor lock-in to a single AI cloud is mapped and quantified (we know the cost and time of leaving).
Open standards (MCP, A2A, OpenAPI) are preferred over proprietary connectors for new builds.
We can replace any one model provider in our stack without rewriting more than 20% of the agent code.
Data sovereignty controls are honored when an agent crosses regional boundaries.

Dimension 5: Quality & Reliability (5 points)

We have automated data-quality tests on every dataset that feeds an agent (freshness, nulls, schema drift, duplicates).
We track agent-output quality with the same rigor as model accuracy (precision, recall, hallucination rate).
Silent-failure detection (output looks right, source was wrong) is a defined responsibility, not a hope.
We have a documented kill-switch and rollback path for every production agent.
Production rollback in the last 12 months has happened fewer than once per agent (the 2026 enterprise base rate is 41% having at least one rollback — beating that average is the bar).

Scoring

0–9 points: Not ready. Stay in sandbox. Production agents at this score will produce the silent failures the Datadog research describes — and will be in the 40% Gartner cancellation cohort.
10–14 points: Pilot-ready, not production-ready. Internal-only agents on read-only datasets are acceptable. External-facing or write-capable agents are not.
15–19 points: Limited production. Agents can ship to production with strict scoping (single domain, capped blast radius, human-in-the-loop on writes).
20–25 points: Scale-ready. You are in the 15% Fivetran identified. Expand carefully — the next failure mode is governance debt, not data debt.

If your honest score is below 15, the right move is not "ship the next agent pilot." It is to fix the foundation in parallel with one tightly scoped pilot.

Framework #2: 12-Month Data Foundation Remediation Roadmap

Quarter 1 (Months 1–3): Inventory and Lineage Audit

Success criteria: Every dataset that feeds an agent in production or a near-term pilot is inventoried, owned, and lineage-mapped at column-level for any regulated field.

Key activities:

Run a full agent-and-dataset inventory. Most enterprises discover 30–50% more agents in flight than the CIO knew about.
Classify every agent by blast radius (read-only, internal-write, external-write, financial).
Pick a metadata platform (Atlan, OvalEdge, Acceldata, native cloud catalog) and standardize lineage capture.
Identify the top five "sovereignty-sensitive" datasets and document the cross-border flow.

Quarter 2 (Months 4–6): Governance and Access Controls

Success criteria: Every production agent has a distinct identity, scoped permissions, an action-tier policy, and an audit log.

Key activities:

Issue per-agent identities. Retire shared service accounts.
Implement an action-tier model: read, write, communicate, financial. Different approval thresholds per tier.
Wire agent action logs into the SIEM or equivalent. Make the logs immutable.
Stand up an Agent Review Board with security, data, legal, and a business sponsor. Cadence: weekly.

Quarter 3 (Months 7–9): Pipeline Modernization

Success criteria: Every agent-feeding pipeline is automated, monitored, and documented to the lineage standard from Q1.

Key activities:

Replace manual exports with managed pipelines (Fivetran, Airbyte, native cloud, or rolled-your-own with proper SLAs).
Add data-quality tests at ingestion: freshness, schema drift, null thresholds, duplicate detection.
Implement rollback paths for each pipeline. Test them at least once.
Define and publish data freshness SLAs for every dataset that feeds a production agent.

Quarter 4 (Months 10–12): Production Hardening and Scale

Success criteria: You can stand up the 26th agent in the same week your team finishes the 25th, without re-litigating identity, lineage, or governance.

Key activities:

Implement silent-failure detection: shadow runs, golden datasets, output quality monitors.
Standardize an "agent onboarding" template — identity, dataset access, governance approvals, observability — that takes hours, not weeks.
Move at least three production agents through a full audit using the lineage system from Q1. Treat the gaps you find as Q4 backlog.
Re-score the 25-point assessment. The target is 20+. If you are still below, do not scale — repeat the weakest quarter.

Case Study: How Klarna's Agent Worked — and Why the Data Foundation Was the Reason

What Klarna did before the agent went live, in roughly the order this article recommends:

Unified the customer record. Every dataset the agent reads — order history, payment history, shipping events, dispute history — was available through a single semantic layer. No batch lag, no broken joins.
Wired lineage end-to-end. When the agent makes a refund decision, support leaders can trace it back to the exact records and policies that drove it.
Implemented a strict action-tier model. The agent can communicate, but financial actions above a threshold escalate to a human. Read access is broad; write access is narrow.
Built silent-failure monitoring. Output quality is sampled and reviewed daily. When upstream data shifts, the team finds out in hours, not weeks.

What to Do About It in the Next 30 Days

For different audiences, the next 30 days look different.

For CIOs and CTOs

For CFOs

For Business Leaders and Sponsors

Continue Reading

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

thedailybrief.com

Subscribe at thedailybrief.com/subscribe for weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi | X: x.com/rajeshberi

AI ROI

Latest Articles

View All →

85% of AI Agents Run on Data That Isn't Ready

THE DAILY BRIEF

85% of AI Agents Run on Data That Isn't Ready

What the Fivetran Survey Actually Found

Why This Matters: Two Audiences, One Bill

Technical Implications (CIO, CTO, CDO)

Business Implications (CFO, CEO, COO)

Market Context: Where the 85/15 Sits in 2026

Framework #1: The 25-Point Agentic AI Data Foundation Readiness Assessment

Dimension 1: Data Freshness & Movement (5 points)

Dimension 2: Lineage Visibility (5 points)

Dimension 3: Governance & Access Controls (5 points)

Dimension 4: Cross-System Interoperability (5 points)

Dimension 5: Quality & Reliability (5 points)

Scoring

Framework #2: 12-Month Data Foundation Remediation Roadmap

Quarter 1 (Months 1–3): Inventory and Lineage Audit

Quarter 2 (Months 4–6): Governance and Access Controls

Quarter 3 (Months 7–9): Pipeline Modernization

Quarter 4 (Months 10–12): Production Hardening and Scale

Case Study: How Klarna's Agent Worked — and Why the Data Foundation Was the Reason

What to Do About It in the Next 30 Days

For CIOs and CTOs

For CFOs

For Business Leaders and Sponsors

Continue Reading

THE DAILY BRIEF

What the Fivetran Survey Actually Found

Why This Matters: Two Audiences, One Bill

Technical Implications (CIO, CTO, CDO)

Business Implications (CFO, CEO, COO)

Market Context: Where the 85/15 Sits in 2026

Framework #1: The 25-Point Agentic AI Data Foundation Readiness Assessment

Dimension 1: Data Freshness & Movement (5 points)

Dimension 2: Lineage Visibility (5 points)

Dimension 3: Governance & Access Controls (5 points)

Dimension 4: Cross-System Interoperability (5 points)

Dimension 5: Quality & Reliability (5 points)

Scoring

Framework #2: 12-Month Data Foundation Remediation Roadmap

Quarter 1 (Months 1–3): Inventory and Lineage Audit

Quarter 2 (Months 4–6): Governance and Access Controls

Quarter 3 (Months 7–9): Pipeline Modernization

Quarter 4 (Months 10–12): Production Hardening and Scale

Case Study: How Klarna's Agent Worked — and Why the Data Foundation Was the Reason

What to Do About It in the Next 30 Days

For CIOs and CTOs

For CFOs

For Business Leaders and Sponsors

Continue Reading

THE DAILY BRIEF

85% of AI Agents Run on Data That Isn't Ready

What the Fivetran Survey Actually Found

Why This Matters: Two Audiences, One Bill

Technical Implications (CIO, CTO, CDO)

Business Implications (CFO, CEO, COO)

Market Context: Where the 85/15 Sits in 2026

Framework #1: The 25-Point Agentic AI Data Foundation Readiness Assessment

Dimension 1: Data Freshness & Movement (5 points)

Dimension 2: Lineage Visibility (5 points)

Dimension 3: Governance & Access Controls (5 points)

Dimension 4: Cross-System Interoperability (5 points)

Dimension 5: Quality & Reliability (5 points)

Scoring

Framework #2: 12-Month Data Foundation Remediation Roadmap

Quarter 1 (Months 1–3): Inventory and Lineage Audit

Quarter 2 (Months 4–6): Governance and Access Controls

Quarter 3 (Months 7–9): Pipeline Modernization

Quarter 4 (Months 10–12): Production Hardening and Scale

Case Study: How Klarna's Agent Worked — and Why the Data Foundation Was the Reason

What to Do About It in the Next 30 Days

For CIOs and CTOs

For CFOs

For Business Leaders and Sponsors

Continue Reading

THE DAILY BRIEF

Stay Ahead of the Curve

Related Articles

Why 67% of AI ROI Comes from Culture, Not Tech

Why 34% of Enterprises Choose Anthropic Over OpenAI