Resolve AI Hits $1.5B Valuation as AI SRE Goes Mainstream

Resolve AI raised $40M at a $1.5B valuation, $190M total in 18 months. Coinbase, Salesforce, Zscaler use it to autonomously resolve production incidents.

By Rajesh Beri·April 17, 2026·11 min read
Share:

THE DAILY BRIEF

AI AgentsSite Reliability EngineeringEnterprise AIAIOpsFunding

Resolve AI Hits $1.5B Valuation as AI SRE Goes Mainstream

Resolve AI raised $40M at a $1.5B valuation, $190M total in 18 months. Coinbase, Salesforce, Zscaler use it to autonomously resolve production incidents.

By Rajesh Beri·April 17, 2026·11 min read

Resolve AI announced a $40 million Series A extension at a $1.5 billion valuation on April 16, 2026, bringing total funding to more than $190 million in less than 18 months out of stealth. DST Global and Salesforce Ventures led the round.

The headline number matters less than the customer list: Coinbase, DoorDash, MongoDB, MSCI, Salesforce, and Zscaler are running Resolve AI in production today. Coinbase reports a 72% reduction in time to investigate critical incidents. Zscaler reports a 30% reduction in engineers required per incident.

Alongside the round, Resolve announced Resolve AI Labs—a research arm building domain-specific models for production environments—and the hire of Dhruv Mahajan as Chief AI Scientist, formerly leading post-training for Meta's Llama models.

This is the clearest signal yet that AI SRE—autonomous agents that detect, investigate, and resolve production incidents—has crossed from experiment to enterprise budget line item. For CIOs and CTOs, the question is no longer whether to evaluate this category. It's how fast you can move before competitors compress your incident costs ahead of you.

What Resolve AI Actually Does

Resolve AI is positioned as "AI for your production systems"—a multi-agent platform that operates the way an experienced site reliability engineer (SRE) operates: across code, infrastructure, services, and telemetry simultaneously.

When a production alert fires, the platform:

  1. Triages the alert against historical incident patterns and current system state
  2. Investigates autonomously by querying logs, metrics, traces, dashboards, code repositories, and dependency graphs
  3. Diagnoses the root cause by reasoning across signals the way a senior engineer would
  4. Takes action—with human involvement scaled to the risk and operational context

For low-risk runbook execution (restarting a stuck pod, rotating a credential, scaling a service), the agent can act autonomously. For high-risk changes (database failover, traffic rerouting, customer-facing config changes), it surfaces a recommendation and waits for an engineer to approve.

The architectural distinction worth noting: this is not a chatbot bolted onto Datadog. The system is designed as a multi-agent orchestrator that operates tools, executes investigation plans, and reasons through ambiguity—closer in spirit to how Cursor changed the IDE category than how a Slack-bot wraps an existing dashboard.

The Founders and Why This Bet Matters

Resolve AI was founded by Spiros Xanthos (CEO) and Mayank Agarwal. Xanthos previously founded Omnition (acquired by Cisco) and was a senior leader at Splunk's observability business, where he scaled the SignalFx acquisition and helped define the modern observability category. Agarwal brings deep distributed systems and production engineering experience.

This is not a team learning observability while building an AI startup. They built the data category their AI now reasons over. That matters because the hardest problem in AI SRE is not the model—it's the messy, fragmented, organization-specific operational data the model has to navigate.

Adding Dhruv Mahajan as Chief AI Scientist locks in the other half of the equation: post-training expertise from one of the most aggressive open-weights model programs at Meta. Resolve is signaling that it intends to train and post-train its own production-specific models—not just wrap GPT-5 or Claude with prompt engineering.

Why This Funding Round Tells You Something Real

In a market where every AI startup is raising at frothy valuations, Resolve's progression is unusually grounded:

Round Date Amount Valuation Lead
Seed + early 2024–2025 ~$25M n/a Greylock, others
Series A Feb 2026 $125M $1.0B Lightspeed
Series A Extension Apr 16, 2026 $40M $1.5B DST Global, Salesforce Ventures

A 50% valuation step-up in 10 weeks—from a Series A that already minted unicorn status—signals that paying enterprise customers are expanding contracts, not just signing logos. DST Global rarely leads at this stage; their participation suggests revenue at a multiple that justifies the markup.

Salesforce Ventures' lead position is the more interesting signal. Salesforce is both an investor and a customer, which means Resolve is operational inside one of the largest enterprise SaaS infrastructures on the planet. That's a reference architecture that will sell itself to every CIO who's ever managed a Salesforce-scale incident.

The Customer Math: Why CFOs Should Care

For business leaders trying to understand whether AI SRE belongs on the capital plan, the publicly disclosed metrics tell a clean ROI story.

Coinbase: 72% reduction in time-to-investigate critical incidents.

Coinbase runs a 24/7 financial system where every minute of incident time has direct revenue impact, regulatory exposure, and customer trust cost. A 72% reduction in mean time to investigate (MTTI) doesn't just save engineering hours—it compresses the window during which a degraded system is bleeding revenue or accumulating compliance risk.

Zscaler: 30% reduction in engineers required per incident.

For a security infrastructure company running global traffic for thousands of enterprises, an incident typically pulls together SREs, security engineers, network engineers, and product engineers into a war room. Removing one in three of those bodies per incident is meaningful headcount leverage on the most expensive engineers in the building.

Translating to dollars: A typical enterprise SRE in the U.S. is fully loaded at $300K–$450K. A team of 30 SREs costs roughly $10M–$13M annually. If AI SRE removes even 25% of toil—the unglamorous incident investigation, paging, and runbook work—that's $2.5M–$3.3M of recovered capacity per year. For Fortune 500 organizations running 100+ SREs, the math scales linearly. Resolve AI's annual contract value is almost certainly less than the toil it removes, which is why the customer math closes.

This is the calculation CFOs should walk into the conversation with: what is your current MTTI, what is your current incidents-per-week volume, and what is your fully loaded SRE cost per hour? Three numbers are enough to size the opportunity without needing a vendor pitch deck.

Why This Category Is Suddenly Real

AI for production operations has been a "next year" category for five years. It became a real category in 2026 for three converging reasons:

1. Frontier models finally reason well enough across noisy operational data. GPT-5, Claude Opus 4.6, and Gemini 2.5 can hold long context, follow multi-step investigation plans, and reason through partial information without hallucinating fixes. The 2024-era LLMs could summarize an alert; the 2026 frontier models can investigate one.

2. Telemetry standardization caught up. OpenTelemetry has consolidated as the default emit format for metrics, logs, and traces across cloud-native stacks. That gives AI agents a stable substrate to reason over instead of negotiating with 30 vendor-specific schemas. Notably, Xanthos was an early OpenTelemetry contributor—Resolve's product surface is built on the standard he helped author.

3. The toil problem has gotten worse, not better. Enterprise infrastructure complexity has compounded as organizations adopt Kubernetes, multi-cloud, microservices, and now agentic AI workloads. Engineers report that 40–60% of their time goes to incident response, on-call, and reactive work. AI SRE is the only credible path to reduce that without cutting the team.

How Resolve Compares: The Competitive Map

For CIOs and VPs of Engineering scoping a vendor evaluation, the AI SRE landscape splits into four buckets:

Pure-play AI SRE startups: Resolve AI, Cleric, Aisera (AIOps roots), Parity, and a handful of stealth-mode entrants. Resolve is currently the most-funded and has the most visible enterprise reference customers in the cohort.

Observability incumbents adding AI: Datadog (Bits AI), New Relic (NRQL AI), Splunk (now Cisco), and Honeycomb. These have the data—they're racing to add agentic capability before pure-plays steal the relationship.

ITSM incumbents adding AI: ServiceNow (incident routing AI), PagerDuty (Incident Workflows), Atlassian (Jira Service Management AI). Strong on workflow, weaker on autonomous investigation.

Hyperscaler-native: AWS DevOps Guru, Azure AI Operations, Google Cloud Operations Suite. Locked to a single cloud, which is fine for single-cloud shops and disqualifying for everyone else.

Where Resolve wins: Multi-cloud, multi-vendor, deep autonomous investigation, founders who built the observability category. Where incumbents win: existing data gravity, enterprise procurement relationships, and bundling. The next 18 months will determine whether the data gravity advantage of Datadog/Splunk overwhelms the architectural advantage of pure-play agents.

Resolve AI Labs: The Strategic Tell

The launch of Resolve AI Labs is the most strategically interesting part of this announcement. Most AI startups at this stage are spending capital on go-to-market. Resolve is spending capital on building production-specific foundation models.

Labs will focus on:

  • Domain-specific model post-training on operational telemetry corpora
  • Multi-signal reasoning across logs, metrics, traces, infrastructure events, and code
  • Reliability evaluation frameworks—measurable accuracy benchmarks for production AI
  • Synthetic data generation and simulated production environments for safe agent training
  • Governance, guardrails, and safe-action policies for autonomous remediation

This is the same playbook Cursor ran in code editing, Harvey ran in legal, and Hippocratic AI is running in clinical care: don't compete on the foundation model layer—compete on domain-specific post-training and the proprietary data flywheel.

For technical leaders, this signals Resolve intends to be defensible against an Anthropic or OpenAI deciding to ship "Claude SRE" next year. For business leaders, it signals the company is building moats that justify the valuation rather than coasting on hype.

The Risks Worth Naming

This is not a slam-dunk category, and any CIO conducting due diligence should pressure-test the following:

1. Autonomous remediation is a trust-and-blast-radius problem. Letting an AI agent restart a service is fine. Letting it failover a database in production is a career-ending decision the first time it gets it wrong. Resolve's "human involvement scaled to risk" model is the right design—but operationalizing the risk classifier is where this either succeeds at scale or generates a high-profile incident that sets the category back two years.

2. Data residency and compliance. AI agents reasoning over production logs are reading some of the most sensitive data in the company: customer PII, payment flows, internal credentials, and proprietary business logic. Enterprise contracts will require SOC 2 Type II, HIPAA where relevant, FedRAMP for public sector, and clear data residency guarantees. Validate these before signing.

3. Lock-in risk. Once an AI agent is trained on your incident history, runbooks, and tribal knowledge, switching costs go up materially. This is a feature for Resolve and a negotiation point for buyers. Ask about data portability and model export rights upfront.

4. The incumbent counterpunch. Datadog, Splunk (Cisco), and ServiceNow each have the data, the relationships, and the capital to ship competitive offerings. The question is whether they can ship credible autonomous agents fast enough to defend their accounts. Bet on the incumbent if your relationship is deep; bet on Resolve if you want the architecturally superior product.

What to Do This Quarter

For technical leaders:

  • Run a 90-day pilot on a non-customer-facing service tier. Pick something with high alert volume and low blast radius—internal tooling, batch pipelines, or staging environments. Measure MTTI, MTTR, and false-positive rate before and after.
  • Define your autonomous-action policy explicitly. Which runbooks can an agent execute without approval? Which require human-in-the-loop? Which are off-limits entirely? Write this down before the vendor does it for you.
  • Audit your telemetry foundations. Resolve—and every AI SRE—is only as good as the data it reasons over. If your logs are unstructured, your traces are sampled too aggressively, or your alert noise is 40% false positives, fix that first or in parallel.

For business leaders:

  • Calculate your incident cost. Fully loaded engineer hours per incident × incident volume × revenue impact during degradation. This is the number that makes the procurement conversation easy.
  • Add AI SRE to your 2026 capital plan. Whether you choose Resolve, an incumbent, or a hyperscaler, this category is moving from "evaluate" to "deploy" inside the next 12 months. Budget accordingly.
  • Don't let the security team block the conversation. They will have legitimate concerns. Address them with vendor security reviews, not by punting the decision. The cost of waiting is measurable in toil, attrition, and outage minutes.

The Bigger Pattern

Resolve AI's $1.5B valuation is one data point in a larger pattern: enterprise AI is consolidating around agents that do specific, high-value work in specific, high-stakes domains. Cursor in code. Harvey in legal. Hippocratic in clinical. Resolve in production operations.

The losers in this pattern are the horizontal "AI assistant for everything" plays that look impressive in demos and produce ambiguous ROI in production. The winners are vertical agents with deep domain models, defensible data flywheels, and a clear answer to the question every CFO eventually asks: what does this replace, and what does it cost compared to what it replaces?

For Resolve, the answer is increasingly clean. It replaces the toil tax that every modern engineering organization pays. The cost is a fraction of the engineering capacity it returns.

That's a story enterprise buyers respond to in any market. In a market where engineering headcount is being scrutinized harder than at any point in the last decade, it's a story that closes deals.

Bottom Line

Resolve AI's $190M, $1.5B valuation, and customer roster confirm that AI SRE is a real, fundable, deployable enterprise category in 2026. CIOs who haven't started a pilot this quarter will be explaining the gap to their boards by Q4. CFOs who haven't modeled the toil-reduction ROI are leaving real capital on the table.

The technology is ready. The customer references are credible. The valuation is grounded in revenue, not hope. The only remaining variable is execution speed inside your own organization.


Want to calculate your own AI ROI? Try our AI ROI Calculator — takes 60 seconds and shows projected savings, payback period, and 3-year ROI.

Continue Reading

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

thedailybrief.com

Subscribe at thedailybrief.com/subscribe for weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

Resolve AI Hits $1.5B Valuation as AI SRE Goes Mainstream

Photo by Christina Morillo on Pexels

Resolve AI announced a $40 million Series A extension at a $1.5 billion valuation on April 16, 2026, bringing total funding to more than $190 million in less than 18 months out of stealth. DST Global and Salesforce Ventures led the round.

The headline number matters less than the customer list: Coinbase, DoorDash, MongoDB, MSCI, Salesforce, and Zscaler are running Resolve AI in production today. Coinbase reports a 72% reduction in time to investigate critical incidents. Zscaler reports a 30% reduction in engineers required per incident.

Alongside the round, Resolve announced Resolve AI Labs—a research arm building domain-specific models for production environments—and the hire of Dhruv Mahajan as Chief AI Scientist, formerly leading post-training for Meta's Llama models.

This is the clearest signal yet that AI SRE—autonomous agents that detect, investigate, and resolve production incidents—has crossed from experiment to enterprise budget line item. For CIOs and CTOs, the question is no longer whether to evaluate this category. It's how fast you can move before competitors compress your incident costs ahead of you.

What Resolve AI Actually Does

Resolve AI is positioned as "AI for your production systems"—a multi-agent platform that operates the way an experienced site reliability engineer (SRE) operates: across code, infrastructure, services, and telemetry simultaneously.

When a production alert fires, the platform:

  1. Triages the alert against historical incident patterns and current system state
  2. Investigates autonomously by querying logs, metrics, traces, dashboards, code repositories, and dependency graphs
  3. Diagnoses the root cause by reasoning across signals the way a senior engineer would
  4. Takes action—with human involvement scaled to the risk and operational context

For low-risk runbook execution (restarting a stuck pod, rotating a credential, scaling a service), the agent can act autonomously. For high-risk changes (database failover, traffic rerouting, customer-facing config changes), it surfaces a recommendation and waits for an engineer to approve.

The architectural distinction worth noting: this is not a chatbot bolted onto Datadog. The system is designed as a multi-agent orchestrator that operates tools, executes investigation plans, and reasons through ambiguity—closer in spirit to how Cursor changed the IDE category than how a Slack-bot wraps an existing dashboard.

The Founders and Why This Bet Matters

Resolve AI was founded by Spiros Xanthos (CEO) and Mayank Agarwal. Xanthos previously founded Omnition (acquired by Cisco) and was a senior leader at Splunk's observability business, where he scaled the SignalFx acquisition and helped define the modern observability category. Agarwal brings deep distributed systems and production engineering experience.

This is not a team learning observability while building an AI startup. They built the data category their AI now reasons over. That matters because the hardest problem in AI SRE is not the model—it's the messy, fragmented, organization-specific operational data the model has to navigate.

Adding Dhruv Mahajan as Chief AI Scientist locks in the other half of the equation: post-training expertise from one of the most aggressive open-weights model programs at Meta. Resolve is signaling that it intends to train and post-train its own production-specific models—not just wrap GPT-5 or Claude with prompt engineering.

Why This Funding Round Tells You Something Real

In a market where every AI startup is raising at frothy valuations, Resolve's progression is unusually grounded:

Round Date Amount Valuation Lead
Seed + early 2024–2025 ~$25M n/a Greylock, others
Series A Feb 2026 $125M $1.0B Lightspeed
Series A Extension Apr 16, 2026 $40M $1.5B DST Global, Salesforce Ventures

A 50% valuation step-up in 10 weeks—from a Series A that already minted unicorn status—signals that paying enterprise customers are expanding contracts, not just signing logos. DST Global rarely leads at this stage; their participation suggests revenue at a multiple that justifies the markup.

Salesforce Ventures' lead position is the more interesting signal. Salesforce is both an investor and a customer, which means Resolve is operational inside one of the largest enterprise SaaS infrastructures on the planet. That's a reference architecture that will sell itself to every CIO who's ever managed a Salesforce-scale incident.

The Customer Math: Why CFOs Should Care

For business leaders trying to understand whether AI SRE belongs on the capital plan, the publicly disclosed metrics tell a clean ROI story.

Coinbase: 72% reduction in time-to-investigate critical incidents.

Coinbase runs a 24/7 financial system where every minute of incident time has direct revenue impact, regulatory exposure, and customer trust cost. A 72% reduction in mean time to investigate (MTTI) doesn't just save engineering hours—it compresses the window during which a degraded system is bleeding revenue or accumulating compliance risk.

Zscaler: 30% reduction in engineers required per incident.

For a security infrastructure company running global traffic for thousands of enterprises, an incident typically pulls together SREs, security engineers, network engineers, and product engineers into a war room. Removing one in three of those bodies per incident is meaningful headcount leverage on the most expensive engineers in the building.

Translating to dollars: A typical enterprise SRE in the U.S. is fully loaded at $300K–$450K. A team of 30 SREs costs roughly $10M–$13M annually. If AI SRE removes even 25% of toil—the unglamorous incident investigation, paging, and runbook work—that's $2.5M–$3.3M of recovered capacity per year. For Fortune 500 organizations running 100+ SREs, the math scales linearly. Resolve AI's annual contract value is almost certainly less than the toil it removes, which is why the customer math closes.

This is the calculation CFOs should walk into the conversation with: what is your current MTTI, what is your current incidents-per-week volume, and what is your fully loaded SRE cost per hour? Three numbers are enough to size the opportunity without needing a vendor pitch deck.

Why This Category Is Suddenly Real

AI for production operations has been a "next year" category for five years. It became a real category in 2026 for three converging reasons:

1. Frontier models finally reason well enough across noisy operational data. GPT-5, Claude Opus 4.6, and Gemini 2.5 can hold long context, follow multi-step investigation plans, and reason through partial information without hallucinating fixes. The 2024-era LLMs could summarize an alert; the 2026 frontier models can investigate one.

2. Telemetry standardization caught up. OpenTelemetry has consolidated as the default emit format for metrics, logs, and traces across cloud-native stacks. That gives AI agents a stable substrate to reason over instead of negotiating with 30 vendor-specific schemas. Notably, Xanthos was an early OpenTelemetry contributor—Resolve's product surface is built on the standard he helped author.

3. The toil problem has gotten worse, not better. Enterprise infrastructure complexity has compounded as organizations adopt Kubernetes, multi-cloud, microservices, and now agentic AI workloads. Engineers report that 40–60% of their time goes to incident response, on-call, and reactive work. AI SRE is the only credible path to reduce that without cutting the team.

How Resolve Compares: The Competitive Map

For CIOs and VPs of Engineering scoping a vendor evaluation, the AI SRE landscape splits into four buckets:

Pure-play AI SRE startups: Resolve AI, Cleric, Aisera (AIOps roots), Parity, and a handful of stealth-mode entrants. Resolve is currently the most-funded and has the most visible enterprise reference customers in the cohort.

Observability incumbents adding AI: Datadog (Bits AI), New Relic (NRQL AI), Splunk (now Cisco), and Honeycomb. These have the data—they're racing to add agentic capability before pure-plays steal the relationship.

ITSM incumbents adding AI: ServiceNow (incident routing AI), PagerDuty (Incident Workflows), Atlassian (Jira Service Management AI). Strong on workflow, weaker on autonomous investigation.

Hyperscaler-native: AWS DevOps Guru, Azure AI Operations, Google Cloud Operations Suite. Locked to a single cloud, which is fine for single-cloud shops and disqualifying for everyone else.

Where Resolve wins: Multi-cloud, multi-vendor, deep autonomous investigation, founders who built the observability category. Where incumbents win: existing data gravity, enterprise procurement relationships, and bundling. The next 18 months will determine whether the data gravity advantage of Datadog/Splunk overwhelms the architectural advantage of pure-play agents.

Resolve AI Labs: The Strategic Tell

The launch of Resolve AI Labs is the most strategically interesting part of this announcement. Most AI startups at this stage are spending capital on go-to-market. Resolve is spending capital on building production-specific foundation models.

Labs will focus on:

  • Domain-specific model post-training on operational telemetry corpora
  • Multi-signal reasoning across logs, metrics, traces, infrastructure events, and code
  • Reliability evaluation frameworks—measurable accuracy benchmarks for production AI
  • Synthetic data generation and simulated production environments for safe agent training
  • Governance, guardrails, and safe-action policies for autonomous remediation

This is the same playbook Cursor ran in code editing, Harvey ran in legal, and Hippocratic AI is running in clinical care: don't compete on the foundation model layer—compete on domain-specific post-training and the proprietary data flywheel.

For technical leaders, this signals Resolve intends to be defensible against an Anthropic or OpenAI deciding to ship "Claude SRE" next year. For business leaders, it signals the company is building moats that justify the valuation rather than coasting on hype.

The Risks Worth Naming

This is not a slam-dunk category, and any CIO conducting due diligence should pressure-test the following:

1. Autonomous remediation is a trust-and-blast-radius problem. Letting an AI agent restart a service is fine. Letting it failover a database in production is a career-ending decision the first time it gets it wrong. Resolve's "human involvement scaled to risk" model is the right design—but operationalizing the risk classifier is where this either succeeds at scale or generates a high-profile incident that sets the category back two years.

2. Data residency and compliance. AI agents reasoning over production logs are reading some of the most sensitive data in the company: customer PII, payment flows, internal credentials, and proprietary business logic. Enterprise contracts will require SOC 2 Type II, HIPAA where relevant, FedRAMP for public sector, and clear data residency guarantees. Validate these before signing.

3. Lock-in risk. Once an AI agent is trained on your incident history, runbooks, and tribal knowledge, switching costs go up materially. This is a feature for Resolve and a negotiation point for buyers. Ask about data portability and model export rights upfront.

4. The incumbent counterpunch. Datadog, Splunk (Cisco), and ServiceNow each have the data, the relationships, and the capital to ship competitive offerings. The question is whether they can ship credible autonomous agents fast enough to defend their accounts. Bet on the incumbent if your relationship is deep; bet on Resolve if you want the architecturally superior product.

What to Do This Quarter

For technical leaders:

  • Run a 90-day pilot on a non-customer-facing service tier. Pick something with high alert volume and low blast radius—internal tooling, batch pipelines, or staging environments. Measure MTTI, MTTR, and false-positive rate before and after.
  • Define your autonomous-action policy explicitly. Which runbooks can an agent execute without approval? Which require human-in-the-loop? Which are off-limits entirely? Write this down before the vendor does it for you.
  • Audit your telemetry foundations. Resolve—and every AI SRE—is only as good as the data it reasons over. If your logs are unstructured, your traces are sampled too aggressively, or your alert noise is 40% false positives, fix that first or in parallel.

For business leaders:

  • Calculate your incident cost. Fully loaded engineer hours per incident × incident volume × revenue impact during degradation. This is the number that makes the procurement conversation easy.
  • Add AI SRE to your 2026 capital plan. Whether you choose Resolve, an incumbent, or a hyperscaler, this category is moving from "evaluate" to "deploy" inside the next 12 months. Budget accordingly.
  • Don't let the security team block the conversation. They will have legitimate concerns. Address them with vendor security reviews, not by punting the decision. The cost of waiting is measurable in toil, attrition, and outage minutes.

The Bigger Pattern

Resolve AI's $1.5B valuation is one data point in a larger pattern: enterprise AI is consolidating around agents that do specific, high-value work in specific, high-stakes domains. Cursor in code. Harvey in legal. Hippocratic in clinical. Resolve in production operations.

The losers in this pattern are the horizontal "AI assistant for everything" plays that look impressive in demos and produce ambiguous ROI in production. The winners are vertical agents with deep domain models, defensible data flywheels, and a clear answer to the question every CFO eventually asks: what does this replace, and what does it cost compared to what it replaces?

For Resolve, the answer is increasingly clean. It replaces the toil tax that every modern engineering organization pays. The cost is a fraction of the engineering capacity it returns.

That's a story enterprise buyers respond to in any market. In a market where engineering headcount is being scrutinized harder than at any point in the last decade, it's a story that closes deals.

Bottom Line

Resolve AI's $190M, $1.5B valuation, and customer roster confirm that AI SRE is a real, fundable, deployable enterprise category in 2026. CIOs who haven't started a pilot this quarter will be explaining the gap to their boards by Q4. CFOs who haven't modeled the toil-reduction ROI are leaving real capital on the table.

The technology is ready. The customer references are credible. The valuation is grounded in revenue, not hope. The only remaining variable is execution speed inside your own organization.


Want to calculate your own AI ROI? Try our AI ROI Calculator — takes 60 seconds and shows projected savings, payback period, and 3-year ROI.

Continue Reading

Share:

THE DAILY BRIEF

AI AgentsSite Reliability EngineeringEnterprise AIAIOpsFunding

Resolve AI Hits $1.5B Valuation as AI SRE Goes Mainstream

Resolve AI raised $40M at a $1.5B valuation, $190M total in 18 months. Coinbase, Salesforce, Zscaler use it to autonomously resolve production incidents.

By Rajesh Beri·April 17, 2026·11 min read

Resolve AI announced a $40 million Series A extension at a $1.5 billion valuation on April 16, 2026, bringing total funding to more than $190 million in less than 18 months out of stealth. DST Global and Salesforce Ventures led the round.

The headline number matters less than the customer list: Coinbase, DoorDash, MongoDB, MSCI, Salesforce, and Zscaler are running Resolve AI in production today. Coinbase reports a 72% reduction in time to investigate critical incidents. Zscaler reports a 30% reduction in engineers required per incident.

Alongside the round, Resolve announced Resolve AI Labs—a research arm building domain-specific models for production environments—and the hire of Dhruv Mahajan as Chief AI Scientist, formerly leading post-training for Meta's Llama models.

This is the clearest signal yet that AI SRE—autonomous agents that detect, investigate, and resolve production incidents—has crossed from experiment to enterprise budget line item. For CIOs and CTOs, the question is no longer whether to evaluate this category. It's how fast you can move before competitors compress your incident costs ahead of you.

What Resolve AI Actually Does

Resolve AI is positioned as "AI for your production systems"—a multi-agent platform that operates the way an experienced site reliability engineer (SRE) operates: across code, infrastructure, services, and telemetry simultaneously.

When a production alert fires, the platform:

  1. Triages the alert against historical incident patterns and current system state
  2. Investigates autonomously by querying logs, metrics, traces, dashboards, code repositories, and dependency graphs
  3. Diagnoses the root cause by reasoning across signals the way a senior engineer would
  4. Takes action—with human involvement scaled to the risk and operational context

For low-risk runbook execution (restarting a stuck pod, rotating a credential, scaling a service), the agent can act autonomously. For high-risk changes (database failover, traffic rerouting, customer-facing config changes), it surfaces a recommendation and waits for an engineer to approve.

The architectural distinction worth noting: this is not a chatbot bolted onto Datadog. The system is designed as a multi-agent orchestrator that operates tools, executes investigation plans, and reasons through ambiguity—closer in spirit to how Cursor changed the IDE category than how a Slack-bot wraps an existing dashboard.

The Founders and Why This Bet Matters

Resolve AI was founded by Spiros Xanthos (CEO) and Mayank Agarwal. Xanthos previously founded Omnition (acquired by Cisco) and was a senior leader at Splunk's observability business, where he scaled the SignalFx acquisition and helped define the modern observability category. Agarwal brings deep distributed systems and production engineering experience.

This is not a team learning observability while building an AI startup. They built the data category their AI now reasons over. That matters because the hardest problem in AI SRE is not the model—it's the messy, fragmented, organization-specific operational data the model has to navigate.

Adding Dhruv Mahajan as Chief AI Scientist locks in the other half of the equation: post-training expertise from one of the most aggressive open-weights model programs at Meta. Resolve is signaling that it intends to train and post-train its own production-specific models—not just wrap GPT-5 or Claude with prompt engineering.

Why This Funding Round Tells You Something Real

In a market where every AI startup is raising at frothy valuations, Resolve's progression is unusually grounded:

Round Date Amount Valuation Lead
Seed + early 2024–2025 ~$25M n/a Greylock, others
Series A Feb 2026 $125M $1.0B Lightspeed
Series A Extension Apr 16, 2026 $40M $1.5B DST Global, Salesforce Ventures

A 50% valuation step-up in 10 weeks—from a Series A that already minted unicorn status—signals that paying enterprise customers are expanding contracts, not just signing logos. DST Global rarely leads at this stage; their participation suggests revenue at a multiple that justifies the markup.

Salesforce Ventures' lead position is the more interesting signal. Salesforce is both an investor and a customer, which means Resolve is operational inside one of the largest enterprise SaaS infrastructures on the planet. That's a reference architecture that will sell itself to every CIO who's ever managed a Salesforce-scale incident.

The Customer Math: Why CFOs Should Care

For business leaders trying to understand whether AI SRE belongs on the capital plan, the publicly disclosed metrics tell a clean ROI story.

Coinbase: 72% reduction in time-to-investigate critical incidents.

Coinbase runs a 24/7 financial system where every minute of incident time has direct revenue impact, regulatory exposure, and customer trust cost. A 72% reduction in mean time to investigate (MTTI) doesn't just save engineering hours—it compresses the window during which a degraded system is bleeding revenue or accumulating compliance risk.

Zscaler: 30% reduction in engineers required per incident.

For a security infrastructure company running global traffic for thousands of enterprises, an incident typically pulls together SREs, security engineers, network engineers, and product engineers into a war room. Removing one in three of those bodies per incident is meaningful headcount leverage on the most expensive engineers in the building.

Translating to dollars: A typical enterprise SRE in the U.S. is fully loaded at $300K–$450K. A team of 30 SREs costs roughly $10M–$13M annually. If AI SRE removes even 25% of toil—the unglamorous incident investigation, paging, and runbook work—that's $2.5M–$3.3M of recovered capacity per year. For Fortune 500 organizations running 100+ SREs, the math scales linearly. Resolve AI's annual contract value is almost certainly less than the toil it removes, which is why the customer math closes.

This is the calculation CFOs should walk into the conversation with: what is your current MTTI, what is your current incidents-per-week volume, and what is your fully loaded SRE cost per hour? Three numbers are enough to size the opportunity without needing a vendor pitch deck.

Why This Category Is Suddenly Real

AI for production operations has been a "next year" category for five years. It became a real category in 2026 for three converging reasons:

1. Frontier models finally reason well enough across noisy operational data. GPT-5, Claude Opus 4.6, and Gemini 2.5 can hold long context, follow multi-step investigation plans, and reason through partial information without hallucinating fixes. The 2024-era LLMs could summarize an alert; the 2026 frontier models can investigate one.

2. Telemetry standardization caught up. OpenTelemetry has consolidated as the default emit format for metrics, logs, and traces across cloud-native stacks. That gives AI agents a stable substrate to reason over instead of negotiating with 30 vendor-specific schemas. Notably, Xanthos was an early OpenTelemetry contributor—Resolve's product surface is built on the standard he helped author.

3. The toil problem has gotten worse, not better. Enterprise infrastructure complexity has compounded as organizations adopt Kubernetes, multi-cloud, microservices, and now agentic AI workloads. Engineers report that 40–60% of their time goes to incident response, on-call, and reactive work. AI SRE is the only credible path to reduce that without cutting the team.

How Resolve Compares: The Competitive Map

For CIOs and VPs of Engineering scoping a vendor evaluation, the AI SRE landscape splits into four buckets:

Pure-play AI SRE startups: Resolve AI, Cleric, Aisera (AIOps roots), Parity, and a handful of stealth-mode entrants. Resolve is currently the most-funded and has the most visible enterprise reference customers in the cohort.

Observability incumbents adding AI: Datadog (Bits AI), New Relic (NRQL AI), Splunk (now Cisco), and Honeycomb. These have the data—they're racing to add agentic capability before pure-plays steal the relationship.

ITSM incumbents adding AI: ServiceNow (incident routing AI), PagerDuty (Incident Workflows), Atlassian (Jira Service Management AI). Strong on workflow, weaker on autonomous investigation.

Hyperscaler-native: AWS DevOps Guru, Azure AI Operations, Google Cloud Operations Suite. Locked to a single cloud, which is fine for single-cloud shops and disqualifying for everyone else.

Where Resolve wins: Multi-cloud, multi-vendor, deep autonomous investigation, founders who built the observability category. Where incumbents win: existing data gravity, enterprise procurement relationships, and bundling. The next 18 months will determine whether the data gravity advantage of Datadog/Splunk overwhelms the architectural advantage of pure-play agents.

Resolve AI Labs: The Strategic Tell

The launch of Resolve AI Labs is the most strategically interesting part of this announcement. Most AI startups at this stage are spending capital on go-to-market. Resolve is spending capital on building production-specific foundation models.

Labs will focus on:

  • Domain-specific model post-training on operational telemetry corpora
  • Multi-signal reasoning across logs, metrics, traces, infrastructure events, and code
  • Reliability evaluation frameworks—measurable accuracy benchmarks for production AI
  • Synthetic data generation and simulated production environments for safe agent training
  • Governance, guardrails, and safe-action policies for autonomous remediation

This is the same playbook Cursor ran in code editing, Harvey ran in legal, and Hippocratic AI is running in clinical care: don't compete on the foundation model layer—compete on domain-specific post-training and the proprietary data flywheel.

For technical leaders, this signals Resolve intends to be defensible against an Anthropic or OpenAI deciding to ship "Claude SRE" next year. For business leaders, it signals the company is building moats that justify the valuation rather than coasting on hype.

The Risks Worth Naming

This is not a slam-dunk category, and any CIO conducting due diligence should pressure-test the following:

1. Autonomous remediation is a trust-and-blast-radius problem. Letting an AI agent restart a service is fine. Letting it failover a database in production is a career-ending decision the first time it gets it wrong. Resolve's "human involvement scaled to risk" model is the right design—but operationalizing the risk classifier is where this either succeeds at scale or generates a high-profile incident that sets the category back two years.

2. Data residency and compliance. AI agents reasoning over production logs are reading some of the most sensitive data in the company: customer PII, payment flows, internal credentials, and proprietary business logic. Enterprise contracts will require SOC 2 Type II, HIPAA where relevant, FedRAMP for public sector, and clear data residency guarantees. Validate these before signing.

3. Lock-in risk. Once an AI agent is trained on your incident history, runbooks, and tribal knowledge, switching costs go up materially. This is a feature for Resolve and a negotiation point for buyers. Ask about data portability and model export rights upfront.

4. The incumbent counterpunch. Datadog, Splunk (Cisco), and ServiceNow each have the data, the relationships, and the capital to ship competitive offerings. The question is whether they can ship credible autonomous agents fast enough to defend their accounts. Bet on the incumbent if your relationship is deep; bet on Resolve if you want the architecturally superior product.

What to Do This Quarter

For technical leaders:

  • Run a 90-day pilot on a non-customer-facing service tier. Pick something with high alert volume and low blast radius—internal tooling, batch pipelines, or staging environments. Measure MTTI, MTTR, and false-positive rate before and after.
  • Define your autonomous-action policy explicitly. Which runbooks can an agent execute without approval? Which require human-in-the-loop? Which are off-limits entirely? Write this down before the vendor does it for you.
  • Audit your telemetry foundations. Resolve—and every AI SRE—is only as good as the data it reasons over. If your logs are unstructured, your traces are sampled too aggressively, or your alert noise is 40% false positives, fix that first or in parallel.

For business leaders:

  • Calculate your incident cost. Fully loaded engineer hours per incident × incident volume × revenue impact during degradation. This is the number that makes the procurement conversation easy.
  • Add AI SRE to your 2026 capital plan. Whether you choose Resolve, an incumbent, or a hyperscaler, this category is moving from "evaluate" to "deploy" inside the next 12 months. Budget accordingly.
  • Don't let the security team block the conversation. They will have legitimate concerns. Address them with vendor security reviews, not by punting the decision. The cost of waiting is measurable in toil, attrition, and outage minutes.

The Bigger Pattern

Resolve AI's $1.5B valuation is one data point in a larger pattern: enterprise AI is consolidating around agents that do specific, high-value work in specific, high-stakes domains. Cursor in code. Harvey in legal. Hippocratic in clinical. Resolve in production operations.

The losers in this pattern are the horizontal "AI assistant for everything" plays that look impressive in demos and produce ambiguous ROI in production. The winners are vertical agents with deep domain models, defensible data flywheels, and a clear answer to the question every CFO eventually asks: what does this replace, and what does it cost compared to what it replaces?

For Resolve, the answer is increasingly clean. It replaces the toil tax that every modern engineering organization pays. The cost is a fraction of the engineering capacity it returns.

That's a story enterprise buyers respond to in any market. In a market where engineering headcount is being scrutinized harder than at any point in the last decade, it's a story that closes deals.

Bottom Line

Resolve AI's $190M, $1.5B valuation, and customer roster confirm that AI SRE is a real, fundable, deployable enterprise category in 2026. CIOs who haven't started a pilot this quarter will be explaining the gap to their boards by Q4. CFOs who haven't modeled the toil-reduction ROI are leaving real capital on the table.

The technology is ready. The customer references are credible. The valuation is grounded in revenue, not hope. The only remaining variable is execution speed inside your own organization.


Want to calculate your own AI ROI? Try our AI ROI Calculator — takes 60 seconds and shows projected savings, payback period, and 3-year ROI.

Continue Reading

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

thedailybrief.com

Subscribe at thedailybrief.com/subscribe for weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

Newsletter

Stay Ahead of the Curve

Weekly enterprise AI insights for technology leaders. No spam, no vendor pitches—unsubscribe anytime.

Subscribe