HP Chose Its AI Platform. 88% of Enterprises Won't Survive Theirs.

HP Inc. just scaled its OpenAI Frontier partnership across global operations after one engineer moved through 122 pull requests across 43 projects and a security team unlocked 82 hours per week of capacity. But the real story isn't HP — it's the enterprise AI platform war now forcing every CIO to make a decade-defining vendor decision. With 88% of AI proofs of concept never reaching production and 40% of agentic AI projects heading for cancellation, this article delivers the evaluation matrix and pilot-to-production readiness assessment every enterprise needs before committing.

By Rajesh Beri·July 1, 2026·15 min read
Share:
THE DAILY BRIEF
enterprise AI platformOpenAI FrontierHP IncAI platform warpilot purgatorySalesforce AgentforceMicrosoft Copilot StudioGoogle Agent PlatformAI vendor lock-inagentic AICIO strategyAI governanceAI ROIenterprise AI adoption
HP Chose Its AI Platform. 88% of Enterprises Won't Survive Theirs.

HP Inc. just scaled its OpenAI Frontier partnership across global operations after one engineer moved through 122 pull requests across 43 projects and a security team unlocked 82 hours per week of capacity. But the real story isn't HP — it's the enterprise AI platform war now forcing every CIO to make a decade-defining vendor decision. With 88% of AI proofs of concept never reaching production and 40% of agentic AI projects heading for cancellation, this article delivers the evaluation matrix and pilot-to-production readiness assessment every enterprise needs before committing.

By Rajesh Beri·July 1, 2026·15 min read

On June 28, 2026, HP Inc. quietly announced what may become the most consequential enterprise technology decision of the year. The company is scaling its OpenAI Frontier strategic partnership across global operations — customer-facing experiences, software development, cybersecurity, partner ecosystems, and internal workflows — after a four-month evaluation period that began in February 2026.

The numbers that convinced HP's leadership are specific. One engineer used OpenAI models to move through 122 pull requests across 43 different projects in weeks. A security team remediated software bugs in a single day that would have otherwise taken a month — unlocking an estimated 82 hours per week of security-team capacity. HP's channel ecosystem, with more than 80% of its business flowing through 100,000+ partners, will now run AI agents across store, partner, chat, and voice experiences through a single platform.

"With OpenAI there is an opportunity to fundamentally rethink how AI can deliver better outcomes," said Prakash Arunkundrum, HP's chief strategy and transformation officer. "It reflects the ambition of our AI strategy to deliver real-world outcomes at scale."

But the HP announcement is not, fundamentally, a story about HP. It is a signal of a decision that every Fortune 500 CIO will face — or has already faced — in 2026: which enterprise AI platform do you bet your company on?

And the data suggests that the vast majority will get it wrong.

The Platform War Nobody Prepared For

The enterprise AI platform market in mid-2026 is a five-front war. Each major vendor has built or acquired a platform designed to become the orchestration layer for autonomous AI agents across the enterprise:

  • OpenAI Frontier (launched February 2026): A standalone agentic-AI platform for building, deploying, and managing AI agents with shared business context, identity and permissions, governance, and evaluation. Custom enterprise pricing. HP is among the first Fortune 100 adopters at scale.

  • Salesforce Agentforce: Annual recurring revenue hit $800 million in fiscal year 2026, a 169% year-over-year increase. 29,000 deals closed since launch. 9,500 paid deals by Q3 FY26. Salesforce's own internal deployment achieves an 83% autonomous resolution rate for customer service queries. The deepest CRM integration in the market — but governance extends only as far as the Salesforce ecosystem.

  • Microsoft Copilot Studio: Over 120,000 custom Copilot agents deployed across enterprises by Q1 2026. Priced at $21-30 per user per month. The natural choice for Microsoft 365 organizations — but governance applies only within the Microsoft boundary. As one analysis noted, "Copilot Studio governs custom agents only within the Microsoft boundary."

  • Google Gemini Enterprise Agent Platform: Cloud-native agent infrastructure for Google Cloud organizations. Leverages Gemini models with deep integration into Google Workspace and BigQuery. Best fit for organizations already standardized on Google Cloud's data estate.

  • Amazon Bedrock Agents and AgentCore: AWS-native agent deployment and orchestration. Natural fit for organizations with existing AWS infrastructure. AgentCore focuses on production deployment patterns with built-in observability.

Each platform is designed to become the connective tissue between your data, your people, and your autonomous agents. Each creates a gravity well that makes switching progressively harder. And each vendor is betting that the platform decision — not the model decision — is what creates durable competitive advantage.

"The teams that buy a single enterprise platform tend to be the ones whose buyer is already standardized on the bundle vendor," observed one industry analysis. "Salesforce shops buy Agentforce. Microsoft shops buy Copilot Studio."

The problem is that this decision is being made with the same rigor enterprises applied to choosing a collaboration tool in 2020 — which is to say, not much rigor at all.

The 80% Failure Rate Nobody Is Talking About

HP's successful pilot-to-production trajectory is notable precisely because it is rare. The data on enterprise AI deployment success rates in 2026 is brutal:

  • 88% of AI proofs of concept never reach productionIDC
  • 80% of AI projects fail, double the rate of non-AI IT projects — RAND Corporation
  • 42% of companies abandoned most AI initiatives by mid-2025, up from 17% the prior year — industry surveys
  • 56% of CEOs report no financial impact from AI investment — PwC 2026 CEO Survey
  • 40%+ of agentic AI projects will be canceled by end of 2027 — Gartner

Read that last number again. Gartner is predicting that nearly half of all agentic AI projects — the exact category HP just bet on — will be canceled within 18 months. The primary drivers: escalating costs, unclear business value, and inadequate risk controls.

The enterprise AI adoption rate did double in 2026 — to 24%, up from 12% in 2025. But only 31% of enterprises have even one AI agent in production. And only 25% have moved at least 40% of their AI experiments into production environments.

This is the landscape in which HP's decision to go all-in on a single platform vendor is being made. It is also the landscape in which every other enterprise is making — or avoiding — the same decision.

What HP Actually Did Differently

The HP case study is instructive not because of the vendor they chose, but because of the process they followed. Strip away the press release language and three structural decisions emerge:

1. They evaluated at the platform level, not the model level.

HP didn't choose OpenAI because GPT-5.6 scored higher on a benchmark. They chose Frontier because it offered an operating model — "connecting access, context, deployment, and evaluation" — that could scale from pilots to production. As OpenAI's CRO Denise Dresser put it, HP was "turning early value from OpenAI APIs and tools like ChatGPT and Codex into repeatable systems."

This distinction matters. Most enterprises are still making model-level decisions — comparing GPT-5.6 vs Claude Opus vs Gemini on accuracy benchmarks — when the actual strategic question is which platform will govern, deploy, monitor, and evaluate their agents at scale.

2. They started with measurable pilot outcomes before scaling.

The four-month evaluation period from February to June 2026 wasn't a sandbox experiment. HP ran pilots with specific, measurable outcomes: 122 PRs across 43 projects for software engineering. Day-one bug remediation in security. These weren't "exploring AI" exercises — they were production-adjacent workloads with clear before-and-after metrics.

This is the pattern that separates the 12% that reach production from the 88% that don't. The botsitting research shows that employees save 11 hours per week with AI but waste 6.4 hours babysitting it. HP's pilots measured net productivity, not gross capability.

3. They treated governance as an adoption accelerator, not a barrier.

HP's press release repeatedly mentions governance — "data integration, governance, and security" as enterprise standards, Frontier as a layer for "how actions are governed." This is the opposite of the pattern we see in 88% of enterprises with AI agent security incidents. Most organizations deploy agents first and govern later. HP built governance into the platform selection criteria.

The Lock-In Trap: 19-34% Switching Costs

Every enterprise AI platform creates lock-in. The question isn't whether you'll be locked in — it's whether you've measured the cost.

Research from Swfte AI estimates that AI vendor switching costs range from 19% to 34% of total deployment cost — encompassing direct migration expenses, productivity loss during transition, retraining, and the value of lost institutional knowledge. VaasBlock's analysis found that most enterprises aren't measuring these costs at all.

The lock-in mechanics differ by platform:

  • Data lock-in: Your enterprise data flows through the vendor's context layer. Agent behaviors, evaluation data, and usage patterns accumulate in formats that don't port cleanly. The longer you run, the deeper the moat.

  • Workflow lock-in: As agents integrate with your CRM, ERP, ITSM, and custom systems through the platform's connectors, migrating means rebuilding every integration.

  • Knowledge lock-in: Agent behaviors, prompt engineering, evaluation criteria, and institutional knowledge get encoded in platform-specific formats. This is the hardest to replicate.

  • Identity and governance lock-in: Permissions, audit trails, and compliance configurations built on one platform's identity model don't transfer to another.

HP made a deliberate bet. With 80% of its business flowing through partners and 100,000+ partners on its portal, the scale of lock-in is enormous. But so is the cost of perpetual pilot purgatory.

The lesson isn't "avoid lock-in." It's "measure lock-in, negotiate exit terms, and choose with open eyes." Salesforce's own Connectivity Report found that 50% of AI agents currently operate in isolated silos. The irony: organizations that refuse to commit to a platform to avoid lock-in often end up with fragmented agent sprawl that creates a different, more expensive form of lock-in — one with no governance at all.

Framework #1: The Enterprise AI Platform Evaluation Matrix

HP evaluated OpenAI Frontier across "technical capabilities, use cases, and strategic alignment." Most enterprises don't have a structured framework for this decision. Here is one.

Score each platform you're evaluating on a 1-5 scale across these 12 dimensions. Weight each dimension based on your organization's priorities.

Dimension What to Evaluate Weight Guide
1. Agent Lifecycle Management Can the platform handle agent creation, deployment, monitoring, versioning, and retirement in one workflow? Critical for orgs planning 10+ agents
2. Model Flexibility Can you swap underlying models (GPT, Claude, Gemini, open-weight) without rebuilding agents? Critical if you need model-agnostic architecture
3. Enterprise Data Integration How deeply does the platform connect to your existing data estate (CRM, ERP, data warehouse, knowledge bases)? Weight by data estate complexity
4. Identity and Permissions Per-agent identity credentials, granular permissions, audit trails traceable to human sponsors? Critical for regulated industries
5. Governance and Compliance Runtime enforcement (not just flagging), policy-as-code, regulatory framework alignment (EU AI Act, state laws)? Critical for EU-facing orgs — August 2, 2026 deadline
6. Observability Real-time monitoring of all agent actions, behavioral baselines, anomaly detection? Critical for production agents with data access
7. Evaluation Framework Built-in tools to measure agent performance, accuracy, cost per task, business impact? Critical for proving ROI
8. Multi-Agent Orchestration Can agents coordinate, hand off tasks, and communicate with each other under governed protocols? Important for complex workflows
9. Security Architecture Supply chain security, prompt injection defenses, output filtering, adversarial testing? Non-negotiable
10. Ecosystem and Integrations Pre-built connectors to your existing tech stack (ServiceNow, Jira, Slack, SAP, etc.)? Weight by integration count
11. Pricing Transparency Can you forecast costs at scale? Per-agent, per-token, per-user, or consumption-based? Critical for budget predictability
12. Exit Terms and Portability Contractual data portability, agent export formats, knowledge migration support? Critical — 19-34% switching costs if missed

Scoring guidance:

  • Total Score 48-60 (Strong fit): Platform aligns with your infrastructure, governance requirements, and strategic priorities. Proceed to pilot.
  • Total Score 36-47 (Moderate fit): Gaps exist but may be addressable. Identify the specific gaps and assess whether the vendor's roadmap closes them within your deployment timeline.
  • Total Score 24-35 (Weak fit): Significant misalignment. The platform may work for narrow use cases but will create friction at enterprise scale.
  • Total Score <24 (Poor fit): Do not deploy. The governance, integration, or security gaps will compound as you scale.

The HP signal: HP's evaluation took four months. They tested pilots with measurable outcomes before committing. If your enterprise AI platform evaluation took less than 90 days or involved no production-adjacent workloads, you haven't actually evaluated — you've chosen a vendor based on a demo.

Framework #2: The Pilot-to-Production Readiness Assessment

88% of AI proofs of concept never reach production. This 10-point assessment identifies where the failure will occur before you invest in scaling.

For each factor, assess your organization as Red (not ready), Yellow (partially ready), or Green (ready). You need 8+ Greens to scale successfully.

Factor 1: Business Ownership

  • 🟢 A named P&L owner has committed budget and headcount to the AI initiative
  • 🟡 A sponsor exists but budget comes from a shared IT pool
  • 🔴 The initiative is owned by "the AI team" or "innovation lab" with no P&L attachment

Factor 2: Success Metrics

  • 🟢 KPIs are defined, baselined, and measured before the pilot begins (e.g., "reduce resolution time from 14 minutes to 4 minutes")
  • 🟡 KPIs exist but baseline measurements haven't been taken
  • 🔴 Success is defined as "explore AI capabilities" or "increase productivity"

Factor 3: Data Readiness

  • 🟢 The data required by agents is clean, accessible via API, governed, and has documented lineage
  • 🟡 Data exists but requires ETL work, manual cleanup, or permission negotiations
  • 🔴 Agents will need data from systems with no API access, unclear ownership, or known quality issues

Factor 4: Integration Architecture

  • 🟢 Target systems have documented APIs, the platform has pre-built connectors, and integration has been tested
  • 🟡 APIs exist but connectors must be custom-built; integration is planned but untested
  • 🔴 Target systems are legacy with no API layer, or integration dependencies are undefined

Factor 5: Governance Framework

  • 🟢 Per-agent identity, permissions, monitoring, and incident response procedures are documented and tested
  • 🟡 Governance policies exist on paper but haven't been operationalized
  • 🔴 Governance is planned for "after we prove the concept works"

Factor 6: Change Management

  • 🟢 End users have been involved in pilot design, trained on the new workflow, and have a feedback channel
  • 🟡 Training is planned but hasn't started; users are aware but not involved
  • 🔴 The AI initiative is being built "for" users who haven't been consulted

Factor 7: Cost Model

  • 🟢 Per-agent and per-task costs are measured during the pilot; total cost of ownership (including human oversight) is modeled for production scale
  • 🟡 Token costs are tracked but human oversight, integration maintenance, and governance costs are not
  • 🔴 Costs are estimated based on vendor pricing pages, not actual usage patterns

Factor 8: Rollback Capability

  • 🟢 The previous workflow can be restored within hours; agents can be halted individually without affecting other systems
  • 🟡 Rollback is possible but would take days and require manual intervention
  • 🔴 No rollback plan exists, or reverting would require rebuilding the previous workflow from scratch

Factor 9: Security Validation

  • 🟢 Agent actions, data access patterns, and output quality have been red-teamed; supply chain dependencies are inventoried
  • 🟡 Basic security review has been completed but no adversarial testing
  • 🔴 Security review is scheduled for after the pilot proves value

Factor 10: Scaling Architecture

  • 🟢 The pilot was designed on the same platform, with the same governance, that production will use; scaling requires configuration, not re-architecture
  • 🟡 The pilot platform differs from the production target but migration path is documented
  • 🔴 The pilot was built on a different platform, with different integrations, than production will require

Scoring:

  • 8-10 Greens: Ready to scale. Your pilot was designed for production from day one.
  • 5-7 Greens: Proceed with caution. Address the Yellow/Red factors before committing production budget. Each unresolved factor increases failure probability by approximately 15-20%.
  • 3-4 Greens: Not ready. You're in the 88% that will stall. Fix the Reds before spending another dollar on the pilot.
  • 0-2 Greens: This isn't a pilot — it's a demo. Start over with a production-first design.

The Decision That Defines the Decade

The enterprise AI platform decision is not a technology decision. It is an operating model decision — on par with choosing your cloud provider, your ERP system, or your CRM platform. Those decisions took years to evaluate and decades to play out. The AI platform decision is being made in quarters.

HP took four months. They ran production-adjacent pilots. They measured specific outcomes. They evaluated governance alongside capability. They committed to a vendor knowing the lock-in implications.

That process — not the vendor they chose — is the model worth studying.

Because the data is clear: 100% of CIOs are budgeting for AI. Half already blew their budgets. 80% of AI projects fail. 88% of enterprises have had AI agent security incidents. And 40% of agentic AI projects will be canceled by the end of next year.

The enterprises that survive this wave won't be the ones that chose the best AI model. They'll be the ones that chose a platform, evaluated it rigorously, governed it from day one, and measured what mattered before they scaled.

HP just showed what that looks like. Now the question is whether anyone is paying attention.


Continue Reading


Rajesh Beri is Head of AI Engineering at Zscaler. Views expressed are his own.

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

beri.net

Subscribe at beri.net/subscribe for twice-weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

On June 28, 2026, HP Inc. quietly announced what may become the most consequential enterprise technology decision of the year. The company is scaling its OpenAI Frontier strategic partnership across global operations — customer-facing experiences, software development, cybersecurity, partner ecosystems, and internal workflows — after a four-month evaluation period that began in February 2026.

The numbers that convinced HP's leadership are specific. One engineer used OpenAI models to move through 122 pull requests across 43 different projects in weeks. A security team remediated software bugs in a single day that would have otherwise taken a month — unlocking an estimated 82 hours per week of security-team capacity. HP's channel ecosystem, with more than 80% of its business flowing through 100,000+ partners, will now run AI agents across store, partner, chat, and voice experiences through a single platform.

"With OpenAI there is an opportunity to fundamentally rethink how AI can deliver better outcomes," said Prakash Arunkundrum, HP's chief strategy and transformation officer. "It reflects the ambition of our AI strategy to deliver real-world outcomes at scale."

But the HP announcement is not, fundamentally, a story about HP. It is a signal of a decision that every Fortune 500 CIO will face — or has already faced — in 2026: which enterprise AI platform do you bet your company on?

And the data suggests that the vast majority will get it wrong.

The Platform War Nobody Prepared For

The enterprise AI platform market in mid-2026 is a five-front war. Each major vendor has built or acquired a platform designed to become the orchestration layer for autonomous AI agents across the enterprise:

  • OpenAI Frontier (launched February 2026): A standalone agentic-AI platform for building, deploying, and managing AI agents with shared business context, identity and permissions, governance, and evaluation. Custom enterprise pricing. HP is among the first Fortune 100 adopters at scale.

  • Salesforce Agentforce: Annual recurring revenue hit $800 million in fiscal year 2026, a 169% year-over-year increase. 29,000 deals closed since launch. 9,500 paid deals by Q3 FY26. Salesforce's own internal deployment achieves an 83% autonomous resolution rate for customer service queries. The deepest CRM integration in the market — but governance extends only as far as the Salesforce ecosystem.

  • Microsoft Copilot Studio: Over 120,000 custom Copilot agents deployed across enterprises by Q1 2026. Priced at $21-30 per user per month. The natural choice for Microsoft 365 organizations — but governance applies only within the Microsoft boundary. As one analysis noted, "Copilot Studio governs custom agents only within the Microsoft boundary."

  • Google Gemini Enterprise Agent Platform: Cloud-native agent infrastructure for Google Cloud organizations. Leverages Gemini models with deep integration into Google Workspace and BigQuery. Best fit for organizations already standardized on Google Cloud's data estate.

  • Amazon Bedrock Agents and AgentCore: AWS-native agent deployment and orchestration. Natural fit for organizations with existing AWS infrastructure. AgentCore focuses on production deployment patterns with built-in observability.

Each platform is designed to become the connective tissue between your data, your people, and your autonomous agents. Each creates a gravity well that makes switching progressively harder. And each vendor is betting that the platform decision — not the model decision — is what creates durable competitive advantage.

"The teams that buy a single enterprise platform tend to be the ones whose buyer is already standardized on the bundle vendor," observed one industry analysis. "Salesforce shops buy Agentforce. Microsoft shops buy Copilot Studio."

The problem is that this decision is being made with the same rigor enterprises applied to choosing a collaboration tool in 2020 — which is to say, not much rigor at all.

The 80% Failure Rate Nobody Is Talking About

HP's successful pilot-to-production trajectory is notable precisely because it is rare. The data on enterprise AI deployment success rates in 2026 is brutal:

  • 88% of AI proofs of concept never reach productionIDC
  • 80% of AI projects fail, double the rate of non-AI IT projects — RAND Corporation
  • 42% of companies abandoned most AI initiatives by mid-2025, up from 17% the prior year — industry surveys
  • 56% of CEOs report no financial impact from AI investment — PwC 2026 CEO Survey
  • 40%+ of agentic AI projects will be canceled by end of 2027 — Gartner

Read that last number again. Gartner is predicting that nearly half of all agentic AI projects — the exact category HP just bet on — will be canceled within 18 months. The primary drivers: escalating costs, unclear business value, and inadequate risk controls.

The enterprise AI adoption rate did double in 2026 — to 24%, up from 12% in 2025. But only 31% of enterprises have even one AI agent in production. And only 25% have moved at least 40% of their AI experiments into production environments.

This is the landscape in which HP's decision to go all-in on a single platform vendor is being made. It is also the landscape in which every other enterprise is making — or avoiding — the same decision.

What HP Actually Did Differently

The HP case study is instructive not because of the vendor they chose, but because of the process they followed. Strip away the press release language and three structural decisions emerge:

1. They evaluated at the platform level, not the model level.

HP didn't choose OpenAI because GPT-5.6 scored higher on a benchmark. They chose Frontier because it offered an operating model — "connecting access, context, deployment, and evaluation" — that could scale from pilots to production. As OpenAI's CRO Denise Dresser put it, HP was "turning early value from OpenAI APIs and tools like ChatGPT and Codex into repeatable systems."

This distinction matters. Most enterprises are still making model-level decisions — comparing GPT-5.6 vs Claude Opus vs Gemini on accuracy benchmarks — when the actual strategic question is which platform will govern, deploy, monitor, and evaluate their agents at scale.

2. They started with measurable pilot outcomes before scaling.

The four-month evaluation period from February to June 2026 wasn't a sandbox experiment. HP ran pilots with specific, measurable outcomes: 122 PRs across 43 projects for software engineering. Day-one bug remediation in security. These weren't "exploring AI" exercises — they were production-adjacent workloads with clear before-and-after metrics.

This is the pattern that separates the 12% that reach production from the 88% that don't. The botsitting research shows that employees save 11 hours per week with AI but waste 6.4 hours babysitting it. HP's pilots measured net productivity, not gross capability.

3. They treated governance as an adoption accelerator, not a barrier.

HP's press release repeatedly mentions governance — "data integration, governance, and security" as enterprise standards, Frontier as a layer for "how actions are governed." This is the opposite of the pattern we see in 88% of enterprises with AI agent security incidents. Most organizations deploy agents first and govern later. HP built governance into the platform selection criteria.

The Lock-In Trap: 19-34% Switching Costs

Every enterprise AI platform creates lock-in. The question isn't whether you'll be locked in — it's whether you've measured the cost.

Research from Swfte AI estimates that AI vendor switching costs range from 19% to 34% of total deployment cost — encompassing direct migration expenses, productivity loss during transition, retraining, and the value of lost institutional knowledge. VaasBlock's analysis found that most enterprises aren't measuring these costs at all.

The lock-in mechanics differ by platform:

  • Data lock-in: Your enterprise data flows through the vendor's context layer. Agent behaviors, evaluation data, and usage patterns accumulate in formats that don't port cleanly. The longer you run, the deeper the moat.

  • Workflow lock-in: As agents integrate with your CRM, ERP, ITSM, and custom systems through the platform's connectors, migrating means rebuilding every integration.

  • Knowledge lock-in: Agent behaviors, prompt engineering, evaluation criteria, and institutional knowledge get encoded in platform-specific formats. This is the hardest to replicate.

  • Identity and governance lock-in: Permissions, audit trails, and compliance configurations built on one platform's identity model don't transfer to another.

HP made a deliberate bet. With 80% of its business flowing through partners and 100,000+ partners on its portal, the scale of lock-in is enormous. But so is the cost of perpetual pilot purgatory.

The lesson isn't "avoid lock-in." It's "measure lock-in, negotiate exit terms, and choose with open eyes." Salesforce's own Connectivity Report found that 50% of AI agents currently operate in isolated silos. The irony: organizations that refuse to commit to a platform to avoid lock-in often end up with fragmented agent sprawl that creates a different, more expensive form of lock-in — one with no governance at all.

Framework #1: The Enterprise AI Platform Evaluation Matrix

HP evaluated OpenAI Frontier across "technical capabilities, use cases, and strategic alignment." Most enterprises don't have a structured framework for this decision. Here is one.

Score each platform you're evaluating on a 1-5 scale across these 12 dimensions. Weight each dimension based on your organization's priorities.

Dimension What to Evaluate Weight Guide
1. Agent Lifecycle Management Can the platform handle agent creation, deployment, monitoring, versioning, and retirement in one workflow? Critical for orgs planning 10+ agents
2. Model Flexibility Can you swap underlying models (GPT, Claude, Gemini, open-weight) without rebuilding agents? Critical if you need model-agnostic architecture
3. Enterprise Data Integration How deeply does the platform connect to your existing data estate (CRM, ERP, data warehouse, knowledge bases)? Weight by data estate complexity
4. Identity and Permissions Per-agent identity credentials, granular permissions, audit trails traceable to human sponsors? Critical for regulated industries
5. Governance and Compliance Runtime enforcement (not just flagging), policy-as-code, regulatory framework alignment (EU AI Act, state laws)? Critical for EU-facing orgs — August 2, 2026 deadline
6. Observability Real-time monitoring of all agent actions, behavioral baselines, anomaly detection? Critical for production agents with data access
7. Evaluation Framework Built-in tools to measure agent performance, accuracy, cost per task, business impact? Critical for proving ROI
8. Multi-Agent Orchestration Can agents coordinate, hand off tasks, and communicate with each other under governed protocols? Important for complex workflows
9. Security Architecture Supply chain security, prompt injection defenses, output filtering, adversarial testing? Non-negotiable
10. Ecosystem and Integrations Pre-built connectors to your existing tech stack (ServiceNow, Jira, Slack, SAP, etc.)? Weight by integration count
11. Pricing Transparency Can you forecast costs at scale? Per-agent, per-token, per-user, or consumption-based? Critical for budget predictability
12. Exit Terms and Portability Contractual data portability, agent export formats, knowledge migration support? Critical — 19-34% switching costs if missed

Scoring guidance:

  • Total Score 48-60 (Strong fit): Platform aligns with your infrastructure, governance requirements, and strategic priorities. Proceed to pilot.
  • Total Score 36-47 (Moderate fit): Gaps exist but may be addressable. Identify the specific gaps and assess whether the vendor's roadmap closes them within your deployment timeline.
  • Total Score 24-35 (Weak fit): Significant misalignment. The platform may work for narrow use cases but will create friction at enterprise scale.
  • Total Score <24 (Poor fit): Do not deploy. The governance, integration, or security gaps will compound as you scale.

The HP signal: HP's evaluation took four months. They tested pilots with measurable outcomes before committing. If your enterprise AI platform evaluation took less than 90 days or involved no production-adjacent workloads, you haven't actually evaluated — you've chosen a vendor based on a demo.

Framework #2: The Pilot-to-Production Readiness Assessment

88% of AI proofs of concept never reach production. This 10-point assessment identifies where the failure will occur before you invest in scaling.

For each factor, assess your organization as Red (not ready), Yellow (partially ready), or Green (ready). You need 8+ Greens to scale successfully.

Factor 1: Business Ownership

  • 🟢 A named P&L owner has committed budget and headcount to the AI initiative
  • 🟡 A sponsor exists but budget comes from a shared IT pool
  • 🔴 The initiative is owned by "the AI team" or "innovation lab" with no P&L attachment

Factor 2: Success Metrics

  • 🟢 KPIs are defined, baselined, and measured before the pilot begins (e.g., "reduce resolution time from 14 minutes to 4 minutes")
  • 🟡 KPIs exist but baseline measurements haven't been taken
  • 🔴 Success is defined as "explore AI capabilities" or "increase productivity"

Factor 3: Data Readiness

  • 🟢 The data required by agents is clean, accessible via API, governed, and has documented lineage
  • 🟡 Data exists but requires ETL work, manual cleanup, or permission negotiations
  • 🔴 Agents will need data from systems with no API access, unclear ownership, or known quality issues

Factor 4: Integration Architecture

  • 🟢 Target systems have documented APIs, the platform has pre-built connectors, and integration has been tested
  • 🟡 APIs exist but connectors must be custom-built; integration is planned but untested
  • 🔴 Target systems are legacy with no API layer, or integration dependencies are undefined

Factor 5: Governance Framework

  • 🟢 Per-agent identity, permissions, monitoring, and incident response procedures are documented and tested
  • 🟡 Governance policies exist on paper but haven't been operationalized
  • 🔴 Governance is planned for "after we prove the concept works"

Factor 6: Change Management

  • 🟢 End users have been involved in pilot design, trained on the new workflow, and have a feedback channel
  • 🟡 Training is planned but hasn't started; users are aware but not involved
  • 🔴 The AI initiative is being built "for" users who haven't been consulted

Factor 7: Cost Model

  • 🟢 Per-agent and per-task costs are measured during the pilot; total cost of ownership (including human oversight) is modeled for production scale
  • 🟡 Token costs are tracked but human oversight, integration maintenance, and governance costs are not
  • 🔴 Costs are estimated based on vendor pricing pages, not actual usage patterns

Factor 8: Rollback Capability

  • 🟢 The previous workflow can be restored within hours; agents can be halted individually without affecting other systems
  • 🟡 Rollback is possible but would take days and require manual intervention
  • 🔴 No rollback plan exists, or reverting would require rebuilding the previous workflow from scratch

Factor 9: Security Validation

  • 🟢 Agent actions, data access patterns, and output quality have been red-teamed; supply chain dependencies are inventoried
  • 🟡 Basic security review has been completed but no adversarial testing
  • 🔴 Security review is scheduled for after the pilot proves value

Factor 10: Scaling Architecture

  • 🟢 The pilot was designed on the same platform, with the same governance, that production will use; scaling requires configuration, not re-architecture
  • 🟡 The pilot platform differs from the production target but migration path is documented
  • 🔴 The pilot was built on a different platform, with different integrations, than production will require

Scoring:

  • 8-10 Greens: Ready to scale. Your pilot was designed for production from day one.
  • 5-7 Greens: Proceed with caution. Address the Yellow/Red factors before committing production budget. Each unresolved factor increases failure probability by approximately 15-20%.
  • 3-4 Greens: Not ready. You're in the 88% that will stall. Fix the Reds before spending another dollar on the pilot.
  • 0-2 Greens: This isn't a pilot — it's a demo. Start over with a production-first design.

The Decision That Defines the Decade

The enterprise AI platform decision is not a technology decision. It is an operating model decision — on par with choosing your cloud provider, your ERP system, or your CRM platform. Those decisions took years to evaluate and decades to play out. The AI platform decision is being made in quarters.

HP took four months. They ran production-adjacent pilots. They measured specific outcomes. They evaluated governance alongside capability. They committed to a vendor knowing the lock-in implications.

That process — not the vendor they chose — is the model worth studying.

Because the data is clear: 100% of CIOs are budgeting for AI. Half already blew their budgets. 80% of AI projects fail. 88% of enterprises have had AI agent security incidents. And 40% of agentic AI projects will be canceled by the end of next year.

The enterprises that survive this wave won't be the ones that chose the best AI model. They'll be the ones that chose a platform, evaluated it rigorously, governed it from day one, and measured what mattered before they scaled.

HP just showed what that looks like. Now the question is whether anyone is paying attention.


Continue Reading


Rajesh Beri is Head of AI Engineering at Zscaler. Views expressed are his own.

Share:
THE DAILY BRIEF
enterprise AI platformOpenAI FrontierHP IncAI platform warpilot purgatorySalesforce AgentforceMicrosoft Copilot StudioGoogle Agent PlatformAI vendor lock-inagentic AICIO strategyAI governanceAI ROIenterprise AI adoption
HP Chose Its AI Platform. 88% of Enterprises Won't Survive Theirs.

HP Inc. just scaled its OpenAI Frontier partnership across global operations after one engineer moved through 122 pull requests across 43 projects and a security team unlocked 82 hours per week of capacity. But the real story isn't HP — it's the enterprise AI platform war now forcing every CIO to make a decade-defining vendor decision. With 88% of AI proofs of concept never reaching production and 40% of agentic AI projects heading for cancellation, this article delivers the evaluation matrix and pilot-to-production readiness assessment every enterprise needs before committing.

By Rajesh Beri·July 1, 2026·15 min read

On June 28, 2026, HP Inc. quietly announced what may become the most consequential enterprise technology decision of the year. The company is scaling its OpenAI Frontier strategic partnership across global operations — customer-facing experiences, software development, cybersecurity, partner ecosystems, and internal workflows — after a four-month evaluation period that began in February 2026.

The numbers that convinced HP's leadership are specific. One engineer used OpenAI models to move through 122 pull requests across 43 different projects in weeks. A security team remediated software bugs in a single day that would have otherwise taken a month — unlocking an estimated 82 hours per week of security-team capacity. HP's channel ecosystem, with more than 80% of its business flowing through 100,000+ partners, will now run AI agents across store, partner, chat, and voice experiences through a single platform.

"With OpenAI there is an opportunity to fundamentally rethink how AI can deliver better outcomes," said Prakash Arunkundrum, HP's chief strategy and transformation officer. "It reflects the ambition of our AI strategy to deliver real-world outcomes at scale."

But the HP announcement is not, fundamentally, a story about HP. It is a signal of a decision that every Fortune 500 CIO will face — or has already faced — in 2026: which enterprise AI platform do you bet your company on?

And the data suggests that the vast majority will get it wrong.

The Platform War Nobody Prepared For

The enterprise AI platform market in mid-2026 is a five-front war. Each major vendor has built or acquired a platform designed to become the orchestration layer for autonomous AI agents across the enterprise:

  • OpenAI Frontier (launched February 2026): A standalone agentic-AI platform for building, deploying, and managing AI agents with shared business context, identity and permissions, governance, and evaluation. Custom enterprise pricing. HP is among the first Fortune 100 adopters at scale.

  • Salesforce Agentforce: Annual recurring revenue hit $800 million in fiscal year 2026, a 169% year-over-year increase. 29,000 deals closed since launch. 9,500 paid deals by Q3 FY26. Salesforce's own internal deployment achieves an 83% autonomous resolution rate for customer service queries. The deepest CRM integration in the market — but governance extends only as far as the Salesforce ecosystem.

  • Microsoft Copilot Studio: Over 120,000 custom Copilot agents deployed across enterprises by Q1 2026. Priced at $21-30 per user per month. The natural choice for Microsoft 365 organizations — but governance applies only within the Microsoft boundary. As one analysis noted, "Copilot Studio governs custom agents only within the Microsoft boundary."

  • Google Gemini Enterprise Agent Platform: Cloud-native agent infrastructure for Google Cloud organizations. Leverages Gemini models with deep integration into Google Workspace and BigQuery. Best fit for organizations already standardized on Google Cloud's data estate.

  • Amazon Bedrock Agents and AgentCore: AWS-native agent deployment and orchestration. Natural fit for organizations with existing AWS infrastructure. AgentCore focuses on production deployment patterns with built-in observability.

Each platform is designed to become the connective tissue between your data, your people, and your autonomous agents. Each creates a gravity well that makes switching progressively harder. And each vendor is betting that the platform decision — not the model decision — is what creates durable competitive advantage.

"The teams that buy a single enterprise platform tend to be the ones whose buyer is already standardized on the bundle vendor," observed one industry analysis. "Salesforce shops buy Agentforce. Microsoft shops buy Copilot Studio."

The problem is that this decision is being made with the same rigor enterprises applied to choosing a collaboration tool in 2020 — which is to say, not much rigor at all.

The 80% Failure Rate Nobody Is Talking About

HP's successful pilot-to-production trajectory is notable precisely because it is rare. The data on enterprise AI deployment success rates in 2026 is brutal:

  • 88% of AI proofs of concept never reach productionIDC
  • 80% of AI projects fail, double the rate of non-AI IT projects — RAND Corporation
  • 42% of companies abandoned most AI initiatives by mid-2025, up from 17% the prior year — industry surveys
  • 56% of CEOs report no financial impact from AI investment — PwC 2026 CEO Survey
  • 40%+ of agentic AI projects will be canceled by end of 2027 — Gartner

Read that last number again. Gartner is predicting that nearly half of all agentic AI projects — the exact category HP just bet on — will be canceled within 18 months. The primary drivers: escalating costs, unclear business value, and inadequate risk controls.

The enterprise AI adoption rate did double in 2026 — to 24%, up from 12% in 2025. But only 31% of enterprises have even one AI agent in production. And only 25% have moved at least 40% of their AI experiments into production environments.

This is the landscape in which HP's decision to go all-in on a single platform vendor is being made. It is also the landscape in which every other enterprise is making — or avoiding — the same decision.

What HP Actually Did Differently

The HP case study is instructive not because of the vendor they chose, but because of the process they followed. Strip away the press release language and three structural decisions emerge:

1. They evaluated at the platform level, not the model level.

HP didn't choose OpenAI because GPT-5.6 scored higher on a benchmark. They chose Frontier because it offered an operating model — "connecting access, context, deployment, and evaluation" — that could scale from pilots to production. As OpenAI's CRO Denise Dresser put it, HP was "turning early value from OpenAI APIs and tools like ChatGPT and Codex into repeatable systems."

This distinction matters. Most enterprises are still making model-level decisions — comparing GPT-5.6 vs Claude Opus vs Gemini on accuracy benchmarks — when the actual strategic question is which platform will govern, deploy, monitor, and evaluate their agents at scale.

2. They started with measurable pilot outcomes before scaling.

The four-month evaluation period from February to June 2026 wasn't a sandbox experiment. HP ran pilots with specific, measurable outcomes: 122 PRs across 43 projects for software engineering. Day-one bug remediation in security. These weren't "exploring AI" exercises — they were production-adjacent workloads with clear before-and-after metrics.

This is the pattern that separates the 12% that reach production from the 88% that don't. The botsitting research shows that employees save 11 hours per week with AI but waste 6.4 hours babysitting it. HP's pilots measured net productivity, not gross capability.

3. They treated governance as an adoption accelerator, not a barrier.

HP's press release repeatedly mentions governance — "data integration, governance, and security" as enterprise standards, Frontier as a layer for "how actions are governed." This is the opposite of the pattern we see in 88% of enterprises with AI agent security incidents. Most organizations deploy agents first and govern later. HP built governance into the platform selection criteria.

The Lock-In Trap: 19-34% Switching Costs

Every enterprise AI platform creates lock-in. The question isn't whether you'll be locked in — it's whether you've measured the cost.

Research from Swfte AI estimates that AI vendor switching costs range from 19% to 34% of total deployment cost — encompassing direct migration expenses, productivity loss during transition, retraining, and the value of lost institutional knowledge. VaasBlock's analysis found that most enterprises aren't measuring these costs at all.

The lock-in mechanics differ by platform:

  • Data lock-in: Your enterprise data flows through the vendor's context layer. Agent behaviors, evaluation data, and usage patterns accumulate in formats that don't port cleanly. The longer you run, the deeper the moat.

  • Workflow lock-in: As agents integrate with your CRM, ERP, ITSM, and custom systems through the platform's connectors, migrating means rebuilding every integration.

  • Knowledge lock-in: Agent behaviors, prompt engineering, evaluation criteria, and institutional knowledge get encoded in platform-specific formats. This is the hardest to replicate.

  • Identity and governance lock-in: Permissions, audit trails, and compliance configurations built on one platform's identity model don't transfer to another.

HP made a deliberate bet. With 80% of its business flowing through partners and 100,000+ partners on its portal, the scale of lock-in is enormous. But so is the cost of perpetual pilot purgatory.

The lesson isn't "avoid lock-in." It's "measure lock-in, negotiate exit terms, and choose with open eyes." Salesforce's own Connectivity Report found that 50% of AI agents currently operate in isolated silos. The irony: organizations that refuse to commit to a platform to avoid lock-in often end up with fragmented agent sprawl that creates a different, more expensive form of lock-in — one with no governance at all.

Framework #1: The Enterprise AI Platform Evaluation Matrix

HP evaluated OpenAI Frontier across "technical capabilities, use cases, and strategic alignment." Most enterprises don't have a structured framework for this decision. Here is one.

Score each platform you're evaluating on a 1-5 scale across these 12 dimensions. Weight each dimension based on your organization's priorities.

Dimension What to Evaluate Weight Guide
1. Agent Lifecycle Management Can the platform handle agent creation, deployment, monitoring, versioning, and retirement in one workflow? Critical for orgs planning 10+ agents
2. Model Flexibility Can you swap underlying models (GPT, Claude, Gemini, open-weight) without rebuilding agents? Critical if you need model-agnostic architecture
3. Enterprise Data Integration How deeply does the platform connect to your existing data estate (CRM, ERP, data warehouse, knowledge bases)? Weight by data estate complexity
4. Identity and Permissions Per-agent identity credentials, granular permissions, audit trails traceable to human sponsors? Critical for regulated industries
5. Governance and Compliance Runtime enforcement (not just flagging), policy-as-code, regulatory framework alignment (EU AI Act, state laws)? Critical for EU-facing orgs — August 2, 2026 deadline
6. Observability Real-time monitoring of all agent actions, behavioral baselines, anomaly detection? Critical for production agents with data access
7. Evaluation Framework Built-in tools to measure agent performance, accuracy, cost per task, business impact? Critical for proving ROI
8. Multi-Agent Orchestration Can agents coordinate, hand off tasks, and communicate with each other under governed protocols? Important for complex workflows
9. Security Architecture Supply chain security, prompt injection defenses, output filtering, adversarial testing? Non-negotiable
10. Ecosystem and Integrations Pre-built connectors to your existing tech stack (ServiceNow, Jira, Slack, SAP, etc.)? Weight by integration count
11. Pricing Transparency Can you forecast costs at scale? Per-agent, per-token, per-user, or consumption-based? Critical for budget predictability
12. Exit Terms and Portability Contractual data portability, agent export formats, knowledge migration support? Critical — 19-34% switching costs if missed

Scoring guidance:

  • Total Score 48-60 (Strong fit): Platform aligns with your infrastructure, governance requirements, and strategic priorities. Proceed to pilot.
  • Total Score 36-47 (Moderate fit): Gaps exist but may be addressable. Identify the specific gaps and assess whether the vendor's roadmap closes them within your deployment timeline.
  • Total Score 24-35 (Weak fit): Significant misalignment. The platform may work for narrow use cases but will create friction at enterprise scale.
  • Total Score <24 (Poor fit): Do not deploy. The governance, integration, or security gaps will compound as you scale.

The HP signal: HP's evaluation took four months. They tested pilots with measurable outcomes before committing. If your enterprise AI platform evaluation took less than 90 days or involved no production-adjacent workloads, you haven't actually evaluated — you've chosen a vendor based on a demo.

Framework #2: The Pilot-to-Production Readiness Assessment

88% of AI proofs of concept never reach production. This 10-point assessment identifies where the failure will occur before you invest in scaling.

For each factor, assess your organization as Red (not ready), Yellow (partially ready), or Green (ready). You need 8+ Greens to scale successfully.

Factor 1: Business Ownership

  • 🟢 A named P&L owner has committed budget and headcount to the AI initiative
  • 🟡 A sponsor exists but budget comes from a shared IT pool
  • 🔴 The initiative is owned by "the AI team" or "innovation lab" with no P&L attachment

Factor 2: Success Metrics

  • 🟢 KPIs are defined, baselined, and measured before the pilot begins (e.g., "reduce resolution time from 14 minutes to 4 minutes")
  • 🟡 KPIs exist but baseline measurements haven't been taken
  • 🔴 Success is defined as "explore AI capabilities" or "increase productivity"

Factor 3: Data Readiness

  • 🟢 The data required by agents is clean, accessible via API, governed, and has documented lineage
  • 🟡 Data exists but requires ETL work, manual cleanup, or permission negotiations
  • 🔴 Agents will need data from systems with no API access, unclear ownership, or known quality issues

Factor 4: Integration Architecture

  • 🟢 Target systems have documented APIs, the platform has pre-built connectors, and integration has been tested
  • 🟡 APIs exist but connectors must be custom-built; integration is planned but untested
  • 🔴 Target systems are legacy with no API layer, or integration dependencies are undefined

Factor 5: Governance Framework

  • 🟢 Per-agent identity, permissions, monitoring, and incident response procedures are documented and tested
  • 🟡 Governance policies exist on paper but haven't been operationalized
  • 🔴 Governance is planned for "after we prove the concept works"

Factor 6: Change Management

  • 🟢 End users have been involved in pilot design, trained on the new workflow, and have a feedback channel
  • 🟡 Training is planned but hasn't started; users are aware but not involved
  • 🔴 The AI initiative is being built "for" users who haven't been consulted

Factor 7: Cost Model

  • 🟢 Per-agent and per-task costs are measured during the pilot; total cost of ownership (including human oversight) is modeled for production scale
  • 🟡 Token costs are tracked but human oversight, integration maintenance, and governance costs are not
  • 🔴 Costs are estimated based on vendor pricing pages, not actual usage patterns

Factor 8: Rollback Capability

  • 🟢 The previous workflow can be restored within hours; agents can be halted individually without affecting other systems
  • 🟡 Rollback is possible but would take days and require manual intervention
  • 🔴 No rollback plan exists, or reverting would require rebuilding the previous workflow from scratch

Factor 9: Security Validation

  • 🟢 Agent actions, data access patterns, and output quality have been red-teamed; supply chain dependencies are inventoried
  • 🟡 Basic security review has been completed but no adversarial testing
  • 🔴 Security review is scheduled for after the pilot proves value

Factor 10: Scaling Architecture

  • 🟢 The pilot was designed on the same platform, with the same governance, that production will use; scaling requires configuration, not re-architecture
  • 🟡 The pilot platform differs from the production target but migration path is documented
  • 🔴 The pilot was built on a different platform, with different integrations, than production will require

Scoring:

  • 8-10 Greens: Ready to scale. Your pilot was designed for production from day one.
  • 5-7 Greens: Proceed with caution. Address the Yellow/Red factors before committing production budget. Each unresolved factor increases failure probability by approximately 15-20%.
  • 3-4 Greens: Not ready. You're in the 88% that will stall. Fix the Reds before spending another dollar on the pilot.
  • 0-2 Greens: This isn't a pilot — it's a demo. Start over with a production-first design.

The Decision That Defines the Decade

The enterprise AI platform decision is not a technology decision. It is an operating model decision — on par with choosing your cloud provider, your ERP system, or your CRM platform. Those decisions took years to evaluate and decades to play out. The AI platform decision is being made in quarters.

HP took four months. They ran production-adjacent pilots. They measured specific outcomes. They evaluated governance alongside capability. They committed to a vendor knowing the lock-in implications.

That process — not the vendor they chose — is the model worth studying.

Because the data is clear: 100% of CIOs are budgeting for AI. Half already blew their budgets. 80% of AI projects fail. 88% of enterprises have had AI agent security incidents. And 40% of agentic AI projects will be canceled by the end of next year.

The enterprises that survive this wave won't be the ones that chose the best AI model. They'll be the ones that chose a platform, evaluated it rigorously, governed it from day one, and measured what mattered before they scaled.

HP just showed what that looks like. Now the question is whether anyone is paying attention.


Continue Reading


Rajesh Beri is Head of AI Engineering at Zscaler. Views expressed are his own.

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

beri.net

Subscribe at beri.net/subscribe for twice-weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

Newsletter

Stay Ahead of the Curve

Weekly enterprise AI insights for technology leaders. No spam, no vendor pitches—unsubscribe anytime.

Subscribe

Related Articles

Enterprise AI

Why 69% of AI Pilots Never Ship (And What Q2 2026 Changed)

New Q2 2026 data: 31% of enterprise AI pilots went live—nearly double Q1. Here's exactly what the successful ones did differently.

June 30, 2026
botsitting

AI Saves 11 Hours a Week. Workers Waste 6.4 Babysitting It.

A landmark study of 6,000 workers reveals enterprise AI's dirty secret: employees spend 6.4 hours per week 'botsitting' — feeding context, debugging mistakes, and cleaning up AI outputs. The net productivity gain is a fraction of what vendors claim, and 69% of workers admit to shipping unverified AI work. Here's how to calculate the real cost and fix the governance gap.

June 30, 2026
AI Budget

CFOs Are Coming for AI Budgets: What Survives the Cut

KPMG Q2 2026: Only 7% of leaders report established AI ROI. CFOs are moving fast — from central budgets to business unit chargebacks. Here's what survives.

June 29, 2026
Coupang

$409M Fine for 5 Missing Controls: Coupang's AI Governance Autopsy

South Korea fined Coupang $409 million after a former employee used an unrevoked signing key to harvest 37.56 million customer records over seven months. The PIPC found 'deficiencies in basic safety management' — not sophisticated hacking. With total incident costs exceeding $1.6 billion and the EU AI Act enforcement starting August 2, 2026, this is the most detailed real-world case study of what AI governance failure actually costs. Enterprise AI governance readiness assessment and cost-of-inaction calculator inside.

June 29, 2026

Latest Articles

View All →