AI Budget Crisis: Uber's 2026 Spend Gone by April

Uber burned its 2026 AI budget in 4 months. Microsoft exits Claude. Token costs spiral—what CFOs must do now to avoid budget blowouts.

By Rajesh Beri·May 27, 2026·7 min read
Share:

THE DAILY BRIEF

Enterprise AIAI CostsToken EconomicsBudget PlanningVendor Management

AI Budget Crisis: Uber's 2026 Spend Gone by April

Uber burned its 2026 AI budget in 4 months. Microsoft exits Claude. Token costs spiral—what CFOs must do now to avoid budget blowouts.

By Rajesh Beri·May 27, 2026·7 min read

Enterprise AI adoption just hit a wall—and it's made of money. Uber Technologies burned through its entire 2026 AI budget in four months. Microsoft is winding down most of its Claude Code usage by June 30. GitHub is abandoning flat-rate pricing for usage-based billing. And one developer's monthly cost just jumped from €67 to €966.

This isn't a pricing adjustment. It's a full-blown budget crisis that's forcing CFOs and CTOs to rethink vendor contracts, usage policies, and ROI calculations they thought were locked in for the year.

The Numbers Behind the Blowout

Uber's AI spending trajectory tells the story. The company's CTO, Praveen Neppalli Naga, confirmed that Claude Code adoption surged from 32% to 84% of the company's 5,000-engineer organization in Q1 2026. Monthly API costs per engineer ranged from $500 to $2,000 as agentic AI workflows replaced traditional autocomplete usage.

The math is brutal. At the conservative end—$500 per engineer per month for 4,200 engineers (84% of 5,000)—that's $2.1 million per month, or $8.4 million over four months. At the high end, it's closer to $33.6 million. Either way, Uber is "back to the drawing board" on AI budgeting, according to Naga's statement to The Information.

This isn't unique to Uber. Microsoft's Experiences and Devices division, which covers Windows, Microsoft 365, Outlook, Teams, and Surface, is shutting down most Claude Code usage by its fiscal year end. While Microsoft cited platform consolidation toward GitHub Copilot CLI as the primary driver, financial considerations influenced the decision. In plain terms: token costs created a forcing function that wouldn't have triggered this quickly otherwise.

Why Token Costs Exploded

The cost structure of frontier AI models explains the blowout. Tokens are the unit of computation an AI model processes—every prompt, every response, every long-context codebase analysis consumes them. According to Anthropic's official documentation, Claude Code costs an average of $6 per developer per day, with daily costs staying below $12 for 90% of users.

But that average obscures the tail risk. Agentic workflows—where developers delegate entire tasks to AI rather than just accepting autocomplete suggestions—consume far more tokens per session than single-turn completions. The unit economics that looked reasonable at pilot stage stop working at adoption stage.

The infrastructure cost driving token prices is no mystery. On-demand pricing for NVIDIA H100 GPUs ranges from $1.49 per hour on specialized providers to $6.98 per hour on Microsoft Azure. AI labs must run thousands of these GPUs simultaneously to serve enterprise customers at scale. Those costs flow directly into API token pricing.

The Vendor Economics Are Breaking

Here's the problem OpenAI and Anthropic are facing: the cost gap between American frontier models and cheaper alternatives is widening, not narrowing. AI benchmarking firm Artificial Analysis runs every major model through the same evaluations and tracks total cost. For each lab's most capable model:

  • Anthropic's Claude: $4,811
  • OpenAI's ChatGPT: $3,357
  • DeepSeek (Chinese): $1,071
  • Kimi (Chinese): $948
  • Zhipu GLM (Chinese): $544

Claude is nearly nine times more expensive than the cheapest Chinese alternative for the same workload. That's not a rounding error—it's a structural disadvantage that enterprise buyers are starting to notice.

On OpenRouter, a marketplace that lets developers access hundreds of AI models through a single interface, Chinese models went from about 1% of usage in 2024 to more than 60% in May 2026. The shift is accelerating.

Google is making the same case. At Google I/O 2026, CEO Sundar Pichai said "many companies are already blowing through their annual token budgets, and it's only May." If the largest Google Cloud customers shifted 80% of their workloads from frontier models to Gemini 3.5 Flash, Pichai said, they would save more than $1 billion annually.

Google can offer cheaper models for structural reasons OpenAI and Anthropic can't easily replicate. First, Google builds its own Tensor Processing Units, reducing dependence on third-party GPU pricing. Second, Google's developers were processing roughly half a trillion tokens per day inside its internal Antigravity platform by March 2026, with that figure surging past three trillion by mid-May. That internal scale creates a data flywheel that improves model efficiency and reduces per-token serving costs over time.

The GitHub Copilot Pricing Shock

GitHub's pricing change crystallizes the budget problem. Starting June 1, 2026, GitHub Copilot AI coding assistant is moving from flat-rate subscriptions to usage-based billing. The change replaces premium request units with GitHub AI Credits tied to token consumption.

One developer reported their projected monthly cost rising from roughly €67 in April to around €966 under the new model. That's a 14x increase—and it removes predictability from enterprise budgets at exactly the moment those budgets are already under pressure.

For enterprises with hundreds or thousands of developers using Copilot, the math compounds fast. A company with 500 developers paying €67/month was spending €33,500 monthly. Under the new model, if even half of those developers hit the €966 threshold, monthly costs jump to €241,500—a 7x increase.

What CFOs and CTOs Should Do Now

The token pricing crisis is forcing a fundamental reassessment of enterprise AI strategy. Here's what financial and technical leaders need to do:

1. Audit Current Usage Immediately

Don't wait for the next invoice. Pull API logs and analyze which teams, projects, and workflows are consuming the most tokens. You can't manage what you don't measure.

2. Implement Tiered Model Strategies

The technique enterprises are deploying is called an "advisor model." A cheap open-source model handles the bulk of work as the default. When it hits a task it can't solve, it calls out to a frontier model from OpenAI or Anthropic for help. Databricks CEO Ali Ghodsi said enterprises using this approach "can curb costs really well."

3. Renegotiate Vendor Contracts

If you locked in annual contracts based on 2024 usage assumptions, those assumptions are now wrong. Get ahead of the Q2 budget review and renegotiate volume commitments, rate cards, or usage caps before you blow through allocated spend.

4. Set Hard Usage Caps

Token budgets should have guardrails. Implement spending caps at the team level, project level, or developer level. Uber didn't set hard limits and burned through a full year's budget in four months. Don't make the same mistake.

5. Evaluate Cost-Efficient Alternatives

If security and compliance allow, test Google's Gemini Flash, open-source models like Llama 3.3, or cost-optimized vendors like Cohere. DeepSeek's latest preview model matches or nearly matches OpenAI, Anthropic, and Google on coding, agentic, and knowledge benchmarks—at a fraction of the cost.

6. Model the Long-Term Trajectory

Token costs are falling year-over-year, but usage is rising faster. NVIDIA's Rubin platform targets a 10x reduction in inference token costs compared to its Blackwell architecture. Average cost per million tokens across major providers fell from roughly $10 to $2.50 in a single year, according to enterprise spending data from Ramp.

But that long-term trend doesn't solve the near-term problem. Enterprises that planned budgets around 2024 token rates are finding that agentic AI workflows at 2026 adoption levels consume multiples of what the spreadsheet projected.

The IPO Valuation Question

The enterprise cost strain is a real variable shaping the IPO math for OpenAI and Anthropic. Both companies are projected to go public at valuations north of $800 billion. Those numbers assume they'll hold market share and pricing power—that competitors can't easily catch up, and that enterprise customers will keep paying a premium because there's no real alternative.

But the data is pointing the other way. Cutting-edge AI is becoming abundant and cheap. If major customers hit budget ceilings and scale back usage, the growth rates both labs are projecting for the second half of 2026 become harder to sustain.

Anthropic is raising between $30 billion and $50 billion at a valuation of up to $950 billion, according to the New York Times. OpenAI's latest reported valuation is $850 billion. For investors, the enterprise budget crisis is the most consequential variable in those numbers.

The Bottom Line

The enterprise AI budget crisis is forcing a reckoning. The unit economics that looked reasonable at pilot stage don't scale at adoption stage. Token costs that felt manageable for 30% of developers become untenable at 84% adoption. Flat-rate pricing that worked for autocomplete doesn't work for agentic workflows.

CFOs who thought they had AI spending under control are discovering they don't. CTOs who locked in vendor contracts based on 2024 assumptions are renegotiating them. And developers who got used to unlimited AI assistance are about to hit hard usage caps.

The companies that survive this transition will be the ones that treat AI budgets like infrastructure budgets—with clear usage policies, tiered vendor strategies, and real-time monitoring. The ones that don't will be back to the drawing board by Q3.

How are you managing AI token costs? What's working—and what's blowing up your budget? Connect on LinkedIn or follow on Twitter.


Continue Reading:

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

thedailybrief.com

Subscribe at thedailybrief.com/subscribe for weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

AI Budget Crisis: Uber's 2026 Spend Gone by April

Photo by Fauxels on Pexels

Enterprise AI adoption just hit a wall—and it's made of money. Uber Technologies burned through its entire 2026 AI budget in four months. Microsoft is winding down most of its Claude Code usage by June 30. GitHub is abandoning flat-rate pricing for usage-based billing. And one developer's monthly cost just jumped from €67 to €966.

This isn't a pricing adjustment. It's a full-blown budget crisis that's forcing CFOs and CTOs to rethink vendor contracts, usage policies, and ROI calculations they thought were locked in for the year.

The Numbers Behind the Blowout

Uber's AI spending trajectory tells the story. The company's CTO, Praveen Neppalli Naga, confirmed that Claude Code adoption surged from 32% to 84% of the company's 5,000-engineer organization in Q1 2026. Monthly API costs per engineer ranged from $500 to $2,000 as agentic AI workflows replaced traditional autocomplete usage.

The math is brutal. At the conservative end—$500 per engineer per month for 4,200 engineers (84% of 5,000)—that's $2.1 million per month, or $8.4 million over four months. At the high end, it's closer to $33.6 million. Either way, Uber is "back to the drawing board" on AI budgeting, according to Naga's statement to The Information.

This isn't unique to Uber. Microsoft's Experiences and Devices division, which covers Windows, Microsoft 365, Outlook, Teams, and Surface, is shutting down most Claude Code usage by its fiscal year end. While Microsoft cited platform consolidation toward GitHub Copilot CLI as the primary driver, financial considerations influenced the decision. In plain terms: token costs created a forcing function that wouldn't have triggered this quickly otherwise.

Why Token Costs Exploded

The cost structure of frontier AI models explains the blowout. Tokens are the unit of computation an AI model processes—every prompt, every response, every long-context codebase analysis consumes them. According to Anthropic's official documentation, Claude Code costs an average of $6 per developer per day, with daily costs staying below $12 for 90% of users.

But that average obscures the tail risk. Agentic workflows—where developers delegate entire tasks to AI rather than just accepting autocomplete suggestions—consume far more tokens per session than single-turn completions. The unit economics that looked reasonable at pilot stage stop working at adoption stage.

The infrastructure cost driving token prices is no mystery. On-demand pricing for NVIDIA H100 GPUs ranges from $1.49 per hour on specialized providers to $6.98 per hour on Microsoft Azure. AI labs must run thousands of these GPUs simultaneously to serve enterprise customers at scale. Those costs flow directly into API token pricing.

The Vendor Economics Are Breaking

Here's the problem OpenAI and Anthropic are facing: the cost gap between American frontier models and cheaper alternatives is widening, not narrowing. AI benchmarking firm Artificial Analysis runs every major model through the same evaluations and tracks total cost. For each lab's most capable model:

  • Anthropic's Claude: $4,811
  • OpenAI's ChatGPT: $3,357
  • DeepSeek (Chinese): $1,071
  • Kimi (Chinese): $948
  • Zhipu GLM (Chinese): $544

Claude is nearly nine times more expensive than the cheapest Chinese alternative for the same workload. That's not a rounding error—it's a structural disadvantage that enterprise buyers are starting to notice.

On OpenRouter, a marketplace that lets developers access hundreds of AI models through a single interface, Chinese models went from about 1% of usage in 2024 to more than 60% in May 2026. The shift is accelerating.

Google is making the same case. At Google I/O 2026, CEO Sundar Pichai said "many companies are already blowing through their annual token budgets, and it's only May." If the largest Google Cloud customers shifted 80% of their workloads from frontier models to Gemini 3.5 Flash, Pichai said, they would save more than $1 billion annually.

Google can offer cheaper models for structural reasons OpenAI and Anthropic can't easily replicate. First, Google builds its own Tensor Processing Units, reducing dependence on third-party GPU pricing. Second, Google's developers were processing roughly half a trillion tokens per day inside its internal Antigravity platform by March 2026, with that figure surging past three trillion by mid-May. That internal scale creates a data flywheel that improves model efficiency and reduces per-token serving costs over time.

The GitHub Copilot Pricing Shock

GitHub's pricing change crystallizes the budget problem. Starting June 1, 2026, GitHub Copilot AI coding assistant is moving from flat-rate subscriptions to usage-based billing. The change replaces premium request units with GitHub AI Credits tied to token consumption.

One developer reported their projected monthly cost rising from roughly €67 in April to around €966 under the new model. That's a 14x increase—and it removes predictability from enterprise budgets at exactly the moment those budgets are already under pressure.

For enterprises with hundreds or thousands of developers using Copilot, the math compounds fast. A company with 500 developers paying €67/month was spending €33,500 monthly. Under the new model, if even half of those developers hit the €966 threshold, monthly costs jump to €241,500—a 7x increase.

What CFOs and CTOs Should Do Now

The token pricing crisis is forcing a fundamental reassessment of enterprise AI strategy. Here's what financial and technical leaders need to do:

1. Audit Current Usage Immediately

Don't wait for the next invoice. Pull API logs and analyze which teams, projects, and workflows are consuming the most tokens. You can't manage what you don't measure.

2. Implement Tiered Model Strategies

The technique enterprises are deploying is called an "advisor model." A cheap open-source model handles the bulk of work as the default. When it hits a task it can't solve, it calls out to a frontier model from OpenAI or Anthropic for help. Databricks CEO Ali Ghodsi said enterprises using this approach "can curb costs really well."

3. Renegotiate Vendor Contracts

If you locked in annual contracts based on 2024 usage assumptions, those assumptions are now wrong. Get ahead of the Q2 budget review and renegotiate volume commitments, rate cards, or usage caps before you blow through allocated spend.

4. Set Hard Usage Caps

Token budgets should have guardrails. Implement spending caps at the team level, project level, or developer level. Uber didn't set hard limits and burned through a full year's budget in four months. Don't make the same mistake.

5. Evaluate Cost-Efficient Alternatives

If security and compliance allow, test Google's Gemini Flash, open-source models like Llama 3.3, or cost-optimized vendors like Cohere. DeepSeek's latest preview model matches or nearly matches OpenAI, Anthropic, and Google on coding, agentic, and knowledge benchmarks—at a fraction of the cost.

6. Model the Long-Term Trajectory

Token costs are falling year-over-year, but usage is rising faster. NVIDIA's Rubin platform targets a 10x reduction in inference token costs compared to its Blackwell architecture. Average cost per million tokens across major providers fell from roughly $10 to $2.50 in a single year, according to enterprise spending data from Ramp.

But that long-term trend doesn't solve the near-term problem. Enterprises that planned budgets around 2024 token rates are finding that agentic AI workflows at 2026 adoption levels consume multiples of what the spreadsheet projected.

The IPO Valuation Question

The enterprise cost strain is a real variable shaping the IPO math for OpenAI and Anthropic. Both companies are projected to go public at valuations north of $800 billion. Those numbers assume they'll hold market share and pricing power—that competitors can't easily catch up, and that enterprise customers will keep paying a premium because there's no real alternative.

But the data is pointing the other way. Cutting-edge AI is becoming abundant and cheap. If major customers hit budget ceilings and scale back usage, the growth rates both labs are projecting for the second half of 2026 become harder to sustain.

Anthropic is raising between $30 billion and $50 billion at a valuation of up to $950 billion, according to the New York Times. OpenAI's latest reported valuation is $850 billion. For investors, the enterprise budget crisis is the most consequential variable in those numbers.

The Bottom Line

The enterprise AI budget crisis is forcing a reckoning. The unit economics that looked reasonable at pilot stage don't scale at adoption stage. Token costs that felt manageable for 30% of developers become untenable at 84% adoption. Flat-rate pricing that worked for autocomplete doesn't work for agentic workflows.

CFOs who thought they had AI spending under control are discovering they don't. CTOs who locked in vendor contracts based on 2024 assumptions are renegotiating them. And developers who got used to unlimited AI assistance are about to hit hard usage caps.

The companies that survive this transition will be the ones that treat AI budgets like infrastructure budgets—with clear usage policies, tiered vendor strategies, and real-time monitoring. The ones that don't will be back to the drawing board by Q3.

How are you managing AI token costs? What's working—and what's blowing up your budget? Connect on LinkedIn or follow on Twitter.


Continue Reading:

Share:

THE DAILY BRIEF

Enterprise AIAI CostsToken EconomicsBudget PlanningVendor Management

AI Budget Crisis: Uber's 2026 Spend Gone by April

Uber burned its 2026 AI budget in 4 months. Microsoft exits Claude. Token costs spiral—what CFOs must do now to avoid budget blowouts.

By Rajesh Beri·May 27, 2026·7 min read

Enterprise AI adoption just hit a wall—and it's made of money. Uber Technologies burned through its entire 2026 AI budget in four months. Microsoft is winding down most of its Claude Code usage by June 30. GitHub is abandoning flat-rate pricing for usage-based billing. And one developer's monthly cost just jumped from €67 to €966.

This isn't a pricing adjustment. It's a full-blown budget crisis that's forcing CFOs and CTOs to rethink vendor contracts, usage policies, and ROI calculations they thought were locked in for the year.

The Numbers Behind the Blowout

Uber's AI spending trajectory tells the story. The company's CTO, Praveen Neppalli Naga, confirmed that Claude Code adoption surged from 32% to 84% of the company's 5,000-engineer organization in Q1 2026. Monthly API costs per engineer ranged from $500 to $2,000 as agentic AI workflows replaced traditional autocomplete usage.

The math is brutal. At the conservative end—$500 per engineer per month for 4,200 engineers (84% of 5,000)—that's $2.1 million per month, or $8.4 million over four months. At the high end, it's closer to $33.6 million. Either way, Uber is "back to the drawing board" on AI budgeting, according to Naga's statement to The Information.

This isn't unique to Uber. Microsoft's Experiences and Devices division, which covers Windows, Microsoft 365, Outlook, Teams, and Surface, is shutting down most Claude Code usage by its fiscal year end. While Microsoft cited platform consolidation toward GitHub Copilot CLI as the primary driver, financial considerations influenced the decision. In plain terms: token costs created a forcing function that wouldn't have triggered this quickly otherwise.

Why Token Costs Exploded

The cost structure of frontier AI models explains the blowout. Tokens are the unit of computation an AI model processes—every prompt, every response, every long-context codebase analysis consumes them. According to Anthropic's official documentation, Claude Code costs an average of $6 per developer per day, with daily costs staying below $12 for 90% of users.

But that average obscures the tail risk. Agentic workflows—where developers delegate entire tasks to AI rather than just accepting autocomplete suggestions—consume far more tokens per session than single-turn completions. The unit economics that looked reasonable at pilot stage stop working at adoption stage.

The infrastructure cost driving token prices is no mystery. On-demand pricing for NVIDIA H100 GPUs ranges from $1.49 per hour on specialized providers to $6.98 per hour on Microsoft Azure. AI labs must run thousands of these GPUs simultaneously to serve enterprise customers at scale. Those costs flow directly into API token pricing.

The Vendor Economics Are Breaking

Here's the problem OpenAI and Anthropic are facing: the cost gap between American frontier models and cheaper alternatives is widening, not narrowing. AI benchmarking firm Artificial Analysis runs every major model through the same evaluations and tracks total cost. For each lab's most capable model:

  • Anthropic's Claude: $4,811
  • OpenAI's ChatGPT: $3,357
  • DeepSeek (Chinese): $1,071
  • Kimi (Chinese): $948
  • Zhipu GLM (Chinese): $544

Claude is nearly nine times more expensive than the cheapest Chinese alternative for the same workload. That's not a rounding error—it's a structural disadvantage that enterprise buyers are starting to notice.

On OpenRouter, a marketplace that lets developers access hundreds of AI models through a single interface, Chinese models went from about 1% of usage in 2024 to more than 60% in May 2026. The shift is accelerating.

Google is making the same case. At Google I/O 2026, CEO Sundar Pichai said "many companies are already blowing through their annual token budgets, and it's only May." If the largest Google Cloud customers shifted 80% of their workloads from frontier models to Gemini 3.5 Flash, Pichai said, they would save more than $1 billion annually.

Google can offer cheaper models for structural reasons OpenAI and Anthropic can't easily replicate. First, Google builds its own Tensor Processing Units, reducing dependence on third-party GPU pricing. Second, Google's developers were processing roughly half a trillion tokens per day inside its internal Antigravity platform by March 2026, with that figure surging past three trillion by mid-May. That internal scale creates a data flywheel that improves model efficiency and reduces per-token serving costs over time.

The GitHub Copilot Pricing Shock

GitHub's pricing change crystallizes the budget problem. Starting June 1, 2026, GitHub Copilot AI coding assistant is moving from flat-rate subscriptions to usage-based billing. The change replaces premium request units with GitHub AI Credits tied to token consumption.

One developer reported their projected monthly cost rising from roughly €67 in April to around €966 under the new model. That's a 14x increase—and it removes predictability from enterprise budgets at exactly the moment those budgets are already under pressure.

For enterprises with hundreds or thousands of developers using Copilot, the math compounds fast. A company with 500 developers paying €67/month was spending €33,500 monthly. Under the new model, if even half of those developers hit the €966 threshold, monthly costs jump to €241,500—a 7x increase.

What CFOs and CTOs Should Do Now

The token pricing crisis is forcing a fundamental reassessment of enterprise AI strategy. Here's what financial and technical leaders need to do:

1. Audit Current Usage Immediately

Don't wait for the next invoice. Pull API logs and analyze which teams, projects, and workflows are consuming the most tokens. You can't manage what you don't measure.

2. Implement Tiered Model Strategies

The technique enterprises are deploying is called an "advisor model." A cheap open-source model handles the bulk of work as the default. When it hits a task it can't solve, it calls out to a frontier model from OpenAI or Anthropic for help. Databricks CEO Ali Ghodsi said enterprises using this approach "can curb costs really well."

3. Renegotiate Vendor Contracts

If you locked in annual contracts based on 2024 usage assumptions, those assumptions are now wrong. Get ahead of the Q2 budget review and renegotiate volume commitments, rate cards, or usage caps before you blow through allocated spend.

4. Set Hard Usage Caps

Token budgets should have guardrails. Implement spending caps at the team level, project level, or developer level. Uber didn't set hard limits and burned through a full year's budget in four months. Don't make the same mistake.

5. Evaluate Cost-Efficient Alternatives

If security and compliance allow, test Google's Gemini Flash, open-source models like Llama 3.3, or cost-optimized vendors like Cohere. DeepSeek's latest preview model matches or nearly matches OpenAI, Anthropic, and Google on coding, agentic, and knowledge benchmarks—at a fraction of the cost.

6. Model the Long-Term Trajectory

Token costs are falling year-over-year, but usage is rising faster. NVIDIA's Rubin platform targets a 10x reduction in inference token costs compared to its Blackwell architecture. Average cost per million tokens across major providers fell from roughly $10 to $2.50 in a single year, according to enterprise spending data from Ramp.

But that long-term trend doesn't solve the near-term problem. Enterprises that planned budgets around 2024 token rates are finding that agentic AI workflows at 2026 adoption levels consume multiples of what the spreadsheet projected.

The IPO Valuation Question

The enterprise cost strain is a real variable shaping the IPO math for OpenAI and Anthropic. Both companies are projected to go public at valuations north of $800 billion. Those numbers assume they'll hold market share and pricing power—that competitors can't easily catch up, and that enterprise customers will keep paying a premium because there's no real alternative.

But the data is pointing the other way. Cutting-edge AI is becoming abundant and cheap. If major customers hit budget ceilings and scale back usage, the growth rates both labs are projecting for the second half of 2026 become harder to sustain.

Anthropic is raising between $30 billion and $50 billion at a valuation of up to $950 billion, according to the New York Times. OpenAI's latest reported valuation is $850 billion. For investors, the enterprise budget crisis is the most consequential variable in those numbers.

The Bottom Line

The enterprise AI budget crisis is forcing a reckoning. The unit economics that looked reasonable at pilot stage don't scale at adoption stage. Token costs that felt manageable for 30% of developers become untenable at 84% adoption. Flat-rate pricing that worked for autocomplete doesn't work for agentic workflows.

CFOs who thought they had AI spending under control are discovering they don't. CTOs who locked in vendor contracts based on 2024 assumptions are renegotiating them. And developers who got used to unlimited AI assistance are about to hit hard usage caps.

The companies that survive this transition will be the ones that treat AI budgets like infrastructure budgets—with clear usage policies, tiered vendor strategies, and real-time monitoring. The ones that don't will be back to the drawing board by Q3.

How are you managing AI token costs? What's working—and what's blowing up your budget? Connect on LinkedIn or follow on Twitter.


Continue Reading:

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

thedailybrief.com

Subscribe at thedailybrief.com/subscribe for weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

Newsletter

Stay Ahead of the Curve

Weekly enterprise AI insights for technology leaders. No spam, no vendor pitches—unsubscribe anytime.

Subscribe