Ramp Hits $44B as CFOs Drown in 13x AI Token Bills

Ramp raised $750M at $44B as AI token spend jumps 13x in 18 months. CFO framework + 90-day playbook to control runaway AI costs before they wreck margins.

By Rajesh Beri·June 6, 2026·16 min read
Share:

THE DAILY BRIEF

CFOAI Spend ManagementRampFinOpsToken Economics

Ramp Hits $44B as CFOs Drown in 13x AI Token Bills

Ramp raised $750M at $44B as AI token spend jumps 13x in 18 months. CFO framework + 90-day playbook to control runaway AI costs before they wreck margins.

By Rajesh Beri·June 6, 2026·16 min read

Ramp just raised $750 million at a $44 billion valuation—triple its valuation from a year ago—and the pitch deck has almost nothing to do with corporate cards. It has everything to do with the AI bill your CFO can't see, can't predict, and can't control.

On June 4, 2026, ICONIQ, GIC, and Ontario Teachers' Pension Plan led a Series F into Ramp that values the spend management company higher than Snowflake's market cap at IPO. The capital isn't funding more credit card swipes. CEO Eric Glyman told investors Ramp is now infrastructure for what he calls "the third pillar" of business spend—the first being payroll, the second traditional procurement, and the third, tokens consumed by AI.

He's not wrong about the third pillar. Ramp's own customer data shows average monthly AI token spend has jumped 13x since January 2025, with top spenders watching costs jump 50% or more in roughly one of every four months. A single Series B SaaS company with 50 engineers went from $1,200 to $18,500 in monthly AI bills. A 200-engineer fintech climbed from $8,000 to $127,000. Most CFOs don't see it until month-end reconciliation—when 40-60% of the spend was already burning through personal credit cards and untracked API keys.

This is the story of why the smartest money in enterprise software is betting that AI cost governance is the next $100 billion category, what's actually driving the explosion, and how to control it before your next board meeting.

What Changed: A Spend Management Company Just Became an AI Governance Platform

Ramp's Series F closed on June 4, 2026, raising $750 million at a $44 billion post-money valuation—nearly tripling from its August 2025 valuation of $15 billion. The round was led by ICONIQ, GIC, and Ontario Teachers' Pension Plan, joined by Goldman Sachs Alternatives, D.E. Shaw & Co., Morgan Stanley Investment Management, Generation Investment Management, Insight Partners, and BroadLight Capital. Existing investors Founders Fund, Sequoia, Thrive, and Khosla Ventures also doubled down. Total equity raised now exceeds $3 billion, according to TechCrunch.

The growth that justified the markup is real. Per Ramp's press release, the company crossed $1 billion in annualized revenue with positive free cash flow, processing $200 billion in annualized purchase volume across 70,000+ customers. TPV grew approximately 170% year-over-year in March 2026. The enterprise book deepened to 3,200+ customers spending $100K+ annually—names like Visa, Uber, Shopify, Anduril, Figma, Notion, and Stanford Athletics.

But the strategic shift is what matters for CIOs and CFOs. In April 2026, Ramp launched Token Spend Management—a product that ingests usage data from OpenAI, Anthropic, Google, Microsoft, AWS Bedrock, and custom inference endpoints, then maps token consumption to teams, projects, environments, and individual employees. It works the same way Ramp's expense tools work, except the "vendor" is a model and the "receipt" is a tokenized API call.

Ramp also rolled out a corporate credit card designed for AI agents, with payment controls and audit trails that allow autonomous agents to transact on behalf of users without creating shadow spend. According to American Banker's coverage, Ramp product leadership told investors that "our largest AI spenders at Ramp are seeing their spend double month-over-month."

Glyman framed it for investors plainly: "Finance is going through the biggest structural change since the spreadsheet." Tokens are becoming a third financial primitive alongside dollars and headcount. The company that owns the rails to track, allocate, and govern that primitive will own the CFO desktop for the next decade.

This is also why the round commanded a $44 billion valuation. Investors are not paying for a credit card business. They are paying for the bet that AI token spend will become a 5-7% line item on every Fortune 500 income statement within 36 months, and that whoever sits between the model providers and the CFO captures rent on the entire flow.

Why This Matters: Two Audiences, One Existential Problem

For Technical Leaders (CIO, CTO, VP Engineering)

Your engineering org just became a finance org, whether you wanted that or not. The shift to consumption-based pricing means every architectural decision is now a budget decision. Three forces are converging to make AI cost governance an engineering-leadership accountability:

Output tokens cost 3-15x more than input tokens. When Claude Sonnet 4.6 generates 50,000 lines of code in an agent loop, that's not the same economics as a chatbot query. Agent loops compound costs through retry logic (10-20 iterations on failed code), repeated structured-output tool calls, and repository-scale context reads without prompt caching. A single OAuth implementation task can cost $12 on a frontier model—and a team running 200 of those per day burns $72,000/month on one workflow.

40-60% of AI costs sit outside your finance team's visibility until month-end reconciliation, according to Ramp's analysis. Engineers expense personal Anthropic and OpenAI subscriptions. Teams stand up untracked Bedrock endpoints. Cursor and Copilot bills get split across 14 different cost centers. The shadow AI tax isn't a future risk—it's already on this quarter's books.

Hidden cost amplifiers add 40-60% on top of the line-item bill. Retry logic, retrieval augmentation, context window management, embedding generation, and observability all stack invisibly. Engineering leaders who present "AI cost" without these layered in are presenting numbers that won't match reality 90 days later.

The architectural counter-move is intelligent routing—Haiku for classification, Sonnet for code generation, Opus only for hard reasoning—paired with aggressive prompt caching (which can deliver 90% cost reduction on repeated context) and reusable skills that amortize design cost across thousands of invocations.

For Business Leaders (CFO, CMO, COO)

The CFO problem is simpler to describe and harder to solve: AI is the first major budget line where you cannot fix the unit cost. Every other technology purchase—SaaS, cloud, telecom, even compute—eventually settles into predictable per-user, per-seat, or per-instance pricing. AI doesn't. A single "good idea" from a product team can 5x consumption in a week.

Three statistics tell the CFO story:

The deeper issue is forecast volatility. CFOs cannot give a board credible AI guidance when input cost behavior is variable, distribution across SaaS-cloud-services invoices creates margin leakage, and reactive infrastructure decisions introduce capital timing risk. Deloitte's framework calls this the "earnings narrative risk"—the danger that you cannot explain to investors what token consumption is buying.

This is why Ramp's Token Spend Management isn't a feature. It's a survival tool.

Market Context: The $2.59 Trillion Race for the CFO Desktop

Worldwide AI spending will hit $2.59 trillion in 2026—a 47% year-over-year increase, per Gartner's May 2026 forecast. Meanwhile, the enterprise software category as a whole will grow 15.2%, with roughly 9 points of that growth tied to price increases on existing software and 4-5 points to net-new AI applications.

Capital is repositioning around this reality fast.

Ramp's chief competitor disappeared in January 2026 when Capital One acquired Brex for $5.15 billion. The deal gave Capital One a vertically integrated B2B fintech stack—bank, payment network (Discover), and software—and signaled that AI-native financial operations is a strategic asset banks will pay to own. That made Ramp's independent path harder and more valuable simultaneously.

Procurement is consolidating. Vertice, a London-based AI procurement platform, acquired Vendr to combine 2 million+ price points with Vendr's 250,000 negotiated contracts and Vertice's $75 billion spend dataset. The combined entity is now training autonomous negotiation agents on the world's largest software pricing corpus. Tropic delivered $56 million in verified savings on $362 million in customer spend (15.5% average) in H1 2025. Pivot is building enterprise-grade source-to-pay with AI agents on structured data. Coupa, GEP, and Ivalua are racing to match.

The unbundling-rebundling thesis is back. Brex bundled cards + software. Ramp is bundling cards + software + AI governance + procurement agents. Rippling bundles spend + HR + IT + payroll. The winners will be the platforms that own enough categories to make AI cost data meaningful—because tokens alone aren't actionable. You need them sitting next to vendor contracts, accounting codes, and budget envelopes to act.

Gartner's analysts have been blunt: 2026 is the "trough of disillusionment" year for enterprise AI ROI. CFOs are being asked to double AI spending while simultaneously making cost optimization their top priority—the exact paradox Ramp is selling itself as the answer to.

Framework #1: The AI Spend Governance Maturity Assessment

Score your organization across five dimensions. Each dimension is rated 1-5. Maximum score is 25. This framework synthesizes Deloitte's CFO AI tokenomics guidance, Ramp customer data, and the FinOps Foundation's 2026 maturity model.

Dimension 1: Visibility (1-5 points)

How much of your AI spend can you see in real time?

  • 1 point — Spend only visible at month-end reconciliation; >40% lives on employee credit cards
  • 2 points — Quarterly review cycle; major vendor invoices tracked, side spend invisible
  • 3 points — Monthly dashboard covering top 3 providers; shadow spend estimated at <25%
  • 4 points — Weekly visibility across all major providers + team-level allocation
  • 5 points — Real-time token-level telemetry across every provider, environment, project, and employee

Dimension 2: Allocation (1-5 points)

Can you attribute AI cost to a team, product, or revenue stream?

  • 1 point — All AI cost sits in one central IT or R&D bucket
  • 2 points — Allocated by business unit, no project-level breakdown
  • 3 points — Allocated by team + top 5 projects
  • 4 points — Allocated by team, project, environment, and feature
  • 5 points — Full chargeback model—every dollar tied to a P&L owner with margin visibility

Dimension 3: Forecast Accuracy (1-5 points)

How close was your last AI budget forecast to actuals?

  • 1 point — Actuals exceeded forecast by >50% (joins the 73% of enterprises)
  • 2 points — Variance of 25-50%; reforecast required mid-quarter
  • 3 points — Variance of 10-25%; modeled scenarios but missed
  • 4 points — Variance of <10%; predictive models calibrated quarterly
  • 5 points — Variance of <5%; real-time burn-rate alerts and rolling 90-day forecasts

Dimension 4: Controls (1-5 points)

What stops a runaway agent or experiment from blowing the budget?

  • 1 point — No budget caps; engineers self-police
  • 2 points — Soft caps at the org level; no per-user or per-project controls
  • 3 points — Hard budget caps at team level; approval gates for high-tier model use
  • 4 points — User-level budgets, approval gates on agent loops, automatic throttling at thresholds
  • 5 points — Policy-as-code controls across every API call; circuit breakers on cost anomalies

Dimension 5: ROI Measurement (1-5 points)

Can you connect token consumption to business outcomes?

  • 1 point — No ROI measurement beyond "engineers say they are faster"
  • 2 points — Anecdotal ROI; pilots tracked, production unclear
  • 3 points — Productivity metrics tracked (PRs/week, tickets closed) but not dollar-attributed
  • 4 points — ROI dollar-attributed for top 5 use cases; net of token cost
  • 5 points — Every AI workflow has a revenue uplift, cost reduction, or productivity dollar value tracked monthly

How to Read Your Score

  • 5-10 pointsNot Ready. You will hit a budget surprise within 90 days. Stop scaling and instrument first.
  • 11-14 pointsLow Maturity. Standard for most enterprises today. High risk of margin compression in 2-3 quarters.
  • 15-19 pointsMedium Maturity. You can scale AI without breaking the budget, but ROI conversations remain difficult.
  • 20-23 pointsHigh Maturity. You are in the top 15% of enterprises. Use this as a competitive lever in board conversations.
  • 24-25 pointsBest in Class. You are doing what only a handful of public companies do today. Consider productizing your internal playbook.

If you scored below 15, you are exactly who Ramp's $44B valuation is betting on. The product roadmap of every spend management vendor in 2026 is being built to drag enterprises from 11 to 18 within a single fiscal year.

Framework #2: The 90-Day AI Cost Control Playbook

Most enterprises overcomplicate the rollout of AI cost governance. The data from Ramp, Deloitte, and FinOps Foundation customers converges on a phased 90-day pattern that delivers a typical 20-50% cost reduction without slowing engineering productivity.

Days 1-30: Instrumentation

Week 1 — Inventory

  • List every AI provider in use (OpenAI, Anthropic, Google, AWS Bedrock, Azure OpenAI, third-party SaaS with AI features, code assistants)
  • Identify every cost center charging AI spend, including expense reports and shadow corporate cards
  • Capture top 10 spenders by team

Week 2-3 — Tagging

  • Apply team / project / environment tags on every API key
  • Migrate personal subscriptions onto corporate billing with attribution
  • Stand up a single dashboard pulling from all providers (Ramp, Vantage, CloudZero, or in-house)

Week 4 — Baseline

  • Lock the previous 60 days of spend as your baseline
  • Calculate true cost per use case (with hidden amplifiers like retry, RAG, embeddings)
  • Identify top 5 cost drivers

Days 31-60: Optimization

Week 5-6 — Routing

  • Implement model routing: Haiku/Flash for classification, Sonnet/Pro for code generation, Opus/Ultra for hard reasoning only
  • Set guardrails on agent loops (max retries, max context size)
  • Deploy prompt caching on all repeated-context workloads

Week 7 — Skills and Reusables

  • Identify your top 10 repeat workflows
  • Encode them as reusable skills, templates, or cached prompts
  • Break-even occurs after ~10 invocations; expect 30-70% per-task cost reduction at scale

Week 8 — Controls

  • Set user-level budgets in your code assistants (now generally available in GitHub Copilot's June 2026 pricing model)
  • Implement approval gates on agent runs above a dollar threshold
  • Configure automatic throttling on anomalies

Days 61-90: Governance and ROI

Week 9-10 — Allocation

  • Move from cost-center aggregation to true chargeback by team and project
  • Surface AI spend in monthly business reviews alongside other variable costs
  • Run a closing-the-books exercise that compares forecast to actuals

Week 11 — ROI Attribution

  • For top 5 workflows, calculate net ROI (value created minus token cost)
  • Identify the bottom 20% of workflows that are not pulling their weight—either fix or sunset
  • Establish a quarterly ROI review cadence

Week 12 — Operating Cadence

  • Publish an internal "AI FinOps" charter with named accountabilities
  • Establish a CFO + CIO + Head of Eng monthly review meeting
  • Set rolling 90-day forecasts with confidence intervals

Customers running this playbook—per Ramp and FinOps Foundation data—typically reduce AI spend 20-50% while preserving or accelerating velocity. The wins come less from cost-cutting and more from killing wasted experimentation, eliminating shadow spend, and re-platforming the top workflows onto cheaper models without quality loss.

Case Study: AT&T's 90% Token Cost Reduction

The most-cited case study in CFO AI governance circles right now is AT&T. According to Deloitte's AI tokenomics framework, AT&T scaled internal AI usage from 8 billion tokens daily to 27 billion tokens daily after implementing multi-agent orchestration—a 237% increase in volume.

Most enterprises would expect their AI bill to triple alongside that growth. AT&T's didn't. The company achieved 90% cost savings on the per-task basis through three architectural moves:

  1. Tiered model routing. Light reasoning and classification went to small open-source models hosted internally. Mid-tier work routed to mid-sized commercial models. Frontier models only handled the hardest 5-10% of queries.

  2. Aggressive caching across the orchestration layer. Common context payloads were cached at the agent-mesh layer rather than at each model call, eliminating redundant token spend across multi-agent workflows.

  3. Reusable agent skills. Common workflows were encoded as deterministic skills that amortized token cost across hundreds of thousands of invocations, dropping per-invocation cost by 60-80%.

The lesson is simple and counterintuitive: the path to controlling AI cost is rarely "use AI less." It is "use AI smarter." The best operators are scaling token volume while compressing unit cost—exactly the opposite of the panicked cost-cutting most boards reach for first.

A second case study cited in Deloitte's research tells the cautionary version: a healthcare enterprise consumed 1 trillion tokens over six months, translating into more than $6 million in unplanned costs before the finance team even understood what was driving it. The root cause wasn't bad strategy—it was no instrumentation. The bill simply showed up.

Two companies, similar token volumes, opposite outcomes. The difference is governance maturity, not engineering talent.

What To Do About It

For CIOs and CTOs

  • Stand up FinOps for AI this quarter. If you have not appointed an owner for AI cost governance, do it before fiscal Q4 planning. The longer you wait, the harder the cleanup.
  • Make routing a first-class architectural pattern. Build (or buy) a model gateway that enforces tier-appropriate routing. Hard-code Opus-class models only into the workflows that justify them.
  • Mandate prompt caching and reusable skills. Every team should have a cached-context strategy and a skills library. Productivity tools without these are 5-10x more expensive than they need to be.

For CFOs

  • Run the 25-point maturity assessment with your CIO this month. If you score below 15, AI cost discipline is now a top-3 priority for the remaining fiscal year.
  • Move AI from a single line item to a third financial primitive. Track token consumption next to headcount and revenue. Build forecasting models that include hidden amplifiers (retry, RAG, embedding).
  • Demand ROI attribution for the top 10 AI workflows. "We are saving engineering time" is not an answer your board will accept in 2027. Net-of-token-cost dollar values are.

For Business Leaders

  • Treat shadow AI as a margin issue, not an IT issue. If 40-60% of your AI spend is invisible until month-end, that is a margin leak masquerading as a tooling problem.
  • Sponsor cross-functional AI FinOps. This requires CFO + CIO + Business Unit alignment. Without an executive sponsor, the playbook stalls in week 5.
  • Use the AI spend conversation to re-anchor ROI dialogue. The companies winning the AI narrative in 2026 are not the biggest spenders. They are the ones who can show net-of-token-cost outcomes per dollar deployed.

The Ramp valuation isn't a fintech story. It's a CFO story. And the operators who win the next 24 months will be the ones who treat tokens like the third pillar of business spend—not a line item to argue about at month-end.


Continue Reading

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

thedailybrief.com

Subscribe at thedailybrief.com/subscribe for weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

Ramp Hits $44B as CFOs Drown in 13x AI Token Bills

Photo by Pixabay on Pexels

Ramp just raised $750 million at a $44 billion valuation—triple its valuation from a year ago—and the pitch deck has almost nothing to do with corporate cards. It has everything to do with the AI bill your CFO can't see, can't predict, and can't control.

On June 4, 2026, ICONIQ, GIC, and Ontario Teachers' Pension Plan led a Series F into Ramp that values the spend management company higher than Snowflake's market cap at IPO. The capital isn't funding more credit card swipes. CEO Eric Glyman told investors Ramp is now infrastructure for what he calls "the third pillar" of business spend—the first being payroll, the second traditional procurement, and the third, tokens consumed by AI.

He's not wrong about the third pillar. Ramp's own customer data shows average monthly AI token spend has jumped 13x since January 2025, with top spenders watching costs jump 50% or more in roughly one of every four months. A single Series B SaaS company with 50 engineers went from $1,200 to $18,500 in monthly AI bills. A 200-engineer fintech climbed from $8,000 to $127,000. Most CFOs don't see it until month-end reconciliation—when 40-60% of the spend was already burning through personal credit cards and untracked API keys.

This is the story of why the smartest money in enterprise software is betting that AI cost governance is the next $100 billion category, what's actually driving the explosion, and how to control it before your next board meeting.

What Changed: A Spend Management Company Just Became an AI Governance Platform

Ramp's Series F closed on June 4, 2026, raising $750 million at a $44 billion post-money valuation—nearly tripling from its August 2025 valuation of $15 billion. The round was led by ICONIQ, GIC, and Ontario Teachers' Pension Plan, joined by Goldman Sachs Alternatives, D.E. Shaw & Co., Morgan Stanley Investment Management, Generation Investment Management, Insight Partners, and BroadLight Capital. Existing investors Founders Fund, Sequoia, Thrive, and Khosla Ventures also doubled down. Total equity raised now exceeds $3 billion, according to TechCrunch.

The growth that justified the markup is real. Per Ramp's press release, the company crossed $1 billion in annualized revenue with positive free cash flow, processing $200 billion in annualized purchase volume across 70,000+ customers. TPV grew approximately 170% year-over-year in March 2026. The enterprise book deepened to 3,200+ customers spending $100K+ annually—names like Visa, Uber, Shopify, Anduril, Figma, Notion, and Stanford Athletics.

But the strategic shift is what matters for CIOs and CFOs. In April 2026, Ramp launched Token Spend Management—a product that ingests usage data from OpenAI, Anthropic, Google, Microsoft, AWS Bedrock, and custom inference endpoints, then maps token consumption to teams, projects, environments, and individual employees. It works the same way Ramp's expense tools work, except the "vendor" is a model and the "receipt" is a tokenized API call.

Ramp also rolled out a corporate credit card designed for AI agents, with payment controls and audit trails that allow autonomous agents to transact on behalf of users without creating shadow spend. According to American Banker's coverage, Ramp product leadership told investors that "our largest AI spenders at Ramp are seeing their spend double month-over-month."

Glyman framed it for investors plainly: "Finance is going through the biggest structural change since the spreadsheet." Tokens are becoming a third financial primitive alongside dollars and headcount. The company that owns the rails to track, allocate, and govern that primitive will own the CFO desktop for the next decade.

This is also why the round commanded a $44 billion valuation. Investors are not paying for a credit card business. They are paying for the bet that AI token spend will become a 5-7% line item on every Fortune 500 income statement within 36 months, and that whoever sits between the model providers and the CFO captures rent on the entire flow.

Why This Matters: Two Audiences, One Existential Problem

For Technical Leaders (CIO, CTO, VP Engineering)

Your engineering org just became a finance org, whether you wanted that or not. The shift to consumption-based pricing means every architectural decision is now a budget decision. Three forces are converging to make AI cost governance an engineering-leadership accountability:

Output tokens cost 3-15x more than input tokens. When Claude Sonnet 4.6 generates 50,000 lines of code in an agent loop, that's not the same economics as a chatbot query. Agent loops compound costs through retry logic (10-20 iterations on failed code), repeated structured-output tool calls, and repository-scale context reads without prompt caching. A single OAuth implementation task can cost $12 on a frontier model—and a team running 200 of those per day burns $72,000/month on one workflow.

40-60% of AI costs sit outside your finance team's visibility until month-end reconciliation, according to Ramp's analysis. Engineers expense personal Anthropic and OpenAI subscriptions. Teams stand up untracked Bedrock endpoints. Cursor and Copilot bills get split across 14 different cost centers. The shadow AI tax isn't a future risk—it's already on this quarter's books.

Hidden cost amplifiers add 40-60% on top of the line-item bill. Retry logic, retrieval augmentation, context window management, embedding generation, and observability all stack invisibly. Engineering leaders who present "AI cost" without these layered in are presenting numbers that won't match reality 90 days later.

The architectural counter-move is intelligent routing—Haiku for classification, Sonnet for code generation, Opus only for hard reasoning—paired with aggressive prompt caching (which can deliver 90% cost reduction on repeated context) and reusable skills that amortize design cost across thousands of invocations.

For Business Leaders (CFO, CMO, COO)

The CFO problem is simpler to describe and harder to solve: AI is the first major budget line where you cannot fix the unit cost. Every other technology purchase—SaaS, cloud, telecom, even compute—eventually settles into predictable per-user, per-seat, or per-instance pricing. AI doesn't. A single "good idea" from a product team can 5x consumption in a week.

Three statistics tell the CFO story:

The deeper issue is forecast volatility. CFOs cannot give a board credible AI guidance when input cost behavior is variable, distribution across SaaS-cloud-services invoices creates margin leakage, and reactive infrastructure decisions introduce capital timing risk. Deloitte's framework calls this the "earnings narrative risk"—the danger that you cannot explain to investors what token consumption is buying.

This is why Ramp's Token Spend Management isn't a feature. It's a survival tool.

Market Context: The $2.59 Trillion Race for the CFO Desktop

Worldwide AI spending will hit $2.59 trillion in 2026—a 47% year-over-year increase, per Gartner's May 2026 forecast. Meanwhile, the enterprise software category as a whole will grow 15.2%, with roughly 9 points of that growth tied to price increases on existing software and 4-5 points to net-new AI applications.

Capital is repositioning around this reality fast.

Ramp's chief competitor disappeared in January 2026 when Capital One acquired Brex for $5.15 billion. The deal gave Capital One a vertically integrated B2B fintech stack—bank, payment network (Discover), and software—and signaled that AI-native financial operations is a strategic asset banks will pay to own. That made Ramp's independent path harder and more valuable simultaneously.

Procurement is consolidating. Vertice, a London-based AI procurement platform, acquired Vendr to combine 2 million+ price points with Vendr's 250,000 negotiated contracts and Vertice's $75 billion spend dataset. The combined entity is now training autonomous negotiation agents on the world's largest software pricing corpus. Tropic delivered $56 million in verified savings on $362 million in customer spend (15.5% average) in H1 2025. Pivot is building enterprise-grade source-to-pay with AI agents on structured data. Coupa, GEP, and Ivalua are racing to match.

The unbundling-rebundling thesis is back. Brex bundled cards + software. Ramp is bundling cards + software + AI governance + procurement agents. Rippling bundles spend + HR + IT + payroll. The winners will be the platforms that own enough categories to make AI cost data meaningful—because tokens alone aren't actionable. You need them sitting next to vendor contracts, accounting codes, and budget envelopes to act.

Gartner's analysts have been blunt: 2026 is the "trough of disillusionment" year for enterprise AI ROI. CFOs are being asked to double AI spending while simultaneously making cost optimization their top priority—the exact paradox Ramp is selling itself as the answer to.

Framework #1: The AI Spend Governance Maturity Assessment

Score your organization across five dimensions. Each dimension is rated 1-5. Maximum score is 25. This framework synthesizes Deloitte's CFO AI tokenomics guidance, Ramp customer data, and the FinOps Foundation's 2026 maturity model.

Dimension 1: Visibility (1-5 points)

How much of your AI spend can you see in real time?

  • 1 point — Spend only visible at month-end reconciliation; >40% lives on employee credit cards
  • 2 points — Quarterly review cycle; major vendor invoices tracked, side spend invisible
  • 3 points — Monthly dashboard covering top 3 providers; shadow spend estimated at <25%
  • 4 points — Weekly visibility across all major providers + team-level allocation
  • 5 points — Real-time token-level telemetry across every provider, environment, project, and employee

Dimension 2: Allocation (1-5 points)

Can you attribute AI cost to a team, product, or revenue stream?

  • 1 point — All AI cost sits in one central IT or R&D bucket
  • 2 points — Allocated by business unit, no project-level breakdown
  • 3 points — Allocated by team + top 5 projects
  • 4 points — Allocated by team, project, environment, and feature
  • 5 points — Full chargeback model—every dollar tied to a P&L owner with margin visibility

Dimension 3: Forecast Accuracy (1-5 points)

How close was your last AI budget forecast to actuals?

  • 1 point — Actuals exceeded forecast by >50% (joins the 73% of enterprises)
  • 2 points — Variance of 25-50%; reforecast required mid-quarter
  • 3 points — Variance of 10-25%; modeled scenarios but missed
  • 4 points — Variance of <10%; predictive models calibrated quarterly
  • 5 points — Variance of <5%; real-time burn-rate alerts and rolling 90-day forecasts

Dimension 4: Controls (1-5 points)

What stops a runaway agent or experiment from blowing the budget?

  • 1 point — No budget caps; engineers self-police
  • 2 points — Soft caps at the org level; no per-user or per-project controls
  • 3 points — Hard budget caps at team level; approval gates for high-tier model use
  • 4 points — User-level budgets, approval gates on agent loops, automatic throttling at thresholds
  • 5 points — Policy-as-code controls across every API call; circuit breakers on cost anomalies

Dimension 5: ROI Measurement (1-5 points)

Can you connect token consumption to business outcomes?

  • 1 point — No ROI measurement beyond "engineers say they are faster"
  • 2 points — Anecdotal ROI; pilots tracked, production unclear
  • 3 points — Productivity metrics tracked (PRs/week, tickets closed) but not dollar-attributed
  • 4 points — ROI dollar-attributed for top 5 use cases; net of token cost
  • 5 points — Every AI workflow has a revenue uplift, cost reduction, or productivity dollar value tracked monthly

How to Read Your Score

  • 5-10 pointsNot Ready. You will hit a budget surprise within 90 days. Stop scaling and instrument first.
  • 11-14 pointsLow Maturity. Standard for most enterprises today. High risk of margin compression in 2-3 quarters.
  • 15-19 pointsMedium Maturity. You can scale AI without breaking the budget, but ROI conversations remain difficult.
  • 20-23 pointsHigh Maturity. You are in the top 15% of enterprises. Use this as a competitive lever in board conversations.
  • 24-25 pointsBest in Class. You are doing what only a handful of public companies do today. Consider productizing your internal playbook.

If you scored below 15, you are exactly who Ramp's $44B valuation is betting on. The product roadmap of every spend management vendor in 2026 is being built to drag enterprises from 11 to 18 within a single fiscal year.

Framework #2: The 90-Day AI Cost Control Playbook

Most enterprises overcomplicate the rollout of AI cost governance. The data from Ramp, Deloitte, and FinOps Foundation customers converges on a phased 90-day pattern that delivers a typical 20-50% cost reduction without slowing engineering productivity.

Days 1-30: Instrumentation

Week 1 — Inventory

  • List every AI provider in use (OpenAI, Anthropic, Google, AWS Bedrock, Azure OpenAI, third-party SaaS with AI features, code assistants)
  • Identify every cost center charging AI spend, including expense reports and shadow corporate cards
  • Capture top 10 spenders by team

Week 2-3 — Tagging

  • Apply team / project / environment tags on every API key
  • Migrate personal subscriptions onto corporate billing with attribution
  • Stand up a single dashboard pulling from all providers (Ramp, Vantage, CloudZero, or in-house)

Week 4 — Baseline

  • Lock the previous 60 days of spend as your baseline
  • Calculate true cost per use case (with hidden amplifiers like retry, RAG, embeddings)
  • Identify top 5 cost drivers

Days 31-60: Optimization

Week 5-6 — Routing

  • Implement model routing: Haiku/Flash for classification, Sonnet/Pro for code generation, Opus/Ultra for hard reasoning only
  • Set guardrails on agent loops (max retries, max context size)
  • Deploy prompt caching on all repeated-context workloads

Week 7 — Skills and Reusables

  • Identify your top 10 repeat workflows
  • Encode them as reusable skills, templates, or cached prompts
  • Break-even occurs after ~10 invocations; expect 30-70% per-task cost reduction at scale

Week 8 — Controls

  • Set user-level budgets in your code assistants (now generally available in GitHub Copilot's June 2026 pricing model)
  • Implement approval gates on agent runs above a dollar threshold
  • Configure automatic throttling on anomalies

Days 61-90: Governance and ROI

Week 9-10 — Allocation

  • Move from cost-center aggregation to true chargeback by team and project
  • Surface AI spend in monthly business reviews alongside other variable costs
  • Run a closing-the-books exercise that compares forecast to actuals

Week 11 — ROI Attribution

  • For top 5 workflows, calculate net ROI (value created minus token cost)
  • Identify the bottom 20% of workflows that are not pulling their weight—either fix or sunset
  • Establish a quarterly ROI review cadence

Week 12 — Operating Cadence

  • Publish an internal "AI FinOps" charter with named accountabilities
  • Establish a CFO + CIO + Head of Eng monthly review meeting
  • Set rolling 90-day forecasts with confidence intervals

Customers running this playbook—per Ramp and FinOps Foundation data—typically reduce AI spend 20-50% while preserving or accelerating velocity. The wins come less from cost-cutting and more from killing wasted experimentation, eliminating shadow spend, and re-platforming the top workflows onto cheaper models without quality loss.

Case Study: AT&T's 90% Token Cost Reduction

The most-cited case study in CFO AI governance circles right now is AT&T. According to Deloitte's AI tokenomics framework, AT&T scaled internal AI usage from 8 billion tokens daily to 27 billion tokens daily after implementing multi-agent orchestration—a 237% increase in volume.

Most enterprises would expect their AI bill to triple alongside that growth. AT&T's didn't. The company achieved 90% cost savings on the per-task basis through three architectural moves:

  1. Tiered model routing. Light reasoning and classification went to small open-source models hosted internally. Mid-tier work routed to mid-sized commercial models. Frontier models only handled the hardest 5-10% of queries.

  2. Aggressive caching across the orchestration layer. Common context payloads were cached at the agent-mesh layer rather than at each model call, eliminating redundant token spend across multi-agent workflows.

  3. Reusable agent skills. Common workflows were encoded as deterministic skills that amortized token cost across hundreds of thousands of invocations, dropping per-invocation cost by 60-80%.

The lesson is simple and counterintuitive: the path to controlling AI cost is rarely "use AI less." It is "use AI smarter." The best operators are scaling token volume while compressing unit cost—exactly the opposite of the panicked cost-cutting most boards reach for first.

A second case study cited in Deloitte's research tells the cautionary version: a healthcare enterprise consumed 1 trillion tokens over six months, translating into more than $6 million in unplanned costs before the finance team even understood what was driving it. The root cause wasn't bad strategy—it was no instrumentation. The bill simply showed up.

Two companies, similar token volumes, opposite outcomes. The difference is governance maturity, not engineering talent.

What To Do About It

For CIOs and CTOs

  • Stand up FinOps for AI this quarter. If you have not appointed an owner for AI cost governance, do it before fiscal Q4 planning. The longer you wait, the harder the cleanup.
  • Make routing a first-class architectural pattern. Build (or buy) a model gateway that enforces tier-appropriate routing. Hard-code Opus-class models only into the workflows that justify them.
  • Mandate prompt caching and reusable skills. Every team should have a cached-context strategy and a skills library. Productivity tools without these are 5-10x more expensive than they need to be.

For CFOs

  • Run the 25-point maturity assessment with your CIO this month. If you score below 15, AI cost discipline is now a top-3 priority for the remaining fiscal year.
  • Move AI from a single line item to a third financial primitive. Track token consumption next to headcount and revenue. Build forecasting models that include hidden amplifiers (retry, RAG, embedding).
  • Demand ROI attribution for the top 10 AI workflows. "We are saving engineering time" is not an answer your board will accept in 2027. Net-of-token-cost dollar values are.

For Business Leaders

  • Treat shadow AI as a margin issue, not an IT issue. If 40-60% of your AI spend is invisible until month-end, that is a margin leak masquerading as a tooling problem.
  • Sponsor cross-functional AI FinOps. This requires CFO + CIO + Business Unit alignment. Without an executive sponsor, the playbook stalls in week 5.
  • Use the AI spend conversation to re-anchor ROI dialogue. The companies winning the AI narrative in 2026 are not the biggest spenders. They are the ones who can show net-of-token-cost outcomes per dollar deployed.

The Ramp valuation isn't a fintech story. It's a CFO story. And the operators who win the next 24 months will be the ones who treat tokens like the third pillar of business spend—not a line item to argue about at month-end.


Continue Reading

Share:

THE DAILY BRIEF

CFOAI Spend ManagementRampFinOpsToken Economics

Ramp Hits $44B as CFOs Drown in 13x AI Token Bills

Ramp raised $750M at $44B as AI token spend jumps 13x in 18 months. CFO framework + 90-day playbook to control runaway AI costs before they wreck margins.

By Rajesh Beri·June 6, 2026·16 min read

Ramp just raised $750 million at a $44 billion valuation—triple its valuation from a year ago—and the pitch deck has almost nothing to do with corporate cards. It has everything to do with the AI bill your CFO can't see, can't predict, and can't control.

On June 4, 2026, ICONIQ, GIC, and Ontario Teachers' Pension Plan led a Series F into Ramp that values the spend management company higher than Snowflake's market cap at IPO. The capital isn't funding more credit card swipes. CEO Eric Glyman told investors Ramp is now infrastructure for what he calls "the third pillar" of business spend—the first being payroll, the second traditional procurement, and the third, tokens consumed by AI.

He's not wrong about the third pillar. Ramp's own customer data shows average monthly AI token spend has jumped 13x since January 2025, with top spenders watching costs jump 50% or more in roughly one of every four months. A single Series B SaaS company with 50 engineers went from $1,200 to $18,500 in monthly AI bills. A 200-engineer fintech climbed from $8,000 to $127,000. Most CFOs don't see it until month-end reconciliation—when 40-60% of the spend was already burning through personal credit cards and untracked API keys.

This is the story of why the smartest money in enterprise software is betting that AI cost governance is the next $100 billion category, what's actually driving the explosion, and how to control it before your next board meeting.

What Changed: A Spend Management Company Just Became an AI Governance Platform

Ramp's Series F closed on June 4, 2026, raising $750 million at a $44 billion post-money valuation—nearly tripling from its August 2025 valuation of $15 billion. The round was led by ICONIQ, GIC, and Ontario Teachers' Pension Plan, joined by Goldman Sachs Alternatives, D.E. Shaw & Co., Morgan Stanley Investment Management, Generation Investment Management, Insight Partners, and BroadLight Capital. Existing investors Founders Fund, Sequoia, Thrive, and Khosla Ventures also doubled down. Total equity raised now exceeds $3 billion, according to TechCrunch.

The growth that justified the markup is real. Per Ramp's press release, the company crossed $1 billion in annualized revenue with positive free cash flow, processing $200 billion in annualized purchase volume across 70,000+ customers. TPV grew approximately 170% year-over-year in March 2026. The enterprise book deepened to 3,200+ customers spending $100K+ annually—names like Visa, Uber, Shopify, Anduril, Figma, Notion, and Stanford Athletics.

But the strategic shift is what matters for CIOs and CFOs. In April 2026, Ramp launched Token Spend Management—a product that ingests usage data from OpenAI, Anthropic, Google, Microsoft, AWS Bedrock, and custom inference endpoints, then maps token consumption to teams, projects, environments, and individual employees. It works the same way Ramp's expense tools work, except the "vendor" is a model and the "receipt" is a tokenized API call.

Ramp also rolled out a corporate credit card designed for AI agents, with payment controls and audit trails that allow autonomous agents to transact on behalf of users without creating shadow spend. According to American Banker's coverage, Ramp product leadership told investors that "our largest AI spenders at Ramp are seeing their spend double month-over-month."

Glyman framed it for investors plainly: "Finance is going through the biggest structural change since the spreadsheet." Tokens are becoming a third financial primitive alongside dollars and headcount. The company that owns the rails to track, allocate, and govern that primitive will own the CFO desktop for the next decade.

This is also why the round commanded a $44 billion valuation. Investors are not paying for a credit card business. They are paying for the bet that AI token spend will become a 5-7% line item on every Fortune 500 income statement within 36 months, and that whoever sits between the model providers and the CFO captures rent on the entire flow.

Why This Matters: Two Audiences, One Existential Problem

For Technical Leaders (CIO, CTO, VP Engineering)

Your engineering org just became a finance org, whether you wanted that or not. The shift to consumption-based pricing means every architectural decision is now a budget decision. Three forces are converging to make AI cost governance an engineering-leadership accountability:

Output tokens cost 3-15x more than input tokens. When Claude Sonnet 4.6 generates 50,000 lines of code in an agent loop, that's not the same economics as a chatbot query. Agent loops compound costs through retry logic (10-20 iterations on failed code), repeated structured-output tool calls, and repository-scale context reads without prompt caching. A single OAuth implementation task can cost $12 on a frontier model—and a team running 200 of those per day burns $72,000/month on one workflow.

40-60% of AI costs sit outside your finance team's visibility until month-end reconciliation, according to Ramp's analysis. Engineers expense personal Anthropic and OpenAI subscriptions. Teams stand up untracked Bedrock endpoints. Cursor and Copilot bills get split across 14 different cost centers. The shadow AI tax isn't a future risk—it's already on this quarter's books.

Hidden cost amplifiers add 40-60% on top of the line-item bill. Retry logic, retrieval augmentation, context window management, embedding generation, and observability all stack invisibly. Engineering leaders who present "AI cost" without these layered in are presenting numbers that won't match reality 90 days later.

The architectural counter-move is intelligent routing—Haiku for classification, Sonnet for code generation, Opus only for hard reasoning—paired with aggressive prompt caching (which can deliver 90% cost reduction on repeated context) and reusable skills that amortize design cost across thousands of invocations.

For Business Leaders (CFO, CMO, COO)

The CFO problem is simpler to describe and harder to solve: AI is the first major budget line where you cannot fix the unit cost. Every other technology purchase—SaaS, cloud, telecom, even compute—eventually settles into predictable per-user, per-seat, or per-instance pricing. AI doesn't. A single "good idea" from a product team can 5x consumption in a week.

Three statistics tell the CFO story:

The deeper issue is forecast volatility. CFOs cannot give a board credible AI guidance when input cost behavior is variable, distribution across SaaS-cloud-services invoices creates margin leakage, and reactive infrastructure decisions introduce capital timing risk. Deloitte's framework calls this the "earnings narrative risk"—the danger that you cannot explain to investors what token consumption is buying.

This is why Ramp's Token Spend Management isn't a feature. It's a survival tool.

Market Context: The $2.59 Trillion Race for the CFO Desktop

Worldwide AI spending will hit $2.59 trillion in 2026—a 47% year-over-year increase, per Gartner's May 2026 forecast. Meanwhile, the enterprise software category as a whole will grow 15.2%, with roughly 9 points of that growth tied to price increases on existing software and 4-5 points to net-new AI applications.

Capital is repositioning around this reality fast.

Ramp's chief competitor disappeared in January 2026 when Capital One acquired Brex for $5.15 billion. The deal gave Capital One a vertically integrated B2B fintech stack—bank, payment network (Discover), and software—and signaled that AI-native financial operations is a strategic asset banks will pay to own. That made Ramp's independent path harder and more valuable simultaneously.

Procurement is consolidating. Vertice, a London-based AI procurement platform, acquired Vendr to combine 2 million+ price points with Vendr's 250,000 negotiated contracts and Vertice's $75 billion spend dataset. The combined entity is now training autonomous negotiation agents on the world's largest software pricing corpus. Tropic delivered $56 million in verified savings on $362 million in customer spend (15.5% average) in H1 2025. Pivot is building enterprise-grade source-to-pay with AI agents on structured data. Coupa, GEP, and Ivalua are racing to match.

The unbundling-rebundling thesis is back. Brex bundled cards + software. Ramp is bundling cards + software + AI governance + procurement agents. Rippling bundles spend + HR + IT + payroll. The winners will be the platforms that own enough categories to make AI cost data meaningful—because tokens alone aren't actionable. You need them sitting next to vendor contracts, accounting codes, and budget envelopes to act.

Gartner's analysts have been blunt: 2026 is the "trough of disillusionment" year for enterprise AI ROI. CFOs are being asked to double AI spending while simultaneously making cost optimization their top priority—the exact paradox Ramp is selling itself as the answer to.

Framework #1: The AI Spend Governance Maturity Assessment

Score your organization across five dimensions. Each dimension is rated 1-5. Maximum score is 25. This framework synthesizes Deloitte's CFO AI tokenomics guidance, Ramp customer data, and the FinOps Foundation's 2026 maturity model.

Dimension 1: Visibility (1-5 points)

How much of your AI spend can you see in real time?

  • 1 point — Spend only visible at month-end reconciliation; >40% lives on employee credit cards
  • 2 points — Quarterly review cycle; major vendor invoices tracked, side spend invisible
  • 3 points — Monthly dashboard covering top 3 providers; shadow spend estimated at <25%
  • 4 points — Weekly visibility across all major providers + team-level allocation
  • 5 points — Real-time token-level telemetry across every provider, environment, project, and employee

Dimension 2: Allocation (1-5 points)

Can you attribute AI cost to a team, product, or revenue stream?

  • 1 point — All AI cost sits in one central IT or R&D bucket
  • 2 points — Allocated by business unit, no project-level breakdown
  • 3 points — Allocated by team + top 5 projects
  • 4 points — Allocated by team, project, environment, and feature
  • 5 points — Full chargeback model—every dollar tied to a P&L owner with margin visibility

Dimension 3: Forecast Accuracy (1-5 points)

How close was your last AI budget forecast to actuals?

  • 1 point — Actuals exceeded forecast by >50% (joins the 73% of enterprises)
  • 2 points — Variance of 25-50%; reforecast required mid-quarter
  • 3 points — Variance of 10-25%; modeled scenarios but missed
  • 4 points — Variance of <10%; predictive models calibrated quarterly
  • 5 points — Variance of <5%; real-time burn-rate alerts and rolling 90-day forecasts

Dimension 4: Controls (1-5 points)

What stops a runaway agent or experiment from blowing the budget?

  • 1 point — No budget caps; engineers self-police
  • 2 points — Soft caps at the org level; no per-user or per-project controls
  • 3 points — Hard budget caps at team level; approval gates for high-tier model use
  • 4 points — User-level budgets, approval gates on agent loops, automatic throttling at thresholds
  • 5 points — Policy-as-code controls across every API call; circuit breakers on cost anomalies

Dimension 5: ROI Measurement (1-5 points)

Can you connect token consumption to business outcomes?

  • 1 point — No ROI measurement beyond "engineers say they are faster"
  • 2 points — Anecdotal ROI; pilots tracked, production unclear
  • 3 points — Productivity metrics tracked (PRs/week, tickets closed) but not dollar-attributed
  • 4 points — ROI dollar-attributed for top 5 use cases; net of token cost
  • 5 points — Every AI workflow has a revenue uplift, cost reduction, or productivity dollar value tracked monthly

How to Read Your Score

  • 5-10 pointsNot Ready. You will hit a budget surprise within 90 days. Stop scaling and instrument first.
  • 11-14 pointsLow Maturity. Standard for most enterprises today. High risk of margin compression in 2-3 quarters.
  • 15-19 pointsMedium Maturity. You can scale AI without breaking the budget, but ROI conversations remain difficult.
  • 20-23 pointsHigh Maturity. You are in the top 15% of enterprises. Use this as a competitive lever in board conversations.
  • 24-25 pointsBest in Class. You are doing what only a handful of public companies do today. Consider productizing your internal playbook.

If you scored below 15, you are exactly who Ramp's $44B valuation is betting on. The product roadmap of every spend management vendor in 2026 is being built to drag enterprises from 11 to 18 within a single fiscal year.

Framework #2: The 90-Day AI Cost Control Playbook

Most enterprises overcomplicate the rollout of AI cost governance. The data from Ramp, Deloitte, and FinOps Foundation customers converges on a phased 90-day pattern that delivers a typical 20-50% cost reduction without slowing engineering productivity.

Days 1-30: Instrumentation

Week 1 — Inventory

  • List every AI provider in use (OpenAI, Anthropic, Google, AWS Bedrock, Azure OpenAI, third-party SaaS with AI features, code assistants)
  • Identify every cost center charging AI spend, including expense reports and shadow corporate cards
  • Capture top 10 spenders by team

Week 2-3 — Tagging

  • Apply team / project / environment tags on every API key
  • Migrate personal subscriptions onto corporate billing with attribution
  • Stand up a single dashboard pulling from all providers (Ramp, Vantage, CloudZero, or in-house)

Week 4 — Baseline

  • Lock the previous 60 days of spend as your baseline
  • Calculate true cost per use case (with hidden amplifiers like retry, RAG, embeddings)
  • Identify top 5 cost drivers

Days 31-60: Optimization

Week 5-6 — Routing

  • Implement model routing: Haiku/Flash for classification, Sonnet/Pro for code generation, Opus/Ultra for hard reasoning only
  • Set guardrails on agent loops (max retries, max context size)
  • Deploy prompt caching on all repeated-context workloads

Week 7 — Skills and Reusables

  • Identify your top 10 repeat workflows
  • Encode them as reusable skills, templates, or cached prompts
  • Break-even occurs after ~10 invocations; expect 30-70% per-task cost reduction at scale

Week 8 — Controls

  • Set user-level budgets in your code assistants (now generally available in GitHub Copilot's June 2026 pricing model)
  • Implement approval gates on agent runs above a dollar threshold
  • Configure automatic throttling on anomalies

Days 61-90: Governance and ROI

Week 9-10 — Allocation

  • Move from cost-center aggregation to true chargeback by team and project
  • Surface AI spend in monthly business reviews alongside other variable costs
  • Run a closing-the-books exercise that compares forecast to actuals

Week 11 — ROI Attribution

  • For top 5 workflows, calculate net ROI (value created minus token cost)
  • Identify the bottom 20% of workflows that are not pulling their weight—either fix or sunset
  • Establish a quarterly ROI review cadence

Week 12 — Operating Cadence

  • Publish an internal "AI FinOps" charter with named accountabilities
  • Establish a CFO + CIO + Head of Eng monthly review meeting
  • Set rolling 90-day forecasts with confidence intervals

Customers running this playbook—per Ramp and FinOps Foundation data—typically reduce AI spend 20-50% while preserving or accelerating velocity. The wins come less from cost-cutting and more from killing wasted experimentation, eliminating shadow spend, and re-platforming the top workflows onto cheaper models without quality loss.

Case Study: AT&T's 90% Token Cost Reduction

The most-cited case study in CFO AI governance circles right now is AT&T. According to Deloitte's AI tokenomics framework, AT&T scaled internal AI usage from 8 billion tokens daily to 27 billion tokens daily after implementing multi-agent orchestration—a 237% increase in volume.

Most enterprises would expect their AI bill to triple alongside that growth. AT&T's didn't. The company achieved 90% cost savings on the per-task basis through three architectural moves:

  1. Tiered model routing. Light reasoning and classification went to small open-source models hosted internally. Mid-tier work routed to mid-sized commercial models. Frontier models only handled the hardest 5-10% of queries.

  2. Aggressive caching across the orchestration layer. Common context payloads were cached at the agent-mesh layer rather than at each model call, eliminating redundant token spend across multi-agent workflows.

  3. Reusable agent skills. Common workflows were encoded as deterministic skills that amortized token cost across hundreds of thousands of invocations, dropping per-invocation cost by 60-80%.

The lesson is simple and counterintuitive: the path to controlling AI cost is rarely "use AI less." It is "use AI smarter." The best operators are scaling token volume while compressing unit cost—exactly the opposite of the panicked cost-cutting most boards reach for first.

A second case study cited in Deloitte's research tells the cautionary version: a healthcare enterprise consumed 1 trillion tokens over six months, translating into more than $6 million in unplanned costs before the finance team even understood what was driving it. The root cause wasn't bad strategy—it was no instrumentation. The bill simply showed up.

Two companies, similar token volumes, opposite outcomes. The difference is governance maturity, not engineering talent.

What To Do About It

For CIOs and CTOs

  • Stand up FinOps for AI this quarter. If you have not appointed an owner for AI cost governance, do it before fiscal Q4 planning. The longer you wait, the harder the cleanup.
  • Make routing a first-class architectural pattern. Build (or buy) a model gateway that enforces tier-appropriate routing. Hard-code Opus-class models only into the workflows that justify them.
  • Mandate prompt caching and reusable skills. Every team should have a cached-context strategy and a skills library. Productivity tools without these are 5-10x more expensive than they need to be.

For CFOs

  • Run the 25-point maturity assessment with your CIO this month. If you score below 15, AI cost discipline is now a top-3 priority for the remaining fiscal year.
  • Move AI from a single line item to a third financial primitive. Track token consumption next to headcount and revenue. Build forecasting models that include hidden amplifiers (retry, RAG, embedding).
  • Demand ROI attribution for the top 10 AI workflows. "We are saving engineering time" is not an answer your board will accept in 2027. Net-of-token-cost dollar values are.

For Business Leaders

  • Treat shadow AI as a margin issue, not an IT issue. If 40-60% of your AI spend is invisible until month-end, that is a margin leak masquerading as a tooling problem.
  • Sponsor cross-functional AI FinOps. This requires CFO + CIO + Business Unit alignment. Without an executive sponsor, the playbook stalls in week 5.
  • Use the AI spend conversation to re-anchor ROI dialogue. The companies winning the AI narrative in 2026 are not the biggest spenders. They are the ones who can show net-of-token-cost outcomes per dollar deployed.

The Ramp valuation isn't a fintech story. It's a CFO story. And the operators who win the next 24 months will be the ones who treat tokens like the third pillar of business spend—not a line item to argue about at month-end.


Continue Reading

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

thedailybrief.com

Subscribe at thedailybrief.com/subscribe for weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

Newsletter

Stay Ahead of the Curve

Weekly enterprise AI insights for technology leaders. No spam, no vendor pitches—unsubscribe anytime.

Subscribe