CloudZero just published numbers that should terrify every CFO planning 2027 AI budgets. Companies allocate 30-36% of cloud spend for AI initiatives, but AI-specific line items show up at just 2.5%. The other 97.5% is what they call "ghost spend" — real AI costs buried under generic compute and storage that no one tracks.
This isn't rounding error territory. This is the difference between thinking you spent $500K on AI and discovering you actually spent $20M. And it's happening because AI pricing has fractured into six fundamentally incompatible billing models that make traditional cost management obsolete.
Here's the core problem: OpenAI charges per token. Microsoft charges per seat. AWS charges per GPU-hour. Salesforce charges per conversation. Your engineering team just spun up a fine-tuning job on SageMaker, and next month's invoice will teach you what "consumption-based pricing" really means. (Spoiler: it means "more than you budgeted.")
Why AI Costs Disappear Into Cloud Bills
Traditional software had one license, one price, one invoice. AI pricing combines multiple billing dimensions in a single product: tokens consumed, compute hours used, data processed, features accessed, and seats provisioned.
The application layer is visible. ChatGPT Plus subscriptions ($20/month), API credits ($0.15 to $15.00 per million tokens), per-seat licenses (Copilot at $18-$30/user/month), and outcome-based charges (Salesforce Agentforce at $2/conversation). These show up as vendor invoices that someone approved.
The infrastructure layer is invisible. GPU compute runs at $1-$55/hour depending on instance type. Model training jobs, data pipelines, vector databases, and self-hosted inference all get billed as generic cloud resources. They appear in your AWS/Azure/GCP bill under "Compute" or "Storage" with no AI context attached.
CloudZero processes over $15 billion in managed cloud and AI spend. Their data shows organizations budget 30-36% of total cloud expenditure for AI, but only 2.5% appears in budget line items explicitly labeled "AI." The remaining 97.5% is scattered across infrastructure categories that finance teams have no way to tie back to AI initiatives.
This creates a measurement gap that kills ROI tracking. If your CFO thinks the AI coding assistant program costs $180K/year (based on per-seat licenses for 500 developers at $30/month), but the actual cost is $2.1M when you include the inference infrastructure, vector database, fine-tuning runs, and data pipeline — you're not measuring ROI. You're guessing.
Six Incompatible Pricing Models (And None Are Comparable)
Every major AI provider uses a different pricing model, and comparing them requires normalizing fundamentally incompatible billing structures.
Per-token pricing (OpenAI, Anthropic, Google): You pay for input tokens processed and output tokens generated. OpenAI's GPT-4o costs $2.50 per 1M input tokens and $10.00 per 1M output tokens. Anthropic's Claude Sonnet 4.6 is $3.00/$15.00 per 1M tokens. Google's Gemini 2.5 Pro is $1.25/$10.00 per 1M tokens.
Forecasting difficulty: High. Costs scale with usage, and token consumption varies wildly based on prompt engineering, context window usage, and output verbosity. A team that switches from 200-word prompts to 2,000-word prompts can 10x their token costs without increasing the number of API calls.
Per-seat subscriptions (Microsoft Copilot): Flat monthly fee per user, independent of actual usage. Copilot Business costs $18/user/month. Copilot Enterprise costs $30/user/month plus a Microsoft 365 base subscription.
Forecasting difficulty: Low for seat counts, but usage-blind. You pay the same whether an employee uses Copilot 100 times a day or never logs in. This creates cost efficiency problems at scale — a 10,000-employee company paying $360K/month for Copilot licenses has no visibility into which teams drive value and which don't.
Consumption-based GPU compute (AWS SageMaker, Azure ML): You're billed by GPU type and runtime hours. AWS charges $55.04/hour for an H100 instance. If you run a training job for 72 hours, that's $3,962.88 — before data transfer, storage, and inference costs.
Forecasting difficulty: High. Training duration depends on model size, dataset size, hyperparameter tuning, and early stopping criteria. A team that estimates "a few days" for training can easily burn through $20K-$50K if jobs don't converge or require multiple retraining runs.
Per-conversation or per-resolution (Salesforce Agentforce, Intercom Fin): Billed per completed customer interaction. Agentforce charges $2 per conversation or $0.10 per action (depending on plan structure).
Forecasting difficulty: Medium. Volume is predictable if customer interaction patterns are stable, but costs spike during product launches, support incidents, or seasonal traffic. A SaaS company handling 50,000 support conversations per month pays $100K/month — but if a critical bug doubles conversation volume, costs double too.
Tiered/freemium models (ChatGPT Free/Plus/Pro): Free tier with usage limits, then paid capacity upgrades. ChatGPT offers Free (rate-limited), Plus ($20/month for faster access), and Team/Enterprise (custom pricing).
Forecasting difficulty: Medium. Individual costs are fixed, but "upgrade creep" is real. Teams start with free accounts, hit rate limits, upgrade to Plus, then realize they need API access for integrations, and suddenly the $0 pilot program costs $50K/year in API credits.
Hybrid models (subscription + usage overages): Base subscription fee plus per-token or per-conversation overages. Common in enterprise contracts where vendors offer a committed minimum spend with overage billing.
Forecasting difficulty: Highest. You have a cost floor (the base subscription) and a moving ceiling (usage-based overages). A company signing a $500K annual commitment might think costs are capped, then discover they hit $850K after overages from unexpected usage spikes.
The Real-World Cost Gap: What Finance Thinks vs. What Engineering Spent
Here's what the cost gap looks like in practice for a mid-size enterprise (5,000 employees, $2B annual revenue) deploying AI across customer support, sales, and engineering:
Finance's view (budget line items):
- Microsoft Copilot Enterprise: 1,200 seats × $30/month × 12 months = $432,000
- OpenAI API credits (committed spend): $150,000
- Salesforce Agentforce: 25,000 conversations/month × $2 × 12 = $600,000
- Total budgeted AI spend: $1,182,000
Engineering's reality (actual infrastructure costs):
- Copilot licenses: $432,000 (matches budget)
- OpenAI API usage (actual): $340,000 (126% over committed spend due to usage spikes)
- Agentforce conversations: $720,000 (20% above forecast after product launch)
- AWS SageMaker training runs: $580,000 (fine-tuning custom models for vertical-specific use cases)
- GPU inference infrastructure: $1,200,000 (self-hosted inference for latency-sensitive applications)
- Vector database (Pinecone/Weaviate): $180,000 (RAG pipeline for customer support knowledge base)
- Data pipeline and ETL: $320,000 (preparing training data, cleaning embeddings)
- S3 storage for model artifacts: $45,000
- Total actual AI spend: $3,817,000
The CFO approved a $1.18M AI budget. The company spent $3.82M — a 223% cost overrun that won't show up in quarterly AI spend reports because 70% of those costs are categorized as "cloud infrastructure."
What CFOs Should Demand From Their Teams
If you're a CFO or finance leader trying to get control of AI spend, here's what you need to implement before the end of Q3 2026:
Unified cost attribution across all AI layers. Every AI cost — whether it's a ChatGPT subscription, an API token, or a GPU training run — needs to be tagged and attributed to a specific initiative, team, or product. CloudZero and similar FinOps platforms can ingest costs from OpenAI, Anthropic, AWS, GCP, Azure, and SaaS vendors into one normalized cost model.
This eliminates the ghost spend problem. Instead of seeing "$2.3M cloud compute" with no context, you see "$1.4M attributed to AI model training, $600K to inference infrastructure, $300K to data pipelines" with drill-down visibility into which teams and projects drove those costs.
Unit economics for every AI use case. Don't measure AI spend in aggregate. Measure cost per outcome: cost per customer support ticket resolved, cost per sales email generated, cost per code suggestion accepted. This tells you which AI investments deliver ROI and which are burning budget.
A customer support team using Agentforce at $2/conversation needs to know whether AI resolution costs less than human agent resolution (typically $8-$12 per ticket). If AI resolves 60% of tickets, saves $6/ticket, and handles 50,000 tickets/month, that's $180K/month in savings against $100K/month in Agentforce costs — a clear win. But if resolution rates drop to 30%, the economics flip.
Pre-approval thresholds for infrastructure experiments. Engineering teams should not be able to spin up $50K+ GPU training runs without finance visibility. Implement spend alerts and approval workflows for high-cost infrastructure: H100 instances, multi-day training jobs, production inference clusters.
This doesn't slow down innovation. It prevents the scenario where a team "tries out" a training approach that burns through $80K in GPU costs before anyone realizes the job is still running.
Monthly AI cost reviews with engineering and product leaders. Finance, engineering, and product need to meet monthly to reconcile forecasted AI spend against actuals, identify cost anomalies, and adjust budgets. If API token costs spiked 40% month-over-month, you need to know whether that's from a product launch (expected), inefficient prompt engineering (fixable), or a bot attack (urgent).
Talking to CFOs over the past few months, the consistent theme is surprise. Not at the scale of AI investments — most companies expected to spend big on AI in 2026 — but at the gap between what they thought they were spending and what they actually spent. The ghost spend problem is fixable, but only if finance and engineering collaborate on cost visibility before Q4 budget planning starts.
Continue Reading
AI Cost Management:
- Why Enterprise AI Budgets Fail: The ROI Measurement Gap
- Cloud FinOps in the AI Era: What Changed in 2026
- The Real Cost of Running AI in Production
Know someone who'd find this useful?
Forward this to a CFO or finance leader managing AI budgets. They can subscribe at beri.net/#newsletter — it's free, twice a week, and I read every reply.
If you were forwarded this, click here to subscribe.
— Rajesh
