Dell Technologies just made the biggest on-premises AI infrastructure bet in history. At Dell Technologies World 2026 in Las Vegas, CEO Michael Dell and NVIDIA CEO Jensen Huang announced that 67% of enterprise AI workloads now run outside the cloud — and projected AI infrastructure spending could hit $3-4 trillion by 2030.
The headline product: Dell PowerRack, a fully integrated rack-scale AI platform that deploys in 6.5 hours and cuts token costs by 10x compared to cloud alternatives. Eli Lilly, Honeywell, and Samsung Electronics are already running production workloads on Dell's AI Factory with NVIDIA.
For CIOs and CFOs navigating the build-vs-buy infrastructure decision, this announcement signals a massive strategic shift: enterprise AI is moving back behind the corporate firewall.
The Numbers Behind the On-Prem Shift
Dell's internal AI adoption survey (cited at the keynote) found that 88% of enterprises are now running at least one AI workload on-premises — whether in their own data centers, at the edge, in colocation facilities, or on local devices.
The reasons are straightforward: cost control, data sovereignty, and performance predictability.
Jensen Huang's keynote quote captured the market shift: "We've now arrived at the era of useful AI, which is the reason why demand is going parabolic, utterly parabolic. What took months now takes weeks. What took weeks now takes days. And what takes days now takes hours."
The compute requirements for this productivity leap are staggering. Dell projects token consumption will grow 3,400% by 2030. For enterprises running agentic AI at scale, the cost of cloud-based token consumption becomes unsustainable fast.
Bottom line for CFOs: When your AI workloads scale from pilot to production, per-token cloud costs add up. On-premises infrastructure shifts the cost curve from variable OpEx to predictable CapEx — with full control over data and model IP.
Dell PowerRack: The Technical Architecture
Dell PowerRack is a rack-scale platform that integrates compute, networking, and storage into a single engineered system. Unlike traditional component-based deployments, PowerRack arrives pre-integrated with thermal design, power management, and software optimization built in from the ground up.
Deployment speed: 6.5 hours from delivery to production-ready. Compare that to traditional rack assembly timelines (weeks to months) and the operational advantage is clear.
What's inside PowerRack:
- Compute: Dell PowerEdge XE9812 servers built on NVIDIA Vera Rubin NVL72 GPUs (up to 10x lower cost-per-token than NVIDIA Blackwell for agentic AI inferencing)
- CPU: NVIDIA Vera CPUs with 1.2 TB/s memory bandwidth (50% faster agentic workloads vs traditional x86, 3x faster database queries)
- Networking: Dell PowerSwitch with NVIDIA Quantum-X800 InfiniBand and NVIDIA Spectrum-6 Ethernet (liquid-cooled, co-packaged optics)
- Storage: Dell Exascale storage (245 TB drives, releasing H2 2026)
Thermal management: PowerRack uses Dell PowerCool CDU C7000 liquid cooling to handle massive-scale AI workloads without the operational overhead of retrofitting air-cooled data centers.
For CTOs: PowerRack is designed for enterprises that need predictable performance under sustained load. Traditional cloud burst capacity works for occasional spikes, but when your AI agents run 24/7, you need infrastructure you control.
Real Enterprise Deployments: Eli Lilly, Honeywell, Samsung
Eli Lilly is using Dell AI Factory with NVIDIA to accelerate drug discovery and life sciences R&D. Diogo Rau, Lilly's EVP and Chief Information and Digital Officer, joined the keynote to describe technology as key to delivering cutting-edge science at scale.
His quote: "I think we're on the verge of maybe being able to end disease as we know it. Something like that was completely unimaginable 20 years ago, but today we can imagine it."
Lilly's AI workloads run in environments where failure has severe consequences — regulatory compliance, patient safety, and intellectual property protection. Cloud-based AI doesn't meet these requirements. On-premises AI infrastructure does.
Honeywell moved from public cloud to on-premises AI specifically for industrial use cases, digital twins, and automation. Suresh Venkatarayalu, Honeywell's CTO, explained the decision in the keynote: "For me, partnering with Dell and NVIDIA is not just about getting infrastructure. It's the full AI stack: scalable, secured, and trusted by customers."
Honeywell's AI workloads span from data center to the edge — connecting factory floors, supply chains, and real-time control systems. Cloud latency and connectivity dependencies don't work for time-sensitive industrial automation. On-premises AI does.
Samsung Electronics is running R&D chip design and manufacturing AI workloads on Dell AI Factory with NVIDIA. Chip design requires massive compute for simulations, verification, and optimization — workloads that benefit from dedicated on-premises infrastructure rather than shared cloud resources.
Bottom line for business leaders: These aren't pilot projects. These are multi-year, production-scale deployments where failure has severe consequences. That's why enterprises are choosing on-premises AI infrastructure.
The Cloud vs On-Prem Cost Equation
Cloud-based AI pricing is typically billed per token or per API call. This works well for experimentation and low-volume use cases. But when your enterprise deploys agentic AI at scale, token consumption grows exponentially.
Example scenario: An enterprise running 10,000 AI agents (customer service, internal automation, code generation, data analysis) might generate billions of tokens per month. At $0.01 per 1,000 tokens (a typical cloud rate for large models), that's $10,000 per billion tokens — or $100,000+ per month for moderate-scale deployments.
Over 36 months, that's $3.6 million in OpEx. A comparable on-premises PowerRack deployment might cost $1.2 million in CapEx (estimated based on rack-scale AI infrastructure pricing) — breaking even in the first year and delivering pure savings in years 2 and 3.
Additional on-prem advantages:
- Zero cloud egress fees (moving data in and out of cloud AI services can cost thousands per month)
- No API rate limits (cloud throttling doesn't exist when you own the infrastructure)
- Full model IP control (train proprietary models without exposing data to third-party cloud providers)
For CFOs: The on-prem cost advantage increases the longer you run production AI workloads. Cloud makes sense for experiments. On-prem makes sense for strategic, high-volume deployments.
Data Sovereignty and Confidential Computing
67% of enterprises cite data sovereignty as a top concern when deploying AI workloads in the cloud. This includes:
- Regulatory compliance: GDPR, HIPAA, financial regulations that restrict where data can be processed
- Intellectual property protection: Proprietary models, training data, and business logic that can't leave the enterprise perimeter
- Customer trust: Industries like healthcare, finance, and defense require on-premises AI for contractual and security reasons
Dell and NVIDIA addressed this directly with NVIDIA Confidential Computing — a hardware-based security layer that protects AI models and sensitive data in use, even when running frontier models like Google Gemini 3.0 or SpaceXAI on-premises.
What this enables:
- Run the world's best proprietary AI models (Gemini, SpaceXAI, etc.) inside your own data center without exposing model IP or enterprise data
- Meet regulatory requirements for data residency and sovereignty
- Deploy AI workloads in regulated industries (finance, healthcare, government) without compromising on model quality
Google Distributed Cloud (GDC) with Gemini 3.0 is now available in preview on Dell PowerEdge XE9780 servers, accelerated by NVIDIA Blackwell and secured by NVIDIA Confidential Computing.
For CIOs: Confidential Computing solves the "we want frontier models but can't send data to the cloud" problem. You get both: world-class models running on-premises with full data control.
The Agentic AI Performance Advantage
NVIDIA Vera CPU (announced at the keynote) is purpose-built for agentic AI workloads. Unlike traditional CPUs optimized for parallel throughput, Vera is designed for sequential, tool-calling agent workloads where each step waits on the last.
Performance benchmarks:
- 50% faster agentic workloads vs traditional x86 CPUs (data pipelines, sandboxed tools, code execution)
- 3x faster database queries (Starburst, DuckDB) — critical when agents are constantly querying enterprise data
- 1.2 TB/s memory bandwidth — the highest single-threaded performance of any CPU in the world
Why this matters for enterprises:
- Agentic AI deployments (customer service agents, sales automation, internal workflow bots) spend most of their time waiting on databases and APIs
- Cloud-based agents run on shared infrastructure with unpredictable latency
- On-premises agents on Vera CPUs deliver consistent, fast response times under sustained load
Dell PowerEdge M9822 and R9822 servers bring NVIDIA Vera CPUs to the enterprise AI factory, enabling faster agent responses and shorter feedback loops at scale.
For business leaders: Faster agents = better customer experience and higher employee productivity. The ROI shows up in reduced support costs, faster sales cycles, and more efficient internal operations.
5,000 Enterprises Already Deployed
Dell announced that 5,000 enterprises are already running AI workloads on Dell AI Factory with NVIDIA. This isn't early adoption — this is mainstream enterprise deployment.
Industries include:
- Life sciences: Eli Lilly (drug discovery, clinical trials)
- Industrial automation: Honeywell (digital twins, factory optimization)
- Semiconductors: Samsung Electronics (chip design, manufacturing)
- Financial services: Hudson River Trading (algorithmic trading, AI-driven research)
Bottom line: The shift from cloud to on-prem AI is already happening. Enterprises that move now gain competitive advantage in cost structure, data control, and performance predictability.
What This Means for Enterprise AI Strategy
Three strategic takeaways:
-
Cloud vs on-prem is not binary — Hybrid deployments make sense. Use cloud for experiments, on-prem for production. But if 67% of workloads are already leaving the cloud, the default assumption should flip.
-
Data sovereignty is now table stakes — Regulators, customers, and boards are asking where your AI runs and who has access to your data. On-premises infrastructure answers these questions definitively.
-
Cost predictability beats OpEx flexibility at scale — Cloud pricing works when token consumption is low. At production scale, on-premises CapEx delivers better unit economics.
For CIOs and CTOs:
- Evaluate PowerRack and competitive on-prem AI platforms (HPE, Lenovo, Supermicro) for production AI workloads
- Plan rack-scale deployments with 6.5-hour deployment windows (not months)
- Architect hybrid cloud strategies with on-prem as the default, cloud as the exception
For CFOs and business leaders:
- Model out cloud token costs vs on-prem CapEx for your expected AI scale (3-year TCO comparison)
- Factor in data sovereignty and compliance risk when choosing cloud vs on-prem
- Expect ROI from on-prem AI to show up in cost savings (cloud avoidance) and productivity gains (faster agents, better performance)
Dell's $4 trillion bet is that enterprise AI infrastructure will follow the same path as enterprise databases: cloud for flexibility, on-prem for production at scale.
The early evidence suggests Dell is right.
Continue Reading:
- Why Enterprise AI Adoption Lags Behind Consumer AI
- The ROI of AI Infrastructure: Cloud vs On-Premises
- How Fortune 500 Companies Deploy Production AI
About the Author:
Rajesh Beri writes THE DAILY BRIEF, a twice-weekly newsletter on Enterprise AI for technical and business leaders. Follow him on LinkedIn, Twitter/X, and Facebook.
Subscribe: Get enterprise AI insights delivered to your inbox every Tuesday and Thursday at beri.net.
