M

Modal

by Modal Labs, Inc.

Infrastructure & CloudDeveloper ToolsAI Models & APIs

The production cloud for AI

usage-based · per-second compute (separate GPU, CPU, memory, storage rates)·Added June 23, 2026·Updated June 23, 2026
Share:
THE DAILY BRIEF
Modal

by Modal Labs, Inc.

Infrastructure & CloudDeveloper ToolsAI Models & APIs

The production cloud for AI

usage-based · per-second compute (separate GPU, CPU, memory, storage rates)

Modal is a Python-native serverless cloud platform that lets developers run AI/ML inference, training, fine-tuning, batch jobs, and sandboxed code on GPUs and CPUs with sub-second cold starts and instant autoscaling. Hardware and container images are defined directly in Python — no YAML, Dockerfiles, or DevOps required.

At a Glance

Category
Infrastructure & Cloud
Pricing
usage-based, per-second compute (separate GPU, CPU, memory, storage rates)
Target Market
AI startups, ML/AI engineering teams, Data teams, Generative-AI app builders (LLM, image, audio, video), Research labs and academics, Enterprises running production AI inference
Founded
2021
Headquarters
New York, NY, USA
Customers
Thousands of customers (no exact public count)

Key Features

  • Serverless GPU access with instant autoscaling
  • Sub-second cold starts
  • Python-native infrastructure-as-code
  • Broad workload support
  • Built-in storage and data primitives

Capabilities

text generation
image generation
video generation
code generation
workflow automation
api access
audio generation
fine tuning
agent orchestration

Use Cases

  • AI music and media generation at scale
  • LLM inference and fine-tuning for production apps
  • Sandboxed execution of AI-generated code

Ideal For

Best For

  • AI/ML inference at scale
  • Fine-tuning and training open-source models
  • Batch processing and embeddings
  • Running AI-generated or untrusted code in sandboxes
  • Cron and scheduled compute jobs

Market Analysis

Premium, developer-loved, AI-native serverless GPU/compute cloudA unicorn competing at the high-velocity end of the AI infrastructure market

Pros

  • Excellent developer experience (Python-native, no DevOps)
  • Genuinely fast cold starts
  • True scale-to-zero and pay-per-second with no idle cost
  • Covers inference, training, batch, and sandboxes
  • Strong, named enterprise and AI-startup customer base
  • Well-funded and rapidly growing

Cons

  • You build the application layer yourself (infrastructure, not turnkey)
  • Usage-based GPU costs can add up at scale
  • Cloud-only with no on-prem or self-hosted option
  • Thin presence on traditional enterprise review sites
  • Region markups and non-preemptible premiums

Pricing

Starter

$0/month + compute

  • $30/month free compute credits
  • 3 workspace seats
  • 100 containers
  • 10 GPU concurrency

Team

$250/month + compute

  • $100/month free credits
  • Unlimited seats
  • 1,000 containers
  • 50 GPU concurrency
  • Custom domains
  • Static IP
  • Deployment rollbacks

Enterprise

Custom

  • Volume discounts
  • Higher GPU concurrency
  • HIPAA
  • Okta SSO
  • Audit logs
  • Private Slack support
  • Embedded ML engineering

Pay only for active compute time, billed per second with no charge for idle resources. Sample rates: B200 $0.001736/sec, H100 $0.001097/sec, A100-80GB $0.000694/sec, T4 $0.000164/sec; Volumes $0.09/GiB/month with 1 TiB free. All new accounts get $30/month in recurring free compute credits; startup grants up to $25K and academic grants up to $10K are available.

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

beri.net

Subscribe at beri.net/subscribe for twice-weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

Modal is a Python-native serverless cloud platform that lets developers run AI/ML inference, training, fine-tuning, batch jobs, and sandboxed code on GPUs and CPUs with sub-second cold starts and instant autoscaling. Hardware and container images are defined directly in Python — no YAML, Dockerfiles, or DevOps required.

At a Glance

Category
Infrastructure & Cloud
Pricing
usage-based, per-second compute (separate GPU, CPU, memory, storage rates)
Target Market
AI startups, ML/AI engineering teams, Data teams, Generative-AI app builders (LLM, image, audio, video), Research labs and academics, Enterprises running production AI inference
Founded
2021
Headquarters
New York, NY, USA
Customers
Thousands of customers (no exact public count)

Key Features

  • Serverless GPU access with instant autoscaling

    On-demand access to T4, A10G, L40S, A100, H100, H200, and B200 GPUs; scales from 0 to 1000+ GPUs in seconds with no capacity planning or commitments.

  • Sub-second cold starts

    A proprietary Rust container runtime and image builder boots containers and loads large models in under a second, even on in-demand GPU types.

  • Python-native infrastructure-as-code

    Define container images, hardware, secrets, and scaling in pure Python via decorators, with no YAML, Dockerfiles, or separate DevOps scripts.

  • Broad workload support

    LLM and multimodal inference, fine-tuning (SFT, LoRA, full), multi-node training, batch jobs, scheduled/cron jobs, web endpoints, and Sandboxes for untrusted code and agents.

  • Built-in storage and data primitives

    High-performance distributed Modal Volumes plus the ability to mount S3, GCS, and other cloud storage directly into functions, with integrated logging and observability.

Capabilities

text generation
image generation
video generation
code generation
workflow automation
api access
audio generation
fine tuning
agent orchestration

Use Cases

  • AI music and media generation at scale

    Suno generates AI music and Substack transcribes podcasts at scale on Modal.

  • LLM inference and fine-tuning for production apps

    Ramp deploys open-source LLMs and Lovable runs tens of thousands of concurrent containers on Modal.

  • Sandboxed execution of AI-generated code

    Developers and agents run untrusted, AI-generated code in isolated sandboxes; over a billion sandboxes have been launched, including thousands of concurrent sandboxes for reinforcement learning.

Ideal For

Best For

  • AI/ML inference at scale
  • Fine-tuning and training open-source models
  • Batch processing and embeddings
  • Running AI-generated or untrusted code in sandboxes
  • Cron and scheduled compute jobs

Integrations

SDK Available
SDK:Python

Market & Ratings

Estimated Customers

Thousands of customers (no exact public count)

Market Analysis

Premium, developer-loved, AI-native serverless GPU/compute cloudA unicorn competing at the high-velocity end of the AI infrastructure market

Pros

  • Excellent developer experience (Python-native, no DevOps)
  • Genuinely fast cold starts
  • True scale-to-zero and pay-per-second with no idle cost
  • Covers inference, training, batch, and sandboxes
  • Strong, named enterprise and AI-startup customer base
  • Well-funded and rapidly growing

Cons

  • You build the application layer yourself (infrastructure, not turnkey)
  • Usage-based GPU costs can add up at scale
  • Cloud-only with no on-prem or self-hosted option
  • Thin presence on traditional enterprise review sites
  • Region markups and non-preemptible premiums

Pricing

Free Trial Available

Starter

$0/month + compute

  • $30/month free compute credits
  • 3 workspace seats
  • 100 containers
  • 10 GPU concurrency

Team

$250/month + compute

  • $100/month free credits
  • Unlimited seats
  • 1,000 containers
  • 50 GPU concurrency
  • Custom domains
  • Static IP
  • Deployment rollbacks

Enterprise

Custom

  • Volume discounts
  • Higher GPU concurrency
  • HIPAA
  • Okta SSO
  • Audit logs
  • Private Slack support
  • Embedded ML engineering

Pay only for active compute time, billed per second with no charge for idle resources. Sample rates: B200 $0.001736/sec, H100 $0.001097/sec, A100-80GB $0.000694/sec, T4 $0.000164/sec; Volumes $0.09/GiB/month with 1 TiB free. All new accounts get $30/month in recurring free compute credits; startup grants up to $25K and academic grants up to $10K are available.

Newsletter

Stay Ahead of the Curve

Weekly enterprise AI insights for technology leaders. No spam, no vendor pitches—unsubscribe anytime.

Subscribe