Replicate

Name: Replicate
Author: Replicate, Inc.

by Replicate, Inc.

Machine Learning InfrastructureModel Inference APIMLOps

Run AI with an API

Usage-based · Enterprise·Added July 2, 2026·Updated July 2, 2026

THE DAILY BRIEF

Replicate

by Replicate, Inc.

Machine Learning InfrastructureModel Inference APIMLOps

Run AI with an API

Usage-based · Enterprise

Cloud platform to run, fine-tune, and deploy open-source and custom machine learning models through a simple API, without managing GPU infrastructure.

At a Glance

Category: Machine Learning Infrastructure
Pricing: Usage-based, Enterprise
Target Market: Enterprise, Startups, Developers
Founded: 2019
Headquarters: San Francisco, California, United States

Key Features

✓Run models via API
✓Cog
✓Fine-tuning
✓Deployments & auto-scaling
✓Client SDKs
✓Per-second usage billing

Capabilities

✓api access

✓fine tuning

✓model deployment

✓auto scaling

✓sdk available

✓custom model hosting

✓usage based billing

Use Cases

•Add AI features to applications
•Deploy custom and fine-tuned models
•Prototype and experiment with ML

Ideal For

Best For

✓Developers integrating AI models into apps via API
✓Running open-source models without managing GPU infrastructure
✓Fine-tuning and deploying custom or proprietary models to production

Not Ideal For

✗Teams that require fully on-premise or air-gapped model hosting

Pricing

Pay-as-you-go (usage-based)

Per-second hardware billing, e.g. CPU $0.000100/sec (~$0.36/hr), Nvidia T4 $0.000225/sec (~$0.81/hr), Nvidia A100 80GB $0.001400/sec (~$5.04/hr), Nvidia H100 $0.001525/sec (~$5.49/hr); some models billed per token/per image (e.g. FLUX 1.1 Pro $0.04/image)

✓Billed by processing time on public models
✓Per-second GPU/CPU rates that scale with multi-GPU configs (2x/4x/8x)
✓Private/custom models billed for setup, idle, and active time
✓Official Python and JavaScript SDKs

Enterprise

Custom

✓Dedicated account manager
✓Priority support
✓Higher GPU limits
✓Performance SLAs
✓Volume discounts

Pricing is usage-based. Most public models are billed by processing time at per-second hardware rates that vary by GPU/CPU tier (e.g. CPU ~$0.36/hr, Nvidia T4 ~$0.81/hr, L40S ~$3.51/hr, A100 80GB ~$5.04/hr, H100 ~$5.49/hr), with multi-GPU options scaling proportionally. Certain models (notably large language models and some image models) are billed per input/output token or per output image instead of by time. Private/custom models run on dedicated hardware and are billed for setup, idle, and active processing time. Enterprise adds a dedicated account manager, priority support, higher GPU limits, performance SLAs, and volume discounts. Exact per-model rates are listed on each model page and on the pricing page.

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

beri.net

Subscribe at beri.net/subscribe for twice-weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi | X: x.com/rajeshberi

Visit Website

Cloud platform to run, fine-tune, and deploy open-source and custom machine learning models through a simple API, without managing GPU infrastructure.

At a Glance

Category: Machine Learning Infrastructure
Pricing: Usage-based, Enterprise
Target Market: Enterprise, Startups, Developers
Founded: 2019
Headquarters: San Francisco, California, United States

Key Features

✓
Run models via API
Access thousands of community-contributed, production-ready models (image, video, speech, music, and LLMs) and call them with as little as one line of code.
✓
Cog
Open-source tool for packaging machine learning models into containers that automatically generates an API server and handles cloud infrastructure.
✓
Fine-tuning
Customize existing models on your own data to create specialized versions for particular tasks.
✓
Deployments & auto-scaling
Deploy your own models on dedicated hardware with automatic scaling up and down based on traffic, paying only for active compute.
✓
Client SDKs
Official Python and JavaScript/Node.js client libraries for interacting with the platform programmatically.
✓
Per-second usage billing
Transparent per-second GPU and CPU pricing (plus per-token/per-image rates for certain models) so you are billed only for what you use.

Capabilities

✓api access

✓fine tuning

✓model deployment

✓auto scaling

✓sdk available

✓custom model hosting

✓usage based billing

Use Cases

•
Add AI features to applications
Integrate image generation, video, speech, music, and language models into products through a single API without building ML infrastructure.
•
Deploy custom and fine-tuned models
Package models with Cog and deploy them to production with managed, auto-scaling GPU infrastructure.
•
Prototype and experiment with ML
Quickly test and iterate on open-source models in the cloud without provisioning or managing GPUs.

Ideal For

Best For

✓Developers integrating AI models into apps via API
✓Running open-source models without managing GPU infrastructure
✓Fine-tuning and deploying custom or proprietary models to production

Not Ideal For

✗Teams that require fully on-premise or air-gapped model hosting

Integrations

✓API Support

✓SDK Available

SDK:PythonJavaScript/Node.js

Deployment

✗Self-Hosted

✓Cloud-Hosted

✗On-Premise

Cloud-hosted API

Pricing

Pay-as-you-go (usage-based)

✓Billed by processing time on public models
✓Per-second GPU/CPU rates that scale with multi-GPU configs (2x/4x/8x)
✓Private/custom models billed for setup, idle, and active time
✓Official Python and JavaScript SDKs

Enterprise

Custom

✓Dedicated account manager
✓Priority support
✓Higher GPU limits
✓Performance SLAs
✓Volume discounts

Connect

Newsletter

Stay Ahead of the Curve

Weekly enterprise AI insights for technology leaders. No spam, no vendor pitches—unsubscribe anytime.

Latest Articles

View All →

Replicate

At a Glance

Key Features

Capabilities

Use Cases

Ideal For

Best For

Not Ideal For

Pricing

Pay-as-you-go (usage-based)

Enterprise

THE DAILY BRIEF

At a Glance

Key Features

Capabilities

Use Cases

Ideal For

Best For

Not Ideal For

Integrations

Deployment

Pricing

Pay-as-you-go (usage-based)

Enterprise

Connect

Stay Ahead of the Curve

Latest Articles

Microsoft's $2.5B Bet: AI Can't Deploy Itself

19 Days Dark: How a Shutdown Broke Enterprise AI's Vendor Myth

Microsoft and AWS Bet $3.5B That AI Deployment Is Broken

$145B Cloud War: Meta's Move That Wiped $12B in One Day