Mistral Forge: Why 'Build-Your-Own AI' Could Beat the API Model

Mistral's new Forge platform lets enterprises train custom AI models from scratch on proprietary data. At $1B ARR, they're betting that companies owning their AI will beat those renting it.

By Rajesh Beri·March 18, 2026·10 min read
Share:

THE DAILY BRIEF

Enterprise AIAI ModelsAI InfrastructureDeploymentData CentersSecurity

Mistral Forge: Why 'Build-Your-Own AI' Could Beat the API Model

Mistral's new Forge platform lets enterprises train custom AI models from scratch on proprietary data. At $1B ARR, they're betting that companies owning their AI will beat those renting it.

By Rajesh Beri·March 18, 2026·10 min read

Mistral AI launched Forge this week — a platform that lets enterprises train AI models from scratch using their own data. Announced at NVIDIA GTC, it's a direct challenge to OpenAI and Anthropic's "rent our model" approach. The French AI startup, on track to hit $1 billion in annual recurring revenue this year, is betting that the companies with the hardest AI problems are the ones least served by generic models trained on the public internet.

⚡ When Custom Training Makes Sense

  • Your competitive advantage lives in proprietary data → Generic models never saw your internal documents, workflows, or domain knowledge
  • You need data sovereignty → Train on-premises with zero data exposure to third-party clouds
  • You work with speciali[zed](/tools/zed) domains → Ancient manuscripts, proprietary code, regulated financial models
  • You're building mission-critical agents → Custom models align to internal policies through reinforcement learning

Why Fine-Tuning APIs Plateau

Most enterprise AI adoption follows a pattern: pick GPT-4, Claude, or Gemini, then fine-tune through a cloud API for specific tasks. This works for proofs-of-concept and many production use cases. But Elisa Salamanca, Mistral's head of product, argues this approach fundamentally plateaus when you try to solve your hardest problems.

"We had a fine-tuning API relying on supervised fine-tuning. I think it was kind of what was the standard a couple of months ago," Salamanca told VentureBeat. "It gets you to a proof-of-concept state. Whenever you actually want to have the performance that you're targeting, you need to go beyond. AI scientists today are not using fine-tuning APIs. They're using much more advanced tools, and that's what Forge is bringing to the table."

Forge packages the training methodology Mistral's own AI scientists use internally — data mixing strategies, synthetic data pipelines, distributed computing optimizations, and battle-tested training recipes. Salamanca drew a sharp line: "There's no platform out there that provides you real-world training recipes that work. Other open-source repositories or other tools can give you generic configurations or community tutorials, but they don't give you the recipe that's been validated — that we've been doing for all of our flagship models today."

What Forge does differently: It supports full-cycle training — pre-training on large internal datasets, post-training through supervised fine-tuning, DPO, ODPO, and critically, reinforcement learning pipelines that align models with internal policies and operational objectives over time. This goes beyond what fine-tuning APIs offer.

Early Use Cases: From Ancient Texts to Hedge Fund Code

In conversations with TechCrunch, Mistral shared real-world examples that show where off-the-shelf models break:

Ancient manuscripts: A public institution had ancient texts with missing sections from damage. "The models that were available were not able to do this because they've never seen the data," Salamanca explained. "Digitization was not very good. There were some unique patterns and characters, and so we actually created a model for them to fill in the spans."

Telecom code migration: Ericsson partnered with Mistral to customize its Codestral model for legacy-to-modern code translation. Ericsson built up five years of proprietary knowledge around an internal calling language — a codebase so specialized no off-the-shelf model has ever seen it. "The concrete impact is like turning a year-long manual migration process, where each engineer needs six months of onboarding, to something that's really more scalable and faster."

Hedge fund quant languages: Financial firms worked with Mistral to build models for proprietary quantitative languages — the kind of deeply guarded intellectual property these firms keep on-premises and never expose to cloud-hosted AI services. Using Forge's reinforcement learning capabilities, Mistral helped one fund develop custom benchmarks and train models to outperform on them, producing "a unique model that was able to give them the competitive edge that was needed."

Why this matters for finance leaders and IT leaders: If your competitive moat is built on decades of internal IP — proprietary workflows, domain-specific data, or custom code — generic models trained on public data leave value on the table. The question isn't "can GPT-4 work?" but "does it capture the nuances that differentiate us?"

The Business Model: Licenses, Data Pipelines, and Embedded Scientists

Forge's revenue model reflects the complexity of enterprise training. For customers running on their own GPU clusters — a common requirement in regulated or IP-sensitive industries — Mistral doesn't charge for compute. Instead, the company charges:

  1. License fee for the Forge platform
  2. Optional data pipeline services (acquisition, curation, synthetic data generation)
  3. Forward-deployed scientists — embedded AI researchers who work alongside customer teams

"No competitor out there today is kind of selling this embedded scientist as part of their training platform offering," Salamanca said. This has echoes of Palantir's early playbook, where forward-deployed engineers bridged the gap between powerful software and messy enterprise reality. Most organizations lack the internal expertise to design effective training recipes, curate data at scale, or navigate distributed GPU training.

Training can happen on Mistral's own clusters, on Mistral Compute (the company's dedicated infrastructure offering), or entirely on-premises. "We have all these different cases, and we support everything," Salamanca said.

Revenue at scale: Mistral CEO Arthur Mensch says the company is on track to surpass $1 billion in annual recurring revenue this year. Early Forge customers include ASML (the Dutch chipmaker that led Mistral's Series C at a €11.7B valuation), Ericsson, the European Space Agency, Singapore's DSO National Laboratories, and Italy's Reply consulting group.

Data Sovereignty: The Sharpest Selling Point

When customers train on their own infrastructure, Mistral never sees the data. "It's on their clusters, it's with their data — we don't see anything of it, and so it's completely under their control," Salamanca said. "I think this is something that sets us apart from the competition, where you actually need to upload your data, and you have a black box effect."

This matters enormously in defense, intelligence, financial services, and healthcare — sectors where legal and reputational risks of exposing proprietary data to third-party clouds can be deal-breakers. Mistral's customer roster suggests they're deliberately targeting the most data-sensitive corners of the enterprise market.

For security leaders and compliance leaders: If your data residency requirements prevent uploading training data to AWS, Azure, or GCP, cloud-based fine-tuning APIs are non-starters. Forge allows you to train entirely on-premises while still accessing Mistral's training recipes and tooling.

Why Custom Models Still Matter in the Agent Era

The AI industry in 2026 has been consumed by agents — autonomous systems that use tools, navigate workflows, and take actions. If the future belongs to agents, why does the underlying model matter? Can't companies plug into the best frontier model through an API and focus on orchestration?

Salamanca pushed back: "The customers that we've been working on — some of these specific problems are things that no MCP server would ever solve. You actually need that intelligence. You actually need to create that model that will help you solve your most critical business problem."

She also argued that model customization is essential even in purely agentic architectures: "There are some agentic behaviors that you need to bring to the model. It can be about reasoning patterns, specific types of documentation, making sure that you have the right reasoning traces. Even in these cases where people are going completely agentic, you still need model customization — like reinforcement learning techniques — to actually get the right level of performance."

Mistral's announcement makes this explicit: custom models make enterprise agents more reliable by providing deeper understanding of internal environments — more precise tool selection, more dependable multi-step workflows, and decisions that reflect internal policies rather than generic assumptions.

For VP Engineering and AI/ML leads: If your agent platform calls a generic LLM that doesn't understand your internal tools, workflows, or business logic, you'll spend months building brittle prompt engineering workarounds. Training a model on your internal data means the agent starts with institutional knowledge baked in.

Mistral vs the Hyperscalers

Forge enters a crowded market. Amazon Bedrock, Microsoft Azure AI Foundry, and Google Cloud Vertex AI all offer model training and customization. But Salamanca argues these offerings are fundamentally limited:

  1. Cloud-only: "In one set of cases, it's very easy to answer — they want to run this on their premises, and so all these tools that are available on the cloud are just not available for them."
  2. Simplified interfaces: Hyperscalers' training tools offer API interfaces that don't provide the depth of control serious model training requires.
  3. Dependency risk: One customer described how a new closed-source model release — more verbose than its predecessor — crashed their production pipelines. "When you're relying on closed-source models, you are also super dependent on the updates of the model that have side effects."

Mistral has released models under permissive Apache 2.0 licenses since its founding. Salamanca confirmed that while Forge currently works with Mistral's own models, support for other open-source architectures is planned: "We're deeply rooted into open source. This has been part of our DNA since the beginning, and we have been building Forge to be an open platform — it's just a question of a matter of time that we'll be opening this to other open-source models."

The Verdict: Who Should Consider Forge?

YES, if you:

  • Have proprietary data that's core to your competitive advantage
  • Need data sovereignty for regulatory, security, or IP protection reasons
  • Work in specialized domains (telecom, finance, healthcare, defense)
  • Want to build mission-critical agents aligned to internal policies
  • Already operate GPU clusters or plan to for AI workloads

NO, if you:

  • Are building generic enterprise apps (customer support, content generation, summarization)
  • Don't have unique domain data or workflows
  • Prefer cloud-hosted services with minimal ops burden
  • Are satisfied with fine-tuning APIs for your use cases

For finance leaders: The ROI question is whether the delta between a generic model and a custom-trained one translates to measurable business value. If your use case is generic, renting GPT-4 through an API is cheaper. If your competitive moat is built on proprietary knowledge, owning the model could be worth millions.

For enterprise leaders: The architectural question is whether you're willing to own the training, ops, and continuous improvement burden. Forge provides tooling and embedded scientists, but you're still responsible for data quality, infrastructure, and model lifecycle management.

Continue Reading


Want more enterprise AI insights like this? Subscribe to THE DAILY BRIEF — Tuesday + Thursday mornings, delivered to your inbox.


Continue Reading

Related articles:

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

thedailybrief.com

Subscribe at thedailybrief.com/subscribe for weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

Mistral Forge: Why 'Build-Your-Own AI' Could Beat the API Model

Photo by [Tara Winstead](https://www.pexels.com/@tara-winstead) on Pexels

Mistral AI launched Forge this week — a platform that lets enterprises train AI models from scratch using their own data. Announced at NVIDIA GTC, it's a direct challenge to OpenAI and Anthropic's "rent our model" approach. The French AI startup, on track to hit $1 billion in annual recurring revenue this year, is betting that the companies with the hardest AI problems are the ones least served by generic models trained on the public internet.

⚡ When Custom Training Makes Sense

  • Your competitive advantage lives in proprietary data → Generic models never saw your internal documents, workflows, or domain knowledge
  • You need data sovereignty → Train on-premises with zero data exposure to third-party clouds
  • You work with speciali[zed](/tools/zed) domains → Ancient manuscripts, proprietary code, regulated financial models
  • You're building mission-critical agents → Custom models align to internal policies through reinforcement learning

Why Fine-Tuning APIs Plateau

Most enterprise AI adoption follows a pattern: pick GPT-4, Claude, or Gemini, then fine-tune through a cloud API for specific tasks. This works for proofs-of-concept and many production use cases. But Elisa Salamanca, Mistral's head of product, argues this approach fundamentally plateaus when you try to solve your hardest problems.

"We had a fine-tuning API relying on supervised fine-tuning. I think it was kind of what was the standard a couple of months ago," Salamanca told VentureBeat. "It gets you to a proof-of-concept state. Whenever you actually want to have the performance that you're targeting, you need to go beyond. AI scientists today are not using fine-tuning APIs. They're using much more advanced tools, and that's what Forge is bringing to the table."

Forge packages the training methodology Mistral's own AI scientists use internally — data mixing strategies, synthetic data pipelines, distributed computing optimizations, and battle-tested training recipes. Salamanca drew a sharp line: "There's no platform out there that provides you real-world training recipes that work. Other open-source repositories or other tools can give you generic configurations or community tutorials, but they don't give you the recipe that's been validated — that we've been doing for all of our flagship models today."

What Forge does differently: It supports full-cycle training — pre-training on large internal datasets, post-training through supervised fine-tuning, DPO, ODPO, and critically, reinforcement learning pipelines that align models with internal policies and operational objectives over time. This goes beyond what fine-tuning APIs offer.

Early Use Cases: From Ancient Texts to Hedge Fund Code

In conversations with TechCrunch, Mistral shared real-world examples that show where off-the-shelf models break:

Ancient manuscripts: A public institution had ancient texts with missing sections from damage. "The models that were available were not able to do this because they've never seen the data," Salamanca explained. "Digitization was not very good. There were some unique patterns and characters, and so we actually created a model for them to fill in the spans."

Telecom code migration: Ericsson partnered with Mistral to customize its Codestral model for legacy-to-modern code translation. Ericsson built up five years of proprietary knowledge around an internal calling language — a codebase so specialized no off-the-shelf model has ever seen it. "The concrete impact is like turning a year-long manual migration process, where each engineer needs six months of onboarding, to something that's really more scalable and faster."

Hedge fund quant languages: Financial firms worked with Mistral to build models for proprietary quantitative languages — the kind of deeply guarded intellectual property these firms keep on-premises and never expose to cloud-hosted AI services. Using Forge's reinforcement learning capabilities, Mistral helped one fund develop custom benchmarks and train models to outperform on them, producing "a unique model that was able to give them the competitive edge that was needed."

Why this matters for finance leaders and IT leaders: If your competitive moat is built on decades of internal IP — proprietary workflows, domain-specific data, or custom code — generic models trained on public data leave value on the table. The question isn't "can GPT-4 work?" but "does it capture the nuances that differentiate us?"

The Business Model: Licenses, Data Pipelines, and Embedded Scientists

Forge's revenue model reflects the complexity of enterprise training. For customers running on their own GPU clusters — a common requirement in regulated or IP-sensitive industries — Mistral doesn't charge for compute. Instead, the company charges:

  1. License fee for the Forge platform
  2. Optional data pipeline services (acquisition, curation, synthetic data generation)
  3. Forward-deployed scientists — embedded AI researchers who work alongside customer teams

"No competitor out there today is kind of selling this embedded scientist as part of their training platform offering," Salamanca said. This has echoes of Palantir's early playbook, where forward-deployed engineers bridged the gap between powerful software and messy enterprise reality. Most organizations lack the internal expertise to design effective training recipes, curate data at scale, or navigate distributed GPU training.

Training can happen on Mistral's own clusters, on Mistral Compute (the company's dedicated infrastructure offering), or entirely on-premises. "We have all these different cases, and we support everything," Salamanca said.

Revenue at scale: Mistral CEO Arthur Mensch says the company is on track to surpass $1 billion in annual recurring revenue this year. Early Forge customers include ASML (the Dutch chipmaker that led Mistral's Series C at a €11.7B valuation), Ericsson, the European Space Agency, Singapore's DSO National Laboratories, and Italy's Reply consulting group.

Data Sovereignty: The Sharpest Selling Point

When customers train on their own infrastructure, Mistral never sees the data. "It's on their clusters, it's with their data — we don't see anything of it, and so it's completely under their control," Salamanca said. "I think this is something that sets us apart from the competition, where you actually need to upload your data, and you have a black box effect."

This matters enormously in defense, intelligence, financial services, and healthcare — sectors where legal and reputational risks of exposing proprietary data to third-party clouds can be deal-breakers. Mistral's customer roster suggests they're deliberately targeting the most data-sensitive corners of the enterprise market.

For security leaders and compliance leaders: If your data residency requirements prevent uploading training data to AWS, Azure, or GCP, cloud-based fine-tuning APIs are non-starters. Forge allows you to train entirely on-premises while still accessing Mistral's training recipes and tooling.

Why Custom Models Still Matter in the Agent Era

The AI industry in 2026 has been consumed by agents — autonomous systems that use tools, navigate workflows, and take actions. If the future belongs to agents, why does the underlying model matter? Can't companies plug into the best frontier model through an API and focus on orchestration?

Salamanca pushed back: "The customers that we've been working on — some of these specific problems are things that no MCP server would ever solve. You actually need that intelligence. You actually need to create that model that will help you solve your most critical business problem."

She also argued that model customization is essential even in purely agentic architectures: "There are some agentic behaviors that you need to bring to the model. It can be about reasoning patterns, specific types of documentation, making sure that you have the right reasoning traces. Even in these cases where people are going completely agentic, you still need model customization — like reinforcement learning techniques — to actually get the right level of performance."

Mistral's announcement makes this explicit: custom models make enterprise agents more reliable by providing deeper understanding of internal environments — more precise tool selection, more dependable multi-step workflows, and decisions that reflect internal policies rather than generic assumptions.

For VP Engineering and AI/ML leads: If your agent platform calls a generic LLM that doesn't understand your internal tools, workflows, or business logic, you'll spend months building brittle prompt engineering workarounds. Training a model on your internal data means the agent starts with institutional knowledge baked in.

Mistral vs the Hyperscalers

Forge enters a crowded market. Amazon Bedrock, Microsoft Azure AI Foundry, and Google Cloud Vertex AI all offer model training and customization. But Salamanca argues these offerings are fundamentally limited:

  1. Cloud-only: "In one set of cases, it's very easy to answer — they want to run this on their premises, and so all these tools that are available on the cloud are just not available for them."
  2. Simplified interfaces: Hyperscalers' training tools offer API interfaces that don't provide the depth of control serious model training requires.
  3. Dependency risk: One customer described how a new closed-source model release — more verbose than its predecessor — crashed their production pipelines. "When you're relying on closed-source models, you are also super dependent on the updates of the model that have side effects."

Mistral has released models under permissive Apache 2.0 licenses since its founding. Salamanca confirmed that while Forge currently works with Mistral's own models, support for other open-source architectures is planned: "We're deeply rooted into open source. This has been part of our DNA since the beginning, and we have been building Forge to be an open platform — it's just a question of a matter of time that we'll be opening this to other open-source models."

The Verdict: Who Should Consider Forge?

YES, if you:

  • Have proprietary data that's core to your competitive advantage
  • Need data sovereignty for regulatory, security, or IP protection reasons
  • Work in specialized domains (telecom, finance, healthcare, defense)
  • Want to build mission-critical agents aligned to internal policies
  • Already operate GPU clusters or plan to for AI workloads

NO, if you:

  • Are building generic enterprise apps (customer support, content generation, summarization)
  • Don't have unique domain data or workflows
  • Prefer cloud-hosted services with minimal ops burden
  • Are satisfied with fine-tuning APIs for your use cases

For finance leaders: The ROI question is whether the delta between a generic model and a custom-trained one translates to measurable business value. If your use case is generic, renting GPT-4 through an API is cheaper. If your competitive moat is built on proprietary knowledge, owning the model could be worth millions.

For enterprise leaders: The architectural question is whether you're willing to own the training, ops, and continuous improvement burden. Forge provides tooling and embedded scientists, but you're still responsible for data quality, infrastructure, and model lifecycle management.

Continue Reading


Want more enterprise AI insights like this? Subscribe to THE DAILY BRIEF — Tuesday + Thursday mornings, delivered to your inbox.


Continue Reading

Related articles:

Share:

THE DAILY BRIEF

Enterprise AIAI ModelsAI InfrastructureDeploymentData CentersSecurity

Mistral Forge: Why 'Build-Your-Own AI' Could Beat the API Model

Mistral's new Forge platform lets enterprises train custom AI models from scratch on proprietary data. At $1B ARR, they're betting that companies owning their AI will beat those renting it.

By Rajesh Beri·March 18, 2026·10 min read

Mistral AI launched Forge this week — a platform that lets enterprises train AI models from scratch using their own data. Announced at NVIDIA GTC, it's a direct challenge to OpenAI and Anthropic's "rent our model" approach. The French AI startup, on track to hit $1 billion in annual recurring revenue this year, is betting that the companies with the hardest AI problems are the ones least served by generic models trained on the public internet.

⚡ When Custom Training Makes Sense

  • Your competitive advantage lives in proprietary data → Generic models never saw your internal documents, workflows, or domain knowledge
  • You need data sovereignty → Train on-premises with zero data exposure to third-party clouds
  • You work with speciali[zed](/tools/zed) domains → Ancient manuscripts, proprietary code, regulated financial models
  • You're building mission-critical agents → Custom models align to internal policies through reinforcement learning

Why Fine-Tuning APIs Plateau

Most enterprise AI adoption follows a pattern: pick GPT-4, Claude, or Gemini, then fine-tune through a cloud API for specific tasks. This works for proofs-of-concept and many production use cases. But Elisa Salamanca, Mistral's head of product, argues this approach fundamentally plateaus when you try to solve your hardest problems.

"We had a fine-tuning API relying on supervised fine-tuning. I think it was kind of what was the standard a couple of months ago," Salamanca told VentureBeat. "It gets you to a proof-of-concept state. Whenever you actually want to have the performance that you're targeting, you need to go beyond. AI scientists today are not using fine-tuning APIs. They're using much more advanced tools, and that's what Forge is bringing to the table."

Forge packages the training methodology Mistral's own AI scientists use internally — data mixing strategies, synthetic data pipelines, distributed computing optimizations, and battle-tested training recipes. Salamanca drew a sharp line: "There's no platform out there that provides you real-world training recipes that work. Other open-source repositories or other tools can give you generic configurations or community tutorials, but they don't give you the recipe that's been validated — that we've been doing for all of our flagship models today."

What Forge does differently: It supports full-cycle training — pre-training on large internal datasets, post-training through supervised fine-tuning, DPO, ODPO, and critically, reinforcement learning pipelines that align models with internal policies and operational objectives over time. This goes beyond what fine-tuning APIs offer.

Early Use Cases: From Ancient Texts to Hedge Fund Code

In conversations with TechCrunch, Mistral shared real-world examples that show where off-the-shelf models break:

Ancient manuscripts: A public institution had ancient texts with missing sections from damage. "The models that were available were not able to do this because they've never seen the data," Salamanca explained. "Digitization was not very good. There were some unique patterns and characters, and so we actually created a model for them to fill in the spans."

Telecom code migration: Ericsson partnered with Mistral to customize its Codestral model for legacy-to-modern code translation. Ericsson built up five years of proprietary knowledge around an internal calling language — a codebase so specialized no off-the-shelf model has ever seen it. "The concrete impact is like turning a year-long manual migration process, where each engineer needs six months of onboarding, to something that's really more scalable and faster."

Hedge fund quant languages: Financial firms worked with Mistral to build models for proprietary quantitative languages — the kind of deeply guarded intellectual property these firms keep on-premises and never expose to cloud-hosted AI services. Using Forge's reinforcement learning capabilities, Mistral helped one fund develop custom benchmarks and train models to outperform on them, producing "a unique model that was able to give them the competitive edge that was needed."

Why this matters for finance leaders and IT leaders: If your competitive moat is built on decades of internal IP — proprietary workflows, domain-specific data, or custom code — generic models trained on public data leave value on the table. The question isn't "can GPT-4 work?" but "does it capture the nuances that differentiate us?"

The Business Model: Licenses, Data Pipelines, and Embedded Scientists

Forge's revenue model reflects the complexity of enterprise training. For customers running on their own GPU clusters — a common requirement in regulated or IP-sensitive industries — Mistral doesn't charge for compute. Instead, the company charges:

  1. License fee for the Forge platform
  2. Optional data pipeline services (acquisition, curation, synthetic data generation)
  3. Forward-deployed scientists — embedded AI researchers who work alongside customer teams

"No competitor out there today is kind of selling this embedded scientist as part of their training platform offering," Salamanca said. This has echoes of Palantir's early playbook, where forward-deployed engineers bridged the gap between powerful software and messy enterprise reality. Most organizations lack the internal expertise to design effective training recipes, curate data at scale, or navigate distributed GPU training.

Training can happen on Mistral's own clusters, on Mistral Compute (the company's dedicated infrastructure offering), or entirely on-premises. "We have all these different cases, and we support everything," Salamanca said.

Revenue at scale: Mistral CEO Arthur Mensch says the company is on track to surpass $1 billion in annual recurring revenue this year. Early Forge customers include ASML (the Dutch chipmaker that led Mistral's Series C at a €11.7B valuation), Ericsson, the European Space Agency, Singapore's DSO National Laboratories, and Italy's Reply consulting group.

Data Sovereignty: The Sharpest Selling Point

When customers train on their own infrastructure, Mistral never sees the data. "It's on their clusters, it's with their data — we don't see anything of it, and so it's completely under their control," Salamanca said. "I think this is something that sets us apart from the competition, where you actually need to upload your data, and you have a black box effect."

This matters enormously in defense, intelligence, financial services, and healthcare — sectors where legal and reputational risks of exposing proprietary data to third-party clouds can be deal-breakers. Mistral's customer roster suggests they're deliberately targeting the most data-sensitive corners of the enterprise market.

For security leaders and compliance leaders: If your data residency requirements prevent uploading training data to AWS, Azure, or GCP, cloud-based fine-tuning APIs are non-starters. Forge allows you to train entirely on-premises while still accessing Mistral's training recipes and tooling.

Why Custom Models Still Matter in the Agent Era

The AI industry in 2026 has been consumed by agents — autonomous systems that use tools, navigate workflows, and take actions. If the future belongs to agents, why does the underlying model matter? Can't companies plug into the best frontier model through an API and focus on orchestration?

Salamanca pushed back: "The customers that we've been working on — some of these specific problems are things that no MCP server would ever solve. You actually need that intelligence. You actually need to create that model that will help you solve your most critical business problem."

She also argued that model customization is essential even in purely agentic architectures: "There are some agentic behaviors that you need to bring to the model. It can be about reasoning patterns, specific types of documentation, making sure that you have the right reasoning traces. Even in these cases where people are going completely agentic, you still need model customization — like reinforcement learning techniques — to actually get the right level of performance."

Mistral's announcement makes this explicit: custom models make enterprise agents more reliable by providing deeper understanding of internal environments — more precise tool selection, more dependable multi-step workflows, and decisions that reflect internal policies rather than generic assumptions.

For VP Engineering and AI/ML leads: If your agent platform calls a generic LLM that doesn't understand your internal tools, workflows, or business logic, you'll spend months building brittle prompt engineering workarounds. Training a model on your internal data means the agent starts with institutional knowledge baked in.

Mistral vs the Hyperscalers

Forge enters a crowded market. Amazon Bedrock, Microsoft Azure AI Foundry, and Google Cloud Vertex AI all offer model training and customization. But Salamanca argues these offerings are fundamentally limited:

  1. Cloud-only: "In one set of cases, it's very easy to answer — they want to run this on their premises, and so all these tools that are available on the cloud are just not available for them."
  2. Simplified interfaces: Hyperscalers' training tools offer API interfaces that don't provide the depth of control serious model training requires.
  3. Dependency risk: One customer described how a new closed-source model release — more verbose than its predecessor — crashed their production pipelines. "When you're relying on closed-source models, you are also super dependent on the updates of the model that have side effects."

Mistral has released models under permissive Apache 2.0 licenses since its founding. Salamanca confirmed that while Forge currently works with Mistral's own models, support for other open-source architectures is planned: "We're deeply rooted into open source. This has been part of our DNA since the beginning, and we have been building Forge to be an open platform — it's just a question of a matter of time that we'll be opening this to other open-source models."

The Verdict: Who Should Consider Forge?

YES, if you:

  • Have proprietary data that's core to your competitive advantage
  • Need data sovereignty for regulatory, security, or IP protection reasons
  • Work in specialized domains (telecom, finance, healthcare, defense)
  • Want to build mission-critical agents aligned to internal policies
  • Already operate GPU clusters or plan to for AI workloads

NO, if you:

  • Are building generic enterprise apps (customer support, content generation, summarization)
  • Don't have unique domain data or workflows
  • Prefer cloud-hosted services with minimal ops burden
  • Are satisfied with fine-tuning APIs for your use cases

For finance leaders: The ROI question is whether the delta between a generic model and a custom-trained one translates to measurable business value. If your use case is generic, renting GPT-4 through an API is cheaper. If your competitive moat is built on proprietary knowledge, owning the model could be worth millions.

For enterprise leaders: The architectural question is whether you're willing to own the training, ops, and continuous improvement burden. Forge provides tooling and embedded scientists, but you're still responsible for data quality, infrastructure, and model lifecycle management.

Continue Reading


Want more enterprise AI insights like this? Subscribe to THE DAILY BRIEF — Tuesday + Thursday mornings, delivered to your inbox.


Continue Reading

Related articles:

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

thedailybrief.com

Subscribe at thedailybrief.com/subscribe for weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi  |  X: x.com/rajeshberi

© 2026 Rajesh Beri. All rights reserved.

Newsletter

Stay Ahead of the Curve

Weekly enterprise AI insights for technology leaders. No spam, no vendor pitches—unsubscribe anytime.

Subscribe

Latest Articles

View All →