cybersecurity AI security vulnerability detection enterprise security Anthropic NSA

NSA Tests Anthropic's Mythos AI for Cybersecurity: 73% Success Rate on Expert-Level Vulnerability Detection

The National Security Agency is testing Anthropic's Mythos AI model to find security flaws in Microsoft products. UK government evaluation shows the model can autonomously execute 32-step corporate network attacks—and 99% of discovered vulnerabilities remain unpatched.

By Rajesh Beri·May 3, 2026·8 min read

THE DAILY BRIEF

cybersecurityAI securityvulnerability detectionenterprise securityAnthropicNSA

NSA Tests Anthropic's Mythos AI for Cybersecurity: 73% Success Rate on Expert-Level Vulnerability Detection

By Rajesh Beri·May 3, 2026·8 min read

The National Security Agency is testing Anthropic's Mythos AI model to identify security vulnerabilities in Microsoft products, marking a significant shift in how government agencies approach cybersecurity defense. According to Bloomberg, NSA officials have been "impressed by its speed and efficiency" in finding potential security flaws that human security researchers would take days to discover.

This isn't theoretical. The UK AI Security Institute (AISI) conducted rigorous government evaluations of Mythos and found it achieved a 73% success rate on expert-level capture-the-flag (CTF) cybersecurity challenges—tasks that no AI model could complete before April 2025. More striking: Mythos became the first AI model to autonomously complete "The Last Ones," a 32-step corporate network attack simulation that takes human security professionals an estimated 20 hours to finish.

The dual-use dilemma is stark. While the NSA is testing Mythos to find vulnerabilities before adversaries do, the same AI capability that makes it valuable for defense also makes it a powerful offensive tool. And here's the problem every CISO should understand: according to one source, more than 99% of vulnerabilities discovered by Mythos on April 7, 2026, were still unpatched in widely deployed systems.

What Mythos Can Do That Traditional Security Tools Can't

Mythos doesn't just scan for known vulnerabilities—it discovers them. Traditional automated security tools rely on signature-based detection and known vulnerability databases (CVEs). Mythos, by contrast, can:

Execute multi-stage attacks across network segments (spanning reconnaissance, lateral movement, privilege escalation, and exfiltration)
Discover and exploit zero-day vulnerabilities in major operating systems and web browsers, including flaws that have existed for decades
Autonomously chain dozens of attack steps together without human guidance
Scale performance with increased compute (UK AISI testing showed continued improvement up to 100 million token budgets)

In the UK government's evaluation, Mythos completed an average of 22 out of 32 steps in the corporate network attack simulation across all attempts—compared to the next-best model (Claude Opus 4.6), which averaged only 16 steps.

For context: Expert-level CTF challenges test specific cybersecurity skills in isolation (code injection, privilege escalation, network pivoting). Real-world attacks require chaining these skills across multiple systems over hours or days. Mythos is the first AI model to bridge that gap in controlled testing environments.

The Enterprise Angle: Defense vs. Offense

The NSA isn't the only organization testing Mythos for defensive purposes. Project Glasswing—a consortium including Microsoft, AWS, Apple, and Google—is using Mythos to "harden software against potential threats," according to Anthropic. The goal: use AI to find vulnerabilities before attackers do, then patch them at scale.

But there's a catch. The UK AISI evaluation noted important limitations in their testing:

Ranges lacked active defenders and defensive tooling (real enterprise environments have EDR, SIEM, SOC monitoring)
No penalties for actions that would trigger security alerts (in production, attackers face detection risk)
Focused on weakly defended and vulnerable systems (not hardened enterprise environments)

This means Mythos can autonomously attack "small, weakly defended and vulnerable enterprise systems where access to a network has been gained," but we don't yet know how it would perform against well-defended Fortune 500 infrastructure with mature security programs.

The business impact for CISOs and security teams is twofold:

Defensive opportunity: AI-powered vulnerability discovery can dramatically accelerate security testing cycles—from quarterly penetration tests to continuous AI-assisted scanning
Offensive risk: Adversaries with access to similar AI models can automate reconnaissance and exploitation at scale, making "time to exploit" a more critical metric than ever

Cost and ROI: AI-Powered Security Testing vs. Traditional Pentesting

Traditional penetration testing costs enterprises $15,000–$50,000+ per engagement (depending on scope), with testing cycles typically running quarterly or annually. A senior security consultant charges $200–$400/hour, and a comprehensive enterprise pentest can take 200–400 hours.

AI-powered security testing changes the economics:

Continuous testing instead of point-in-time assessments
Automated vulnerability chaining (finding exploit paths across systems)
Inference cost scales with compute (Anthropic hasn't published Mythos API pricing, but inference costs for frontier models typically run $5–$30 per million tokens)

The ROI equation for security leaders: If Mythos can autonomously execute what takes a senior security researcher 20 hours in a fraction of the time, the cost per vulnerability discovered drops dramatically—even accounting for AI inference costs and false positive triage.

But there's a hidden cost: deploying AI-powered security testing requires model access governance, prompt engineering expertise, and integration with existing security workflows. Early adopters will face integration challenges that traditional pentest vendors handle out-of-the-box.

Vendor Comparison: Mythos vs. Traditional Security Tools

Capability	Mythos (AI-Powered)	Traditional Pentesting	Automated Scanners (e.g., Nessus, Qualys)
Zero-day discovery	✅ Finds novel vulnerabilities	✅ Expert-dependent	❌ Signature-based only
Multi-step attack chaining	✅ Autonomous 32-step attacks	✅ Manual expertise required	❌ Single-step detection
Speed	✅ Hours (vs. days for humans)	⚠️ Days to weeks	✅ Minutes to hours
Cost (per engagement)	⚠️ TBD (inference + integration)	❌ $15K–$50K+	✅ $2K–$10K/year (subscription)
False positive rate	⚠️ Unknown (early-stage)	✅ Low (expert validation)	❌ High (requires triage)
Compliance reporting	❌ Not yet integrated	✅ Standard deliverables	✅ Compliance templates

Key takeaway: AI-powered security testing excels at speed and autonomous discovery but lacks the compliance integration and human validation workflows that mature security programs require. Expect hybrid approaches: AI for continuous discovery, humans for validation and remediation prioritization.

What Security Leaders Should Do Now

The UK AISI's advice to enterprises is blunt: "This highlights the importance of cybersecurity basics." Specifically:

Patch aggressively. If 99% of Mythos-discovered vulnerabilities remain unpatched, your attack surface is wider than you think.
Implement Zero Trust access controls. Mythos succeeds when it gains network access and can move laterally—segment networks and enforce least-privilege access.
Invest in comprehensive logging. AI-powered attacks leave digital footprints—if you're logging comprehensively and monitoring actively.
Adopt the UK Cyber Essentials framework. Security updates, robust access controls, security configuration, and logging are table stakes.

For forward-looking CISOs, consider these strategic questions:

Can we use AI-powered security testing defensively? (Pilot programs with Anthropic, OpenAI, or similar models for internal vulnerability discovery)
How do we detect AI-powered attacks in our environment? (Traditional signature-based detection won't catch novel AI-discovered exploits)
What's our mean time to patch (MTTP) for critical vulnerabilities? (If it's measured in weeks, you're vulnerable to AI-accelerated exploitation)

The dual-use nature of Mythos means enterprises face both opportunity and risk. The same AI that can harden your defenses can also be used by adversaries to find your weaknesses faster than ever.

The Geopolitical Context: Why the NSA Is Involved

The political backdrop is messy. The Trump administration previously designated Anthropic as a "supply chain risk" due to disagreements over AI use for autonomous weapons and mass surveillance. This theoretically restricts federal agencies from using Anthropic's models—yet Bloomberg reports the NSA is actively testing Mythos anyway.

Why? Because the cybersecurity implications are too significant to ignore. If an AI model can autonomously discover vulnerabilities in critical infrastructure (Microsoft products are deployed across government and enterprise networks), the national security calculus changes. The White House is reportedly exploring ways for federal agencies to access Mythos while maintaining supply chain restrictions.

For enterprise leaders, this signals a broader trend: AI-powered cybersecurity is transitioning from research novelty to operational necessity. If the NSA is testing Mythos despite political complications, it's because the defensive value outweighs bureaucratic friction.

What Comes Next: The AI Security Arms Race

Mythos represents a capability threshold. For the first time, an AI model can autonomously execute complex, multi-step cyberattacks that previously required human expertise. This isn't a lab demo—it's a government-validated capability that has already found thousands of unpatched vulnerabilities in production systems.

The implications for enterprise security leaders are urgent:

Assume adversaries have similar capabilities. If Mythos can do this, so can models built by nation-state actors or well-funded cybercrime groups.
Accelerate patch cycles. The window between vulnerability discovery and exploitation is shrinking from months to days (or hours).
Invest in AI-powered defense. The only way to defend against AI-powered attacks at scale is with AI-powered defense at scale.

The good news: The same organizations testing Mythos offensively (NSA, UK AISI) are also exploring defensive use cases. Project Glasswing's collaboration with Microsoft, AWS, Apple, and Google suggests the industry recognizes the dual-use opportunity.

The bad news: We're in an arms race. Enterprises that treat this as a distant future threat will find themselves defending against AI-accelerated attacks with yesterday's tools.

Want to calculate your own AI ROI? Try our AI ROI Calculator — takes 60 seconds and shows projected savings, payback period, and 3-year ROI.

Continue Reading

Microsoft Agent 365 Ships: The $99 SKU Is Not Your Bill — Enterprise AI governance and security sprawl
AI Workforce Automation: Real Enterprise ROI Data from 2026 Deployments — Cost savings and productivity benchmarks
Zero Trust Architecture: The 2026 Enterprise Playbook — Network segmentation and least-privilege access

Sources:

Bloomberg: NSA Testing Anthropic's Mythos to Find Flaws in Microsoft Tech (April 30, 2026)
UK AI Security Institute: Our evaluation of Claude Mythos Preview's cyber capabilities
Anthropic: Project Glasswing

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

thedailybrief.com

Subscribe at thedailybrief.com/subscribe for weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi | X: x.com/rajeshberi

NSA Tests Anthropic's Mythos AI for Cybersecurity: 73% Success Rate on Expert-Level Vulnerability Detection

Photo by Markus Spiske on Unsplash

What Mythos Can Do That Traditional Security Tools Can't

Execute multi-stage attacks across network segments (spanning reconnaissance, lateral movement, privilege escalation, and exfiltration)
Discover and exploit zero-day vulnerabilities in major operating systems and web browsers, including flaws that have existed for decades
Autonomously chain dozens of attack steps together without human guidance
Scale performance with increased compute (UK AISI testing showed continued improvement up to 100 million token budgets)

The Enterprise Angle: Defense vs. Offense

But there's a catch. The UK AISI evaluation noted important limitations in their testing:

Ranges lacked active defenders and defensive tooling (real enterprise environments have EDR, SIEM, SOC monitoring)
No penalties for actions that would trigger security alerts (in production, attackers face detection risk)
Focused on weakly defended and vulnerable systems (not hardened enterprise environments)

The business impact for CISOs and security teams is twofold:

Defensive opportunity: AI-powered vulnerability discovery can dramatically accelerate security testing cycles—from quarterly penetration tests to continuous AI-assisted scanning
Offensive risk: Adversaries with access to similar AI models can automate reconnaissance and exploitation at scale, making "time to exploit" a more critical metric than ever

Cost and ROI: AI-Powered Security Testing vs. Traditional Pentesting

AI-powered security testing changes the economics:

Continuous testing instead of point-in-time assessments
Automated vulnerability chaining (finding exploit paths across systems)
Inference cost scales with compute (Anthropic hasn't published Mythos API pricing, but inference costs for frontier models typically run $5–$30 per million tokens)

Vendor Comparison: Mythos vs. Traditional Security Tools

Capability	Mythos (AI-Powered)	Traditional Pentesting	Automated Scanners (e.g., Nessus, Qualys)
Zero-day discovery	✅ Finds novel vulnerabilities	✅ Expert-dependent	❌ Signature-based only
Multi-step attack chaining	✅ Autonomous 32-step attacks	✅ Manual expertise required	❌ Single-step detection
Speed	✅ Hours (vs. days for humans)	⚠️ Days to weeks	✅ Minutes to hours
Cost (per engagement)	⚠️ TBD (inference + integration)	❌ $15K–$50K+	✅ $2K–$10K/year (subscription)
False positive rate	⚠️ Unknown (early-stage)	✅ Low (expert validation)	❌ High (requires triage)
Compliance reporting	❌ Not yet integrated	✅ Standard deliverables	✅ Compliance templates

What Security Leaders Should Do Now

The UK AISI's advice to enterprises is blunt: "This highlights the importance of cybersecurity basics." Specifically:

Patch aggressively. If 99% of Mythos-discovered vulnerabilities remain unpatched, your attack surface is wider than you think.
Implement Zero Trust access controls. Mythos succeeds when it gains network access and can move laterally—segment networks and enforce least-privilege access.
Invest in comprehensive logging. AI-powered attacks leave digital footprints—if you're logging comprehensively and monitoring actively.
Adopt the UK Cyber Essentials framework. Security updates, robust access controls, security configuration, and logging are table stakes.

For forward-looking CISOs, consider these strategic questions:

Can we use AI-powered security testing defensively? (Pilot programs with Anthropic, OpenAI, or similar models for internal vulnerability discovery)
How do we detect AI-powered attacks in our environment? (Traditional signature-based detection won't catch novel AI-discovered exploits)
What's our mean time to patch (MTTP) for critical vulnerabilities? (If it's measured in weeks, you're vulnerable to AI-accelerated exploitation)

The dual-use nature of Mythos means enterprises face both opportunity and risk. The same AI that can harden your defenses can also be used by adversaries to find your weaknesses faster than ever.

The Geopolitical Context: Why the NSA Is Involved

What Comes Next: The AI Security Arms Race

The implications for enterprise security leaders are urgent:

Assume adversaries have similar capabilities. If Mythos can do this, so can models built by nation-state actors or well-funded cybercrime groups.
Accelerate patch cycles. The window between vulnerability discovery and exploitation is shrinking from months to days (or hours).
Invest in AI-powered defense. The only way to defend against AI-powered attacks at scale is with AI-powered defense at scale.

The bad news: We're in an arms race. Enterprises that treat this as a distant future threat will find themselves defending against AI-accelerated attacks with yesterday's tools.

Want to calculate your own AI ROI? Try our AI ROI Calculator — takes 60 seconds and shows projected savings, payback period, and 3-year ROI.

Continue Reading

Microsoft Agent 365 Ships: The $99 SKU Is Not Your Bill — Enterprise AI governance and security sprawl
AI Workforce Automation: Real Enterprise ROI Data from 2026 Deployments — Cost savings and productivity benchmarks
Zero Trust Architecture: The 2026 Enterprise Playbook — Network segmentation and least-privilege access

Sources:

Bloomberg: NSA Testing Anthropic's Mythos to Find Flaws in Microsoft Tech (April 30, 2026)
UK AI Security Institute: Our evaluation of Claude Mythos Preview's cyber capabilities
Anthropic: Project Glasswing

THE DAILY BRIEF

cybersecurityAI securityvulnerability detectionenterprise securityAnthropicNSA

NSA Tests Anthropic's Mythos AI for Cybersecurity: 73% Success Rate on Expert-Level Vulnerability Detection

By Rajesh Beri·May 3, 2026·8 min read

What Mythos Can Do That Traditional Security Tools Can't

Execute multi-stage attacks across network segments (spanning reconnaissance, lateral movement, privilege escalation, and exfiltration)
Discover and exploit zero-day vulnerabilities in major operating systems and web browsers, including flaws that have existed for decades
Autonomously chain dozens of attack steps together without human guidance
Scale performance with increased compute (UK AISI testing showed continued improvement up to 100 million token budgets)

The Enterprise Angle: Defense vs. Offense

But there's a catch. The UK AISI evaluation noted important limitations in their testing:

Ranges lacked active defenders and defensive tooling (real enterprise environments have EDR, SIEM, SOC monitoring)
No penalties for actions that would trigger security alerts (in production, attackers face detection risk)
Focused on weakly defended and vulnerable systems (not hardened enterprise environments)

The business impact for CISOs and security teams is twofold:

Defensive opportunity: AI-powered vulnerability discovery can dramatically accelerate security testing cycles—from quarterly penetration tests to continuous AI-assisted scanning
Offensive risk: Adversaries with access to similar AI models can automate reconnaissance and exploitation at scale, making "time to exploit" a more critical metric than ever

Cost and ROI: AI-Powered Security Testing vs. Traditional Pentesting

AI-powered security testing changes the economics:

Continuous testing instead of point-in-time assessments
Automated vulnerability chaining (finding exploit paths across systems)
Inference cost scales with compute (Anthropic hasn't published Mythos API pricing, but inference costs for frontier models typically run $5–$30 per million tokens)

Vendor Comparison: Mythos vs. Traditional Security Tools

Capability	Mythos (AI-Powered)	Traditional Pentesting	Automated Scanners (e.g., Nessus, Qualys)
Zero-day discovery	✅ Finds novel vulnerabilities	✅ Expert-dependent	❌ Signature-based only
Multi-step attack chaining	✅ Autonomous 32-step attacks	✅ Manual expertise required	❌ Single-step detection
Speed	✅ Hours (vs. days for humans)	⚠️ Days to weeks	✅ Minutes to hours
Cost (per engagement)	⚠️ TBD (inference + integration)	❌ $15K–$50K+	✅ $2K–$10K/year (subscription)
False positive rate	⚠️ Unknown (early-stage)	✅ Low (expert validation)	❌ High (requires triage)
Compliance reporting	❌ Not yet integrated	✅ Standard deliverables	✅ Compliance templates

What Security Leaders Should Do Now

The UK AISI's advice to enterprises is blunt: "This highlights the importance of cybersecurity basics." Specifically:

Patch aggressively. If 99% of Mythos-discovered vulnerabilities remain unpatched, your attack surface is wider than you think.
Implement Zero Trust access controls. Mythos succeeds when it gains network access and can move laterally—segment networks and enforce least-privilege access.
Invest in comprehensive logging. AI-powered attacks leave digital footprints—if you're logging comprehensively and monitoring actively.
Adopt the UK Cyber Essentials framework. Security updates, robust access controls, security configuration, and logging are table stakes.

For forward-looking CISOs, consider these strategic questions:

Can we use AI-powered security testing defensively? (Pilot programs with Anthropic, OpenAI, or similar models for internal vulnerability discovery)
How do we detect AI-powered attacks in our environment? (Traditional signature-based detection won't catch novel AI-discovered exploits)
What's our mean time to patch (MTTP) for critical vulnerabilities? (If it's measured in weeks, you're vulnerable to AI-accelerated exploitation)

The dual-use nature of Mythos means enterprises face both opportunity and risk. The same AI that can harden your defenses can also be used by adversaries to find your weaknesses faster than ever.

The Geopolitical Context: Why the NSA Is Involved

What Comes Next: The AI Security Arms Race

The implications for enterprise security leaders are urgent:

Assume adversaries have similar capabilities. If Mythos can do this, so can models built by nation-state actors or well-funded cybercrime groups.
Accelerate patch cycles. The window between vulnerability discovery and exploitation is shrinking from months to days (or hours).
Invest in AI-powered defense. The only way to defend against AI-powered attacks at scale is with AI-powered defense at scale.

The bad news: We're in an arms race. Enterprises that treat this as a distant future threat will find themselves defending against AI-accelerated attacks with yesterday's tools.

Want to calculate your own AI ROI? Try our AI ROI Calculator — takes 60 seconds and shows projected savings, payback period, and 3-year ROI.

Continue Reading

Microsoft Agent 365 Ships: The $99 SKU Is Not Your Bill — Enterprise AI governance and security sprawl
AI Workforce Automation: Real Enterprise ROI Data from 2026 Deployments — Cost savings and productivity benchmarks
Zero Trust Architecture: The 2026 Enterprise Playbook — Network segmentation and least-privilege access

Sources:

Bloomberg: NSA Testing Anthropic's Mythos to Find Flaws in Microsoft Tech (April 30, 2026)
UK AI Security Institute: Our evaluation of Claude Mythos Preview's cyber capabilities
Anthropic: Project Glasswing

THE DAILY BRIEF

Enterprise AI insights for technology and business leaders, twice weekly.

thedailybrief.com

Subscribe at thedailybrief.com/subscribe for weekly AI insights delivered to your inbox.

LinkedIn: linkedin.com/in/rberi | X: x.com/rajeshberi

Enterprise AI

Why 34% of Enterprises Choose Anthropic Over OpenAI

Anthropic's $950B valuation (up 1,445% in 12 months) signals a seismic vendor shift: 70% enterprise win rate, Claude Code at $2.5B revenue, and business adoption surpassing OpenAI for the first time.

May 16, 2026 SAP

SAP Just Picked Anthropic Over Microsoft. Here's Why.

At SAP Sapphire 2026 on May 12, CEO Christian Klein declared Anthropic's Claude the primary reasoning engine for 200+ AI agents and 50+ Joule Assistants inside S/4HANA, SuccessFactors, and Ariba. SAP runs 85% of the Fortune 500 — and just picked Anthropic over OpenAI despite Microsoft being its largest cloud partner. Why Anthropic won the largest enterprise vendor selection of 2026, the SaaSpocalypse context (SAP -41%, Anthropic at a $1T valuation 5x SAP's market cap), and two frameworks every CIO should run before signing a Joule contract: a Joule Investment Decision Matrix and a Vendor Co-Opetition Risk Matrix to assess partners who also compete with you.

May 15, 2026 Enterprise AI

PwC's Claude Bet: 30,000 Trained for $2T Tech Debt War

PwC will train 30,000 on Claude Code and stand up a Claude-native finance unit. Inside the Big 4 AI race and the $2T legacy-systems prize.

May 15, 2026 Enterprise AI

Anthropic $950B Raise: 3 Strategic Moves for CIOs

Anthropic's $950B valuation changes enterprise AI vendor dynamics. Three decisions CIOs and CFOs must make now about multi-vendor strategy and risk.

May 15, 2026

Latest Articles

View All →

NSA Tests Anthropic's Mythos AI for Cybersecurity: 73% Success Rate on Expert-Level Vulnerability Detection

THE DAILY BRIEF

NSA Tests Anthropic's Mythos AI for Cybersecurity: 73% Success Rate on Expert-Level Vulnerability Detection

What Mythos Can Do That Traditional Security Tools Can't

The Enterprise Angle: Defense vs. Offense

Cost and ROI: AI-Powered Security Testing vs. Traditional Pentesting

Vendor Comparison: Mythos vs. Traditional Security Tools

What Security Leaders Should Do Now

The Geopolitical Context: Why the NSA Is Involved

What Comes Next: The AI Security Arms Race

Continue Reading

THE DAILY BRIEF

What Mythos Can Do That Traditional Security Tools Can't

The Enterprise Angle: Defense vs. Offense

Cost and ROI: AI-Powered Security Testing vs. Traditional Pentesting

Vendor Comparison: Mythos vs. Traditional Security Tools

What Security Leaders Should Do Now

The Geopolitical Context: Why the NSA Is Involved

What Comes Next: The AI Security Arms Race

Continue Reading

THE DAILY BRIEF

NSA Tests Anthropic's Mythos AI for Cybersecurity: 73% Success Rate on Expert-Level Vulnerability Detection

What Mythos Can Do That Traditional Security Tools Can't

The Enterprise Angle: Defense vs. Offense

Cost and ROI: AI-Powered Security Testing vs. Traditional Pentesting

Vendor Comparison: Mythos vs. Traditional Security Tools

What Security Leaders Should Do Now

The Geopolitical Context: Why the NSA Is Involved

What Comes Next: The AI Security Arms Race

Continue Reading

THE DAILY BRIEF

Stay Ahead of the Curve

Related Articles

Why 34% of Enterprises Choose Anthropic Over OpenAI

SAP Just Picked Anthropic Over Microsoft. Here's Why.

PwC's Claude Bet: 30,000 Trained for $2T Tech Debt War

Anthropic $950B Raise: 3 Strategic Moves for CIOs

Latest Articles

Why 67% of AI ROI Comes from Culture, Not Tech

Why 34% of Enterprises Choose Anthropic Over OpenAI

JPMorgan's $12T/Day Agentic AI Kills the 95% Pilot Trap

Broadridge Goes Live: 40 Clients, 30% Cost Cut, 0 Pilots