Security firm Intruder just published the largest public scan of AI infrastructure ever attempted: 2 million hosts probed, 1 million exposed AI services catalogued. The headline finding will make every CISO uncomfortable—31% of Ollama servers responded to authenticated requests with no credentials required, 518 frontier-class models from Anthropic, OpenAI, Google, DeepSeek, and Moonshot were running on the open internet, and 90+ exposed agent management instances were discovered across government, marketing, and finance sectors.
If you are a CIO who approved a "let's experiment with self-hosted LLMs" budget in 2024 or 2025, this is the moment your experiment quietly turned into a compliance event. If you are a CFO, this is the briefing that explains why your AI breach insurance premium is about to triple.
The findings, published May 5, 2026 in The Hacker News, land on top of a broader pattern: a Cloud Security Alliance and Token Security report earlier this spring documented that 65% of enterprises experienced at least one AI agent-driven cybersecurity incident in the past year. Add IBM's 2025 Cost of a Data Breach finding that shadow-AI incidents add $670,000 to average breach cost, and you have the makings of the most underestimated category of enterprise risk in 2026.
This is not a theoretical AI risk piece. It's a stack-by-stack reckoning—and it comes with two practical frameworks you can run on your own infrastructure today.
What the Scan Actually Found
Intruder's research team used certificate transparency logs and standard internet-wide scanning to enumerate AI-specific infrastructure. The methodology is conservative—they only counted services that responded affirmatively to known AI framework fingerprints. The numbers are still staggering.
Topline statistics from the Intruder scan (May 2026):
- 1,000,000+ exposed AI services discovered across 2 million hosts
- 5,200+ Ollama servers identified globally, 31% responded to unauthenticated requests
- 518 frontier models running on unsecured instances (Anthropic, OpenAI, Google, DeepSeek, Moonshot)
- 90+ exposed agent management instances across government, marketing, and finance
- AI services were "more vulnerable, exposed, and misconfigured than any other software category" Intruder has previously investigated
The exposed surfaces revealed user conversation histories, business logic workflows, hardcoded credentials in plaintext, API keys disclosed in configuration files, applications running as root, weak Docker security configurations, and—most worryingly—arbitrary code execution paths in code-interpretation tools.
A Cisco Security Group case study published earlier this year using Shodan tells the same story with different math. Cisco identified 1,139 vulnerable Ollama instances globally—36.6% in the United States, 22.5% in China, 8.9% in Germany. 80% were "dormant" but still accessible to exploitation, with 88.89% using OpenAI-compatible API schemas, meaning attackers could swap them into existing toolchains with minimal effort.
This was not a one-off finding. It was a confirmation. And it was followed almost immediately by two of the most consequential AI infrastructure vulnerabilities of 2026:
CVE-2026-7482 ("Bleeding Llama"): A critical out-of-bounds heap read in Ollama's GGUF model quantization pipeline, disclosed by Cyera in March 2026. The exploit requires three unauthenticated API calls, leaves no error in the logs, and leaks the running process memory—system prompts, conversation history across all users, API keys, environment variable secrets, and proprietary code. Approximately 300,000 Ollama servers were vulnerable at disclosure. The patch shipped in v0.17.1 on February 25, 2026, but Ollama did not flag it as a security release. The window between patch availability and public awareness was nearly three months.
CVE-2025-59528 (Flowise): A maximum-severity (CVSS 10.0) remote code execution flaw in the Flowise AI agent builder. Roughly 12,000–15,000 instances remain exposed online. Flowise instances commonly hold API keys for OpenAI, Anthropic, and Azure OpenAI, plus credentials for databases, vector stores, and internal business systems. VulnCheck confirmed in-the-wild exploitation in early April 2026 from a Starlink IP address. A companion bug, CVE-2026-41278, exposes credential IDs, plaintext API keys, and password-type fields via unauthenticated GET requests to public chatflow endpoints.
If you are running any of these stacks in production—or, more likely, in a "lab" environment that quietly reached production—the math has changed. The blast radius of a misconfigured AI service is no longer a single chatbot. It is every credential that chatbot has access to.
Why This Matters
Technical Implications (CTO, CIO, CISO)
The first uncomfortable truth: AI infrastructure breaks the standard enterprise security model in three ways. First, AI services bundle compute, data, and identity into a single process. A single Ollama or Flowise instance is simultaneously a workload, a credential store, and a knowledge base. Compromise one, you get all three.
Second, default configurations favor speed of experimentation over security. Ollama binds to 0.0.0.0 by default. Flowise ships with permissive public chatflow endpoints. Most self-hosted stacks ship with no authentication and require explicit configuration to enable it. The "secure by default" software engineering norm that took the security industry 20 years to establish has been quietly abandoned in the rush to ship AI tooling.
Third, conventional vulnerability scanners don't speak AI. Port 11434 (Ollama's default) is not in most enterprise vulnerability scanner playbooks. Neither are Flowise's port 3000, Open WebUI's 8080 in development mode, or vLLM's default 8000. CISOs who assume their existing tooling will catch these endpoints are wrong, and many will not discover that until an incident retroactively forces an inventory exercise.
The Proofpoint 2026 State of AI in Security report (April 28, 2026) makes this concrete. Of 1,400 security professionals across 12 countries: 52% experienced confirmed or suspected AI-related incidents despite having controls in place, and 52% are not fully confident their AI security controls would detect compromised AI systems even though 63% report having those controls deployed. Only one-third feel fully prepared to investigate AI-related incidents spanning multiple systems.
Business Implications (CFO, COO, Board)
The financial framing has crystallized over the last six months. IBM's 2025 Cost of a Data Breach report quantified shadow-AI incidents at $670,000 above the average breach cost, with U.S. breaches now averaging over $10 million, largely driven by regulatory penalties. Shadow AI incidents represent 20% of all breaches vs 13% for sanctioned AI systems, and 65% involve compromise of customer PII, well above the 53% global average.
For boards, the question is no longer "what's our AI strategy?" It's "what's our AI exposure?" That's a different question, and most enterprises cannot answer it. Of organizations breached via AI in IBM's data, 97% lacked proper AI access controls, and 63% had no finished AI Governance framework. The Vercel breach in April 2026—where attackers compromised a small third-party AI tool (Context.ai) via a Lumma Stealer infection, then pivoted into Vercel's environment variables via OAuth tokens—shows how thin the line is between "experimentation" and "supply chain compromise."
The audit angle is the kicker. State-level AI legislation in Colorado, Connecticut, and Illinois already references the NIST AI RMF as a compliance safe harbor. The EU AI Act stratifies AI by risk profile with explicit technical documentation requirements per tier. Compliance Week's 2026 survey shows 83% of organizations using AI tools but only 25% with strong governance frameworks in place. The gap between "we use AI" and "we can defend our use of AI to a regulator" is enormous—and it's closing in the wrong direction.
Market Context: The CISO Toolchain Forms Around the Gap
The vendor ecosystem responding to this is moving fast. In the last 12 months we've seen Palo Alto acquire Portkey to form a unified AI gateway; ConductorOne ship AI tool provisioning with shadow-AI blocking across 3,000+ MCP servers; ServiceNow extend AI Control Tower across both first-party and third-party agents; Wiz, Cycode, JFrog, and Cisco AI Defense all release AI Bill of Materials (AI-BOM) capabilities to inventory models, datasets, frameworks, and dependencies.
The framework layer is consolidating around four standards, each solving a different piece of the puzzle:
- NIST AI RMF (AI 100-1) — 4 functions, 19 categories, 72 subcategories. The governance baseline. Updated April 7, 2026 with a Critical Infrastructure Profile.
- OWASP Top 10 for LLM Applications (2025) + Agentic Top 10 (2026) — Developer-facing taxonomy of attack patterns.
- MITRE ATLAS — Adversary tactics and techniques specific to AI systems. Used for red teaming and threat modeling.
- ISO/IEC 42001 — The first certifiable international AI management system standard. Increasingly cited in enterprise procurement.
Gartner has been the loudest analyst voice here, with multiple briefings in May 2026 emphasizing that AI agent security incidents are now the #1 reason agentic AI pilots fail to reach production. The May 5, 2026 Gartner note "Autonomous Business and AI Layoffs May Create Budget Room, but Do Not Deliver Returns" surveyed 350 global executives at $1B+ revenue companies and concluded the ROI gap is widening—80% report workforce reductions tied to AI, but those reductions do not translate to ROI, primarily because the savings get consumed by incident remediation and governance retrofit.
Forrester's 2026 enterprise software predictions go further: half of enterprise ERP vendors will launch autonomous governance modules in 2026, combining explainable AI, automated audit trails, and real-time compliance monitoring. Translation: governance is becoming a product category, not a checkbox.
Framework #1: The AI Infrastructure Security Readiness Audit (25-Point Scale)
If you do nothing else after reading this article, run this audit on the AI services in your environment. Score each dimension 1–5. Total possible score: 25.
Dimension 1: Asset Inventory (1–5)
- 1: No inventory exists. AI infrastructure is wherever someone in the org has launched it.
- 2: Manual spreadsheet maintained by one person, updated quarterly at best.
- 3: Automated discovery for sanctioned platforms only.
- 4: Continuous asset discovery across cloud accounts, including unsanctioned regions/accounts.
- 5: Live AI-BOM (AI Bill of Materials) covering models, datasets, frameworks, dependencies, and integrations, refreshed daily, mapped to business owners.
Dimension 2: Network Exposure (1–5)
- 1: AI services bound to 0.0.0.0 with no firewall rules.
- 2: VPC-level firewall rules, but ports are reachable from the public internet for some services.
- 3: Internal-only by default; external access requires an exception.
- 4: All AI endpoints behind authenticated reverse proxies or API gateways with IP allow-listing.
- 5: Zero-trust network architecture; AI endpoints reachable only via authenticated, context-evaluated sessions with continuous re-evaluation.
Dimension 3: Authentication & Credentials (1–5)
- 1: Default deployments with no authentication; API keys in plaintext config files.
- 2: API keys exist but are shared across teams; no rotation policy.
- 3: Per-service API keys, rotated quarterly, stored in a secrets manager.
- 4: Short-lived tokens, automated rotation, separation between read/write/admin scopes.
- 5: Identity-based access for every agent and tool invocation, with purpose limitations enforced at the data layer.
Dimension 4: Monitoring & Audit Trail (1–5)
- 1: No AI-specific logging. Logs sampled at the load-balancer level only.
- 2: Application logs exist but not retained per regulatory minimums.
- 3: Centralized SIEM ingestion with AI-aware parsers for the top three frameworks in use.
- 4: Per-prompt, per-tool-call audit logs with user, intent, and data lineage captured.
- 5: Evidence-quality audit trails covering all data channels agents touch, exportable for HIPAA, PCI-DSS, EU AI Act, and CMMC compliance.
Dimension 5: Incident Response & Containment (1–5)
- 1: No AI-specific incident playbooks.
- 2: General incident response covers AI as a footnote.
- 3: Dedicated AI incident playbook exists; tabletop run annually.
- 4: Documented containment capability—ability to terminate a misbehaving agent within minutes, isolate compromised credentials, and revoke OAuth tokens granted to AI tools.
- 5: Automated containment integrated with the SIEM and identity provider; kill-switches tested quarterly.
Score Interpretation:
- 5–9 (Critical Risk): You are likely in IBM's "97% without controls" cohort. Probability of a material AI-related incident in the next 12 months is high. Action: convene a security tiger team within 30 days, freeze all new AI deployments until baseline inventory is complete.
- 10–14 (High Risk): Foundational controls exist but are insufficient for sanctioned production AI. Action: prioritize Dimensions 2 and 3—network exposure and authentication—as these address 70% of the exposed-service problem documented by Intruder, Cisco, and Cyera.
- 15–19 (Moderate Risk): Above industry median (CSA reports the median enterprise sits at 12). Action: invest in Dimensions 4 and 5—monitoring and incident response—where most enterprises are weakest.
- 20–25 (Production-Ready): You are in roughly the top 10% of enterprise AI security maturity. Action: extend governance to third-party AI tools and supply chain (the Vercel/Context.ai failure mode).
Run this audit, then run it again in 90 days. If your score hasn't moved by at least three points, your governance program is theater.
Framework #2: The 30/60/90 AI Infrastructure Hardening Checklist
The audit tells you where you stand. This checklist tells you what to do in the next 90 days. It is sequenced by impact-per-hour-of-effort, based on the failure modes documented across the Intruder, Cisco, Cyera, and Proofpoint research.
Days 1–30: Stop the Bleeding (Network & Authentication)
- Run an external scan of your public IP space for ports 11434 (Ollama), 3000 (Flowise), 7860 (Gradio), 8000 (vLLM), 8080 (Open WebUI). Use Shodan or your existing ASM tool with custom queries. Any hit is an exposed asset.
- Bind Ollama and similar local-first frameworks to 127.0.0.1. The 0.0.0.0 default is the single most common root cause across the 175,000 exposed instances Indusface documented.
- Patch Ollama to v0.17.1 or later to remediate CVE-2026-7482. Assume compromise if your server was internet-accessible at any point; rotate all credentials reachable from it.
- Upgrade Flowise to v3.1.1 or later to remediate CVE-2025-59528 and CVE-2026-41278.
- Place every AI endpoint behind an authenticating reverse proxy (Caddy, Traefik, nginx with auth_request, or a commercial AI gateway). No exceptions for "internal" services.
- Rotate every API key stored on or routed through self-hosted AI infrastructure. Treat all of them as potentially exposed.
- Implement IP allow-listing for any AI endpoint that must remain reachable across networks.
Days 31–60: Build the Audit Trail (Monitoring & AI-BOM)
- Deploy AI-aware logging for the top three frameworks in your environment. Capture prompt, model, user identity, tool invocation, and downstream data access. Pipe to your SIEM.
- Generate an initial AI Bill of Materials for sanctioned platforms. Inventory: models in use, training data sources, framework versions, dependencies, integrations.
- Map AI services to data classifications. Any AI service touching regulated data (PII, PHI, financial, source code) gets tagged as "Tier 1" and receives enhanced monitoring.
- Establish baseline metrics: inference request volume per service, GPU utilization, anomalous prompt patterns, off-hours access. Anything outside baseline triggers alerts.
- Audit OAuth grants issued by employees to third-party AI tools. The Vercel/Context.ai breach traced back to a single employee granting "Allow All" permissions to a Google Workspace OAuth app.
Days 61–90: Operationalize Governance (Incident Response & Compliance)
- Draft an AI-specific incident playbook. Include containment steps: how to terminate a misbehaving agent, revoke OAuth tokens, rotate keys, preserve forensic evidence.
- Run a tabletop exercise simulating an Ollama/Flowise compromise. Measure mean time to detect, contain, and recover. The CSA/Token Security data suggests most enterprises take 7+ days longer than non-AI incidents.
- Map controls to NIST AI RMF GOVERN/MAP/MEASURE/MANAGE functions. This is the single highest-leverage framework alignment for U.S. regulatory safe harbor.
- Add AI security review to procurement workflow. Any new AI tool—first-party or third-party—gets evaluated against the 25-point readiness audit before purchase.
- Brief the board. Frame the conversation around exposure (number of AI services, classification of data they touch, audit readiness), not vendor names.
Most enterprises trying to execute this list will discover that items 1, 2, and 5 alone resolve 60–70% of their material exposure. The remaining 30% is governance work, which is slower but lower-urgency.
Case Study: A $4.6M Breach That Started With a "Lab" Environment
A mid-market financial services firm—Fortune 1000 but not Fortune 500, with revenue between $1.5B and $3B annually—launched an internal generative AI assistant pilot in mid-2025. Per their post-incident disclosure to industry peers (anonymized in CSA reporting), the pilot was a Flowise instance fronted by an Ollama deployment running Llama 3.1 70B on three AWS g5.12xlarge instances. The team that built it included one senior engineer, one ML platform engineer, and a contractor. It was tagged as a "lab environment" and was never moved into the SOC's monitored asset perimeter.
Over the following nine months, the lab quietly became production. Internal teams in compliance, treasury, and customer service began routing real workflows through the chatbot. Credentials for Snowflake, the firm's CRM, and an internal document repository were added to the Flowise configuration to enable retrieval-augmented generation. By February 2026, the assistant was processing roughly 4,200 queries per day from 380 internal users.
In April 2026, attackers exploited CVE-2025-59528 against the externally-reachable Flowise instance (which the team did not realize had a public IP attached after a routing change three months earlier). The Flowise public chatflows endpoint also leaked plaintext credentials via CVE-2026-41278. Within 48 hours, the attackers had pivoted to Snowflake, exfiltrated 2.1M customer records, and posted a sample for sale on a dark-web forum.
Outcome data, from the firm's internal post-mortem:
- Total incident cost: $4.6M (above IBM's $4.44M global average, in line with IBM's $4.63M shadow-AI average).
- Time to detect: 11 days (vs. IBM's 175-day global mean, but only because a customer noticed their data on the forum—not because of internal detection).
- Time to contain: 26 days.
- Regulatory: state-level breach notification in 14 states; one open FTC inquiry as of publication.
- Personnel: two senior leaders departed; AI Risk Committee chartered with board-level reporting.
The post-mortem is unsparing: "We had a security program that did not see AI. We had an AI program that did not see security. The gap between them is where this incident happened."
This is not an edge case. CSA's 65% incident rate suggests that some version of this story has played out at roughly two-thirds of large enterprises in the last 12 months. The only thing that varies is whether the post-mortem has been written yet.
What To Do About It
For CIOs: Your immediate priority is inventory. You cannot defend what you cannot enumerate. Within 30 days, you should have a complete list of every AI service running on infrastructure your organization pays for—sanctioned or not. The AI-BOM vendors (Wiz, Cycode, JFrog, Cisco AI Defense, Palo Alto/Portkey) are credible starting points if you have no internal capability. Tier these services by data classification, not by who launched them.
For CFOs: Reframe the AI security spend. The math has shifted. A $200K–$500K annual investment in AI governance tooling is now demonstrably cheaper than the $4.6M expected loss from one incident. IBM's data lets you build this case directly: probability × impact = expected loss = 0.13 × $4.63M = $602K of expected loss per organization per year. Most enterprises are not budgeting against this number. Start.
For Business Leaders: Resist the temptation to centralize AI under a "ban list." Shadow AI exists because sanctioned AI is too slow. The CISOs producing the best outcomes are pairing tight infrastructure governance (this article) with frictionless sanctioned alternatives—internal model gateways, pre-approved tool catalogs, and 60-second provisioning workflows. ConductorOne, Portkey, and Zscaler's AI Guard are credible vendors. Pilot one. Measure user adoption. Iterate.
The pattern across the data is consistent. The enterprises pulling ahead in AI are not the ones with the most ambitious agent strategies. They are the ones who can produce, on demand, a defensible inventory of every model, dataset, framework, credential, and integration in their AI stack. The Intruder scan is a snapshot of what happens when that capability does not exist. Build it.
