Cybersecurity

AI Agent Marketplaces Have a 13% Critical Vulnerability Rate

The dirty secret of AI agent marketplaces: a 13% critical vulnerability rate More than one in eight skills available across AI agent marketplaces contains a critical vulnerability. That number comes from the Agent Skills project, an open-source registry built specifically because no one else was tracking it — and mainstream AI coverage has treated it ... Read more

AI Agent Marketplaces Have a 13% Critical Vulnerability Rate
Illustration · Newzlet

The dirty secret of AI agent marketplaces: a 13% critical vulnerability rate

More than one in eight skills available across AI agent marketplaces contains a critical vulnerability. That number comes from the Agent Skills project, an open-source registry built specifically because no one else was tracking it — and mainstream AI coverage has treated it as a rounding error.

The exposure is structural. Skills are packaged instructions and resources that extend what an AI coding agent can do — think plugins that teach Claude Code, Cursor, Copilot, or Antigravity new workflows and specialized capabilities. Unlike a suspicious npm package that a developer might isolate or sandbox, agent skills operate inside the developer environment with elevated trust baked in from the start. A compromised skill doesn’t just corrupt a dependency tree. It can reach live codebases, harvest API credentials, and interact with external services, all while the developer watches the agent appear to work normally.

The growth curve made this inevitable. Claude Code, Cursor, and GitHub Copilot have each seen adoption accelerate sharply over the past 18 months. Every new user base creates demand for more skills, and community marketplaces have responded by publishing faster than any security review process can follow. The result is an ecosystem that structurally resembles the early npm registry — high volume, low friction, minimal gatekeeping — except the blast radius of a bad actor is significantly larger when the attack surface is an autonomous agent with file system and network access.

The Agent Skills project frames this as the next supply-chain disaster waiting to happen, and the comparison to npm is deliberate. The 2021 ua-parser-js compromise and the broader left-pad class of incidents showed what happens when developers extend implicit trust to third-party packages at scale. Agent skills carry that same implicit trust, compounded by the fact that most developers have no visibility into what a skill actually does once an AI agent starts executing it. Auditing a CLAUDE.md instruction file or a Cursor rule set requires a different mental model than reading a JavaScript module — and most security tooling doesn’t cover it at all.

What ‘skills’ actually are — and why they are a bigger attack surface than most developers realise

Agent skills are packaged instructions and resources that extend what an AI coding agent can do autonomously. Think of them as plugins — a skill teaches an agent new workflows, patterns, and specialized knowledge, then steps back while the agent acts on that teaching inside a live development environment. Agents like Claude Code, Cursor, Copilot, and Antigravity all consume skills in this way, pulling in modular capabilities that expand their reach across codebases, infrastructure, and external services.

That architecture creates a problem most developers have not fully reckoned with. A browser extension that misbehaves crashes a tab. A skill that misbehaves sits inside the agent’s decision-making loop — the layer that determines what code gets written, what commands get run, and what API calls get made. A malicious or poorly written skill does not announce itself with an error. It quietly shapes the agent’s choices, potentially redirecting actions against production infrastructure or injecting patterns into code that ships to real users. The failure mode is silent and downstream.

The scale of exposure is already measurable. Data from the agent-skills project puts the rate of critical vulnerabilities in marketplace skills above 13 percent. That figure applies to skills already circulating in ecosystems developers trust and use daily. The vulnerabilities are not hypothetical — they exist in packages that agents are already loading and acting on.

The problem compounds because of standardisation. The agent-skills project includes MCP server integration, which means it operates on top of the Model Context Protocol — an emerging interoperability layer that lets skills work across multiple agents from different vendors. MCP turns skills from a tool-specific quirk into a shared attack surface. A vulnerability pattern that works against one MCP-compatible agent works against all of them. That is the same dynamic that made npm and PyPI supply-chain attacks so damaging: one compromised package, many downstream victims. The skill ecosystem is heading toward the same structural risk, and most teams are not treating it with the same scrutiny they apply to their package dependencies.

How Agent Skills is trying to be the ‘verified app store’ the ecosystem desperately needs

The Tech Leads Club’s Agent Skills project takes direct aim at the credibility problem by operating as a hardened registry — every skill is explicitly verified and tested before inclusion. That’s a meaningful structural difference from the status quo, where most extension ecosystems lean on community star ratings or vendor self-attestation to signal trustworthiness. Neither approach has a strong track record. Agent Skills treats verification as the entry requirement, not a badge applied after the fact.

The registry’s architecture makes this commitment visible. A dedicated Security & Trust section sits at the top level of the project’s documentation — not buried in a contributing guide or footnoted in a README. In most package repositories, security handling has historically been reactive: vulnerabilities get flagged after users encounter them. Agent Skills inverts that default. Vetting happens upstream, before any skill reaches a developer’s environment.

Scope also signals intent. The registry currently supports Antigravity, Claude Code, Cursor, and GitHub Copilot — four of the most widely used AI coding agents in professional development workflows. Supporting that cross-section of the market positions Agent Skills as infrastructure rather than a companion tool for one platform’s users. A developer switching between Cursor and Claude Code doesn’t need to re-evaluate the skill library’s safety guarantees. The verification carries across.

The project’s own framing sharpens the stakes: it operates against a backdrop where over 13% of marketplace skills contain critical vulnerabilities. That figure, cited directly in the project’s documentation, gives the hardened-registry model a concrete problem to solve rather than a hypothetical one. Agent Skills isn’t pitching security as a differentiator for its own sake — it’s responding to a failure rate that already exists in the wild and that grows more consequential as AI agents gain deeper access to codebases, CI pipelines, and production systems.

What most coverage is missing: this is a supply-chain risk story, not a product launch story

Tech media has covered AI coding agents almost exclusively through a productivity lens. Benchmark comparisons, token throughput, code completion rates — the coverage treats Cursor, GitHub Copilot, and Claude Code as software tools to be evaluated on output quality. The extensibility ecosystems beneath those tools have received almost no scrutiny.

That framing has a cost. The skill and plugin marketplaces that extend these agents mirror conditions that security researchers now recognize as precursors to supply-chain disasters. The npm ecosystem looked like a productivity story too — until malicious packages started exfiltrating credentials from millions of developer machines. SolarWinds looked like an enterprise IT story until it became a national security incident. The pattern is consistent: a fast-growing dependency ecosystem, minimal provenance controls, and professional users who trust the toolchain by default.

The Agent Skills project surfaces a number. Over 13% of marketplace skills available for AI coding agents contain critical vulnerabilities. That figure applies to tools developers are running inside active codebases, connecting to internal APIs, and operating with elevated permissions inside CI/CD pipelines. It is not a theoretical exposure. It is a measured vulnerability rate in software that enterprise engineering teams are already using.

Most coverage of Agent Skills has treated this as a product launch — a new registry with a quick-start guide. That framing buries the actual news. A validated skill registry exists precisely because the absence of one produced a significant, quantifiable security problem. The registry is the response to a crisis, not the announcement of a feature.

Enterprise security and DevSecOps teams have not caught up. Formal policies governing AI agent skill provenance — who authored a skill, how it was validated, what permissions it requests — do not exist in most organizations. Security teams that have spent years hardening their software supply chains against malicious npm packages and compromised CI integrations have not yet applied the same scrutiny to the agent skill layer. Agent Skills is forcing that conversation into the open. The alternative is waiting for a high-profile incident to force it instead — and the historical precedent for how that plays out is not encouraging.

The open questions that will determine whether this solution scales

The Agent Skills registry carries real promise, but three unresolved questions will determine whether it hardens the ecosystem or simply adds a new layer of false confidence.

First, governance. A GitHub-hosted repository is only as trustworthy as the humans and processes behind it. The project currently markets itself as “secure” and “validated,” but it has not published a clear answer to the questions every enterprise security team will ask: Who performs the vetting? What specific criteria does a skill have to meet to earn a verified badge? What is the revocation process when a previously approved skill is found to contain malicious logic after the fact? Without documented, enforceable answers to those questions, the registry risks becoming exactly the kind of trusted-but-unaudited source that made the npm and PyPI supply-chain attacks so damaging — a place where the appearance of safety substitutes for actual safety.

Second, distribution. Agent Skills already lists Cursor, Claude Code, Copilot, and Antigravity as supported agents. But listing compatibility is not the same as integration. If Cursor does not surface verified skills by default inside its interface, if Anthropic does not steer Claude Code users toward the registry, and if Microsoft does not embed a trust signal into Copilot’s extension layer, most developers will continue pulling skills from wherever is fastest — a random gist, a Discord post, a blog tutorial. Convenience has always beaten security hygiene at the distribution layer, and that dynamic does not change unless the major vendors make friction-free access to safe skills a product decision, not a user responsibility.

Third, protocol versus registry. The project exposes an MCP server integration, which points toward a more architecturally ambitious future: skill security enforced at the protocol level rather than managed through a centralized list. That approach could be significantly more robust — a compromised skill would fail validation at the agent runtime rather than slip through because a registry update lagged. But the MCP ecosystem has not coalesced around a shared security standard, and fragmentation at the protocol layer would produce a patchwork of incompatible trust models that is harder to reason about than the current problem it is meant to solve.

AI-Assisted Content — This article was produced with AI assistance. Sources are cited below. Factual claims are verified automatically; uncertain claims are flagged for human review. Found an error? Contact us or read our AI Disclosure.

More in Cybersecurity

See all →