Cybersecurity

Project Glasswing Shows AI Vulnerability Hunting Cuts Both Ways

What Project Glasswing actually is — and why it’s different from previous security AI trials Project Glasswing is Anthropic’s controlled programme for releasing Mythos Preview — its latest AI model — to a curated set of organisations specifically for offensive and defensive security testing. This is not a general API rollout. Anthropic issued invitations selectively, ... Read more

Project Glasswing Shows AI Vulnerability Hunting Cuts Both Ways
Illustration · Newzlet

What Project Glasswing actually is — and why it’s different from previous security AI trials

Project Glasswing is Anthropic’s controlled programme for releasing Mythos Preview — its latest AI model — to a curated set of organisations specifically for offensive and defensive security testing. This is not a general API rollout. Anthropic issued invitations selectively, limiting access to participants who could test the model against real infrastructure and report back on its capabilities and failure modes.

The invitation-only structure marks a deliberate break from how AI labs typically release powerful models. Standard practice is broad access: publish an API, let the market experiment, iterate on feedback. Anthropic has done the opposite here, keeping Mythos Preview out of general circulation while using Project Glasswing to gather structured, high-signal data from environments where the stakes are real. That restraint is itself a signal. It suggests Anthropic recognises that a model capable enough to find genuine vulnerabilities in production systems carries risks that a benchmark leaderboard cannot capture.

Newzlet received an invitation to participate. Over several weeks, we pointed Mythos Preview at more than fifty of our own code repositories — live, production infrastructure, not sandboxed replicas or deliberately seeded test environments. That distinction matters. Synthetic benchmarks measure performance against known problems with known answers. Running a model against your actual codebase surfaces something harder to fake: whether it finds things your own engineers missed, how it reasons about context across large codebases, and whether its outputs are actionable or noise.

What we observed across those repositories shapes everything that follows in this article. Mythos Preview performed differently from every other security-focused model we have tested in recent months — not uniformly better across all tasks, but distinctly more capable in specific areas that matter to attackers. Understanding what those areas are, and what the programme’s controlled structure implies about Anthropic’s own risk calculus, is the starting point for any serious assessment of where AI-assisted vulnerability hunting is heading.

What Mythos Preview found — and how fast it found it

Newzlet ran Mythos Preview against more than fifty of its own repositories as part of Project Glasswing, a structured evaluation Anthropic invited the publication to join. The results moved faster and covered more ground than any security-focused LLM the team had previously tested on the same infrastructure.

Speed was the first thing that stood out. Where earlier models worked through systems methodically and flagged issues in sequence, Mythos Preview generated findings across multiple repositories in parallel, compressing what had previously taken days of scanning into a fraction of that time. The breadth matched the pace — it did not concentrate on obvious targets and ignore the rest.

The more significant capability was chaining. Mythos Preview took individually low-severity issues — the kind of findings that rarely trigger alerts in isolation — and connected them into coherent, higher-impact attack paths. A misconfigured header here, a permissive scope there, a service boundary with a predictable behavior: the model assembled these into sequences that a human attacker could plausibly walk through. Rule-based scanners do not do this. They flag individual conditions against known signatures; they do not reason about what becomes possible when several conditions exist together.

Several findings fell into exposure classes that Newzlet’s existing tooling had not surfaced at all. These were not signature misses — the findings did not correspond to known CVEs the scanners had failed to match. They reflected the model reasoning about system architecture and the relationships between components, identifying risk that emerges from how systems are configured relative to each other rather than from any single broken line of code.

That combination — speed, chaining, and architectural reasoning — is what separates Mythos Preview from the generation of tools that preceded it. It also explains why the Project Glasswing findings carry implications well beyond Newzlet’s own security posture.

The missing context most coverage ignores: the same model works for attackers

The capabilities that make Mythos Preview valuable to defenders make it equally dangerous in other hands. Speed, contextual reasoning across large codebases, and attack-path chaining do not become inert when the person running the model has malicious intent. A threat actor with access to the same model — or a sufficiently similar one — can use it to find the same vulnerabilities, chain the same exploits, and move faster than any human analyst reviewing code manually. The tool does not know who is asking.

The security industry habitually frames AI-assisted vulnerability research as a win for defenders, and the framing contains a buried assumption: that defenders get access first, get it broadly, and operationalize it before attackers do. Project Glasswing’s invitation-only structure cuts directly against that assumption. The organizations testing Mythos Preview were selected. That is not a criticism of the approach — controlled rollouts exist for good reasons — but it exposes how narrow the current defensive advantage actually is. Invitation-only access is not a moat. It is a delay.

The more serious gap is what remains undisclosed. There is no public information about what safety mitigations, if any, are built into Mythos Preview to prevent offensive use. No published refusal policies specific to the security domain. No disclosed red-teaming results. No explanation of how the model distinguishes between a security engineer auditing their own infrastructure and someone probing a target they do not own. The organizations participating in Project Glasswing noted that the model shows what attackers will be able to do with the latest models — and treated that as context, not as a problem requiring a stated solution.

That transparency gap matters enormously. When a model demonstrates genuine capability at finding exploitable vulnerabilities across fifty repositories, the question of what guardrails govern its use stops being theoretical. Right now, the answer is: unknown. Coverage of Project Glasswing has focused on what Mythos Preview found. The harder question is what happens when the next version of that capability ships without an invitation list.

What this means for organisations that were not invited to Project Glasswing

The organisations that matter most right now are the ones that were not in the room. While Project Glasswing participants spent weeks running Mythos Preview across their own repositories and learning its failure modes, every other business continued operating under the same threat landscape — except that landscape had quietly shifted.

Attackers do not wait for controlled rollouts. Comparable AI-assisted vulnerability discovery tools are already circulating through underground markets and less scrupulous API resellers. The defenders at most organisations do not have equivalent capability. That gap is not theoretical. It is active.

The problem lands hardest on mid-sized organisations — companies large enough to run complex infrastructure across dozens of services, but not large enough to employ the security teams required to act on what AI generates at volume. Project Glasswing’s own participants flagged this directly: pointing Mythos Preview at more than fifty repositories produced findings faster than human reviewers could triage them, and that was inside a well-resourced organisation that volunteered for the trial. A 200-person software company with two security engineers does not have a triage bottleneck problem. It has a triage impossibility problem. Giving that team access to a high-output vulnerability model without the surrounding process and headcount does not make them safer. It buries them.

Project Glasswing’s phased approach was designed to prevent exactly the kind of reckless deployment that creates new risks. That reasoning is sound. But the practical effect is a capability window — a period during which large, well-connected organisations refine their use of Mythos Preview, build the workflows around it, and fix the vulnerabilities it finds, while everyone else stands still. Windows like that close eventually. The question is what gets exploited before they do.

The harder questions Anthropic and the industry need to answer

Anthropic has not published a timeline for moving Mythos Preview out of its closed trial phase, and that silence is a problem. Project Glasswing participants signed up to test the model against their own infrastructure — but the same capabilities that helped one security team scan fifty repositories in weeks will be just as available to threat actors the moment access broadens without adequate controls. Anthropic needs to state publicly when broader availability begins, what access tiers will exist, and exactly what verification gates stand between a new applicant and a working vulnerability-hunting agent.

The verification question cuts deeper than onboarding. Project Glasswing has no publicly documented audit mechanism to confirm that participants are using discovered findings to patch rather than stockpile. Organizations invited into the program are trusted by assumption. That assumption breaks the moment one participant suffers a breach. An attacker who compromises a Glasswing member inherits not just their data but their query history — a detailed map of exploitable weaknesses, produced by one of the most capable security-focused models available. Anthropic has not explained what happens in that scenario, whether findings are logged centrally, or whether compromised participants are required to disclose.

The wider industry problem is structural. Responsible disclosure frameworks built over the past two decades assume a human researcher finds a vulnerability and reports it to a vendor. AI-assisted hunting breaks both assumptions. The model discovers vulnerabilities faster than any patch cycle, generates findings across dozens of codebases simultaneously, and is itself a new attack surface — a system that can be queried, probed, and potentially manipulated to surface weaknesses on demand. No shared framework exists that treats the AI model as a threat vector in the disclosure chain, not just a tool within it.

CISA, ENISA, and major coordinated vulnerability disclosure bodies have not updated their guidance to address AI-generated findings at scale. Until they do, every organization running a program like Glasswing is operating on improvised rules. Anthropic’s closed trial is not a solution to that gap — it is a delay of the moment when the gap becomes impossible to ignore.

AI-Assisted Content — This article was produced with AI assistance. Sources are cited below. Factual claims are verified automatically; uncertain claims are flagged for human review. Found an error? Contact us or read our AI Disclosure.

More in Cybersecurity

See all →