Mythos Previews Build PoC Exploits for Automated Vulner
Anthropic’s Mythos Preview, a security-focused AI model, marks a critical threshold in automated vulnerability research. It moves beyond simply identifying vulnerabilities, now demonstrating the...
Anthropic’s Mythos Preview, a security-focused AI model, marks a critical threshold in automated vulnerability research. It moves beyond simply identifying vulnerabilities, now demonstrating the ability to chain them into functional proof-of-concept (PoC) exploits.
That’s the finding from Cloudflare’s security team, which spent several weeks running the model against more than fifty internal repositories as part of Anthropic’s invite-only Project Glasswing.
The results are a meaningful signal for both defenders and attackers: an AI model can now close the gap between “we found a flaw” and “here is a working exploit.”
Previous frontier models tested by Cloudflare could identify individual vulnerabilities and write coherent descriptions of why they mattered.
What they consistently failed to do was finish the job, leaving exploit chains incomplete and exploitability unproven. Mythos Preview changes that in two concrete ways.
Mythos Preview Builds PoC Exploits
Exploit chain construction allows the model to take multiple low-severity primitives, a use-after-free bug, an arbitrary read/write, a return-oriented programming (ROP) gadget, and reason about how they combine into a single, higher-severity working exploit.

Bugs that would have sat invisible in a security backlog become actionable attack paths.
Proof generation means the model writes code to trigger a suspected bug, compiles it in a sandboxed environment, runs it, reads the failure, adjusts its hypothesis, and iterates until it either confirms or rules out exploitability.
A confirmed finding arrives with a PoC attached, significantly reducing triage time.
Even with Mythos Preview’s improvements, noise remains a challenge. Two factors dominate false positive rates: programming language (C and C++ codebases produced significantly more noise than memory-safe languages like Rust) and model bias (models are tuned to report speculatively, flooding triage queues with “possibly,” “potentially,” and “could in theory” findings).
Mythos Preview noticeably reduces this problem. Its output arrives with fewer hedged conclusions, clearer reproduction steps, and PoC code that collapses the fix-or-dismiss decision considerably.
Cloudflare found that pointing any AI model directly at a repository produces poor coverage. Real vulnerability research requires a custom execution harness built around several principles:
- Narrow scope — scoping each agent task to a specific function, attack class, and trust boundary produces far sharper findings than broad repository-wide prompts
- Adversarial review — a second independent agent, using a different prompt and model, reviews findings specifically to disprove them, catching a significant fraction of noise the first agent misses
- Chain splitting — asking “is this code buggy?” and “can an attacker reach this from outside?” as separate tasks produces better reasoning on both
- Parallel narrow tasks — running approximately fifty concurrent agents on tightly scoped hypotheses, then deduplicating results, outperform any single exhaustive agent
Their full pipeline includes recon, hunt, validate, gapfill, dedupe, trace, feedback, and report stages, with a final trace stage that determines whether attacker-controlled input can actually reach a confirmed bug from outside the system.
Despite operating under reduced safeguards within Project Glasswing, Mythos Preview exhibited organic refusals declining to write demonstration exploits in some cases while completing equivalent tasks when framed differently.
Cloudflare flagged this inconsistency directly: emergent guardrails alone are not a reliable safety boundary, and any future general availability of capable cyber-focused models will require additional, consistent safeguards layered on top.
Cloudflare is explicit about the dual-use reality: the same capabilities that accelerated internal bug discovery will accelerate attacks against internet-facing applications.
The architectural response defenses that sit in front of applications, limit blast radius, and enable simultaneous global patch rollout, are increasingly urgent as the gap between vulnerability disclosure and exploitation continues to shrink.
Disclaimer: HackersRadar reports on cybersecurity threats and incidents for informational and awareness purposes only. We do not engage in hacking activities, data exfiltration, or the hosting or distribution of stolen or leaked information. All content is based on publicly available sources.



No Comment! Be the first one.