
AI-generated image. No copyright claimed or implied.
Yesterday Anthropic published a report that stopped a lot of people in their tracks.
Claude Mythos Preview can autonomously find zero-day vulnerabilities in every major operating system and browser. It writes working exploits. It chains multiple vulnerabilities together to escape a browser sandbox.
It turned a 27-year-old OpenBSD memory safety bug, long considered unexploitable in practice, into a reliable remote crash. No human involvement after the initial prompt.
The security industry will spend this week writing hot takes. We want to say something different.
We saw this coming.
Not Claude Mythos Preview specifically. But the shape of it, the trajectory. That autonomous systems would eventually move faster than human defenders. That the threat surface would shift from “what vulnerabilities exist” to “how quickly can they be found and weaponised.” That the window between disclosure and exploitation would collapse entirely.
Helixar was founded on that premise.
27
years old, the OpenBSD bug Mythos Preview exploited from scratch
0
human decisions required after the initial prompt
Every
major OS and browser in scope for autonomous zero-day discovery
Where the Idea Came From
In early 2025, the security conversation hadn't caught up to where AI was heading. Most vendors were mapping AI to existing threat categories: AI-assisted phishing, deepfakes, slightly smarter malware. Incremental threats layered onto the existing model.
We weren't convinced that was the right frame.
The more interesting question wasn't “what new attacks does AI enable?” It was “what happens when the attacker is an agent?” When the thing probing your systems doesn't sleep, doesn't have a budget, can run a thousand parallel hypotheses simultaneously, and can chain together techniques that no human would have the patience to combine?
That is a qualitatively different problem. It requires a qualitatively different defence.
So we started building one.
What We Built and Why
Helixar has three core components. Each one exists because we anticipated a specific failure mode in how organisations would try to defend against agentic threats.
The Helixar Architecture
Endpoint layer
Sees what's actually happening on the machine, not just what processes report, but what the hardware is doing. A compromised agent that behaves correctly at the application layer can still leave traces elsewhere. You need visibility across the full stack.
API security layer
Purpose-built for agentic traffic. An MCP server isn't a web application. An agent-to-agent call doesn't look like a human user session. The threat model is different: prompt injection, session hijacking, orchestration abuse, tool poisoning. We were built for this. Existing HTTP tooling was not.
Correlation engine
Ties the layers together. An attack that starts at the API layer and pivots through the endpoint is invisible to tools that only see one surface. The kill chain spans layers. Detection has to as well.
Underneath all of it: a detection philosophy built around behaviour, not signatures. The specific vulnerability being exploited is less important than the pattern of what the attacker is trying to accomplish. Preparation. Positioning. Expansion. Objective. These stages are invariant regardless of the initial access method. Catch the behaviour, and the specific exploit becomes less relevant.
This is not an accident of design. It is the reason Helixar exists.
We Didn't Just Build a Product. We Published the Research.
Building a commercial platform wasn't enough. We wanted to contribute to the standards and open tooling the industry would need, even if most organisations hadn't accepted they needed them yet.
In March 2026 we published HDP, the Human Delegation Provenance Protocol, a CC BY 4.0 open specification for cryptographically recording and verifying human authorisation in agentic AI systems. When an AI agent acts on your behalf, there is today no standard record of what you actually authorised it to do. Claude Mythos Preview operating autonomously makes that gap acute: you need a verifiable chain of authority between what a human instructed and what the model did. HDP is the first open protocol designed to close that gap.
We also publish ReleaseGuard, an MIT-licensed artifact security tool that scans release packages for embedded secrets, source maps, and unsigned binaries, then signs and attests output with a full SBOM. Autonomous agents running in agentic infrastructure will be compromised through their tools and dependencies before they are compromised through their prompts. ReleaseGuard is the supply chain integrity layer that sees this coming.
These are not marketing artefacts. They are substantive contributions to the open infrastructure a post-Mythos world will need, published before the report existed and available to anyone to implement.
Helixar open research
CC BY 4.0
Human Delegation Provenance Protocol, v0.1. Open specification for cryptographic authorisation chains in agentic AI. Read the paper →
MIT licence
Artifact scanning, secret detection, SBOM attestation for AI developer tooling. GitHub Marketplace. Read the release →
Zero breaches
PentAGI, an open-source multi-agent autonomous penetration testing framework, ran twenty AI agents against the Helixar platform. None got in. Read the results →
The Validation We Didn't Want to Be Right About
The Anthropic report on Claude Mythos Preview contains a sentence that should be pinned to the wall of every CISO office in the world:
“Mitigations whose security value comes primarily from friction rather than hard barriers may become considerably weaker against model-assisted adversaries.”
That is the exact architectural argument we have been making since we started.
Friction doesn't work against a frictionless attacker.
An autonomous agent doesn't get tired. It doesn't make mistakes under time pressure. It doesn't abandon a campaign because it's taking too long. Every defence that relies on making the attack inconvenient rather than impossible gets weaker as attacker capability increases.
This is why we built for behavioural enforcement rather than faster signature databases. For session-level intent detection rather than perimeter controls. For cross-layer visibility rather than point solutions that see one surface at a time.
We hoped we were being paranoid. It turns out we were being accurate.
The core failure mode
Every defence built on signature matching, perimeter enforcement, or friction assumes an attacker with limited time, limited patience, and limited ability to iterate. Claude Mythos Preview eliminates all three constraints simultaneously. The window between vulnerability existence and active exploitation, which defenders have historically relied on to patch, is no longer a planning assumption you can build a security posture around.
What the Capability Curve Actually Means
Anthropic is not making Claude Mythos Preview generally available. They are explicit that the model's autonomous offensive capabilities are being restricted under their Responsible Scaling Policy while safety mitigations catch up.
But they are equally explicit that models with similar capabilities will become broadly available over time. The trajectory is clear: just months ago, frontier models couldn't find nontrivial vulnerabilities in production codebases at all. Now one is chaining zero-days to escape browser sandboxes autonomously.
The capability curve is steep. And it will not stop at Claude Mythos Preview.
The DARPA comparison
In 2016, the DARPA Cyber Grand Challenge demonstrated that AI systems could autonomously find and patch vulnerabilities in a controlled capture-the-flag environment. At the time, it was remarkable. It was also slow, heavily scaffolded, and limited to known vulnerability classes in artificial targets.
Ten years later, Claude Mythos Preview is doing the offensive version of that, against real operating systems, in real browsers, without scaffolding, without human guidance after the initial prompt. The rate of improvement dwarfs anything the CGC predicted.
Who Needs to Act Now
If you are running agentic infrastructure, autonomous agents with real credentials, MCP servers, AI systems with genuine permissions to read and write production data, you are operating with an attack surface that was designed before this threat existed.
The defenders who wait for the threat to become obvious before building infrastructure to address it will already be behind. Claude Mythos Preview isn't widely deployed. But the model that is widely deployed in six months will have capabilities directionally similar, and the organisations that haven't built detection infrastructure by then will face a gap they cannot close quickly.
We are entering paid pilots now. If you are running agentic infrastructure and want to understand your exposure before the next capability jump, we would like to talk.
This is the moment we built Helixar for.
References
- Anthropic. (2026). Claude Mythos Preview: Autonomous Vulnerability Research Capabilities Assessment. Anthropic Research Blog, April 2026. anthropic.com
- WIRED. (2026). Anthropic's New AI Finds and Exploits Zero-Days Without Human Guidance. WIRED, April 2026. wired.com
- The Hacker News. (2026). Claude Mythos Preview: Anthropic AI Chains Vulnerabilities to Escape Browser Sandbox. The Hacker News, April 2026. thehackernews.com
- Ars Technica. (2026). What Claude Mythos's Zero-Day Capabilities Mean for Enterprise Security. Ars Technica, April 2026. arstechnica.com
- Anthropic. (2025). Responsible Scaling Policy v1.3. Anthropic. anthropic.com/responsible-scaling-policy
- DARPA. (2016). Cyber Grand Challenge: Final Event Results. Defense Advanced Research Projects Agency. darpa.mil
- Google Project Zero. (2025). Automated Vulnerability Research: A Year in Review. Google Security Blog, December 2025. googleprojectzero.blogspot.com
- CrowdStrike. (2026). 2026 Global Threat Report. CrowdStrike Inc. crowdstrike.com
- Helixar Research Team. (2026). HDP: The Open Protocol That Gives AI Agents a Verifiable Chain of Authority. Helixar, March 2026. helixar.ai
- Helixar Labs. (2026). ReleaseGuard, Sentinel, and the MCP Security Checklist. Helixar Labs, March 2026. helixar.ai