Autonomous AI exploit stream colliding with a collapsing clock, versus a cyber-armoured defence shield — representing the race between agentic attackers and behavioral defenders

AI-generated image. No copyright claimed or implied.

Yesterday Anthropic published a report that stopped a lot of people in their tracks.

Claude Mythos Preview can autonomously find zero-day vulnerabilities in every major operating system and browser. It writes working exploits. It chains multiple vulnerabilities together to escape a browser sandbox.

It turned a 27-year-old OpenBSD memory safety bug, long considered unexploitable in practice, into a reliable remote crash. No human involvement after the initial prompt.

The security industry will spend this week writing hot takes. We want to say something different.

We saw this coming.

Not Claude Mythos Preview specifically. But the shape of it, the trajectory. That autonomous systems would eventually move faster than human defenders. That the threat surface would shift from “what vulnerabilities exist” to “how quickly can they be found and weaponised.” That the window between disclosure and exploitation would collapse entirely.

Helixar was founded on that premise.

years old, the OpenBSD bug Mythos Preview exploited from scratch

human decisions required after the initial prompt

Every

major OS and browser in scope for autonomous zero-day discovery

Where the Idea Came From

In early 2025, the security conversation hadn't caught up to where AI was heading. Most vendors were mapping AI to existing threat categories: AI-assisted phishing, deepfakes, slightly smarter malware. Incremental threats layered onto the existing model.

We weren't convinced that was the right frame.

The more interesting question wasn't “what new attacks does AI enable?” It was “what happens when the attacker is an agent?” When the thing probing your systems doesn't sleep, doesn't have a budget, can run a thousand parallel hypotheses simultaneously, and can chain together techniques that no human would have the patience to combine?

That is a qualitatively different problem. It requires a qualitatively different defence.

So we started building one.

What We Built and Why

Helixar has three core components. Each one exists because we anticipated a specific failure mode in how organisations would try to defend against agentic threats.

The Helixar Architecture

Endpoint layer

Sees what's actually happening on the machine, not just what processes report, but what the hardware is doing. A compromised agent that behaves correctly at the application layer can still leave traces elsewhere. You need visibility across the full stack.

API security layer

Purpose-built for agentic traffic. An MCP server isn't a web application. An agent-to-agent call doesn't look like a human user session. The threat model is different: prompt injection, session hijacking, orchestration abuse, tool poisoning. We were built for this. Existing HTTP tooling was not.

Correlation engine

Ties the layers together. An attack that starts at the API layer and pivots through the endpoint is invisible to tools that only see one surface. The kill chain spans layers. Detection has to as well.

Underneath all of it: a detection philosophy built around behaviour, not signatures. The specific vulnerability being exploited is less important than the pattern of what the attacker is trying to accomplish. Preparation. Positioning. Expansion. Objective. These stages are invariant regardless of the initial access method. Catch the behaviour, and the specific exploit becomes less relevant.

This is not an accident of design. It is the reason Helixar exists.

We Didn't Just Build a Product. We Published the Research.

Building a commercial platform wasn't enough. We wanted to contribute to the standards and open tooling the industry would need, even if most organisations hadn't accepted they needed them yet.

In March 2026 we published HDP, the Human Delegation Provenance Protocol, a CC BY 4.0 open specification for cryptographically recording and verifying human authorisation in agentic AI systems. When an AI agent acts on your behalf, there is today no standard record of what you actually authorised it to do. Claude Mythos Preview operating autonomously makes that gap acute: you need a verifiable chain of authority between what a human instructed and what the model did. HDP is the first open protocol designed to close that gap.

We also publish ReleaseGuard, an MIT-licensed artifact security tool that scans release packages for embedded secrets, source maps, and unsigned binaries, then signs and attests output with a full SBOM. Autonomous agents running in agentic infrastructure will be compromised through their tools and dependencies before they are compromised through their prompts. ReleaseGuard is the supply chain integrity layer that sees this coming.

These are not marketing artefacts. They are substantive contributions to the open infrastructure a post-Mythos world will need, published before the report existed and available to anyone to implement.

Helixar open research

HDP

CC BY 4.0

Human Delegation Provenance Protocol, v0.1. Open specification for cryptographic authorisation chains in agentic AI. Read the paper →

ReleaseGuard

MIT licence

Artifact scanning, secret detection, SBOM attestation for AI developer tooling. GitHub Marketplace. Read the release →

Red team

Zero breaches

PentAGI, an open-source multi-agent autonomous penetration testing framework, ran twenty AI agents against the Helixar platform. None got in. Read the results →

The Validation We Didn't Want to Be Right About

The Anthropic report on Claude Mythos Preview contains a sentence that should be pinned to the wall of every CISO office in the world:

“Mitigations whose security value comes primarily from friction rather than hard barriers may become considerably weaker against model-assisted adversaries.”

That is the exact architectural argument we have been making since we started.

Friction doesn't work against a frictionless attacker.

An autonomous agent doesn't get tired. It doesn't make mistakes under time pressure. It doesn't abandon a campaign because it's taking too long. Every defence that relies on making the attack inconvenient rather than impossible gets weaker as attacker capability increases.

This is why we built for behavioural enforcement rather than faster signature databases. For session-level intent detection rather than perimeter controls. For cross-layer visibility rather than point solutions that see one surface at a time.

We hoped we were being paranoid. It turns out we were being accurate.

The core failure mode

Every defence built on signature matching, perimeter enforcement, or friction assumes an attacker with limited time, limited patience, and limited ability to iterate. Claude Mythos Preview eliminates all three constraints simultaneously. The window between vulnerability existence and active exploitation, which defenders have historically relied on to patch, is no longer a planning assumption you can build a security posture around.

What the Capability Curve Actually Means

Anthropic is not making Claude Mythos Preview generally available. They are explicit that the model's autonomous offensive capabilities are being restricted under their Responsible Scaling Policy while safety mitigations catch up.

But they are equally explicit that models with similar capabilities will become broadly available over time. The trajectory is clear: just months ago, frontier models couldn't find nontrivial vulnerabilities in production codebases at all. Now one is chaining zero-days to escape browser sandboxes autonomously.

The capability curve is steep. And it will not stop at Claude Mythos Preview.

The DARPA comparison

In 2016, the DARPA Cyber Grand Challenge demonstrated that AI systems could autonomously find and patch vulnerabilities in a controlled capture-the-flag environment. At the time, it was remarkable. It was also slow, heavily scaffolded, and limited to known vulnerability classes in artificial targets.

Ten years later, Claude Mythos Preview is doing the offensive version of that, against real operating systems, in real browsers, without scaffolding, without human guidance after the initial prompt. The rate of improvement dwarfs anything the CGC predicted.

Who Needs to Act Now

If you are running agentic infrastructure, autonomous agents with real credentials, MCP servers, AI systems with genuine permissions to read and write production data, you are operating with an attack surface that was designed before this threat existed.

The defenders who wait for the threat to become obvious before building infrastructure to address it will already be behind. Claude Mythos Preview isn't widely deployed. But the model that is widely deployed in six months will have capabilities directionally similar, and the organisations that haven't built detection infrastructure by then will face a gap they cannot close quickly.

We are entering paid pilots now. If you are running agentic infrastructure and want to understand your exposure before the next capability jump, we would like to talk.

This is the moment we built Helixar for.

References

Anthropic. (2026). Claude Mythos Preview: Autonomous Vulnerability Research Capabilities Assessment. Anthropic Research Blog, April 2026. anthropic.com
WIRED. (2026). Anthropic's New AI Finds and Exploits Zero-Days Without Human Guidance. WIRED, April 2026. wired.com
The Hacker News. (2026). Claude Mythos Preview: Anthropic AI Chains Vulnerabilities to Escape Browser Sandbox. The Hacker News, April 2026. thehackernews.com
Ars Technica. (2026). What Claude Mythos's Zero-Day Capabilities Mean for Enterprise Security. Ars Technica, April 2026. arstechnica.com
Anthropic. (2025). Responsible Scaling Policy v1.3. Anthropic. anthropic.com/responsible-scaling-policy
DARPA. (2016). Cyber Grand Challenge: Final Event Results. Defense Advanced Research Projects Agency. darpa.mil
Google Project Zero. (2025). Automated Vulnerability Research: A Year in Review. Google Security Blog, December 2025. googleprojectzero.blogspot.com
CrowdStrike. (2026). 2026 Global Threat Report. CrowdStrike Inc. crowdstrike.com
Helixar Research Team. (2026). HDP: The Open Protocol That Gives AI Agents a Verifiable Chain of Authority. Helixar, March 2026. helixar.ai
Helixar Labs. (2026). ReleaseGuard, Sentinel, and the MCP Security Checklist. Helixar Labs, March 2026. helixar.ai

Note on scope: This article responds to Anthropic's April 2026 publication on Claude Mythos Preview capabilities. Specific capability claims are drawn directly from that report. No proprietary Helixar intelligence, customer data, or unpublished internal research is referenced or implied.

Brand names: Claude Mythos Preview is a product of Anthropic, PBC. Helixar is an independent company and is not affiliated with, endorsed by, or in partnership with Anthropic. All third-party product names are used for identification purposes only.

Early access: Helixar is currently accepting applications for paid pilot engagements. Organisations running agentic infrastructure are invited to enquire at press@helixar.ai.

Anthropic's Claude Mythos Preview Just Changed the Security Equation. We Built Helixar for Exactly This.