The CrowdStrike 2026 Global Threat Report, published in February 2026, documented the fastest-ever recorded adversary breakout time: 27 seconds from initial access to lateral movement inside a target network. The median breakout time across all intrusions fell to under three minutes. At that speed, human-paced detection and response, the model on which virtually all enterprise security operations are built, is structurally insufficient. And behind those numbers is a force multiplier that was not part of the threat model when those security stacks were designed: agentic AI.
APT31 and Google Gemini: Confirmed AI-Enabled Reconnaissance
In February 2026, The Register and The Hacker News reported on a confirmed disclosure from Google's Threat Intelligence Group: APT31, the Chinese state-affiliated threat actor, had been observed using Google Gemini to assist in cyberattack planning and reconnaissance operations targeting US government and defence infrastructure.
The disclosed activity showed APT31 operators prompting Gemini with a “cybersecurity expert” persona, using it to research attack methodologies, enumerate target infrastructure details from publicly available sources, and assist in drafting attack scripts. The model had been fine-tuned with safety guidelines designed to prevent exactly this use — and APT31 operators had developed prompting strategies that partially circumvented those guidelines, obtaining operationally useful output.
Google stated that it had blocked the relevant accounts and reported the activity to the appropriate authorities. But the disclosure confirmed what the security community had been anticipating: nation-state actors with significant resources were actively integrating commercial AI capabilities into their offensive operations, and the primary constraint on this usage was not the effectiveness of AI safety guidelines but the creativity of the adversary's prompting strategy.
This was not the first such disclosure. Microsoft and OpenAI had jointly published in February 2024 that state-affiliated actors, including groups associated with Russian, Chinese, North Korean, and Iranian intelligence services, had been observed using LLMs for scripting, vulnerability research, and phishing content generation. The 2026 APT31 report represented an escalation: from LLM-assisted attack planning to documented operational use in a confirmed active intrusion campaign.
Anthropic's November 2025 Disclosure: Thirty Targets, Zero Traditional Indicators
In November 2025, Anthropic published a transparency report disclosing that it had identified and disrupted approximately thirty active campaigns conducted by actors in the technology, financial services, and government sectors, which used Claude to assist in intrusion operations against enterprise targets.
The technical detail in the disclosure was notable for what it revealed about the operational structure of these attacks. The intrusion chains involved, on average, four to six human decision points across the entire operation: the initial targeting decision, the access strategy, the exfiltration prioritisation, a review of output, and the exit. Everything between those decision points was executed autonomously by the AI agent.
This is a fundamental change to the attacker's operational tempo. Human attackers operating manually, even skilled and well-tooled threat actors, require time at each step of an intrusion: reconnaissance, access establishment, enumeration, exfiltration, clean-up. That time is the window in which defenders can detect and respond. An agentic operation that requires only four to six human decisions and executes the rest autonomously compresses that window toward zero.
“Four to six human decision points across an entire enterprise intrusion. The rest: fully automated reconnaissance, lateral movement, exfiltration, and clean-up. This is not the attacker model any current SIEM was designed to detect.”
CrowdStrike 2026: The Numbers Are Not Ambiguous
The CrowdStrike 2026 Global Threat Report documented an 89% year-over-year increase in AI-enabled cyberattacks. The report, described by CrowdStrike as representing the most significant documented escalation in adversary capability in the company's operational history, attributed the increase to three structural factors:
- Access democratisation: Commercial AI platforms have eliminated the skill floor for sophisticated attack operations. Capabilities that previously required nation-state-level technical expertise: realistic phishing content, vulnerability script generation, and automated reconnaissance, now accessible to commodity eCrime groups.
- Operational velocity: AI-assisted and agentic operations execute significantly faster than human-operated intrusions. The 27-second breakout time documented in the report was enabled by AI-assisted lateral movement tooling.
- Evasion sophistication: AI-generated malware, credential scripts, and social engineering content are substantially harder for signature-based detection to classify than human-authored equivalents, because they lack the characteristic patterns that detection models were trained on.
CSO Online, in its analysis of the report, characterised AI-enabled attacks as the top enterprise security threat of 2026, above ransomware, supply chain attacks, and nation-state intrusions in isolation, precisely because it is a force multiplier for all of the above.
The Named Actors: FANCY BEAR, PUNK SPIDER, FAMOUS CHOLLIMA
The 2026 CrowdStrike report and associated disclosures from CISA named specific threat actor groups deploying AI capabilities in documented operations:
FANCY BEAR (APT28, Russia): The Russian GRU-affiliated group was documented using AI-generated credential extraction scripts against European government and defence targets. The scripts were generated by LLM and subsequently refined through multiple automated iterations, a development cycle that compressed what would previously have taken days of skilled operator time into hours.
PUNK SPIDER (eCrime): The PUNK SPIDER group deployed LAMEHUG malware, documented in the CrowdStrike 2026 report as the first commercially distributed malware with AI-generated, per-target polymorphic payloads. Each deployment generated a unique binary variant, rendering signature-based antivirus detection ineffective across the fleet of variants simultaneously in circulation.
FAMOUS CHOLLIMA (North Korea): The Lazarus Group-affiliated actor continued its pattern of financially motivated intrusions with a notable escalation: the February 2026 Bybit exchange breach, attributed to FAMOUS CHOLLIMA, resulted in a confirmed $1.46 billion cryptocurrency theft, the largest single crypto theft on record. The operation used AI-generated insider personas (fraudulent IT workers with AI-enhanced credentials and interview performance) to establish legitimate access to target organisations, bypassing the pre-employment vetting that organisations rely on as a first line of defence.
The Common Structural Gap
Across every documented nation-state and advanced eCrime operation in this wave, a single structural characteristic is consistent: the attacks use authorised AI APIs, produce no malware binaries that signature detection can classify, and look, from every conventional security monitoring perspective, like legitimate agentic operations.
FANCY BEAR's credential scripts were generated by a legitimate LLM API and executed by a process that looked like developer tooling. PUNK SPIDER's polymorphic payloads were generated by AI and produced no signature match against any existing database. FAMOUS CHOLLIMA's insider personas passed standard background check procedures precisely because the AI enhancement was applied to legitimate human identities, not fabricated from nothing.
The CISA December 2025 joint advisory, co-signed by the NSA, FBI, and equivalent agencies across the Five Eyes alliance, acknowledged this structural shift explicitly: existing detection paradigms were designed for attacks that leave observable artefacts in process behaviour, file system state, and network traffic. AI-native attacks are designed, whether intentionally or as a byproduct of how AI tooling works, to leave no such artefacts. The advisory called for new detection approaches grounded in behavioural analysis rather than indicator matching.
What Detection Looks Like at This Threat Level
The detection challenge posed by AI-enabled nation-state intrusions is not a harder version of the existing problem. It is a different problem.
Signature-based detection fails because AI-generated attack artefacts have no known signature. Rule-based correlation fails because the operations comply with the surface-level rules — they use legitimate APIs, legitimate credentials, and legitimate processes. Volume-based anomaly detection fails because the operations are designed to operate within expected parameters until a specific phase of the intrusion is complete.
The detection surface that remains is behavioural sequence: the relationship between what an agent was authorised to do and what it actually did; the relationship between the pattern of actions taken and what a legitimate operation would look like; the correlation of signals across the endpoint and API layers that, individually, appear normal, but together describe a chain that no legitimate operation would produce.
This is the architecture Helixar was built to provide. Helixar Vigil establishes the behavioural baseline for every agent in the deployment environment and monitors continuously for deviation, capturing the process chain, the credential access sequence, the API call pattern. Helixar Shield applies intent analysis at the API boundary, distinguishing between calls that fit the agent's established operational profile and calls that suggest the agent has been redirected. Nexus correlates signals across both layers and escalates the full chain (trigger, execution, outcome) to a human operator, with every piece of evidence required to make an informed decision, before irreversible action completes.
The Anthropic disclosure documented an intrusion model with four to six human decision points. Helixar's detection model is designed to surface an anomaly to a human operator before the second decision point is reached.
References
- The Register. (2026). APT31 used Google Gemini to plan cyberattacks against US targets. The Register, February 2026. theregister.com
- The Hacker News. (2026). Google: State-Backed Hackers Used Gemini AI for Cyberattack Reconnaissance. The Hacker News, February 2026. thehackernews.com
- CrowdStrike. (2026). 2026 CrowdStrike Global Threat Report. CrowdStrike Inc. crowdstrike.com
- BusinessWire. (2026). CrowdStrike 2026 Global Threat Report: 89% Increase in AI-Enabled Attacks. BusinessWire, February 2026. businesswire.com
- CSO Online. (2026). Top 5 AI-Enabled Cybersecurity Threats of 2026. CSO Online, February 2026. csoonline.com
- CISA / NSA / FBI et al. (2025). Joint Advisory: AI-Enabled Threats to Critical Infrastructure — Detection and Mitigation Guidance. December 2025. cisa.gov.
- Anthropic. (2025). Transparency Report: Disrupted AI-Enabled Intrusion Campaigns. Anthropic, November 2025.
- Microsoft & OpenAI. (2024). Disrupting malicious uses of AI by state-affiliated threat actors. Microsoft Security Blog, February 2024.