Open-Source Project Reverse Engineers Anthropic's Most Dangerous AI

A developer known as Kye Gomez has released OpenMythos, an open-source project that reconstructs the architecture of Anthropic's Claude Mythos, the company's most powerful but locked-away AI model.

Mythos first surfaced in late March when Anthropic accidentally revealed documents describing it. During Mozilla testing, Mythos autonomously discovered 271 Firefox vulnerabilities and completed a 32-step corporate network attack simulation, making it the first AI to achieve such feats. Anthropic responded by sealing it inside Project Glasswing, a coalition of around 40 partners including Microsoft, Apple, Amazon, and the NSA.

OpenMythos speculates that Mythos uses a Recurrent-Depth Transformer, also known as a looped transformer. This architecture runs a smaller stack of layers multiple times per forward pass, enabling deeper reasoning in continuous latent space before emitting a token. This would explain Mythos's exceptional problem-solving skills but uneven memorization-a hallmark of looped designs.

The project draws from recent research including Parcae, a University of California San Diego paper that solved instability in looped models, and incorporates DeepSeek's Multi-Latent Attention and a Mixture-of-Experts setup. However, OpenMythos contains no trained weights; it's theoretical code awaiting someone with substantial computing resources to train it.

This follows a recent study by Vidoc Security that replicated Mythos's vulnerability discoveries using off-the-shelf models for under $30 per scan. Together, these efforts suggest that the barriers around Mythos's capabilities may be lower than Anthropic suggests.