2 stories tagged #SWE-Bench Pro

  1. Anthropic's Claude Fable 5 Shatters AI Benchmarks, Surpassing OpenAI's Leading Model
    tech

    Anthropic's Claude Fable 5 Shatters AI Benchmarks, Surpassing OpenAI's Leading Model

    Anthropic's new 'Mythos-class' Claude Fable 5 achieves a 161 on the Epoch Capabilities Index and dominates in software engineering tests, signaling a major shift in the enterprise AI landscape.

    last wk. 1 min read
  2. Kimi WebBridge Lets AI Agents Drive Your Browser Locally
    tech

    Kimi WebBridge Lets AI Agents Drive Your Browser Locally

    Moonshot AI launches Kimi WebBridge, a browser extension for local AI agent automation, keeping data private.

    last mo. 1 min read