3 stories tagged #SWE-bench Pro

  1. Anthropic's Claude Sonnet 5 Challenges Opus Model at a Lower Price Point
    tech

    Anthropic's Claude Sonnet 5 Challenges Opus Model at a Lower Price Point

    Anthropic's new Claude Sonnet 5 model matches the performance of its top-tier Opus 4.8 on key benchmarks, offering developers near-equivalent intelligence at a significantly reduced cost.

    14h ago 2 min read
  2. Anthropic's Claude Fable 5 Shatters AI Benchmarks, Surpassing OpenAI's Leading Model
    tech

    Anthropic's Claude Fable 5 Shatters AI Benchmarks, Surpassing OpenAI's Leading Model

    Anthropic's new 'Mythos-class' Claude Fable 5 achieves a 161 on the Epoch Capabilities Index and dominates in software engineering tests, signaling a major shift in the enterprise AI landscape.

    2w ago 1 min read
  3. Kimi WebBridge Lets AI Agents Drive Your Browser Locally
    tech

    Kimi WebBridge Lets AI Agents Drive Your Browser Locally

    Moonshot AI launches Kimi WebBridge, a browser extension for local AI agent automation, keeping data private.

    last mo. 1 min read