12 stories tagged #GPT-5.4

  1. AI Models Disagree on Two-Thirds of Fact-Checks, Lenz Research Study Finds
    tech

    AI Models Disagree on Two-Thirds of Fact-Checks, Lenz Research Study Finds

    Lenz Research tests five AI models on 1,000 real claims, finding 67% disagreement, with implications for AI reliability in markets.

    4d ago 1 min read
  2. AI Models Disagree on Basic Facts Two-Thirds of the Time, Study Finds
    tech

    AI Models Disagree on Basic Facts Two-Thirds of the Time, Study Finds

    Five frontier AI systems agreed on only 328 out of 1,000 real-world fact-check claims, revealing deep reliability issues in the models' understanding of basic factual information.

    5d ago 2 min read
  3. Kimi WebBridge Lets AI Agents Drive Your Browser Locally
    tech

    Kimi WebBridge Lets AI Agents Drive Your Browser Locally

    Moonshot AI launches Kimi WebBridge, a browser extension for local AI agent automation, keeping data private.

    2w ago 1 min read
  4. OpenAI's GPT Image 2 vs. Google's Nano Banana 2: The Battle for AI Image Dominance
    tech

    OpenAI's GPT Image 2 vs. Google's Nano Banana 2: The Battle for AI Image Dominance

    OpenAI launches GPT Image 2 with native reasoning and superior text accuracy. We pit it against Google's Nano Banana 2 in a seven-category shootout.

    last mo. 3 min read
  5. OpenAI Launches ChatGPT for Clinicians, Claims Superior Performance in Medical Tasks
    health

    OpenAI Launches ChatGPT for Clinicians, Claims Superior Performance in Medical Tasks

    OpenAI introduces a specialized AI tool for healthcare professionals, asserting its advanced capabilities in clinical tasks over human doctors in benchmark tests.

    last mo. 1 min read
  6. Moonshot AI Unveils Kimi-K2.6: A New Frontier in AI with 1 Trillion Parameters
    tech

    Moonshot AI Unveils Kimi-K2.6: A New Frontier in AI with 1 Trillion Parameters

    Moonshot AI releases Kimi-K2.6, an advanced AI model with 1T parameters and optimized attention mechanisms, claiming superior performance over leading competitors.

    last mo. 1 min read
  7. AI Vulnerability Discovery Now Cheap and Accessible, Researchers Find
    tech

    AI Vulnerability Discovery Now Cheap and Accessible, Researchers Find

    A new study demonstrates that sophisticated AI exploits, once thought exclusive to advanced models, can now be replicated with readily available AI tools, raising security concerns.

    last mo. 1 min read
  8. Cloudflare Unveils New Tools for AI Agent Development and Scaling
    tech

    Cloudflare Unveils New Tools for AI Agent Development and Scaling

    Cloudflare expands Agent Cloud with Dynamic Workers, Artifacts, and Sandboxes to enable developers to build and scale AI agents for production.

    last mo. 1 min read