GPT-5.4
-
techAI Models Disagree on Two-Thirds of Fact-Checks, Lenz Research Study Finds
Lenz Research tests five AI models on 1,000 real claims, finding 67% disagreement, with implications for AI reliability in markets.
-
techAI Models Disagree on Basic Facts Two-Thirds of the Time, Study Finds
Five frontier AI systems agreed on only 328 out of 1,000 real-world fact-check claims, revealing deep reliability issues in the models' understanding of basic factual information.
-
techKimi WebBridge Lets AI Agents Drive Your Browser Locally
Moonshot AI launches Kimi WebBridge, a browser extension for local AI agent automation, keeping data private.
-
techOpenAI's GPT Image 2 vs. Google's Nano Banana 2: The Battle for AI Image Dominance
OpenAI launches GPT Image 2 with native reasoning and superior text accuracy. We pit it against Google's Nano Banana 2 in a seven-category shootout.
-
healthOpenAI Launches ChatGPT for Clinicians, Claims Superior Performance in Medical Tasks
OpenAI introduces a specialized AI tool for healthcare professionals, asserting its advanced capabilities in clinical tasks over human doctors in benchmark tests.
-
techMoonshot AI Unveils Kimi-K2.6: A New Frontier in AI with 1 Trillion Parameters
Moonshot AI releases Kimi-K2.6, an advanced AI model with 1T parameters and optimized attention mechanisms, claiming superior performance over leading competitors.
-
techAI Vulnerability Discovery Now Cheap and Accessible, Researchers Find
A new study demonstrates that sophisticated AI exploits, once thought exclusive to advanced models, can now be replicated with readily available AI tools, raising security concerns.
-
techCloudflare Unveils New Tools for AI Agent Development and Scaling
Cloudflare expands Agent Cloud with Dynamic Workers, Artifacts, and Sandboxes to enable developers to build and scale AI agents for production.