Google's AI Overviews, powered by Gemini, delivers inaccurate information at scale. A New York Times investigation with startup Oumi found the system gets 90% of answers right-but that still means hundreds of thousands of false responses every minute.

Using the SimpleQA benchmark, which tests factual accuracy, Oumi assessed the AI with over 4,000 verified questions. In 2024, when Gemini 2.5 was tested, accuracy stood at 85%. After the Gemini 3 upgrade, it rose to 91%.

However, even with improved performance, the sheer volume of queries leads to massive misinformation output.

Examples include misidentifying dates related to Bob Marley’s former home and incorrectly denying the existence of the Classical Music Hall of Fame.

Despite ongoing improvements, Google’s AI remains prone to significant factual errors, raising concerns about trust in AI-generated search results.