Nearly Half of AI Health Advice Is Wrong, Study Finds

A new peer-reviewed study published in BMJ Open reveals nearly half of health answers from popular AI chatbots are wrong, misleading, or dangerously incomplete. Researchers from UCLA, University of Alberta, and Wake Forest tested five major chatbots-Gemini, DeepSeek, Meta AI, ChatGPT, and Grok-on 250 health questions covering cancer, vaccines, nutrition, and athletic performance.

Results show 49.6% of responses were problematic, with 19.6% highly problematic. Chatbots don't reason or weigh evidence; they pattern-match text from the internet, where misinformation spreads quickly.

Grok performed worst, with 58% of responses rated problematic, significantly more than expected. X's platform, known for spreading health misinformation, is linked to its poor performance.

Nutrition and athletic performance questions fared the worst. No chatbot produced a fully accurate reference list, with a median completeness score of just 40%. All responses were difficult to read, above the recommended sixth-grade level.

Researchers call for public education, professional training, and regulatory oversight to ensure AI supports public health rather than erodes it.