AI Health Tools Urge Caution: ChatGPT Fails Critical Emergency Cases

Millions turn to AI for medical advice, but a new study urges caution. ChatGPT Health, while performing well for textbook cases, struggles with serious emergencies.

Research published in Nature found the AI tool underestimated the need for emergency care in more than half of critical situations. Lead author Ashwin Ramaswamy stated, "We wanted to answer... if someone is experiencing a real medical emergency and turns to ChatGPT Health for help, will it clearly tell them to go to the emergency room?"

While recognizing clear emergencies like stroke, the model faltered when danger was not immediately obvious. In one asthma scenario, it advised waiting despite identifying early warning signs of respiratory failure.

The study also examined responses to self-harm intentions, finding inconsistent guidance. The intended safety net, a banner encouraging help-seeking, appeared unreliably, sometimes even more reliably for users who hadn't specified a means of self-harm.

Researchers do not advocate abandoning AI health tools but emphasize thoughtful integration. "We must learn to integrate thoughtfully into care rather than substitutes for clinical judgment," noted co-author Alvira Tyagi. Experts advise seeking direct medical care for concerning symptoms like chest pain, shortness of breath, severe allergic reactions, or changes in mental status, rather than relying solely on chatbot guidance. AI models are constantly evolving, necessitating ongoing review to ensure performance improvements translate to safer patient care.