An artificial intelligence chatbot designed to provide antibiotic prescribing advice using local hospital protocols shows potential but is not yet safe for routine clinical work.
The system uses a retrieval-augmented generation approach, restricting a large language model to answer questions based solely on hospital-specific antimicrobial guidelines.
In an evaluation of 200 simulated clinical cases, the chatbot attempted to answer 93% of queries. Of those, 87% of responses were fully correct, while 5% were flatly incorrect. Accuracy dropped significantly in complex cases involving renal impairment.
A second test using 66 real questions from infection specialists found 81% of answers were fully correct. However, some responses included inappropriate antibiotic choices or dosing errors. The model also failed to recognize when a question was outside its scope roughly half the time.
Notably, the guideline-grounded system vastly outperformed general AI models. A locally deployed model without retrieval support produced only 11% fully correct answers, while a more advanced general-purpose model reached 46% when prompted for local advice.
Response generation typically took 10 to 15 seconds, hinting at real-time clinical utility if safety improves.