A head-to-head UK study of seven commercially available AI tools for detecting lung cancer on chest radiographs has revealed dramatic differences in diagnostic accuracy, raising critical questions for hospitals considering deployment.
The retrospective study, conducted at a single UK center, analyzed over 5,200 chest X-rays from primary care patients. Lung cancer was confirmed in 1.4% of cases. The AI systems showed a wide performance range: sensitivity ranged from 20.8% to 77.8%, specificity from 58.9% to 98.4%, and positive predictive value from just 1.5% to 28.4%.
Area under the ROC curve varied from 0.80 to 0.94. False-positive findings ranged dramatically-from 10 to over 2,000 additional cases flagged incorrectly, depending on the system. Agreement between devices was minimal, with different tools often flagging different patients as suspicious.
The findings come as the NHS expands AI investment, including a £21 million government allocation in 2023 for imaging AI. The UK faces a 29% radiologist shortfall, and AI is seen as a potential solution for prioritizing urgent scans and improving early detection. However, the study suggests these tools should not be considered interchangeable, and that higher sensitivity often comes with a significant trade-off in false positives, potentially increasing unnecessary follow-ups and workload.