Artificial Intelligence (AI)-Assisted Systems for Pneumothorax Detection: Latest Cureus Update
A recent proof-of-concept study concluded that AI systems are promising for pneumothorax detection on chest radiographs but exhibit distinct diagnostic biases that must be carefully matched to the clinical context. Balanced performance models may be suitable for general screening, whereas high-sensitivity models may better support the triage workflow. The findings highlight that rigorous validation, integration strategies, and human supervision remain essential before deployment in real-world clinical practice.
This proof-of-concept study was published in December 2025 in the Journal Cureus
Introduction
Artificial intelligence (AI) systems for detecting pneumothorax are being improved. They can be faster and more accurate than doctors reading radiographs. However, most AI tools only work with images. For instance, a neural network model performed well on test images but not on real hospital images. This highlights the challenge of making AI work across different settings and avoiding such errors. New AI tools on platforms such as Google Cloud Vertex AI work well, even for small pneumothorax cases. They can provide a second opinion. In addition to images, new AI models incorporate text, other data, and even audio. These models may assist with report writing and answering radiology questions. A survey showed that while these models exhibit potential, they also encounter challenges, including inadequate data, inaccuracies, and usability issues.
Study Overview
The authors conducted a preliminary comparative evaluation of two general-purpose multimodal models, GPT-4o and Gemini 2.5 Pro, for detecting pneumothorax on chest radiographs. This study aimed to benchmark the strengths and limitations of these models and assess the potential clinical contexts in which each approach may be valuable. The evaluation involved 2,000 frontal chest radiographs, evenly divided between cases with and without clinically diagnosed pneumothorax.
Both models were queried following a standardized prompt: "Given a frontal chest radiograph, analyze and determine evidence of pneumothorax. Look for visible pleural line without lung markings, deep sulcus sign, lung translucency asymmetry, and collapse signs.
- The models generated binary predictions for each radiograph: 0 (no pneumothorax) or 1 (pneumothorax present).
- The predictions were compared with the ground-truth labels to generate confusion matrices showing true positives, true negatives, false positives, and false negatives.
- The performance metrics included accuracy, precision, recall, and F1 score.
- Confidence intervals (95%) were used for accuracy and recall, and bootstrapping was used for F1 scores.
Key findings:
Table 1 shows the comparative performance of the GPT-4o and Gemini models for pneumothorax detection. The matrix highlights GPT-4o’s more balanced performance, with moderate precision and specificity but a higher number of false negatives, which raises concerns for clinical triage scenarios where missed cases are critical. The matrix demonstrates Gemini’s substantially higher recall, indicating stronger sensitivity and reduced precision compared with GPT-4o, reflecting its emphasis on minimizing missed cases of pneumothorax. However, fewer false negatives make it potentially more suitable for early screening contexts, although this comes at the cost of increased false positives and lower precision.
Table 1: Comparative performance of GPT-4o and Gemini model for pneumothorax detection (2,000 chest X-rays - 1,000 pneumothorax and 1,000 nonpneumothorax)
| Performance Metrics | GPT-4o model | Gemini model |
| Overall Accuracy | 64% | 62% |
| Precision | 66% | 55% |
| Recall | 57% | 88% |
| F1 Score | 61% | 68% |
| True Positives | 571 Pneumothorax cases | 879 Pneumothorax cases |
| True Negatives | 710 no-pneumothorax cases | 275 no-pneumothorax cases |
| False Positives | 290 Pneumothorax cases | 121 Pneumothorax cases |
| False Negatives | 429 no-pneumothorax cases | 725 no-pneumothorax cases |
Potential Clinical Implications
The authors highlighted the complementary strengths of GPT-4o and Gemini 2.5 Pro in pneumothorax detection. Pneumothorax is an urgent condition in which delayed recognition can lead to cardiorespiratory collapse. Early identification on chest radiographs, especially in emergency settings, is vital. AI assistance may expedite the detection and reduce errors during high-workload periods. GPT-4o's balanced performance seems suitable for general screening, whereas Gemini's high sensitivity is suitable for triage workflows, where minimizing missed diagnoses is critical. This shows the need to tailor AI deployment to the clinical context. Regulatory preparedness, validation, integration strategies, and human supervision are essential before clinical deployment.
Reference: Chetla N, Patel S, Sharma S, et al. Evaluating AI Models for Pneumothorax Detection on Chest Radiographs: Diagnostic Accuracy and Clinical Trade-Offs. Cureus 17(12): e99298. Published 2025 Dec 15. DOI 10.7759/cureus.99298
Dr Bhumika Maikhuri is an orthodontist with 2 years of clinical experience. She is also working as a medical writer and anchor at Medical Dialogues. She has completed her BDS from Dr D.Y. Patil Medical College and Hospital and MDS from Kalinga Institute of Dental Sciences. She has a few publications and patents to her credit. Her diverse background in clinical dentistry and academic research uniquely positions her to contribute meaningfully to our team.
Disclaimer: This website is primarily for healthcare professionals. The content here does not replace medical advice and should not be used as medical, diagnostic, endorsement, treatment, or prescription advice. Medical science evolves rapidly, and we strive to keep our information current. If you find any discrepancies, please contact us at corrections@medicaldialogues.in. Read our Correction Policy here. Nothing here should be used as a substitute for medical advice, diagnosis, or treatment. We do not endorse any healthcare advice that contradicts a physician's guidance. Use of this site is subject to our Terms of Use, Privacy Policy, and Advertisement Policy. For more details, read our Full Disclaimer here.
NOTE: Join us in combating medical misinformation. If you encounter a questionable health, medical, or medical education claim, email us at factcheck@medicaldialogues.in for evaluation.