ChatGPT-3.5 Shows Moderate Accuracy in Medical Genetics Exam, reveals research

Medically Reviewed By : Dr. Kamal Kant Kohli

Published On 2025-08-01 14:45 GMT | Update On 2025-08-02 06:53 GMT

Researchers have found in a new study that ChatGPT-3.5 performed with moderate accuracy and stable results on a specialist medical genetics exam, indicating potential for educational support. However, its limitations in handling complex, domain-specific reasoning highlight the need for continued advancement before wider application. The study was published in Laboratory Medicine journal by Klaudia P. and colleagues.

To test the performance of ChatGPT-3.5, scientists chose 456 available questions from the Polish national specialist exam in medical laboratory genetics that are available online. The questions were classified by topic, genetic changes, diagnostic techniques, clinical case, and calculations, and by complexity (simple, complex). Each question was asked three times on ChatGPT to test not just the correctness but also the reliability of responses across multiple rounds of interaction. Statistical tests were then used to compare differences in performance by category, level of complexity, and repeatability.

Key Findings

• Overall Accuracy: ChatGPT correctly answered 59% of the questions, statistically significant (P < 0.001).

By Category:

• Calculation-based questions: 71% accurate

• Genetic methods and genetic changes: ~60% accuracy

• Clinical case–based questions: 37% accuracy

By Complexity:

• Simple questions: 63% accurate

• Complex questions: 43% accuracy (P = .001)

• Consistency: The AI model had consistent performance throughout three repeated sessions (P = 0.43), which reflects reliability in output even when being asked repeatedly.

This research concluded that although ChatGPT-3.5 performs moderate accuracy and stable performance in responding to medical laboratory genetics exam questions, it lags behind in dealing with complex and clinical case–based reasoning. Consequently, its version at present may assist in education but is not yet adequate for advanced or high-stakes implementation in genetic medicine. Further advancement in AI reasoning and domain adaptation will be required before these tools can be introduced to professional medical education or practice with confidence.

Reference:

Klaudia Paruzel, Michał Ordak, Assessment of ChatGPT-3.5 performance on the medical genetics specialist exam, Laboratory Medicine, 2025;, lmaf038, https://doi.org/10.1093/labmed/lmaf038

Article Source : Laboratory Medicine

Disclaimer: This website is primarily for healthcare professionals. The content here does not replace medical advice and should not be used as medical, diagnostic, endorsement, treatment, or prescription advice. Medical science evolves rapidly, and we strive to keep our information current. If you find any discrepancies, please contact us at corrections@medicaldialogues.in. Read our Correction Policy here. Nothing here should be used as a substitute for medical advice, diagnosis, or treatment. We do not endorse any healthcare advice that contradicts a physician's guidance. Use of this site is subject to our Terms of Use, Privacy Policy, and Advertisement Policy. For more details, read our Full Disclaimer here.

NOTE: Join us in combating medical misinformation. If you encounter a questionable health, medical, or medical education claim, email us at factcheck@medicaldialogues.in for evaluation.

Dr Riya Dave

Dr Riya Dave has completed dentistry from Gujarat University in 2022. She is a dentist and accomplished medical and scientific writer known for her commitment to bridging the gap between clinical expertise and accessible healthcare information. She has been actively involved in writing blogs related to health and wellness.

Comments Policy

Our comments section is governed by our Comments Policy . By posting comments at Medical Dialogues you automatically agree with our Comments Policy , Terms And Conditions and Privacy Policy .

ChatGPT-3.5 Shows Moderate Accuracy in Medical Genetics Exam, reveals research

Similar News

Beetroot Compound Betanin Promising as Safe, Natural Alternative to Gadolinium in MRI Scans, New Study Finds

Olpasiran Effectively Reduces Lp(a)-apoB Without Increasing Other apoB Particles: JAMA

Blood test identifies HPV-associated head and neck cancers up to 10 years before symptoms: Study

Female Patients with Culture-Negative Endocarditis Face Worse Prognosis, suggests study

A breath test could help us detect blood cancers, reveals study

Women More Likely to Develop Chronic Rhinosinusitis Without Polyps, Finds New Study

TyG-BMI Index Predicts Higher Mortality Risk in Osteoporosis Patients: Study

Automated Insulin Delivery Benefits Adults with Type 2 Diabetes Regardless of C-Peptide Levels: Study Finds