Researching the Reliability of ChatGPT-4, 4 Plus and Google Gemini in Responding to Questions About the Management of Lower Urinary Tract Trauma: Cross-Sectional Study

Yunus ÇOLAKOĞLU; Ali AYTEN

doi:10.5336/urology.2024-107872

Giriş Yeni Kullanıcı English

Journal of Reconstructive Urology

Dergi Kimliği

Dergi Hakkında

Hakem İnceleme Süreci

Son Sayı

Sayı Arşivi

Yayın Kurulu

Yazım Kuralları

Yazar Formları

Yayın Gönder

Abone Satış

Bu eser Creative Commons Atıf-GayriTicari-Türetilemez 4.0 Uluslararası Lisansı ile lisanslanmıştır.

Researching the Reliability of ChatGPT-4, 4 Plus and Google Gemini in Responding to Questions About the Management of Lower Urinary Tract Trauma: Cross-Sectional Study

Alt İdrar Yolu Travmasının Yönetimi ile İlgili Sorulara Yanıt Vermede ChatGPT-4, 4 Plus ve Google Gemini'nin Güvenilirliğinin Araştırılması: Kesitsel Çalışma

Yunus ÇOLAKOĞLU^a , Ali AYTEN^b
^aBahçelievler Memorial Hospital, Clinic of Urology, İstanbul, Türkiye
^bGaziosmanpaşa Training and Research Hospital, Clinic of Urology, İstanbul, Türkiye

J Reconstr Urol. 2024;14(3):102-6

doi: 10.5336/urology.2024-107872

Makale Dili: EN

Tam Metin

Ücretsiz Erişim

ABSTRACT
Objective: The emergence of artificial intelligence chatbots in recent years has dramatically changed the landscape of education and access to knowledge. The role of these large language models (LLM) within medicine is still relatively unclear due to their novelty. The question of whether such systems can replace the knowledge and experience of trained doctors is highly debated. Lower urinary tract trauma is a very complex emergency to manage. First responding physicians often need consultation. In this study, we aimed to investigate whether these three LLMs are reliable enough in management of lower urinary tract trauma. Material and Methods: The recommendation tables of the bladder, urethra and genital trauma management section of the European Association of Urology 2024 guidelines were analyzed. Those with strong and weak recommendation levels were translated into a questionnaire. Questions were asked in English via ChatGPT-4, ChatGPT-4 Plus and Google Gemini. Answers were evaluated by two surgeons experienced in urogenital trauma and subjected to statistical analysis by calculating the mean score. Results: In total, 27 questions were included in the study. While ChatGPT-4 and 4 Plus were more successful than Google Gemini in urethral trauma management, Google Gemini and ChatGPT-4 Plus were statistically better in approaching bladder trauma (p<0.001). In total, ChatGPT-4 and Google Gemini gave 81.4% correct and sufficient answers to the questions, while this rate was 88.8% in ChatGPT-4 Plus (p=0.618). Conclusion: Our study confirms that ChatGPT-4, 4 Plus and Google Gemini are a reliable resource in lower urinary tract trauma management. In future, it may become a platform to help healthcare professionals in determining the managing trauma.

Keywords: Lower urinary tract traumas; ChatGPT; Google Gemini

ÖZET
Amaç: Son yıllarda yapay zekâ sohbet robotlarının ortaya çıkması, eğitim ve bilgiye erişim manzarasını önemli ölçüde değiştirdi. Bu büyük dil modellerinin [large language models (LLM)] tıbbın içindeki rolü, yenilikleri nedeniyle hâlâ nispeten belirsizdir. Bu tür sistemlerin, eğitimli doktorların geniş bilgi ve deneyiminin yerini alıp alamayacağı sorusu oldukça tartışılmaktadır. Alt üriner sistem travmaları yönetimi oldukça zor ve karmaşık bir acil durumdur. Hastaya ilk müdahale eden hekimler genellikle konsültasyona ihtiyaç duymaktadırlar. Bu çalışmada, bu üç LLM'nin alt üriner sistem travmalarının yönetimi konusunda yeterince güvenilir mi sorusunu araştırmayı amaçladık. Gereç ve Yöntemler: Avrupa Üroloji Derneği 2024 kılavuzunun mesane, üretra ve genital travma yönetimi bölümünün öneri tabloları analiz edildi. Güçlü ve zayıf öneri düzeyine sahip olanlar soru formuna çevrildi. Sorular İngilizce olarak ChatGPT-4, ChatGPT-4 Plus ve Google Gemini üzerinden soruldu. Cevaplar ürogenital travma konusunda deneyimli iki cerrah tarafından değerlendirildi ve ortalama puan hesaplanarak istatistiksel analize tabi tutuldu. Bulgular: Toplamda 27 soru çalışmaya dâhil edildi. Üretral travma yönetimi konusunda ChatGPT-4 ve 4 Plus, Google Gemini'ye göre daha başarılıyken, mesane travmasına yaklaşım konusunda Google Gemini ve ChatGPT-4 Plus istatistiksel olarak daha iyiydi (p<0,001). Totalde ChatGPT-4 ve Google Gemini sorulara %81,4 oranında doğru ve yeterli cevap verirken, ChatGPT-4 Plus'da bu oran %88,8 idi (p=0,618). Sonuç: Çalışmamız ChatGPT-4, 4 Plus ve Google Gemini'nin alt üriner sistem travma yönetiminde güvenilir bir kaynak olduğunu doğrulamaktadır. İlerleyen yıllarda acil tedavi planının belirlenmesi ve travmanın yönetiminde sağlık profesyonellerine yardımcı bir platform hâline gelebilir.

Anahtar Kelimeler: Alt üriner sistem travmaları; ChatGPT; Google Gemini

REFERANSLAR:

Sezgin E. Artificial intelligence in healthcare: complementing, not replacing, doctors and healthcare providers. Digit Health. 2023;9:20552076231186520. [Crossref] [PubMed] [PMC]
Terwilliger E, Bcharah G, Bcharah H, Bcharah E, Richardson C, Scheffler P. Advancing medical education: performance of generative artificial intelligence models on otolaryngology board preparation questions with image analysis insights. Cureus. 2024;16(7):e64204. [Crossref] [PubMed] [PMC]
Caglar U, Yildiz O, Meric A, Ayranci A, Yusuf R, Sarilar O, et al. Evaluating the performance of ChatGPT in answering questions related to benign prostate hyperplasia and prostate cancer. Minerva Urol Nephrol. 2023;75(6):729-33. [Crossref] [PubMed]
Secinaro S, Calandra D, Secinaro A, Muthurangu V, Biancone P. The role of artificial intelligence in healthcare: a structured literature review. BMC Med Inform Decis Mak. 2021;21(1):125. [Crossref] [PubMed] [PMC]
Abedi AR, Hosseini J, Hojjati SA, Alinejad Khorram A, Nikmaram R, Fakhar F. Efficacy and complications of mitrofanoff continent urinary diversion in adults with complex urethral strictures: a single-center experience. Urol J. 2024;21(6):404-9. [PubMed]
Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. [Crossref] [PubMed] [PMC]
Antaki F, Touma S, Milad D, El-Khoury J, Duval R. Evaluating the performance of ChatGPT in ophthalmology: an analysis of its successes and shortcomings. Ophthalmol Sci. 2023;3(4):100324. [Crossref] [PubMed] [PMC]
Mahat Y, Leong JY, Chung PH. A contemporary review of adult bladder trauma. J Inj Violence Res. 2019;11(2):101-6. [Crossref] [PubMed] [PMC]
Song BJ, Wang Q, Feng W, Zhu DJ, Zhang XJ. Comparison of pediatric pelvic fractures and associated injuries caused by different types of road traffic accidents. Chin J Traumatol. 2024;27(6):372-9. [Crossref] [PubMed] [PMC]
Echefu G, Batalik L, Lukan A, Shah R, Nain P, Guha A, et al. The digital revolution in medicine: applications in cardio-oncology. Curr Treat Options Cardiovasc Med. 2025;27(1). [Crossref] [PubMed]
Mankowski MA, Jaffe IS, Xu J, Bae S, Oermann EK, Aphinyanaphongs Y, et al. ChatGPT solving complex kidney transplant cases: a comparative study with human respondents. Clin Transplant. 2024;38(10):e15466. [Crossref] [PubMed] [PMC]
Cakir H, Caglar U, Halis A, Sarilar O, Yazili HB, Ozgor F. Assessing the knowledge of ChatGPT in answering questions regarding female urology. Urol J. 2024;21(6):410-4. [PubMed]
Gill GS, Tsai J, Moxam J, Sanghvi HA, Gupta S. Comparison of Gemini Advanced and ChatGPT 4.0's Performances on the Ophthalmology Resident Ophthalmic Knowledge Assessment Program (OKAP) Examination Review Question Banks. Cureus. 2024;16(9):e69612. [Crossref]
Baturu M, Solakhan M, Kazaz TG, Bayrak O. Frequently asked questions on erectile dysfunction: evaluating artificial intelligence answers with expert mentorship. Int J Impot Res. 2024. [Crossref] [PubMed]
Caglar U, Yildiz O, Ozervarli MF, Aydin R, Sarilar O, Ozgor F, et al. Assessing the performance of chat generative pretrained transformer (ChatGPT) in answering andrology-related questions. Urol Res Pract. 2023;49(6):365-9. [PubMed] [PMC]

.: Güncel

.: İşlem Listesi

Türkçe İngilizce

Hakkımızda İletişim Görüş ve Öneri

Veri Politikamız Kullanım Şartları

Ortadoğu Reklam Tanıtım Yayıncılık Turizm Eğitim İnşaat Sanayi ve Ticaret A.Ş.

.: Adres

Türkocağı Caddesi No:30 06520 Balgat / ANKARA
Telefon: +90 312 286 56 56
Faks: +90 312 220 04 70
E-posta: info@turkiyeklinikleri.com

.: Yazı İşleri Servisi

Telefon: +90 312 286 56 56/ 154 - 153
E-posta: yaziisleri@turkiyeklinikleri.com

.: İngilizce Dil Redaksiyonu

Telefon: +90 312 286 56 56/ 145
E-posta: tkyayindestek@turkiyeklinikleri.com

.: Reklam Servisi

Telefon: +90 312 286 56 56/ 142
E-posta: reklam@turkiyeklinikleri.com

.: Abone ve Halkla İlişkiler Servisi

Telefon: +90 312 286 56 56/ 197
E-posta: abone@turkiyeklinikleri.com

.: Müşteri Hizmetleri

Telefon: +90 312 286 56 56/ 197
E-posta: satisdestek@turkiyeklinikleri.com