Amaç: Ücretsiz erişim sağlanabilen ChatGPT-3.5, Copilot ve Gemini yapay zekâ sohbet botlarının oküler inflamasyon ve üveit alanındaki çoktan seçmeli sorulardaki başarısının soruların dil farklılığına bağlı olarak değişimlerini değerlendirmektir. Gereç ve Yöntemler: Oküler inflamasyon ve üveit ile ilgili 36 soru çalışmaya alındı. Her soru sertifikasyonlu çevirmen tarafından Türkçeye çevrildikten sonra hem İngilizce hem Türkçe versiyonları yapay zekâ programlarına uygulandı. Sorular cevap anahtarı ile karşılaştırılarak doğru ve yanlış olarak gruplandırıldı. Doğru cevaplama düzeyleri istatistiksel olarak karşılaştırıldı. Bulgular: İngilizce sorulan sorulara ChatGPT-3.5, Copilot ve Gemini sırası ile %63,9, %63,9 ve %50 oranında doğru cevap verdi. Türkçe sorulara ChatGPT-3.5, Copilot ve Gemini sırası ile %52,8, %52,8 ve %66,7 oranında doğru cevap verdi. ChatGPT-3.5, Copilot ve Gemini İngilizce ve Türkçe sorulan soruların sırası ile %22,2'sine, %30,6'sına ve %25'ine farklı cevaplar üretti. ChatGPT-3.5'in farklı cevaplar ürettiği soruların %75'i; Copilot'un farklı cevaplar ürettiği soruların %63,6'sı; Gemini'nin farklı cevaplar ürettiği soruların %22,2'si İngilizce sorulduğunda doğru cevaplanmışken Türkçe sorulduğunda yanlış cevaplandı. Yapay zekâ programları İngilizce ve Türkçe soruları cevaplamada farklı doğru cevap oranına sahip olsa da başarıları arasında istatistiksel olarak anlamlı düzeyde bir fark gözlenmedi (p>0,05). Sonuç: Yapay zekâ programları her ne kadar üvea alanında umut vadetse de bilgi düzeyleri ve dil çeviri, algılama ve cevap verebilme kabiliyetlerinin geliştirilmeye ihtiyacı vardır.
Anahtar Kelimeler: ChatGPT-3.5; Copilot; Gemini; İngilizce ve Türkçe; oküler inflamasyon ve üveit
Objective: To evaluate the changes in the success of free-accessible ChatGPT-3.5, Copilot, and Gemini artificial intelligence chatbots in multiple choice questions in the field of ocular inflammation and uveitis depending on the language difference of the questions. Material and Methods: Thirty-six questions regarding ocular inflammation and uveitis were included in the study. After each question was translated into Turkish by a native speaker, both English and Turkish versions were applied to artificial intelligence programs. The questions were grouped as correct and incorrect by comparing them with the answer key. Correct answer levels were compared statistically. Results: ChatGPT-3.5, Copilot and Gemini correctly answered the questions asked in English at a rate of 63.9%, 63.9%, and 50%, respectively. ChatGPT-3.5, Copilot and Gemini answered the Turkish questions correctly at 52.8%, 52.8%, and 66.7%, respectively. ChatGPT-3.5, Copilot and Gemini produced different answers to 22.2%, 30.6%, and 25% of questions asked in English and Turkish, respectively. Seventy-five percent of the questions for which ChatGPT-3.5 produced different answers; 63.6% of the questions for which Copilot produced different answers; and 22.2% of the questions for which Gemini produced different answers were answered correctly when asked in English but incorrectly when asked in Turkish. Although the artificial intelligence programs had different correct answer rates in answering English and Turkish questions, there was no statistically significant difference between their success (p>0.05). Conclusion: Although artificial intelligence programs are promising in uveitis, their knowledge levels and language translation, perception, and response capabilities need to be improved.
Keywords: ChatGPT-3.5; Copilot; Gemini; English and Turkish; ocular inflammation and uveitis
- Standardization of Uveitis Nomenclature (SUN) Working Group. Development of Classification Criteria for the Uveitides. Am J Ophthalmol. 2021;228:96-105. [Crossref] [PubMed] [PMC]
- Solomon L, Tsegaw A. Pattern of uveitis at a tertiary eye care and training center, North-West Ethiopia. Ocul Immunol Inflamm. 2022;30(7-8):1848-52. [Crossref] [PubMed]
- Rojas-Carabali W, Cifuentes-González C, Wei X, Putera I, Sen A, Thng ZX, et al. Evaluating the diagnostic accuracy and management recommendations of ChatGPT in uveitis. Ocul Immunol Inflamm. 2024;32(8):1526-31. [Crossref] [PubMed]
- Jabs DA, Dick A, Doucette JT, Gupta A, Lightman S, McCluskey P, et al; Standardization of Uveitis Nomenclature Working Group. Interobserver agreement among uveitis experts on uveitic diagnoses: the standardization of uveitis nomenclature experience. Am J Ophthalmol. 2018;186:19-24. [Crossref] [PubMed] [PMC]
- Agrawal R, Ludi Z, Betzler BK, Testi I, Mahajan S, Rousellot A, et al. The Collaborative Ocular Tuberculosis Study (COTS) calculator-a consensus-based decision tool for initiating antitubercular therapy in ocular tuberculosis. Eye (Lond). 2023;37(7):1416-23. [Crossref] [PubMed] [PMC]
- Gao CA, Howard FM, Markov NS, Dyer EC, Ramesh S, Luo Y, et al. Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers. NPJ Digit Med. 2023;6(1):75. [Crossref] [PubMed] [PMC]
- Flanagin A, Bibbins-Domingo K, Berkwits M, Christiansen SL. Nonhuman "authors" and implications for the integrity of scientific publication and medical knowledge. JAMA. 2023;329(8):637-9. [Crossref] [PubMed]
- Smith WM, Armbrust Karen R, Dahr Sam S, Dodds EM, Gangaputra Sapna, Knickelbein Jared E, et al. Uveitis and Ocular Inflammation. San Francisco: American Academy of Ophthalmology; 2023.
- Wen J, Wang W. The future of ChatGPT in academic research and publishing: A commentary for clinical and translational medicine. Clin Transl Med. 2023;13(3):e1207. [Crossref] [PubMed] [PMC]
- Bing Chat [Internet] Microsoft Edge. © Microsoft 2024 [Cited: July 4, 2024]. Available from: [Link]
- Waisberg E, Ong J, Masalkhi M, Zaman N, Sarker P, Lee AG, et al. Google's AI chatbot "Bard": a side-by-side comparison with ChatGPT and its utilization in ophthalmology. Eye (Lond). 2024;38(4):642-5. [Crossref] [PubMed] [PMC]
- Google AI updates: Bard and new AI features in Search.(accessed July 4, 2024) [Link]
- Rahimy E. Deep learning applications in ophthalmology. Curr Opin Ophthalmol. 2018;29(3):254-60. [Crossref] [PubMed]
- Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I. Language models are unsupervised multitask learners. OpenAI blog. 2019;1(8):9. [Link]
- Jeblick K, Schachtner B, Dexl J, Mittermeier A, Stüber AT, Topalis J, et al. ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports. Eur Radiol. 2024;34(5):2817-25. [Crossref] [PubMed] [PMC]
- Khan RA, Jawaid M, Khan AR, Sajjad M. ChatGPT - Reshaping medical education and clinical management. Pak J Med Sci. 2023;39(2):605-7. [Crossref] [PubMed] [PMC]
- Rojas-Carabali W, Sen A, Agarwal A, Tan G, Cheung CY, Rousselot A, et al. Chatbots vs. human experts: evaluating diagnostic performance of chatbots in uveitis and the perspectives on AI adoption in ophthalmology. Ocul Immunol Inflamm. 2024;32(8):1591-8. [Crossref] [PubMed]
- Mihalache A, Popovic MM, Muni RH. Performance of an artificial intelligence chatbot in ophthalmic knowledge assessment. JAMA Ophthalmol. 2023;141(6):589-97. [Crossref] [PubMed] [PMC]
- Tao BK, Hua N, Milkovich J, Micieli JA. ChatGPT-3.5 and Bing Chat in ophthalmology: an updated evaluation of performance, readability, and informative sources. Eye (Lond). 2024;38(10):1897-902. [Crossref] [PubMed]
- Canleblebici M, Dal A, Erdağ M. Evaluation of the performance of large language models (Chatgpt-3.5, Chatgpt-4, Bing and Bard) in Turkish ophthalmology chief-assistant exams: a comparative study. Turkiye Klinikleri J Ophthalmol. 2024;33(3):163-70. [Crossref]
- Mihalache A, Grad J, Patil NS, Huang RS, Popovic MM, Mallipatna A, Kertes PJ, Muni RH. Google Gemini and Bard artificial intelligence chatbot performance in ophthalmology knowledge assessment. Eye (Lond). 2024;38(13):2530-5. [Crossref] [PubMed] [PMC]
.: Process List