Evaluating the Performances of ROC Curve Estimation Methods for Different Distributions and Different Kernel Functions

Deniz SIĞIRLI

doi:10.5336/biostatic.2021-85803

Turkiye Klinikleri Journal of Biostatistics

Journal Identity

About Journal

Peer Review Process

Last Issue

Issue List

Editorial Board

Information For Authors

Author Forms

Article Submission

Subscription

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Evaluating the Performances of ROC Curve Estimation Methods for Different Distributions and Different Kernel Functions

Farklı Dağılımlar ve Farklı Çekirdek Fonksiyonları için ROC Eğrisi Tahmin Yöntemlerinin Performanslarının İncelenmesi

Deniz SIĞIRLI^a
^aDepartment of Biostatistics, Bursa Uludağ University Faculty of Medicine, Bursa, TURKEY

Turkiye Klinikleri J Biostat. 2021;13(3):225-35

doi: 10.5336/biostatic.2021-85803

Article Language: EN

Full Text

ABSTRACT
Objective: Receiver operating characteristic (ROC) curve is a statistical method used to examine the actual effectiveness of a diagnostic test or a biomarker in a comprehensive and reliable way. Several methods have been proposed to estimate ROC curve properly. The aim of the present study is to compare recent ROC curve estimation methods for different distribution and sample sizes. Material and Methods: Log-concave density and smooth log-concave density estimate based ROC curve estimation, kernel based ROC curve estimation with Gaussian, Epanechnikov, rectangular, triangular kernels, and binormal ROC estimation methods were compared for different simulation scenarios. Results: The ROC curve estimation methods based on kernel estimates gave their best performances when the biomarker values of non-diseased group are normal but the biomarker values of the diseased group are right-skewed, with a notable difference from other methods. Epanechnikov and rectangular kernel methods yielded better performance than other kernel methods in small sample sizes; but this difference disappeared as the sample size increased. The methods based on kernel or log-concave density estimate gave their worst results for the simulation scenario where the data were nonnormal but symmetric. Conclusion: The performances of the other methods examined in the study exceeded the performance of the binormal method in highly skewed data in both groups and when the distribution of diseased and non-diseased populations were right-skewed and normal, respectively.

Keywords: Diagnostic test; receiver operating characteristic curve; kernel density estimation; log-concave density estimation

ÖZET
Amaç: Alıcı işletim karakteristiği [receiver operating characteristic (ROC)] eğrisi, bir tanı testinin veya bir biyobelirtecin gerçek etkinliğini kapsamlı ve güvenilir bir şekilde incelemek için kullanılan istatistiksel bir yöntemdir. ROC eğrisini doğru bir şekilde tahmin etmek için çeşitli yöntemler önerilmiştir. Bu çalışmanın amacı, farklı dağılım ve örneklem büyüklükleri için güncel ROC eğrisi tahmin yöntemlerini karşılaştırmaktır. Gereç ve Yöntemler: Log-konkav yoğunluk ve düzgün log-konkav yoğunluk tahmini tabanlı ROC eğrisi tahmin yöntemi, Gaussian, Epanechnikov, dikdörtgen, üçgen kernel fonksiyonu kullanan kernel tabanlı ROC eğrisi tahmin yöntemleri ve binormal ROC eğrisi tahmin yöntemleri farklı simülasyon senaryoları kullanılarak karşılaştırılmıştır. Bulgular: Kernel tahmincilerine dayanan ROC eğrisi tahmin yöntemleri, sağlıklı grubun biyobelirteç değerlerinin normal dağılım, hasta grubun biyobelirteç değerlerinin sağa çarpık dağılım gösterdiği durumda, diğer yöntemlerden büyük farkla en iyi performansı göstermiştir. Epanechnikov vedikdörtgen kernel yöntemleri, diğer kernel yöntemlerinden küçük örneklemlerde daha iyi performans göstermekle birlikte aralarındaki fark, örneklem büyüklüğündeki artışla ortadan kalkmıştır. Kernel ve log-konkav yoğunluk tahminine dayalı yöntemler, verinin normal olmadığı fakat simetrik olduğu durumda en kötü sonucu vermişlerdir. Sonuç: Çalışmada incelenen yöntemlerin performansları, her iki grupta yüksek oranda çarpık veriler olması durumunda ve hasta ve sağlıklı popülasyonların dağılımları sırasıyla sağa çarpık dağılım ve normal dağılım olduğunda, binormal yöntemin performansını geçmiştir.

Anahtar Kelimeler: Tanı testi; alıcı işlem karakteristik eğrisi; kernel yoğunluk tahmini; log-konkav yoğunluk tahmini

REFERENCES:

Swets JA, Pickett RM. Evaluation of Diagnostic Systems: Methods from Signal Detection Theory. 1st ed. New York: Academic Press Inc.; 1982. [Link]
Hanley JA. Receiver operating characteristic (ROC) methodology: the state of the art. Crit Rev Diagn Imaging. 1989;29(3):307-35. [PubMed]
Begg CB. Advances in statistical methodology for diagnostic medicine in the 1980's. Stat Med. 1991;10(12):1887-95. [Crossref] [PubMed]
Zhou XH, Obuchowski N, McClish D. Statistical Methods in Diagnostic Medicine. 2nd ed. New Jersey: John Wiley & Sons Inc; 2011. [Crossref]
Hsieh F, Turnbull B. Nonparametric and semiparametric estimation of the receiver operating characteristic curve. Annals of Statistics. 1996;24(1):25-40. [Crossref]
Pepe M, Longton G, Janes H. Estimation and comparison of receiver operating characteristic curves. Stata J. 2009;9(1):1. [Crossref] [PubMed] [PMC]
Zou KH, Hall WJ, Shapiro DE. Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests. Stat Med. 1997;16(19):2143-56. [Crossref] [PubMed]
Jokiel-Rokita A, Pulit M. Nonparametric estimation of the ROC curve based on smoothed empirical distribution functions. Stat Comput. 2013;23(6):703-12. [Crossref]
Zou KH, Hall WJ, Shapiro DE. Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests. Stat Med. 1997;16(19):2143-56. [Crossref] [PubMed]
Lloyd CJ. Using smoothed receiver operating characteristic curves to summarize and compare diagnostic systems. JASA. 1998;93(444):1356-64. [Crossref]
Lloyd CJ, Yong Z. Kernel estimators of the ROC curve are better than empirical. Statistics and Prob Letters. 1999;44(3):221-8. [Crossref]
Rufibach K. A smooth ROC curve estimator based on log-concave density estimates. Int J Biostat. 2012;8(1). [Crossref] [PubMed]
Rufibach K. Computing maximum likelihood estimators of a logconcave density function. J Statist Comp Sim. 2007;77(7):561-74. [Crossref]
Zhou XH, Lin H. Semi-parametric maximum likelihood estimates for ROC curves of continuous-scale tests. Stat Med. 2008;27(25):5271-90. [Crossref] [PubMed] [PMC]
Wang D, Cai X. Smooth ROC curve estimation via Bernstein polynomials. PLoS One. 2021;16(5):e0251959. [Crossref] [PubMed] [PMC]
Sheather SJ, Jones MC. A reliable data-based bandwidth selection method for kernel density estimation. J R Statist Soc B. 1991;53(3):683-90. [Crossref]
Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez JC, et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011;12:77. [Crossref] [PubMed] [PMC]

.: Up To Date

.: Process List

Turkish English

About us Contact Us Comments

Ortadoğu Reklam Tanıtım Yayıncılık Turizm Eğitim İnşaat Sanayi ve Ticaret A.Ş.

.: Address

Turkocagi Caddesi No:30 06520 Balgat / ANKARA
Phone: +90 312 286 56 56
E-mail: info@turkiyeklinikleri.com

.: Manuscript Editing Department

Phone: +90 312 286 56 56/ 154 - 153
E-mail: yaziisleri@turkiyeklinikleri.com

.: English Language Redaction

Phone: +90 312 286 56 56/ 145
E-mail: tkyayindestek@turkiyeklinikleri.com

.: Marketing Sales-Project Department

Phone: +90 312 286 56 56/ 142
E-mail: reklam@turkiyeklinikleri.com

.: Subscription and Public Relations Department

Phone: +90 312 286 56 56/ 197
E-mail: abone@turkiyeklinikleri.com

.: Customer Services

Phone: +90 312 286 56 56/ 197
E-mail: satisdestek@turkiyeklinikleri.com