Objective: Associative classification is a method that generates a rule-based classifier in a categorical data set. The main purpose of the associative classification is to create classification models with high performance and, in addition, to improve interpretability thanks to the rules it creates. In this study, it is aimed to classify, predict cervical cancer with the methods of relational classification and to determine the most important parameters and relational rules associated with the disease. Material and Methods: In the study, regular class association rules (RCAR) and classification based on associations (CBA) methods were applied to the open access data set named 'Cervical Cancer Behavioral Risk Data Set' and the results were compared. In order to separate the numerical variables in the data set, Boruta feature selection method was applied to determine the most important features about Ameva and cervical cancer. The performances of the created relational classification models were evaluated with accuracy, balanced accuracy, sensitivity, specificity, Matthews correlation coefficient (MCC), Gmean, diagnostic accuracy, Youden's index, positive predictive value, negative predictive value and F1-score criteria. Results: According to CBA model results, sensitivity is 100%, specificity 98%, accuracy 98.6%, balanced accuracy 99%, Youden's index 98%, MCC 96.7%, diagnostic accuracy 98.6%, G-mean 97.7%, negative predictive value 1%, positive predictive value 95.5%, and F1 score 97.7%. According to RCAR model results, sensitivity is 90.5%, specificity 98%, accuracy 95.8%, balanced accuracy 94.3%, Youden's index 88.5%, MCC 89.8%, diagnostic accuracy 95.8%, G-mean 95.6%, negative predictive value 96.2%, positive predictive value 95%, and F1 score 92.7%. Conclusion: When the results are examined, it can be said that the CBA model is more successful in classifying cervical cancer compared to the RCAR model. In addition, the relational classification models created in this study and the rules obtained regarding the disease are promising in terms of their use in early diagnosis and preventive medicine practices for cervical cancer.
Keywords: Cervical cancer; regularized class association rules; classification based on association rules; associative classification methods
Amaç: İlişkili sınıflandırma, kategorik bir veri kümesinde kural tabanlı bir sınıflandırıcı oluşturan bir yöntemdir. İlişkisel sınıflandırmanın temel amacı, yüksek performanslı sınıflandırma modelleri oluşturmak ve ayrıca oluşturduğu kurallar sayesinde yorumlanabilirliği artırmaktır. Bu çalışmada, rahim ağzı kanserinin ilişkisel sınıflandırma yöntemleri ile serviks kanserinin sınıflandırılması, tahmin edilmesi ve hastalıkla ilişkili en önemli parametrelerin ve ilişkisel kuralların belirlenmesi amaçlanmıştır. Gereç ve Yöntemler: Çalışmada 'Rahim Ağzı Kanseri Davranışsal Risk Veri Seti' adlı açık erişim veri setine düzenli sınıf ilişkilendirme kuralları [regular class association rules (RCAR)] ve ilişki kurallarına dayalı sınıflandırma [classification based on association (CBA)] yöntemleri uygulanmış ve sonuçlar karşılaştırılmıştır. Veri setindeki sayısal değişkenleri ayırmak amacıyla Ameva ve rahim ağzı kanseri ile ilgili en önemli özellikleri belirlemek için Boruta özellik seçme yöntemi uygulanmıştır. Oluşturulan ilişkisel sınıflandırma modellerinin performansları; doğruluk, dengeli doğruluk, duyarlılık, özgüllük, Matthews korelasyon katsayısı [Matthews correlation coefficient (MCC)], G-ortalama, tanısal doğruluk, Youden indeksi, pozitif tahmin değeri, negatif tahmin değeri ve F1 skor kriterleri ile değerlendirildi. Bulgular: CBA model sonuçlarına göre duyarlılık %100, özgüllük %98, doğruluk %98,6, dengeli doğruluk %99, Youden indeksi %98, MCC %96,7, tanısal doğruluk %98,6, G-ortalama %97,7, negatif tahmin değeri %1, pozitif tahmin değeri %95,5 ve F1 puanı %97,7'dir. RCAR model sonuçlarına göre duyarlılık %90,5, özgüllük %98, doğruluk %95,8, dengeli doğruluk %94,3, Youden indeksi %88,5, MCC %89,8, tanısal doğruluk %95,8, Gortalama %95,6, negatif tahmin değeri %96,2, pozitif tahmin değeri %95 ve F1 puanı %92,7. Sonuç: Sonuçlar incelendiğinde, CBA modelinin RCAR modeline göre rahim ağzı kanserini sınıflandırmada daha başarılı olduğu söylenebilir. Ayrıca bu çalışmada oluşturulan ilişkisel sınıflandırma modelleri ve hastalığa ilişkin elde edilen kurallar, rahim ağzı kanserine yönelik erken tanı ve koruyucu hekimlik uygulamalarında kullanılması açısından umut vericidir.
Anahtar Kelimeler: Rahim ağzı kanseri; düzenli sınıf birleştirme kuralları; ilişkilendirme kurallarına göre sınıflandırma; ilişkisel sınıflandırma yöntemleri
- Islami F, Fedewa SA, Jemal A. Trends in cervical cancer incidence rates by age, race/ethnicity, histological subtype, and stage at diagnosis in the United States. Prev Med. 2019;123:316-23. [Crossref] [PubMed]
- Arbyn M, Weiderpass E, Bruni L, de Sanjosé S, Saraiya M, Ferlay J, et al. Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis. Lancet Glob Health. 2020;8(2):e191-e203. [Crossref] [PubMed] [PMC]
- Nanda K, McCrory DC, Myers ER, Bastian LA, Hasselblad V, Hickey JD, et al. Accuracy of the Papanicolaou test in screening for and follow-up of cervical cytologic abnormalities: a systematic review. Ann Intern Med. 2000;132(10):810-9. [Crossref] [PubMed]
- Michalas SP. The Pap test: George N. Papanicolaou (1883-1962). A screening test for the prevention of cancer of uterine cervix. Eur J Obstet Gynecol Reprod Biol. 2000;90(2):135-8. [Crossref] [PubMed]
- Perçin İ, Yağın FH, Arslan AK, Çolak C. An interactive web tool for classification problems based on machine learning algorithms using java programming language: data classification software. IEEE; 2019. [Crossref]
- Tkatchenko A. Machine learning for chemical discovery. Nature Communications. 2020;11:41-25. [Crossref] [PubMed] [PMC]
- Perçin İ, Yağın FH, Güldoğan E, Yoloğlu S. ARM: an interactive web software for association rules mining and an application in medicine. IEEE; 2019. [Crossref]
- Ünvan YA. Market basket analysis with association rules. Communications in Statistics-Theory and Methods. 2021;50(7):1615-28. [Crossref]
- Arslan AK, Küçükakçalı Z, Balıkçı Çiçek İ, Çolak C. A novel interpretable web-based tool on the associative classification methods: an application on breast cancer dataset. The Journal of Cognitive Systems. 2020;5(1):33-40. [Link]
- Ibrahim SS, Sivabalakrishnan M. An evolutionary memetic weighted associative classification algorithm for heart disease prediction. In: Hemanth DJ, Kumar BV, Manavalan GRK, eds. Recent Advances on Memetic Algorithms and its Applications in Image Processing. Singapore: Springer; 2020. Cilt. 873. p.183-99. [Crossref]
- Machmud R, Wijaya A. Behavior determinant based cervical cancer early detection with machine learning algorithm. Advanced Science Letters. 2016;22(10):3120-3. [Crossref]
- Balıkçı Çiçek İ, Küçükakçalı Z, Çolak C. Associative classification approach can predict prostate cancer based on the extracted association rules. The Journal of Cognitive Systems;-. 2020;5(2):51-4. [Link]
- Thanajiranthorn C, Songram P. Efficient rule generation for associative classification. Algorithms. 2020;13(11):299. [Crossref]
- Küçükakçalı Z, Balıkçı Çiçek İ, Güldoğan E, Çolak C. Assessment of associative classification approach for predicting mortality by heart failure. The Journal of Cognitive Systems. 2020;5(2):41-5. [Link]
- Thabtah F, Mahmood Q, McCluskey L, Abdel-Jaber H. A new classification based on association algorithm. Journal of Information & Knowledge Management. 2010;9(1):55-64. [Crossref]
- Azmi M, Runger GC, Berrado A. Interpretable regularized class association rules algorithm for classification in a categorical data space. Information Sciences. 2019;483:313-31. [Crossref]
- Azmi M, Berrado A. RCAR framework: building a regularized class association rules model in a categorical data space. SITA'20: Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications. 2020:1-6. [Crossref]
- Balıkçı Çiçek İ, Küçükakçalı Z. Classification of hypothyroid disease with extreme learning machine model. The Journal of Cognitive Systems. 2020;5(2):64-8. [Link]
- Makarova I, Yakupova G, Buyvol P, Mukhametdinov E, Pashkevich A. Association rules to identify factors affecting risk and severity of road accidents. Proceedings of the 6th International Conference on Vehicle Technology and Intelligent Transport Systems. 2020:614-21. [Crossref] [PubMed]
- Tayal DK, Meena K. A new MapReduce solution for associative classification to handle scalability and skewness in vertical data structure. Future Generation Computer Systems. 2020;103:44-57. [Crossref]
.: Process List