Objective: When results of simulation studies are examined, it is seen that there are noticeable differences among the simulation results due to differences in the experimental conditions. This study aim at re-evaluating the results of 35 simulation studies carried out the same purpose by using graphic techniques. Material and Methods: Type I error estimates of 35 simulation studies carried out under different experimental conditions (pairing type, variance ratio, sample size ratio, number of simulation, skewness, kurtosis, total sample size, equality of sample size, and number of group) were used as a material of this study. Two different graphical techniques, namely Automatic Linear Modeling and Regression Tree Analysis were used in evaluating Type I error estimates of these studies. Results: Statistical analyses results indicated that the results of simulation studies were affected by different factors and these factors should be considered in order to get more reliable and stable estimates. It was observed that the most important factors affecting Type I error estimates were as pairing type, variance ratio, number of simulation, and sample size ratio. Therefore, these factors can be considered the primary factors that might cause getting different results among the studies. Conclusion: Both methods are promising and can be used efficiently to determine the factors that affect the response variable when there is a large and complex data set.
Keywords: Classification and regression trees; data mining; interaction; simulation
Amaç: Simülasyon çalışmalarının sonuçları incelendiğinde, deneysel koşullardaki farklılıklardan dolayı simülasyon sonuçları arasında önemli farklılıklar olduğu görülmektedir. Bu çalışma, aynı amaçla yürütülen 35 simülasyon çalışmasının sonuçlarının, grafik teknikler kullanılarak yeniden değerlendirilmesini amaçlamaktadır. Gereç ve Yöntemler: Eşleştirme tipi, varyans oranı, örnek genişliği oranı, simülasyon sayısı, eğrilik ve diklik katsayıları, toplam örnek genişliği, örnek genişliğinin eşitliği ve grup sayısı gibi farklı deneysel koşullarda gerçekleştirilen 35 simülasyon çalışmasının Tip I hata tahminleri, bu çalışmanın materyali olarak kullanılmıştır. Veri setinin analizinde, Otomatik Doğrusal Modelleme ve Regresyon Ağacı Analizi olmak üzere 2 farklı grafik tekniği kullanılmıştır. Bulgular: Yapılan istatistiksel analiz sonucunda, dikkate alınan simülasyon çalışmaları sonuçlarının farklı faktörlerden etkilendiğini, daha güvenilir ve istikrarlı tahminlerin elde edilebilmesi için tespit edilen faktörlerin dikkate alınması gerektiğini göstermiştir. Tip I hata tahminlerini etkileyen en önemli faktörlerin ise sırasıyla eşleştirme tipi ya da varyans oranları ile örnek genişlikleri arasındaki ilişkiler, varyans oranı, çalışmada dikkate alınan simülasyon sayısı ve örneklem büyüklüğü oranı olduğu görülmüştür. Bu bulgulardan hareketle bu 4 faktörün, çalışmada dikkate alınan 35 simülasyon çalışması sonucunda elde edilen Tip I hata olasılıklarının farklılaşmasında başlıca rol oynayan faktörler oldukları sonucuna varılabilir. Sonuç: Her iki yöntem de umut vericidir, büyük ve karmaşık bir veri kümesi olduğunda yanıt değişkenini etkileyen faktörleri belirlemek için verimli bir şekilde kullanılabilir.
Anahtar Kelimeler: Sınıflandırma ve regresyon ağaçları; veri madenciliği; etkileşim; simülasyon
- Mendeş M, Akkartal E. Regression tree analysis for predicting slaughter weight in broilers. Italian Journal of Animal Science. 2009;8(4):615-24. [Crossref]
- Breiman L, Friedman JH, Stone CJ, Olshen RA, Stone CJ. Classification and Regression Trees. 1st ed. Boca Raton, FL: Routledge; 1984.
- IBM SPSS Inc. IBM SPSS Statistics 21 Algorithms. Chicago, 2012. (Download spss 21 full version 64 bit for free (Windows) (freedownloadmanager.org) [Link]
- Berk RA. Statistical Learning from a Regression Perspective. 1st ed. New York, NY: Springer Science & Business; 2009. [Crossref] [PubMed]
- James G, Witten D, Hastie T, Tibshirani R. An Introduction to Statistical Learning with Applications in R. New York, NY: Springer; First Edition, ISBN: 978-1-4614-7137-0 (Hardcover), 2013.
- Morgan J. Classification and regression tree analysis. Technical Report No. 1. 2014. [Link]
- Paxton P, Curran PJ, Bollen KA, Kirby J, Chen F. Monte Carlo Experiments: design and implementation. Structural Equation Modeling: A Multidisciplinary Journal. 2001;8(2):287-312. [Crossref]
- Mendeş M. Normal dağılım ve varyansların homojenliği ön şartlarının gerçekleşmediği durumlarda varyans analizi tekniği yerine kullanılabilecek bazı parametrik alternatif testlerin I.tip hata ve testin gücü bakımından irdelenmesi. Ankara Üniv. Fen Bil. Enstitüsü, Doktora Tezi, Ankara. 2002. p.278.
- Mendeş M, Başpınar E, Gürbüz F. Confidence interval for test power in Welch, James-second order and Alexander-Govern tests: a simulation study. YYÜ Fen Bilimleri Enstitüsü Dergisi. 2005;10(1):16-22. [Link]
- Bandalos DL. The use of Monte Carlo studies in structural equation modeling research. In: Hancock GR, Mueller RO, eds. Structural Equation Modeling: A Second Course. Greenwich, Conn.: IAP; 2006. p.385-426. ISBN 1-59311-015-4 (hardcover) [Link]
- Lee S. Implementing a simulation study using multiple software packages for structural equation modeling. SAGE Open. 2015;5(3):1-16. [Crossref]
- Welch BL. On the comparison of several mean values: an alternative approach. Biometrika. 1951;38(3/4):330-6. [Crossref]
- James GS. The comparison of several groups of observations when the ratios of the population variances are unknow. Biometrika. 1951;38(3/4):324-9. [Crossref]
- Kohr RL, Games PA. Robustness of the analysis of variance, the Welch procedure and a box procedure to heterogeneous variances. J Exp Educ. 1974;43(1):61-9. [Crossref]
- Brown MB, Forsythe AB. Robust tests for the equality of variances. J Am Stat Assoc. 1974;69(346):364-7. [Crossref]
- Brown MB, Forsythe AB. The small sample behavior of some statistics which test the equality of several means. Technometrics. 1974;16(1):129-32. [Crossref]
- Rogan JC, Keselman HJ. Is the ANOVA F-test robust to variance heterogeneity when sample sizes are equal? An investigation via a coefficient of variation. Am Educ Res J. 1977;14(4):493-8. [Crossref]
- Bishop TA, Dudewicz EJ. Exact analysis of variance with unequal variances: test procedures and tables. Technometrics. 1978;20(4):419-30. [Crossref]
- Ramsey PH. Exact Type I error rates for robustness of student's t test with unequal variances. J Educ Behav Stat. 1980;5(4):337-49. [Crossref]
- Olejnik S. Conditional ANOVA for mean differences when population variances are unknown. J Exp Educ. 1987;55(3):141-8. [Crossref]
- Rasmussen JL. An evaluation of parametric and non-parametric tests on modified and non-modified data. Br J Math Stat Psychol. 1986;39(2):213-20. [Crossref]
- Tomarkin AJ, Serlin RC. Comparison of ANOVA alternatives under variance heterogeneity and specific noncentrality structures. Psychologial Bulletin. 1986;99(1):90-9. [Crossref]
- Krutchkoff RG. One-way fixed effects analysis of variance when the error varianes may be unequal. J Statist Comput Simul. 1988;30(4):259-71. [Crossref]
- Randolph E, Robey R, Barcikowski R. Type I error of the ANOVA revisited using power analysis criteria. Paper presented at the Annual Meeting of the American Educational Research Association. Boston; 1990. Available from: [Link]
- Oshima TC, Algina J. Type I error rates James's second-order test and Wilcox's Hm test under heteroscedasticity and non-normality. Br J Math Stat Psychol. 1992;45(2):255-63. [Crossref]
- Robey RR, Barcikowski RS. Type I error and the number of iterations in Monte Carlo studies of robustness. Br J Math Stat Psychol. 1992;45(2):283-8. [Crossref]
- Alexander RA, Govern DM. A new and simpler approximation for ANOVA under variance heterogeneity. J Educ Behav Stat. 1994;19(2):91-101. [Crossref]
- Hsiung TH, Olejnik S. Type I error rates and statistical power for James second-order test and the univariate F test in two-way fixed-effects ANOVA models under heteroscedasticity and/or nonnormality. J Exp Educ. 1996;65(1):57-71. [Crossref]
- Schneider PJ, Penfield DA. Alexander and Govern's approximation: providing an alternative to ANOVA under variance heterogeneity. J Exp Educ. 1977;65(3):271-86. [Crossref]
- Murphy KR, Myors B. Statistical Power Analysis: A Simple and General Model for Traditional and Modern Hypothesis Tests. London: Lawrence Erlbaum; First Edition; 1998.
- Myers L. Comparability of the James' second-order approximation test and the alexander and govern A statistic for non-normal heteroscedastic data. J Stat Comput Simul. 1998;60(3):207-22. [Crossref]
- Wilcox RR, Charlin VL, Thompson KL. New Monte Carlo results on the robustness of the ANOVA F, W and F* statistics. Communications in Statistics-Simulation and Computation. 1986;15(4):933-43. [Crossref]
- Wilcox RR. A new alternative to the ANOVA F and new results on James's second-order method. Br J Math Stat Psychol. 1988;41(1):109-17. [Crossref]
- Wilcox RR. Adjusting for unequal variances when comparing means in one-way and two-way effects ANOVA models. J Educ Behav Stat. 1989;14(3):269-78. [Crossref]
- Wilcox RR. A one-way random effects model for trimmed means. Psychometrika. 1994;59:289-306. [Crossref]
- Wilcox RR. ANOVA: A paradigm for low power and misleading measures of effect size? Review of Educational Research. 1995;65(1):51-77. [Crossref]
- Wilcox RR. Comparing variances of two independent groups. Br J Math Stat Psychol. 2002;55(1):169-75. [Crossref] [PubMed]
- Moder K. How to keep the type I error rate in ANOVA if variances are heteroscedastic. Austrian Journal of Statistics. 2007;36(3):179-88. [Link]
- Mendeş M, Akkartal E. Comparison of ANOVA-F and WELCH tests with their respective permutation versions in terms of Type I error rates and test power. Kafkas Üniversitesi Veteriner Fakültesi Dergisi. 2010;16(5):711-6. [Link]
- An Q. A Monte Carlo Study of Several Alpha-Adjustment Procedures Used in Testing Multiple Hypotheses in Factorial Anova [Thesis]. Ohio: Ohio University; 2010. [Link]
- Zahayu MY, Suhaida A, Sharipah SSY, Othman AR. A robust alternative to the t-test. Modern Applied Science. 2012;6(5):27-33. [Crossref]
- Mendeş M, Yiğit S. Comparison of ANOVA-F and ANOM tests with regard to type I error rate and test power. Journal of Statistical Computation and Simulation. 2013;83(11):2093-104. [Crossref]
- LinkedIn Learning. How to use automatic linear modeling ǀ Lynda.com tutorial. Release date: 22.8.2011. Available from:(Access date: 28.8.2019) [Link]
- Yang H. The case for being automatic: introducing the automatic linear modeling (LINEAR) procedure in SPSS statistics. Multiple Linear Regression Viewpoints. 2013;39(2):27-37. [Link]
- Liaw A, Wiener M. Classification and regression by randomForest. R News. 2002;2(3):18-22. [Link]
- Merkle EC, Shaffer VA. Binary recursive partitioning: background, methods, and application to psychology. Br J Math Stat Psychol. 2011;64(Pt 1):161-81. [Crossref] [PubMed]
- Gordon L. Using Classification and Regression Trees (CART) in SAS® Enterprise MinerTM for applications in public health. SAS Global Forum 2013. [Link]
- Therneau T, Atkinson B, Ripley B. rpart: Recursive partitioning and regression trees. R package version 4.1-15. 2019. Retrieved from [Link]
- Yang L, Liu S, Tsoka S, Papageorgiou LG. A regression tree approach using mathematical programming. Expert System with Application. 2017;78:347-57. [Crossref]
- Wludyka PS, Nelson PR. An analysis-of-means-type test for variances from normal populations. Technometrics. 1997;39(3):274-85. [Crossref]
- Mendeş M, Yiğit S. An alternative approach for multiple comparison problems when there are a large number of groups: ANOM technique. Journal of Animal and Plant Sciences. 2018;28(4):1074-9.
- Mendeş M. İstatistiksel Yöntemler ve Deneme Planlanması. 1. Baskı. İstanbul: Kriter Yayın Evi; 2019.
- Hayes T, Usami S, Jacobucci R, McArdle JJ. Using Classification and Regression Trees (CART) and random forests to analyze attrition: Results from two simulations. Psychol Aging. 2015;30(4):911-29. [Crossref] [PubMed] [PMC]
- Gonzales O, O'Rourke HP, Wurpts IC, Grimm KJ. Analyzing Monte Carlo simulation studies with classification and regression trees. Structural Equation Modeling A Multidisciplinary Journal. 2018;25(3):403-13. [Crossref]
.: Process List