Show simple item record

dc.contributor.authorAkkaya, Berke
dc.date.accessioned2021-12-10T10:11:45Z
dc.date.available2021-12-10T10:11:45Z
dc.identifier.citationAkkaya B., "The Effect of Recursive Feature Elimination with Cross-Validation Method on Classification Performance with Different Sizes of Datasets", 4th International Conference on Data Science & Applications, İstanbul, Türkiye, 4 - 06 Haziran 2021, ss.142-152
dc.identifier.otherav_2febfa84-2e47-4e77-a1ee-177d9f9ef7b3
dc.identifier.othervv_1032021
dc.identifier.urihttp://hdl.handle.net/20.500.12627/169394
dc.identifier.urihttps://avesis.istanbul.edu.tr/api/publication/2febfa84-2e47-4e77-a1ee-177d9f9ef7b3/file
dc.description.abstractThe high dimensionality problem, which is one of the problems encountered in classification problems, arises when there are too many features in the dataset. This problem affects the success of classification models and causes loss of time. Feature selection is one of the methods used to eliminate the high dimensionality problem. Feature selection is defined as the selection of the best features that can represent the original dataset. This process aims to reduce the size of the data by reducing the number of features in the dataset by selecting the most useful and important features for the relevant problem. In this study, the performances of various classification algorithms in different data sizes were compared by using the recursive feature elimination method with cross-validation, which is one of the feature selection methods. Recursive feature elimination with cross-validation is a method that tries to get the most accurate result by eliminating the least important variables with cross- validation. In the study, datasets containing binary classification problems with a balanced distribution were used. Accuracy, ROC-AUC score, and fit time were used as evaluation metrics, while Logistic Regression, Support Vector Machines, Naive Bayes, k-Nearest Neighbors, Stochastic Gradient Descent, Decision Tree, AdaBoost, Multilayer Perceptron, and XGBoost classifiers were used as classification algorithms in the study. When the findings obtained as a result of recursive feature elimination with cross-validation were examined, it was observed that the accuracy increased by 5% on average and the ROC-AUC score increased by 5,3% on average, and the fit time decreased by about 5,1 seconds on average. It has been concluded that Naive Bayes and Multilayer Perceptron classifiers are the most sensitive to feature selection since they are the classifiers whose classification performance increases the most after feature selection.
dc.language.isoeng
dc.subjectÇOK DİSİPLİNLİ BİLİMLER
dc.subjectPSİKOLOJİ, MATEMATİKSEL
dc.subjectİstatistik
dc.subjectİstatistik Analiz ve Uygulamaları
dc.subjectTemel Bilimler
dc.subjectDoğa Bilimleri Genel
dc.subjectMultidisciplinary
dc.subjectPsikoloji
dc.subjectTemel Bilimler (SCI)
dc.titleThe Effect of Recursive Feature Elimination with Cross-Validation Method on Classification Performance with Different Sizes of Datasets
dc.typeBildiri
dc.contributor.departmentİstanbul Üniversitesi , İşletme Fakültesi , İşletme Bölümü
dc.contributor.firstauthorID2725033


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record