Penerapan Synthetic Minority Oversampling Technique pada Pemodelan Regresi Logistik Biner terhadap Keberhasilan Studi Mahasiswa Program Magister IPB
Keywords:binary logistic regression, SMOTE, unbalance data
The Postgraduate School of IPB has academic standards as well as high competitiveness of graduates who have spread both at home and abroad. In this study Binary Logistic Regression method was used to determine the factors that influence the success of the study of Postgraduate students of Bogor Agricultural University (Graduate School-IPB). The data used are data from IPB Graduate School students who graduated from 2011 to 2015. The response variable used is the success status of student studies namely graduating and not passing and using 9 explanatory variables namely gender, marital status, admission status when entering S2, college status S1 level, the source of S2 education costs, group of agencies working, S2 study program groups, age when entering S2 and S1 GPA. The data obtained is not balanced with the percentage of students who graduate is greater than those who did not pass, so the imbalance of data is handled with SMOTE if it is not handled it will cause a misclassification. Comparison of classification results seen in testing data. The results in the model before SMOTE have an area under the curve or AUC of 0.6760, an accuracy value of 88.77%, a sensitivity value of 99.09% and a specificity of 4.63%. The model after 600% oversampling SMOTE has an AUC value of 0.6642, an accuracy value of 78.36%, a sensitivity value of 83.65%, and a specificity value of 35.18%. Although the accuracy of the model and sensitivity value before SMOTE was higher than the model after SMOTE, the specificity in the model after SMOTE was higher, which meant that the model after SMOTE was better at predicting minority classes (not graduating).