Kajian Metode Pohon Model Logistik (Logistic Model Tree) dengan Penanganan Ketakseimbangan Data

Authors

  • Akmala Firdausi Department of Statistics, IPB University
  • Aam Alamudi Department of Statistics, IPB University
  • Kusman Sadik Department of Statistics, IPB University

DOI:

https://doi.org/10.29244/xplore.v11i2.922

Keywords:

imbalanced data handling, logistic model tree, ROSE, SMOTE, undersampling

Abstract

Logistic model tree is a nonparametric modelling method that combines decision tree with linear logistic regression. Logistic model tree handles multicollinearity well, but is not immune to problems that arise due to data imbalance. This study was carried to compare the performance of undersampling, SMOTE, and ROSE in handling imbalanced data when used in tandem with logistic model tree. The data used in the simulation was obtained by generating random numbers following the Bernoulli distribution as the response variable and the Bivariate Normal distribution as the explanatory variables, based on five different imbalance levels. Comparisons done on the AUC value showed that logistic model trees built with methods to handle imbalanced data performed better than logistic model trees built without applying any such method on every level of tested data imbalance in classifying objects. Among those, logistic model trees built with ROSE performed better than logistic model trees built with other methods. On datasets with low level of imbalance, the performance of logistic model trees built with ROSE and undersampling do not significantly differ.

Downloads

Published

2022-05-31

How to Cite

Firdausi, A., Alamudi, A., & Sadik, K. (2022). Kajian Metode Pohon Model Logistik (Logistic Model Tree) dengan Penanganan Ketakseimbangan Data. Xplore: Journal of Statistics, 11(2), 157–167. https://doi.org/10.29244/xplore.v11i2.922

Issue

Section

Articles

Most read articles by the same author(s)

1 2 > >>