Application of Adaptive Synthetic Nominal and Extreme Gradient Boosting Methods in Determining Factors Affecting Obesity: A Case Study of Indonesian Basic Health Research Survey 2013

Aplikasi Metode Adaptive Synthetic Nominal dan Extreme Gradient Boosting dalam Menentukan Faktor yang Memengaruhi Obesitas: Studi Kasus Riset Kesehatan Dasar Indonesia 2013

Authors

  • Yoris Rombe Department of Statistics, Universitas Hasanuddin, Indonesia
  • Sri Astuti Thamrin Department of Statistics, Universitas Hasanuddin, Indonesia
  • Armin Lawi Department of Mathematics, Universitas Hasanuddin, Indonesia & Institut Teknologi Bacharuddin Jusuf Habibie, Indonesia https://orcid.org/0000-0003-1023-6925

DOI:

https://doi.org/10.29244/ijsa.v6i2p309-317

Keywords:

ADASYN-N, feature important, information gain, obesity, XGBoots

Abstract

Obesity is the accumulation of excessive body fat and can be harmful to health. According to recent studies, several factors that contribute to the increasing prevalence of obesity in Indonesia include poor diet, lack of consumption of vegetables and fruits, high consumption of fast food, area of residence, and lack of physical activity. In addition, psychological factors, high consumption of alcohol and cigarettes, cultural differences, and stress factors also trigger obesity. The rapid development of the medical field cannot be separated from the availability of data that is increasingly easy to access and increasing knowledge in the medical field. This makes machine learning increasingly needed for pattern recognition from very large medical data, including obesity data. In this study, the factors that influence obesity status in Indonesia will be determined. In order to achieve this, Extreme Gradient Boosting (XGBoost) was used. This method is one of the classification methods that has better scalability and more efficient over its previous methods. Besides that, to overcome the imbalanced data, Adaptive Synthetic Nominal Algorithm (ADASYN-N) is used in order to balance the data and improve its prediction accuracy. Both the ADASYN-N and XGBoost methods will be applied to obesity data from the Indonesian Basic Health Research Survey in 2013. This study shows that female is more at risk in determining obesity status in Indonesia based on the highest gain value (37%). In addition, age 35-54 years, strenuous activity, and eating vegetables for 6 days are also risk factors of obesity.

Downloads

Download data is not yet available.

References

Alkhalaf, M., Yu, P., Shen, J., & Deng, C. (2022). A review of the application of machine learning in adult obesity studies. Applied Computing and Intelligence, 2(1): 32–48. https://doi.org/10.3934/aci.2022002

Charbuty, B., & Abdulazeez, A. (2021). Classification Based on Decision Tree Algorithm for Machine Learning. Journal of Applied Science and Technology Trends, 2(01): 20–28. https://doi.org/10.38094/jastt20165

Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16: 321–357. https://doi.org/10.1613/jair.953

Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785

Fithriasari, K., Hariastuti, I., & Wening, K. S. (2020). Handling Imbalance Data in Classification Model with Nominal Predictors. International Journal of Computing Science and Applied Mathematics, 6(1): 33. https://doi.org/10.12962/j24775401.v6i1.6643

Haibo He, Yang Bai, Garcia, E. A., & Shutao Li. (2008). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 1322–1328. https://doi.org/10.1109/IJCNN.2008.4633969

Jukic, S., Saracevic, M., Subasi, A., & Kevric, J. (2020). Comparison of Ensemble Machine Learning Methods for Automated Classification of Focal and Non-Focal Epileptic EEG Signals. Mathematics, 8(9): 1481. https://doi.org/10.3390/math8091481

Morgenstern, J. D., Rosella, L. C., Costa, A. P., de Souza, R. J., & Anderson, L. N. (2021). Perspective: Big Data and Machine Learning Could Help Advance Nutritional Epidemiology. Advances in Nutrition, 12(3): 621–631. https://doi.org/10.1093/advances/nmaa183

Oddo, V. M., Maehara, M., & Rah, J. H. (2019). Overweight in Indonesia: an observational study of trends and risk factors among adults and children. BMJ Open, 9(9): e031198. https://doi.org/10.1136/bmjopen-2019-031198

Rahayu, S., Adji, T. B., & Setiawan, N. A. (2017). Analisis Perbandingan Metode Over-Sampling Adaptive Synthetic-Nominal (ADASYN-N) dan Adaptive Synthetic-kNN (ADSYN-kNN) untuk Data dengan Fitur Nominal-Multi Categories. 5.

Sari, K., & Rosha, B. Ch. (2016). Several dominants risk factors related to obesity in urban childbearing age women in Indonesia. Health Science Journal of Indonesia, 6(1Jun): 63–68. https://doi.org/10.22435/hsji.v6i1Jun.4494

Song, Y., & Lu, Y. (2015). Decision tree methods: applications for classification and prediction. 27(2): 7. http://dx.doi.org/10.11919/j.issn.1002-0829.215044

Thamrin, S. A., Arsyad, D. S., Kuswanto, H., Lawi, A., & Nasir, S. (2021). Predicting Obesity in Adults Using Machine Learning Techniques: An Analysis of Indonesian Basic Health Research 2018. Frontiers in Nutrition, 8: 669155. https://doi.org/10.3389/fnut.2021.669155

Zhou, Z.-H. (2012). Ensemble Methods: Foundations and Algorithms (0 ed.). https://doi.org/10.1201/b12207

Downloads

Published

2022-08-31

How to Cite

Rombe, Y., Thamrin, S. A., & Lawi, A. (2022). Application of Adaptive Synthetic Nominal and Extreme Gradient Boosting Methods in Determining Factors Affecting Obesity: A Case Study of Indonesian Basic Health Research Survey 2013: Aplikasi Metode Adaptive Synthetic Nominal dan Extreme Gradient Boosting dalam Menentukan Faktor yang Memengaruhi Obesitas: Studi Kasus Riset Kesehatan Dasar Indonesia 2013. Indonesian Journal of Statistics and Its Applications, 6(2), 309–317. https://doi.org/10.29244/ijsa.v6i2p309-317

Issue

Section

Articles