A Metaheuristic for Fuzzy Density Based SVM and Confidence SMOTE for Early Prediction of Diabetes

SDB-SVM

  • Asma Driouich Engineering, Mathematicals and Informatiques laboratory, Faculty of Sciences, uiz, Agadir, Morocco
  • ABDELLATIF EL OUISSARI LaR2A Laboratory, Faculty of Sciences, Abdelmalek Essaadi University, Tetouan, Morocco
  • Karim EL MOUTAOUAKIL Engineering Science Laboratory (LSI), Polydisplinary Faculty of Taza,USMBA, Morocco
  • Ismail Akharraz Engineering, Mathematicals and Informatiques laboratory, Faculty of Sciences, uiz, Agadir, Morocco
Keywords: DB-Support vector machine, Class Imbalance, Classification, Diabet, Machine learning

Abstract

Diabetes is a chronic disease that affects millions of people worldwide. In this work, we propose a confident version of the density-based support vector machine for early detection of diabetes. The proposed method, called SMOTE Density Based Support Vector Machine (SDB-SVM), considers unbalanced data sets. First, we clean the diabetes datasets using DBSVM which has a high ability to detect harmful samples. Then, we call SMOTE to balance the datasets based on the confidence of each synthetic point. DBSVM allows SMOTE to produce synthetic data that is plausible to the minority class data. We test the proposed system on several unbalanced diabetes datasets such as PIMA and Germany datasets. In this sense, we compare our method with well-known classifiers. The experimental results show the superiority and efficiency of the proposed algorithm.

References

Li, J., Yuan, P., Hu, X., Huang, J., Cui, L., Cui, J., ... and Xu, J. (2021). A tongue features fusion approach to predicting prediabetes and diabetes with machine learning. Journal of Biomedical Informatics, 103693.

Thirunavukkarasu, U., and Umapathy, S. (2020). Classification of Prediabetes and Healthy Subjects in Plantar Infrared Thermal Imaging Using Various Machine Learning Algorithms. In Micro-Electronics and Telecommunication Engineering (pp. 85-96). Springer, Singapore.

Choi, S. B., Kim, W. J., Yoo, T. K., Park, J. S., Chung, J. W., Lee, Y. H., ... and Kim, D. W. (2014). Screening for prediabetes using machine learning models. Computational and mathematical methods in medicine, 2014.


Chen L, Magliano DJ, Zimmet PZ (2011) The worldwide epidemiology of type 2 diabetes mellitus—present and future perspectives. Nat Rev Endocrinol 8:228–236


Bounabi, M., Moutaouakil, K. E., \& Satori, K. (2020, December). The Automatic option of inference rules for the fuzzy TF-IDF. In 2020 IEEE 2nd International Conference on Electronics, Control, Optimization and Computer Science (ICECOCS) (pp. 1-6). IEEE.

El Moutaouakil, K., \& Touhafi, A. (2020, November). A New Recurrent Neural Network Fuzzy Mean Square Clustering Method. In 2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech) (pp. 1-5). IEEE.


EL MOUTAOUAKIL KARIM., EL OUISSARI ABDELLATIF., Touhafi, A., \& AHARRANE, N. (2020, November). An Improved Density Based Support Vector Machine (DBSVM). In 2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech) (pp. 1-7). IEEE.

Kumari, V. Anuja, and R. Chitra. "Classification of diabetes disease using support vector machine." International Journal of Engineering Research and Applications 3.2 (2013): 1797-1801.

Hassan, M. M., and Amiri, N. Classification of Imbalanced Data of Diabetes Disease Using Machine Learning Algorithms. International Conference on Theoretical and Applied Computer Science and Engineering (ICTACSE, 2019) (2019), 21(81), 33-24.


Aharrane, Nabil, Karim El Moutaouakil, and Khalid Satori. "A comparison of supervised classification methods for a statistical set of features: Application: Amazigh OCR." In 2015 Intelligent Systems and Computer Vision (ISCV), pp. 1-8. IEEE, 2015.


N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer, SMOTE: Synthetic minority over-sampling technique, Journal of Artificial Intelligence Research 16 (2002) 321–357, ISSN 10769757, doi: 10.1613/ jair.953 .

H. Han, W.-Y. Wang, B.-H. Mao, Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning, Advances in intelligent computing 17 (12) (2005) 878–887, ISSN 1941-0506, doi: 10.1007/11538059 91 .

D. A. Cieslak, N. V. Chawla, A. Striegel, Combating imbalance in network intrusion datasets, in: IEEE International Conference on Granular Computing, 2006, IEEE, ISBN 1-4244-0134-8, 732–737, doi: 10.1109/GRC.2006.1635905 , 2006.

] I. Nekooeimehr, S. K. Lai-Yuen, Adaptive semi-unsupervised weighted oversampling (A-SUWO) for imbalanced datasets, Expert Systems with Applications 46 (2016) 405–416, ISSN 09574174, doi: 10.1016/j.eswa.2015.10.031 .

W.-C. Lin, C.-F. Tsai, Y.-H. Hu, J.-S. Jhang, Clustering-based undersampling in class-imbalanced data,
Information Sciences 409-410 (2017) 17–26, ISSN 0020-0255, doi: 10.1016/j.ins.2017.05.008 .

Khanam, J. J., \& Foo, S. Y. (2021). A comparison of machine learning algorithms for diabetes prediction. ICT Express.


Tigga, Neha Prerna, and Shruti Garg. "Prediction of type 2 diabetes using machine learning classification methods." Procedia Computer Science 167 (2020): 706-716.

Shuja, M., Mittal, S., and Zaman, M. (2020). Effective prediction of type ii diabetes mellitus using data mining classifiers and SMOTE. In Advances in computing and intelligent systems (pp. 195-211). Springer, Singapore.

Devi, R. D. H., Bai, A., and Nagarajan, N. (2020). A novel hybrid approach for diagnosing diabetes mellitus using farthest first and support vector machine algorithms. Obesity Medicine, 17, 100152.


Ettaouil, M., Ghanou, Y., ElMoutaouakil, K., et al. Image medical compression by a new architecture optimization model for the Kohonen networks. International Journal of Computer Theory and Engineering, 2011, vol. 3, no 2, p. 204.

Aharrane, N., El Moutaouakil, K., & Satori, K. (2015). Recognition of handwritten Amazigh charactersbased on zoning methods and MLP. WSEAS transactions on Computers, 14(19), 178-185.

Bounabi, M., Moutaouakil, K. E., & Satori, K. (2019). A comparison of text classification methodsusingdifferentstemming techniques. International Journal of Computer Applications in Technology, 60(4), 298-306.

https://www.kaggle.com/uciml/pima-indians-diabetesdatabase

https://www.kaggle.com/johndasilva/diabetes
Published
2024-12-27
How to Cite
Driouich, A., EL OUISSARI, A., EL MOUTAOUAKIL, K., & Akharraz, I. (2024). A Metaheuristic for Fuzzy Density Based SVM and Confidence SMOTE for Early Prediction of Diabetes. Statistics, Optimization & Information Computing. https://doi.org/10.19139/soic-2310-5070-1348
Section
Research Articles