A Mathematical Programming Model with Cost Sensitivity in the Objective Function for Imbalanced Datasets Challenges

  • Redouane HAKIMI Institut National de Statistique et d'Economie Appliquée (INSEA) https://orcid.org/0000-0001-8901-3980
  • Badreddine Benyacoub Institut National de Statistique et d'Economie Appliquée
  • Mohamed Ouzineb
Keywords: Mathematical programming, Binary classification, Imbalanced data, Cost sensitivity, MSD

Abstract

This paper introduces CS-MSD, a cost-sensitive deviation minimization model designed to address a typical issue of imbalanced datasets in binary classification, which is a major problem in machine learning tasks across various areas. Imbalanced datasets, in which a single class significantly dominates the other, frequently generate biased models that neglect the minority class, making them extremely important in practical sectors like health services and financial services. The Traditional re-sampling techniques, including under-sampling and over-sampling, have associated limitations, such as information loss and over-fitting. CS-MSD overcomes these limitations by combining external deviations with cost sensitivity, which produces a perfect balance of minority and majority class costs. The model outperforms Decision Tree and Radial SVM, achieving a Recall of 0.958 on the win dataset, alongside Specificity and G-mean metrics. With CPU and wall times of 0.052 s and 0.054 s, respectively, CS-MSD also surpasses Random Forest and Bagging in computational efficiency, making it ideal for time-sensitive tasks. Its combined flexibility and processing speed establish CS-MSD as a vital solution for enhancing classification performance across diverse domains.

Author Biography

Redouane HAKIMI, Institut National de Statistique et d'Economie Appliquée (INSEA)
PhD student in Mathematics at the National Institute of Statistics and Applied Economics (INSEA) in Rabat, specializing in data science and linear programming. Also serving as an adjunct lecturer at the same institution, where I teach courses in mathematics for data science and linear programming.
Published
2025-06-03
How to Cite
HAKIMI, R., Benyacoub, B., & Ouzineb, M. (2025). A Mathematical Programming Model with Cost Sensitivity in the Objective Function for Imbalanced Datasets Challenges. Statistics, Optimization & Information Computing. https://doi.org/10.19139/soic-2310-5070-2549
Section
I2CEAI24