Optimization of the K-Nearest Neighbor Algorithm to Predict Bank Churn

  • Sonia Akakpo
  • Patrick Dambra
  • Rachell Paz
  • Timothy Smyth
  • Frank Torre
  • Chunhui Yu Department of Mathematics, Farmingdale State College, State University of New York, USA
Keywords: Bank Churn, K-Nearest Neighbors, Random Forests, Optimization, Logistic Regression, Machine Learning

Abstract

Bank churn occurs when customers switch from one bank to another. Although some customer loss is unavoidable, it is important for banks to avoid voluntary churn as it is easier and cheaper to keep an existing customer than to gain a new one. In our paper, we train and optimize a machine learning algorithm, specifically a k-nearest neighbors algorithm, to predict whether or not a customer will leave their bank using existing demographic and financial information. Bygiving banks a reliable method for predicting whether or not a customer will churn, they can prioritize certain groups in an effort to increase retention rates. We compare the accuracy of our algorithm to other types of machine learning algorithms, such as random forest and logistic regression models, and increase the accuracy of the k-nearest neighbor algorithm by optimizing the k value used in our model, as well as utilizing 10-folds cross-validation. We determine the most important attributes and weight them appropriately. After optimizing this model, we are able to predict with 85.72% accuracy whether or not the customer will churn.
Published
2024-07-10
How to Cite
Akakpo, S., Dambra, P., Paz, R., Smyth, T., Torre, F., & Yu, C. (2024). Optimization of the K-Nearest Neighbor Algorithm to Predict Bank Churn. Statistics, Optimization & Information Computing, 12(5), 1397-1408. https://doi.org/10.19139/soic-2310-5070-2098
Section
Research Articles