Ensemble Learning and K-means Models for Lung and Colon Cancer Classification

Keywords: Histopathological Image, Lung and Colon Cancer, Deep Learning, Ensemble Learning, K-means Clustering, Gamma Correction

Abstract

According to World Health Organization (WHO) statistics, cancer remains one of the leading causes of death worldwide. The highest number of cancer-related deaths is caused by lung cancer, with approximately 1.8 million fatalities (18.7%), followed by colorectal cancer, responsible for around 900,000 deaths (9.3%). These death rates are increasing in developing countries, where human, material, and technological resources for early detection are sometimes limited. In Morocco, for example, according to WHO statistics, the mortality rate for lung cancer stands at 21.6%. Diagnosis of histopathological images is one of the most effective ways of confirming or denying the existence of this type of cancer. Traditionally, this analysis is done manually by pathologists, which makes this process time-consuming and the outcome largely depends on the expertise of the pathologist. Automating this process can enable early detection and considerably increase the chances of cure. Thanks to the remarkable results achieved using machine learning techniques, many research projects have attempted to capitalize on these advances and apply them to automate and improve cancer detection accuracy. Despite advancements in deep learning-based classification, achieving consistently high accuracy remains a challenge. In this paper, we propose a new approach that uses the K-means model and gamma correction function to preprocess histopathological images from the LC25000 dataset, and transfer learning and ensemble learning to enhance the classification performance. We have combined two models based on VGG16 and DenseNet pre-trained models. This approach enabled us to achieve an accuracy of 99.96%, which illustrates the importance of combining unsupervised models, transfer learning and ensemble learning to improve the accuracy of histopathological images classification.
Published
2025-08-24
How to Cite
Oubaalla, A., El Moubtahij, H., & EL AKKAD, N. (2025). Ensemble Learning and K-means Models for Lung and Colon Cancer Classification. Statistics, Optimization & Information Computing. https://doi.org/10.19139/soic-2310-5070-2833
Section
Research Articles