Lung Cancer Segmentation and Classification with Multi-Dataset Integration

  • Hozan Abdulqader Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq, Akre University for Applied Science- Technical College of Informatics- Akre- Department of Information Technology
  • Adnan Abdulazeez Duhok Polytechnic University- Kurdistan Region - Iraq https://orcid.org/0000-0002-4357-7331
Keywords: Lung Cancer, Segmentation, Classification, Deep Learning, Multi Dataset Integration

Abstract

Accurate computer-aided lung cancer diagnosis is based on two sequential tasks: precise nodule segmentation and reliable malignancy classification. To this end, we curated the largest open-source CT benchmark to date by unifying five public repositories, resulting in 7,061 annotated slices from 571 patients for segmentation and 17,351 slices from 1,208 patients for classification. A standardized pre-processing pipeline was developed to harmonize voxel spacing, intensity windows, and label conventions.For segmentation, six encoder–decoder architectures were evaluated, with the hybrid UNet++ achieving the highest validation performance (Dice coefficient = 98.5%), demonstrating that attention-augmented dense skip pathways enable more accurate boundary detection of lung nodules.These masks were then used to drive a two-phase classification strategy: models were initially trained using ground-truth masks, followed by fine-tuning on predicted masks to emulate real-world deployment scenarios. Our proposed NoduleHyperFusionNet a dual-stream EfficientNetV2-S architecture , achieved the best overall discrimination (Accuracy = 92%, F1-score = 89%, AUC = 91%). The EfficientNet-B3 model also performed strongly, reaching an AUC of 94%. Overall, this study demonstrates that the combination of attention-enhanced segmentation and lightweight multichannel fusion architectures can significantly improve automated lung cancer workflows, reducing diagnostic error rates without incurring prohibitive computational costs.
Published
2025-09-28
How to Cite
Abdulqader, H., & Abdulazeez , A. (2025). Lung Cancer Segmentation and Classification with Multi-Dataset Integration. Statistics, Optimization & Information Computing. https://doi.org/10.19139/soic-2310-5070-2814
Section
Research Articles