Abstract
Lung cancer is an uncontrolled growth of tissue causing a lump in the human lung. If lung cancer can be detected early, it can increase the survival rate. Therefore, a multi-classification approach of lung nodule detection with high computational effectiveness is required. In this paper, a multi-classification approach of lung nodule detection and classification is proposed using artificial intelligence on computed tomography (CT) scan images. Different preprocessing steps are applied for resizing, smoothing, and enhancement of the CT images. Then, two different approaches for feature extraction using VGG16 transfer learning and morphological segmentation are proposed. Morphological segmentation and feature extraction are applied for the segmentation of the region of interest and to extract the distinct features. Finally, the proposed deep learning architecture and seven different machine learning algorithms are applied on the preprocessed data and the extracted features for the classification of lung nodules into three classes: malignant, benign, and normal. It is observed that the stacked ensemble model of deep learning convolutional neural network (CNN) and VGG16 transfer learning models (CNN+VGG16) can achieve 99.55% accuracy using preprocessed data. It is also observed that all the ML algorithms perform with reasonably high accuracy using the low-dimensional morphological features. It is observed from the fivefold cross-validation results that logistic regression performs with 99.36% accuracy in 23.71 s time using the preprocessed data. Whereas, using the morphological features, k-nearest neighbor, and the support vector machine perform with the highest accuracy of 99.76% with very reduced computational time of 0.017 and 0.008 s, respectively.
Similar content being viewed by others
Availability of supporting data
The IQ-OTH/NCCD Lung Cancer Dataset are available at the Mendeley Data repository and the Kaggle data archive.
References
Radhika P, Nair RA, Veena G (2019) A comparative study of lung cancer detection using machine learning algorithms. In: IEEE International Conference on Electrical, Computer and Communication Technologies. IEEE, pp 1–4
Differences between a malignant and benign tumor. http://www.differencebetween.net/science/health/difference-between-benign-and-malignant/, [Online accessed 2022-04-17]
Kaushal C, Bhat S, Koundal D, Singla A (2019) Recent trends in computer assisted diagnosis (CAD) system for breast cancer diagnosis using histopathological images. Irbm, Elsevier 40(4):211–227
Sniadanko N (2022) ML-based system or why we use ? Computer-aided systems in healthcare. https://vitechteam.com/computer-aided-systems-in-healthcare/, [Online accessed 2022-05-03]
Chaturvedi P, Jhamb A, Vanani M, Nemade V (2021) Prediction and classification of lung cancer using machine learning techniques. In: IOP Conference Series: Materials Science and Engineering, vol. 1099, no. 1.IOP Publishing, p. 012059
Günaydin Ö, Günay M, Şengel Ö (2019) Comparison of lung cancer detection algorithms. In: cientific Meeting on Electrical-Electronics and Biomedical Engineering and Computer Science. IEEE, pp. 1–4
Punithavathy K, Poobal S, Ramya M (2019) Performance evaluation of machine learning techniques in lung cancer classification from PET/CT images. FME Trans 47(3):418–423
Tao Z, Bingqiang H, Huiling L, Zaoli Y, Hongbin S (2020) NSCR-based DenseNet for lung tumor recognition using chest CT image. BioMed Research International, Hindawi, vol 2020
Pradhan K, Chawla P (2020) Medical internet of things using machine learning algorithms for lung cancer detection. J Manage Anal 7(4):591–623
Hu H, Li Q, Zhao Y, Zhang Y (2020) Parallel deep learning algorithms with hybrid attention mechanism for image segmentation of lung tumors. IEEE Trans Ind Informat 17(4):2880–2889
Moitra D, Mandal RK (2020) Classification of non-small cell lung cancer using one-dimensional convolutional neural network. Exp Syst Appl 159:113564
Boban BM, Megalingam RK (2020) Lung diseases classification based on machine learning algorithms and performance evaluation. In International Conference on Communication and Signal Processing. IEEE, pp 0315–0320
Abdullah DM, Abdulazeez AM, Sallow AB (2021) Lung cancer prediction and classification based on correlation selection method using machine learning techniques. Qubahan Acad J 1(2):141–149
Nawreen N, Hany U, Islam T (2021) Lung cancer detection and classification using ct scan image processing. In: International Conference on Automation, Control and Mechatronics for Industry (ACMI). IEEE, pp 1–6
Nanglia P, Kumar S, Mahajan AN, Singh P, Rathee D (2021) A hybrid algorithm for lung cancer classification using SVM and Neural Networks. ICT Express 7(3):335–341
Kareem HF, Al-Huseiny MS, Mohsen FY, Al-Yasriy K (2021) Evaluation of svm performance in the detection of lung cancer in marked ct scan dataset. Indonesian J Electrical Eng Comput Sci 21(3):1731
Pandian R, Vedanarayanan V, Kumar DR, Rajakumar R (2022) Detection and classification of lung cancer using CNN and Google net. Measurement: Sensors, vol 24, p 100588
Alyasriy H, Muayed A (2021) The IQ-OTHNCCD lung cancer dataset. Mendeley Data 1:2020
The IQ-OTHNCCD lung cancer dataset. https://www.kaggle.com/datasets/antonixx/the-iqothnccd-lung-cancer-dataset, [Online accessed 2022-04-19]
James M (2022) Hands-on transfer learning with Keras and the VGG16 Model. https://www.learndatasci.com/tutorials/hands-on-transfer-learning-keras/, [Online accessed 2022-07-20]
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint: arXiv:1409.1556
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybernet 9(1):62–66
Raheem KR, Shabat HA (2023) An otsu thresholding for images based on a nature-inspired optimization algorithm. Indonesian J Electrical Eng Comput Sci 31(2):933–944
Dash J, Bhoi N (2018) Retinal blood vessel segmentation using otsu thresholding with principal component analysis. In 2018 2nd International Conference on Inventive Systems and Control (ICISC), pp 933–937
Rokach L, Maimon O (2005) Decision trees. In: Data mining and knowledge discovery handbook. Springer, pp 165–192
Guo G, Wang H, Bell D, Bi Y, Greer K (2003) KNN model-based approach in classification. In: OTM Confederated International Conferences on the Move to Meaningful Internet Systems. Springer, pp 986–996
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 785–794
Zhang Y (2012) Support vector machine classification algorithm and its application. In: International Conference on Information Computing and Applications. Springer, pp 179–186
Wright RE (1995) Logistic regression. American Psychological Association
Brownlee J (2023) A gentle introduction to k-fold cross-Validation. In: Statistics. https://machinelearningmastery.com/k-fold-cross-validation/, [2023-10-04]
Aayush B (2022) Performance metrics in machine learning [Complete Guide]. https://neptune.ai/blog/performance-metrics-in-machine-learning-complete-guide, [Online accessed 2022-07-21]
Rose W (2022) Cross-entropy loss and its applications in deep learning. https://neptune.ai/blog/cross-entropy-loss-and-its-applications-in-deep-learning, [Online accessed 2022-09-02]
Acknowledgements
We thank the contributor of the dataset available at the Mendeley Data repository and the IQ-OTHNCCD lung cancer dataset archive of Kaggle. We also thank the Department of Electrical and Electronic Engineering, Ahsanullah University of Science and Technology for providing the necessary resources to conduct the research.
Funding
The research is supported by the AUST Internal Research Grant (Round 3) of Ahsanullah University of Science and Technology, Bangladesh.
Author information
Authors and Affiliations
Contributions
NMH developed the deep learning CNN architecture, the transfer learning VGG16 model, and the stacked ensemble model, applied preprocessing, deep learning, and machine learning classifiers, and prepared Figures 10, 12–17. UH developed the morphological segmentation and feature extraction, applied preprocessing, applied machine learning classifier, and stacked ensemble model, prepared Figures 1–9, 11, 18, contributed to the literature review, and wrote the manuscript. TI wrote the "Machine learning classifiers" subsection and the "Performance Evaluation" section, contributed to the literature review, and reviewed the manuscript. NN developed five morphological operations in Matlab and contributed to the literature review. AAM collected datasets from Kaggle and reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Ethical approval
We have used lung cancer datasets collected in the Iraq-Oncology Teaching Hospital/National Center for Cancer Diseases (IQ-OTH/NCCD) over a period of three months. It includes CT scans of patients diagnosed with lung cancer in different stages, as well as healthy subjects. The IQ-OTH/NCCD slides were marked by oncologists and radiologists. The source of the collected IQ-OTH/NCCD lung cancer dataset has been cited in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Muhtasim, N., Hany, U., Islam, T. et al. Artificial intelligence for detection of lung cancer using transfer learning and morphological features. J Supercomput 80, 13576–13606 (2024). https://doi.org/10.1007/s11227-024-05942-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-024-05942-z