Skip to main content

Prediction Models Applied to Lung Cancer Using Data Mining

  • Conference paper
  • First Online:
Intelligent Distributed Computing XV (IDC 2022)

Abstract

Lung cancer is the most common cause of cancer death in men and the second leading cause of cancer death in women worldwide. Even though early detection of cancer can aid in the complete cure of the disease, the demand for techniques to detect the occurrence of cancer nodules at an early stage is increasing. Its cure rate and prediction are primarily dependent on early disease detection and diagnosis. Knowledge discovery and data mining have numerous applications in the business and scientific domains that provide useful information in healthcare systems. Therefore, the present work aimed to compare several prediction models as well as the features to be used, with the help of Weka and RapidMiner tools. Both classification and association rules techniques were implemented. The results obtained were quite satisfactory, with emphasis on the Naive Bayes model, which obtained an accuracy of 95.03% for cross-validation 10 folds and 94.59% for percentage split 66%.

This work has been supported by FCT-Fundação para a Ciência e Tecnologia within the R &D Units Project Scope: UIDB/00319/2020.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Cancer Online: What is lung cancer?. https://www.cancro-online.pt/cancro-do-pulmao/informacao-basica/o-que-e-o-cancro-do-pulmao/. Accessed 5 Jan 2022

  2. Yang, H.: Data mining in lung cancer pathologic staging diagnosis: Correlation between clinical and pathology information (2015). Accessed 29 Dec 2021

    Google Scholar 

  3. Krishnaiah, V.: Diagnosis of lung cancer prediction system using data mining classification techniques (2013). Accessed 2 Jan 2022

    Google Scholar 

  4. Reis, R., Peixoto, H., Machado, J., Abelha, A.: Machine learning in nutritional follow-up research (2017). https://www.degruyter.com/document/doi/10.1515/comp-2017-0008/html. Accessed 29 Mar 2022

  5. Bhat, M.A.: Lung Cancer (2021). https://www.kaggle.com/datasets/mysarahmadbhat/lung-cancer. Accessed 18 Dec 2021

  6. DevMedia: Data Mining: concepts and use cases in healthcare. https://www.devmedia.com.br/data-mining-conceitos-e-casos-de-uso-na-area-da-saude/5945. Accessed 21 Dec 2021

  7. Horácio, J.: Data driven mindset - O modelo de mineração CRISP-DM. https://jorgeaudy.com/2021/01/29/data-driven-mindset-o-modelo-de-mineracao-crisp-dm/. Accessed 22 Dec 2021

  8. Damasceno, M.: Introduction to Data Mining using Weka. http://connepi.ifal.edu.br/ocs/anais/conteudo/anais/files/conferences/1/schedConfs//papers/258/public/258-4653-1-PB.pdf. Accessed 30 Dec 2021

  9. Garner, S.R.: Weka: The waikato environment for knowledge analysis. In: Proceedings of the New Zealand Computer Science Research Students Conference, vol. 1995, pp. 57–64 (1995). Accessed 29 Mar 2022

    Google Scholar 

  10. iMasters: Data Mining: Association Rules. https://imasters.com.br/back-end/data-mining-na-pratica-regras-de-associacao. Accessed 30 Dec 2021

  11. Santana, R.: Dealing with unbalanced classes - machine learning (2020). https://minerandodados.com.br/lidando-com-classes-desbalanceadas-machine-learning/. Accessed 10 Jan 2022

  12. Fonceca, F., Peixoto, H., Mirande, F., Machado, J., Abelha, A.: Step towards prediction of perineal tear (2017). https://repositorium.sdum.uminho.pt/bitstream/1822/51692/1/3.pdf. Accessed 10 Jan 2022

  13. Neto, C., Peixoto, H., Abelha, V., Abelha, A., Machado, J.: Knowledge discovery from surgical waiting lists (2017). https://www.sciencedirect.com/science/article/pii/S1877050917323438. Accessed 29 Mar 2022

  14. iMasters: Machine Learning: Metrics for Classification Models (2019). https://imasters.com.br/desenvolvimento/machine-learning-metricas-para-modelos-de-classificacao. Accessed 11 Jan 2022

  15. Rodrigues, M., Peixoto, H., Machado, J., Abelha, A.: Understanding stroke in dialysis and chronic kidney disease (2017). https://www.sciencedirect.com/science/article/pii/S1877050917317052. Accessed 29 Mar 2022

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hugo Peixoto .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sousa, R., Sousa, R., Peixoto, H., Machado, J. (2023). Prediction Models Applied to Lung Cancer Using Data Mining. In: Braubach, L., Jander, K., Bădică, C. (eds) Intelligent Distributed Computing XV. IDC 2022. Studies in Computational Intelligence, vol 1089. Springer, Cham. https://doi.org/10.1007/978-3-031-29104-3_22

Download citation

Publish with us

Policies and ethics