Risk Prediction-Based Breast Cancer Diagnosis Using Personal Health Records and Machine Learning Models

Moturi, Sireesha; Tirumala Rao, S. N.; Vemuru, Srikanth

doi:10.1007/978-981-15-9516-5_37

Sireesha Moturi^16,17,
S. N. Tirumala Rao¹⁷ &
Srikanth Vemuru¹⁶

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1280))

457 Accesses
2 Citations

Abstract

Breast cancer is most common in middle-aged female population. It is the fourth most dangerous cancer compared to remaining cancers. In recent years, breast cancer patients are significantly increasing, so the early diagnosis of cancer has become a necessary task in the cancer research, to facilitate subsequent clinical management of patients. The prevention of the breast cancer tumor is early detection of the tumor. Early detection of cancer can stop increase in tumor and saves lives. In the field of machine learning classification, cancer patients are classified into two types as benign or malignant. Different preprocessing techniques like filling missing values, applying correlation coefficient, synthetic minority oversampling technique (SMOTE) and tenfold cross-validations are implemented and aptly used to obtain the accuracy. The main context of this study is to identify key features from the dataset and analyze the performance evaluation of different machine learning algorithms like random forest classifier, logistic regression, support vector machine, decision tree, Gaussian Naive Bayes and k-nearest neighbors. Based on the results, the classification model that gives highest accuracy will be used as the best model for cancer prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Analysis and Prediction of Breast Cancer using Multi-model Classification Approach

Performance Evaluation of Machine Learning Algorithms to Predict Breast Cancer

A Review of Machine Learning Algorithms on Different Breast Cancer Datasets

References

A. Jemal, T. Murray, E. Ward, A. Samuels, R.C. Tiwari, A. Ghafoor, E.J. Feuer, M.J. Thun, Cancer statistics. CA Cancer J. Clin. 55(1), 10–30 (2005)
Google Scholar
K.B. Prakash, M.A. DoraiRangaswamy, A.R. Raman, Text studies towards multi-lingual content mining for web communication, in Proceedings of the 2nd International Conference on Trendz in Information Sciences and Computing (2010), pp. 28–31
Google Scholar
K.B. Prakash, M.A.D. Rangaswamy, Content extraction of biological datasets using soft computing techniques. J. Med. Imag. Health Inf. 932–936 (2016)
Google Scholar
K.B. Prakash, Information extraction in current Indian web documents. Int. J. Eng. Technol. (UAE) (2018), pp. 68–71
Google Scholar
K.B. Prakash, M.A. DoraiRangaswamy, Content extraction studies using neural network and attribute generation. Ind. J. Sci. Technol. 1–10 (2016)
Google Scholar
M. Sireesha, S. Vemuru, S. N. Tirumala Rao, Coalesce based binary table: an enhanced algorithm for mining frequent patterns. Int. J. Eng. Technol. 7(1.5), 51–55 (2018)
Google Scholar
M. Sireesha, S.N. Tirumala Rao, S. Vemuru, Frequent Itemset Mining Algorithms: A Survey. J. Theoret. Appl. Inf. Technol. 96(3), 744–755 (2018)
Google Scholar
M. Sireesha, S. Vemuru, S.N. Tirumala Rao, Classification model for prediction of heart disease using correlation coefficient technique. Int. J. Adv. Trends in Comput. Sci. En. 9(2), 2116–2123 (2020)
Google Scholar
U.S. Cancer Statistics Working Group. https://www.cdc.gov/cancer/uscs/technical_notes/index.html
M. Kumari, V. Singh, Breast cancer prediction system, in International Conference on Computational Intelligence and Data Science (2018), pp. 371–376
Google Scholar
M. Sireesha, S.N. Tirumala Rao, S. Vemuru, Optimized feature extraction and hybrid classification model for heart disease and breast cancer prediction. Int. J. Rec. Technol. Eng. 7(6), 1754–1772 (2016)
Google Scholar
H. Asri, H. Mousannif, H. Al Moatassime, T. Noel, Using machine learning algorithms for breast cancer risk prediction and diagnosis, in The 6th International Symposium on Frontiers in Ambient and Mobile System (2016), pp. 1064–1069
Google Scholar
UCI Machine Learning Repository: Breast Cancer Wisconsin (Original) Data Set. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Original%29
Data Preprocessing- an overview. https://www.javatpoint.com/data-preprocessing-machine-learning
Handling Missing Values in machine learning. https://towardsdatascience.com/working-with-missing-data-in-machine-learning-9c0a430df4ce
M. Sireesha, S.N. Tirumala Rao, S. Vemuru, Predictive analysis of imbalanced cardiovascular disease using SMOTE. Int. J. Adv. Sci. Technol. 29(5), 6301–6311 (2020)
Google Scholar
In-Database Machine Learning 2: Calculate a correlation Matrix—A Data Exploration Post, Vertica. https://www.vertica.com/blog/in-database-machine-learning-2-calculate-a-correlation-matrix-a-data-exploration-post/
D. Lavanya, K. Usha Rani, Analysis of feature selection with classification: breast cancer datasets. Ind. J. Comput. Sci. Eng. (IJCSE) (2011)
Google Scholar
Cross-validation: evaluating estimator performance. https://scikit-learn.org/stable/modules/cross_validation.html
J. Han, M. Kamber, Data Mining Concepts and Techniques (Morgan Kauffman Publishers, 2000)
Google Scholar

Download references

Author information

Authors and Affiliations

KLEF, Vaddeswaram, India
Sireesha Moturi & Srikanth Vemuru
Narasaraopeta Engineering College, Narasaraopet, India
Sireesha Moturi & S. N. Tirumala Rao

Authors

Sireesha Moturi
View author publications
You can also search for this author in PubMed Google Scholar
S. N. Tirumala Rao
View author publications
You can also search for this author in PubMed Google Scholar
Srikanth Vemuru
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sireesha Moturi .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, K. L. University, Guntur, Andhra Pradesh, India
Debnath Bhattacharyya
Department of Computer Science and Engineering, Vignan’s Institute of Information Technology, Visakhapatnam, Andhra Pradesh, India
N. Thirupathi Rao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Moturi, S., Tirumala Rao, S.N., Vemuru, S. (2021). Risk Prediction-Based Breast Cancer Diagnosis Using Personal Health Records and Machine Learning Models. In: Bhattacharyya, D., Thirupathi Rao, N. (eds) Machine Intelligence and Soft Computing. Advances in Intelligent Systems and Computing, vol 1280. Springer, Singapore. https://doi.org/10.1007/978-981-15-9516-5_37

Download citation

DOI: https://doi.org/10.1007/978-981-15-9516-5_37
Published: 21 January 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-9515-8
Online ISBN: 978-981-15-9516-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Risk Prediction-Based Breast Cancer Diagnosis Using Personal Health Records and Machine Learning Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Analysis and Prediction of Breast Cancer using Multi-model Classification Approach

Performance Evaluation of Machine Learning Algorithms to Predict Breast Cancer

A Review of Machine Learning Algorithms on Different Breast Cancer Datasets

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Risk Prediction-Based Breast Cancer Diagnosis Using Personal Health Records and Machine Learning Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Analysis and Prediction of Breast Cancer using Multi-model Classification Approach

Performance Evaluation of Machine Learning Algorithms to Predict Breast Cancer

A Review of Machine Learning Algorithms on Different Breast Cancer Datasets

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation