Abstract
In this paper, a deep learning-based approach is applied to high dimensional, high-volume, and high-sparsity medical data to identify critical casual attributions that might affect the survival of a breast cancer patient. The Surveillance Epidemiology and End Results (SEER) breast cancer data is explored in this study. The SEER data set contains accumulated patient-level and treatment-level information, such as cancer site, cancer stage, treatment received, and cause of death. Restricted Boltzmann machines (RBMs) are proposed for dimensionality reduction in the analysis. RBM is a popular paradigm of deep learning networks and can be used to extract features from a given data set and transform data in a non-linear manner into a lower dimensional space for further modelling. In this study, a group of RBMs has been trained to sequentially transform the original data into a very low dimensional space, and then the k-means clustering is conducted in this space. Furthermore, the results obtained about the cluster membership of the data samples are mapped back to the original sample space for interpretation and insight creation. The analysis has demonstrated that essential features relating to breast cancer survival can be effectively extracted and brought forward into a much lower dimensional space formed by RBMs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
WHO. https://www.who.int/cancer/prevention/diagnosis-screening/breast-cancer/en/
Hankey, B.F., Ries, L.A., Edwards, B.K.: The surveillance, epidemiology, and end results program. Cancer Epidemiol. Prev. Biomark. 8(12), 1117–1121 (1999)
Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, Amsterdam (2011)
Zhang, K., Liu, J., Chai, Y., Qian, K.: An optimized dimensionality reduction model for high-dimensional data based on restricted boltzmann machines. In: 27th Chinese Control and Decision Conference (2015 CCDC), Qingdao, China, pp. 2939–2944. IEEE (2015)
Katkar, J.A., Baraskar, T.: A novel approach for medical image segmentation using PCA and K-means clustering. In: 2015 International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT), Davangere, India, pp. 430–435. IEEE (2015)
Sullivan, C.W., et al.: Differences in symptom clusters identified using symptom occurrence rates versus severity ratings in patients with breast cancer undergoing chemotherapy. Eur. J. Oncol. Nurs. 28, 122–132 (2017)
Chen, A.T.: Exploring online support spaces: using cluster analysis to examine breast cancer, diabetes and fibromyalgia support groups. Patient Educ. Couns. 87(2), 250–257 (2012)
Sanford, S.D., Beaumont, J.L., Butt, Z., Sweet, J.J., Cella, D., Wagner, L.I.: Prospective longitudinal evaluation of a symptom cluster in breast cancer. J. Pain Symptom Manage. 47(4), 721–730 (2014)
Sarenmalm, E.K., Browall, M., Gaston-Johansson, F.: Symptom burden clusters: a challenge for targeted symptom management. A longitudinal study examining symptom burden clusters in breast cancer. J. Pain Symptom Manage. 47(4), 731–741 (2014)
Rathnam, C., Lee, S., Jiang, X.: An algorithm for direct causal learning of influences on patient outcomes. Artif. Intell. Med. 75, 1–15 (2017)
Fogel, D.B., Wasson III, E.C., Boughton, E.M.V., Porto, W.: Evolving artificial neural networks for screening features from mammograms. Artif. Intell. Med. 14(3), 317–326 (1998)
Blanco, R., Inza, I., Merino, M., Quiroga, J., Larrañaga, P.: Feature selection in bayesian classifiers for the prognosis of survival of cirrhotic patients treated with tips. J. Biomed. Inform. 38(5), 376–388 (2005)
Kose, U., Alzubi, J. (eds): Deep Learning for Cancer Diagnosis. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-6321-8
Zhou, Z.H., Jiang, Y., Yang, Y.B., Chen, S.F.: Lung cancer cell identification based on artificial neural network ensembles. Artif. Intell. Med. 24(1), 25–36 (2002)
Xu, R., Damelin, S., Nadler, B., Wunsch, D.C.: Clustering of high-dimensional gene expression data with feature filtering methods and diffusion maps. Artif. Intell. Med. 48(2), 91–98 (2010)
Li, L., et al.: Data mining techniques for cancer detection using serum proteomic profiling. Artif. Intell. Med. 32(2), 71–83 (2004)
Regnier-Coudert, O., McCall, J., Lothian, R., Lam, T., McClinton, S., NDow, J.: Machine learning for improved pathological staging of prostate cancer: a performance comparison on a range of classifiers. Artif. Intell. Med. 55(1), 25–35 (2012)
Yang, X., Cao, A., Song, Q., Schaefer, G., Su, Y.: Vicinal support vector classifier using supervised kernel-based clustering. Artif. Intell. Med. 60(3), 189–196 (2014)
Kakushadze, Z., Yu, W.: k-means and cluster models for cancer signatures. Biomol. Detect. Quantif. 13, 7–31 (2017)
Hinton, G.E.: A practical guide to training restricted Boltzmann machines. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 599–619. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_32
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, D. et al. (2021). Deep Learning Causal Attributions of Breast Cancer. In: Arai, K. (eds) Intelligent Computing. Lecture Notes in Networks and Systems, vol 285. Springer, Cham. https://doi.org/10.1007/978-3-030-80129-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-80129-8_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80128-1
Online ISBN: 978-3-030-80129-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)