Abstract
Advanced data analysis tools and bioinformatics are essential for uncovering the nature of breast cancer, which is the leading cause of cancer death among women. The goal of this study is to identify potential genomic biomarkers that have a significant impact on four prognostic factors, including tumour size, lymph node involvement, metastasis, and overall survival status. The Random Forest algorithm has been trained on data from The Cancer Genome Atlas Breast Cancer, which contains the expression values of 19,737 genes. In order to obtain the optimal learning model, the process has been repeated 20 times for each indicator, and only the genes with a p-value < 0.05 were taken into further consideration. Several performance metrics (e.g., F1 score) were calculated to check the algorithm's reliability. As a result, 97 and 7 genes were included in the extended and final databases, respectively. The chosen genes have been proven to play a critical role in cancer-related pathways, such as Toll-like receptor and NF-κB, and have effects on cell proliferation, tumour formation, and angiogenesis. Thus, this study demonstrates the potential of machine learning analyses for biomedical purposes and provides machine-generated insights into breast cancer development, setting the groundwork for further in vitro examinations to validate the prognostic potential of these biomarkers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Dyba, T., et al.: The European cancer burden in 2020: incidence and mortality estimates for 40 countries and 25 major cancers. Eur. J. Cancer 157, 308–347 (2021)
Siegel, R.L., Miller, K.D., Wagle, N.S., Jemal, A.: Cancer statistics, 2023. CA Cancer J. Clin. 73(1), 17–48 (2023)
Zhang, Y., Zhang, Z.: The history and advances in cancer immunotherapy: understanding the characteristics of tumor-infiltrating immune cells and their therapeutic implications. Cel. Mol. Immunol. 17(8), 807–821 (2020)
Zaremba, A., Zaremba, P., Zahorodnia, S.: In silico study of HASDI (high-affinity selective DNA intercalator) as a new agent capable of highly selective recognition of the DNA sequence. Sci. Rep. 13(1), 5395 (2023)
Świętek, M., et al.: Magnetic temperature-sensitive solid-lipid particles for targeting and killing tumor cells. Front. Chem. 8, 205 (2020)
Martínez, R., et al.: Multitarget anticancer agents based on histone deacetylase and protein kinase CK2 inhibitors. Molecules (Basel, Switzerland) 25(7), 1497 (2020)
Riley, R.S., June, C.H., Langer, R., Mitchell, M.J.: Delivery technologies for cancer immunotherapy. Nat. Rev. Drug Discovery 18(3), 175–196 (2019)
Falfushynska, H., Lushchak, O., Siemens, E.: The application of multivariate statistical methods in ecotoxicology and environmental biochemistry. In: Proceedings of International Conference on Applied Innovation in IT, vol. 10, no. 1, pp. 99–104 (2022)
Rzymski, P., Kasianchuk, N., Sikora, D., Poniedziałek, B.: COVID‐19 vaccinations and rates of infections, hospitalizations, ICU admissions, and deaths in Europe during SARS‐CoV‐2 Omicron wave in the first quarter of 2022. J. Med. Virol. 95(14) (2022). https://doi.org/10.1002/jmv.28131
He, J., McGee, D.L., Niu, X.: Application of the Bayesian dynamic survival model in medicine. Stat. Med. 29(3), 347–360 (2010)
Kasianchuk, N., Tsvyk, D., Siemens, E., Falfushynska, H.: Random forest algorithm in unravelling biomarkers of breast cancer progression. In: Proceedings of the 11th International Conference on Applied Innovations in IT, vol. 11, no. 1 (2023)
Cerami, E., et al.: The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2(5), 401–404 (2012)
Rich, A.M., Hussaini, H.M., Parachuru, V.P., Seymour, G.J.: Toll-like receptors and cancer, particularly oral squamous cell carcinoma. Front. Immunol. 5, 464 (2014)
Semlali, A., et al.: Toll-like receptor 6 expression, sequence variants, and their association with colorectal cancer risk. J. Cancer 10(13), 2969–2981 (2019). https://doi.org/10.7150/jca.31011
Kauppila, J.H., Takala, H., Selander, K.S., Lehenkari, P.P., Saarnio, J., Karttunen, T.J.: Increased toll-like receptor 9 expression indicates adverse prognosis in oesophageal adenocarcinoma. Histopathology 59, 643–649 (2011). https://doi.org/10.1111/j.1365-2559.2011.03991.x
Tuomela, J., et al.: Low TLR9 expression defines an aggressive subtype of triple-negative breast cancer. Breast Cancer Res. Treat. 135, 481–493 (2012). https://doi.org/10.1007/s10549-012-2181-7
Orlacchio, A., Mazzone, P.: The role of toll-like receptors (TLRs) mediated inflammation in pancreatic cancer pathophysiology. Int. J. Mol. Sci. 22(23), 12743 (2021). https://doi.org/10.3390/ijms222312743
Gu, J., Liu, Y., Xie, B., Ye, P., Huang, J., Lu, Z.: Roles of toll-like receptors: from inflammation to lung cancer progression. Biomed. Rep. 8(2), 126–132 (2018). https://doi.org/10.3892/br.2017.1034
Bhattacharya, D., Yusuf, N.: Expression of toll-like receptors on breast tumors: taking a toll on tumor microenvironment. Int. J. Breast Cancer 2012, 716564 (2012). https://doi.org/10.1155/2012/716564
Javaid, N., Choi, S.: Toll-like receptors from the perspective of cancer treatment. Cancers 12(2), 297 (2020). https://doi.org/10.3390/cancers12020297
Giurini, E.F., Madonna, M.B., Zloza, A., Gupta, K.H.: Microbial-derived toll-like receptor agonism in cancer treatment and progression. Cancers 14(12), 2923 (2022). https://doi.org/10.3390/cancers14122923
Braunstein, M.J., Kucharczyk, J., Adams, S.: Targeting toll-like receptors for cancer therapy. Target. Oncol. 13(5), 583–598 (2018). https://doi.org/10.1007/s11523-018-0589-7
Chen, X., Zhang, Y., Fu, Y.: The critical role of toll-like receptor-mediated signaling in cancer immunotherapy. Med. Drug Discov. 14, 100122 (2022). https://doi.org/10.1016/j.medidd.2022.100122
Xia, L., et al.: Role of the NFκB-signaling pathway in cancer. Onco. Targets. Ther. 11, 2063–2073 (2018). https://doi.org/10.2147/OTT.S161109
Dewe, J.M., Fuller, B.L., Lentini, J.M., Kellner, S.M., Fu, D.: TRMT1-Catalyzed tRNA modifications are required for redox homeostasis to ensure proper cellular proliferation and oxidative stress survival. Mol. Cell Biol. 37(21), e00214-e217 (2017). https://doi.org/10.1128/MCB.00214-17
Qi, T.F., Miao, W., Wang, Y.: Targeted profiling of epitranscriptomic reader, writer, and eraser proteins accompanied with radioresistance in breast cancer cells. Anal. Chem. 94(3), 1525–1530 (2022). https://doi.org/10.1021/acs.analchem.1c05441
Jiang, T., et al.: Quiescin Sulfhydryl Oxidase 2 overexpression predicts poor prognosis and tumor progression in patients with colorectal cancer: a study based on data mining and clinical verification. Front. Cell Dev. Biol. 9, 678770 (2021). https://doi.org/10.3389/fcell.2021.678770
Li, Y., et al.: QSOX2 is an E2F1 target gene and a novel serum biomarker for monitoring tumor growth and predicting survival in advanced NSCLC. Front Cell Dev. Biol. 9, 688798 (2021). https://doi.org/10.3389/fcell.2021.688798
Danuta, G., Tobias, M., Marcus, D., et al.: Molecular karyotyping and gene expression analysis in childhood cancer patients. J. Mol. Med. 98, 1107–1123 (2020). https://doi.org/10.1007/s00109-020-01937-4
Zhou, S., et al.: Single-cell RNA-seq dissects the intratumoral heterogeneity of triple-negative breast cancer based on gene regulatory networks. Mol. Therapy Nucleic Acids 23, 682–690 (2021). https://doi.org/10.1016/j.omtn.2020.12.018
Osmanbeyoglu, H.U., et al.: Chromatin-informed inference of transcriptional programs in gynecologic and basal breast cancers. Nat. Commun. 10, 4369 (2019). https://doi.org/10.1038/s41467-019-12196-5
Euhus, D.M., Timmons, C.F., Tomlinson, G.E.: ETV6-NTRK3–Trk-ing the primary event in human secretory breast cancer. Cancer Cell 2(5), 347–348 (2002). https://doi.org/10.1016/s1535-6108(02)00184-8
Jia, J.J., Zhang, X., Ge, C.R., Jois, M.: The polymorphisms of UCP2 and UCP3 genes associated with fat metabolism, obesity and diabetes. Obes. Rev. Official J. Int. Assoc. Study Obes. 10(5), 519–526 (2009). https://doi.org/10.1111/j.1467-789X.2009.00569.x
Joshi, H., Vastrad, B., Joshi, N., Vastrad, C., Tengli, A., Kotturshetti, I.: Identification of key pathways and genes in obesity using bioinformatics analysis and molecular docking studies. Front. Endocrinol. 12, 628907 (2021). https://doi.org/10.3389/fendo.2021.628907
Lentes, K.U., et al.: Genomic organization and mutational analysis of the human UCP2 gene, a prime candidate gene for human obesity. J. Recept. Signal Transduct. Res. 19(1–4), 229–244 (1999). https://doi.org/10.3109/10799899909036648
Qiao, C., et al.: UCP2-related mitochondrial pathway participates in oroxylin a-induced apoptosis in human colon cancer cells. J. Cell. Physiol. 230(5), 1054–1063 (2015). https://doi.org/10.1002/jcp.24833
Dando, I., et al.: UCP2 inhibition triggers ROS-dependent nuclear translocation of GAPDH and autophagic cell death in pancreatic adenocarcinoma cells. Biochem. Biophys. Acta. 1833(3), 672–679 (2013). https://doi.org/10.1016/j.bbamcr.2012.10.028
Li, W., et al.: UCP2 knockout suppresses mouse skin carcinogenesis. Cancer Prev. Res. 8(6), 487–491 (2015). https://doi.org/10.1158/1940-6207.CAPR-14-0297-T
Human Gene Set: ZIC3_01, https://www.gsea-msigdb.org/gsea/msigdb/cards/ZIC3_01.html. Accessed 14 May 2023
Herman, G.E., El-Hodiri, H.M.: The role of ZIC3 in vertebrate development. Cytogenet. Genome Res. 99(1–4), 229–235 (2002). https://doi.org/10.1159/000071598
Aruga, J.: The role of ZIC genes in neural development. Mol. Cell. Neurosci. 26(2), 205–221 (2004). https://doi.org/10.1016/j.mcn.2004.01.004
Ma, G., Dai, W., Sang, A., Yang, X., Li, Q.: Roles of ZIC family genes in human gastric cancer. Int. J. Mol. Med. 38(1), 259–266 (2016). https://doi.org/10.3892/ijmm.2016.2587
Yang, B., et al.: MiR-564 functions as a tumor suppressor in human lung cancer by targeting ZIC3. Biochem. Biophys. Res. Commun. 467(4), 690–696 (2015). https://doi.org/10.1016/j.bbrc.2015.10.082
Chen, D., Fan, Y., Wan, F.: LncRNA IGBP1-AS1/miR-24-1/ZIC3 loop regulates the proliferation and invasion ability in breast cancer. Cancer Cell Int. 20, 153 (2020). https://doi.org/10.1186/s12935-020-01214-x
Sharma, S., Kadam, P., Dubinett, S.: CCL21 programs immune activity in tumor microenvironment. Proc. Cancer Prev. Res. (Philadelphia, Pa.) 8(6), 487–491 (2015). https://doi.org/10.1158/1940-6207.CAPR-14-0297-T
Cheever, M.A.: Twelve immunotherapy drugs that could cure cancers. Immunol. Rev. 222, 357–368 (2008). https://doi.org/10.1111/j.1600-065X.2008.00604.x
Chang, X., et al.: Bioinformatic analysis suggests that three hub genes may be a vital prognostic biomarker in pancreatic ductal adenocarcinoma (2020). https://doi.org/10.1089/cmb.2019.0367
Zhou, Y.Y., et al.: Integrated transcriptomic analysis reveals hub genes involved in diagnosis and prognosis of pancreatic cancer (2019). https://doi.org/10.1186/s10020-019-0113-2
Yu, Y., Werdyani, S., Carey, M., Parfrey, P., Yilmaz, Y.E., Savas, S.: A comprehensive analysis of SNPs and CNVs identifies novel markers associated with disease outcomes in colorectal cancer (2021). https://doi.org/10.1002/1878-0261.13067
Pengue, G., Cannada-Bartoli, P., Lania, L.: The ZNF35 human zinc finger gene encodes a sequence-specific DNA-binding protein. FEBS Lett. 321(2–3), 233–236 (1993). https://doi.org/10.1016/0014-5793(93)80115-b
Yin, Z., et al.: Detecting prognosis risk biomarkers for colon cancer through multi-omics-based prognostic analysis and target regulation simulation modeling. Front. Genet. 11, 524 (2020). https://doi.org/10.3389/fgene.2020.00524
Acknowledgements
This work was partly supported by EMBO IG 4728-2020, Jacek Arct and ‘New Technologies for Women’ scholarshipsб Kyiv School of Economics «Talents for Ukraine» for NK and Alexander von Humboldt Stiftung (Philipp Schwartz-Initiative) for HF.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kasianchuk, N., Tsvyk, D., Siemens, E., Ostash, V., Falfushynska, H. (2023). Genomic Data Machined: The Random Forest Algorithm for Discovering Breast Cancer Biomarkers. In: Dovgyi, S., Trofymchuk, O., Ustimenko, V., Globa, L. (eds) Information and Communication Technologies and Sustainable Development. ICT&SD 2022. Lecture Notes in Networks and Systems, vol 809. Springer, Cham. https://doi.org/10.1007/978-3-031-46880-3_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-46880-3_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46879-7
Online ISBN: 978-3-031-46880-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)