Abstract
Unraveling the subcellular localization of mRNA is an imperative aspect in the realm of biotechnology. This resolution can illuminate the inner workings of genetic regulatory mechanisms, gene expression modalities, and the evolution of cellular physiological and developmental processes. However, the experimental delineation of mRNA subcellular localization imposes significant temporal and resource commitments. Despite the development of multiple algorithms and predictive models for mRNA subcellular localization, their performance indexes have not been markedly high. In this paper, we introduce a novel hybrid approach to categorize mRNA into five distinct subcellular locales, including the cytoplasm, endoplasmic reticulum, extracellular region, mitochondria, and nucleus. Our model exploits the strengths of ensemble learning with a hybrid methodology, incorporating multiple biologically pertinent features extracted from the input sequencing data. Additionally, the model dynamically adjusts the weightages of functions and the minority class, through the modulation of the weight ratio of disparate models during their contribution to the principal model. Overall, our model delivers promising results, with an average accuracy of 0.89 in an independent dataset for the classification of mRNA subcellular localizations into five subclasses. This displays a significant performance elevation in contrast to preceding algorithms, particularly in instances where the classes are adequately sampled.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Lodish, H., et al.: Molecular Cell Biology. Macmillan (2008)
Alberts, B., et al.: Molecular Biology of the Cell 6th ed. 2014: Garland Science
Cooper, T.A., Wan, L., Dreyfuss, G.J.C.: RNA and disease 136(4), 777–793 (2009)
Watson, J.D.: Molecular Biology of the Gene. Pearson Education India (2004)
Nelson, D.L., Lehninger, A.L., Cox, M.M.: Lehninger Principles of Biochemistry. Macmillan (2008)
Berg, J., et al.: Protein synthesis 8, 893–924 (2015)
Martin, K.C., Ephrussi, A.J.C.: mRNA localization: gene expression in the spatial dimension 136(4), 719–730 (2009)
Yan, Z., Lécuyer, E., Blanchette, M.J.B.: Prediction of mRNA subcellular localization using deep recurrent neural networks 35(14), i333–i342 (2019)
Zhang, Z.-Y., et al.: Design powerful predictor for mRNA subcellular location prediction in Homo sapiens 22(1), 526–535 (2021)
Garg, A., et al.: mRNALoc: a novel machine-learning based in-silico tool to predict mRNA subcellular localization 48(W1), W239–W243 (2020)
Li, J., et al.: SubLocEP: a novel ensemble predictor of subcellular localization of eukaryotic mRNA based on machine learning 22(5), bbaa401 (2021)
Wang, S., et al.: DeepmRNALoc: a novel predictor of eukaryotic mRNA subcellular localization based on deep learning 28(5), 2284 (2023)
Cui, T., et al.: RNALocate v2. 0: an updated resource for RNA subcellular localization with increased coverage and annotation 50(D1): D333–D339 (2022)
Chen, Z., et al.: iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data 21(3), 1047–1057 (2020)
Cai, C., et al.: SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence 31(13), 3692–3697 (2003)
Géron, A.: Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow. O’Reilly Media, Inc. (2022)
Kermany, D.S., et al.: Identifying medical diagnoses and treatable diseases by image-based deep learning 172(5), 1122–1131. e9 (2018)
Wekesa, J.S., et al.: A deep learning model for plant lncRNA-protein interaction prediction with graph attention 295, 1091–1102 (2020)
Chu, Y., et al.: DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features 22(1), 451–462 (2021)
Shan, X., et al.: Prediction of CYP450 enzyme–substrate selectivity based on the network-based label space division method 59(11), 4577–4586 (2019)
Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique 16, 321–357 (2002)
Dogan, A., Birant, D.: A weighted majority voting ensemble approach for classification. In: 4th International Conference on Computer Science and Engineering (UBMK) (2019)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nguyen, TT., Nguyen, VN., Tran, TX., Le, NQK. (2023). Enhanced Prediction of mRNA Subcellular Localization Using a Novel Ensemble Learning and Hybrid Approach. In: Nghia, P.T., Thai, V.D., Thuy, N.T., Son, L.H., Huynh, VN. (eds) Advances in Information and Communication Technology. ICTA 2023. Lecture Notes in Networks and Systems, vol 847. Springer, Cham. https://doi.org/10.1007/978-3-031-49529-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-49529-8_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49528-1
Online ISBN: 978-3-031-49529-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)