Abstract
This chapter is based on the prediction of MoRF regions within the intrinsically disordered protein sequence. Disordered proteins have molecular recognition regions (MoRF) making them highly attractive to bind with protein pairs. Thus, as they combine with other protein pairs, they undergo disorder-to-order transition making them essential for various biological functions. Therefore, the project is tasked to obtain structural information of the disordered protein sequence and perform machine learning techniques to predict the MoRF regions in disordered protein sequences. The proposed method for the project will focus on programming and simulation analysis using the MATLAB software for which structural information will be extracted from the disordered protein sequences. Using these sequences, the project is aimed to perform training and testing implementation. Two test methods are used to evaluate the performance of the trained SVM models. Analysis has shown that the cross-validation test method outperforms the independent test method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sharma R, Kumar S, Tsunoda T, Patil A, Sharma A (2016) Predicting MoRFs in protein sequences using HMM profiles. BMC Bioinform 17(19). Available: https://doi.org/10.1186/s12859-016-1375-0
Sharma R, Sharma A, Patil A, Tsunoda T (2019) Discovering MoRFs by trisecting intrinsically disordered protein sequence into terminals and middle regions. BMC Bioinform 19(13). Available: https://doi.org/10.1186/s12859-018-2396-7
Sharma R, Raicar G, Tsunoda T, Patil A, Sharma A (2018) OPAL: prediction of MoRF regions in intrinsically disordered protein sequences. Bioinformatics 34(11):1850–1858. Available: https://doi.org/10.1093/bioinformatics/bty
Malhis N, Jacobson M, Gsponer J (2016) MoRFchibi SYSTEM: software tools for the identification of MoRFs in protein sequences. Nucleic Acids Res 44(W1):W488–W493
Sharma R, Bayarjargal M, Tsunoda T, Patil A, Sharma A (2018) MoRFPred-plus: computational identification of MoRFs in protein sequences using physicochemical properties and HMM profiles. J Theoret Biol 437:9–16. Available: https://doi.org/10.1016/j.jtbi.2017.10.015
Midic U, Oldfield C, Dunker A, Obradovic Z, Uversky V (2009) Protein disorder in the human diseasome: unfoldomics of human genetic diseases. BMC Genom 10(1):S12. Available https://doi.org/10.1186/1471-2164-10-s1-s12
Uversky V et al (2009) Unfoldomics of human diseases: linking protein intrinsic disorder with diseases. BMC Genom 10(1):S7. Available: https://doi.org/10.1186/1471-2164-10-s1-s7
Al-Tabbakh SM, Mohamed HM, El ZH (2018) Machine learning techniques for analysis of Egyptian flight delay. Int J Data Mining Knowledge Managem Process 8(3):01–14. Available https://doi.org/10.5121/ijdkp.2018.8301
Ryan MM, Shobha G, Rangaswamy S (2020) Supervised learning—an overview | ScienceDirect Topics. Sciencedirect.com 2020. [Online]. Available https://www.sciencedirect.com/topics/computer-science/supervised-learning. Accessed 1 Mar 2020
Mishra S (2020) Unsupervised learning and data clustering. Medium 2020. [Online]. Available: https://towardsdatascience.com/unsupervised-learning-and-data-clustering-eeecb78b422a. Accessed 1 Mar 2020
Hsu W et al (2020) Intrinsic protein disorder and protein-protein interactions. In: Pacific symposium on biocomputing. Pacific symposium on biocomputing, pp 1–13. Available: https://doi.org/10.1142/9789814366496_0012 Accessed 20 Feb 2020
Mohan A et al (2006) Analysis of molecular recognition features (MoRFs). J Molecular Biol 362(5):1043–1059. Available: https://doi.org/10.1016/j.jmb.2006.07.087
He H, Zhao J, Sun G (2019) Prediction of MoRFs in protein sequences with MLPs based on sequence properties and evolution information. Entropy 21(7):635. Available: https://doi.org/10.3390/e21070635
Hanson J, Litfin T, Paliwal K, Zhou Y (2019) Identifying molecular recognition features in intrinsically disordered regions of proteins by transfer learning. Bioinformatics. Available https://doi.org/10.1093/bioinformatics/btz691
Wang Y, Guo Y, Pu X, Li M (2017) A sequence-based computational method for prediction of MoRFs. RSC Adv 7(31):18937–18945. Available https://doi.org/10.1039/c6ra27161h
EL‐Manzalawy Y, Dobbs D, Honavar V (2008) Predicting flexible length linear B-cell epitopes. J Molecular Recogn 21(4):121–132. Available: http://www.lifesciencessociety.org/CSB2008/toc/PDF/121.2008.pdf
Reddy H, Sharma A, Dehzangi A, Shigemizu D, Chandra A, Tsunoda T (2019) GlyStruct: glycation prediction using structural properties of amino acid residues. BMC Bioinform 19(13). Available https://doi.org/10.1186/s12859-018-2547-x
Team D (2020) Kernel functions-introduction to SVM Kernel & examples—dataflair. DataFlair, 2020 [Online]. Available https://data-flair.training/blogs/svm-kernel-functions/. Accessed 28 May 2020
Understanding AUC—ROC Curve, Medium (2020) [Online]. Available https://towardsdatascience.com/understanding-auc-roc-curve-68b2303cc9c5. Accessed 22 May 2020
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Elisha, D., Sanau, J., Assaf, M.H., Kumar, R.R., Sharma, B., Sharma, R. (2023). Molecular Recognition and Feature Extraction System. In: Yadav, A., Nanda, S.J., Lim, MH. (eds) Proceedings of International Conference on Paradigms of Communication, Computing and Data Analytics. PCCDA 2023. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-99-4626-6_43
Download citation
DOI: https://doi.org/10.1007/978-981-99-4626-6_43
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4625-9
Online ISBN: 978-981-99-4626-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)