Abstract
We present an easy protocol for evolutionary analysis of proteins, with an emphasis on studying the evolutionary dynamics of disordered regions. Using the p53 protein family as an example, we provide a guide for finding homologous sequences in a database and refining a dataset before constructing the evolutionary context by building a phylogenetic tree. We show how a multiple sequence alignment and phylogeny for a protein family can be further partitioned into smaller datasets in order to investigate the changes in disorder content across the phylogeny. Based on the evolutionary context, we also investigate site-specific conservation of disorder. Last, we address how to evaluate the evolutionary dynamics of disorder-to-order transitions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Gabaldón T, Koonin EV (2013) Functional and evolutionary implications of gene orthology. Nat Rev Genet 14:360–366. https://doi.org/10.1038/nrg3456
Echave J, Spielman SJ, Wilke CO (2016) Causes of evolutionary rate variation among protein sites. Nat Rev Genet 17:109–121. https://doi.org/10.1038/nrg.2015.18
Brown CJ, Takayama S, Campen AM et al (2002) Evolutionary rate heterogeneity in proteins with long disordered regions. J Mol Evol 55:104–110
van der Lee R, Buljan M, Lang B et al (2014) Classification of intrinsically disordered regions and proteins. Chem Rev 114:6589–6631. https://doi.org/10.1021/cr400525m
Ahrens J, Rahaman J, Siltberg-Liberles J (2018) Large-scale analyses of site-specific evolutionary rates across eukaryote proteomes reveal confounding interactions between intrinsic disorder, secondary structure, and functional domains. Genes (Basel) 9:553. https://doi.org/10.3390/genes9110553
Ahrens J, Dos Santos HG, Siltberg-Liberles J (2016) The nuanced interplay of intrinsic disorder and other structural properties driving protein evolution. Mol Biol Evol 33:2248–2256. https://doi.org/10.1093/molbev/msw092
Light S, Sagit R, Sachenkova O et al (2013) Protein expansion is primarily due to indels in intrinsically disordered regions. Mol Biol Evol 30:2645–2653. https://doi.org/10.1093/molbev/mst157
Anisimova M, Liberles DA, Philippe H et al (2013) State-of the art methodologies dictate new standards for phylogenetic analysis. BMC Evol Biol 13:161. https://doi.org/10.1186/1471-2148-13-161
Dos Santos HG, Nunez-Castilla J, Siltberg-Liberles J (2016) Functional diversification after gene duplication: Paralog specific regions of structural disorder and phosphorylation in p53, p63, and p73. PLoS One 11:e0151961. https://doi.org/10.1371/journal.pone.0151961
Richter DJ, King N (2013) The genomic and cellular foundations of animal origins. Annu Rev Genet 47:509–537. https://doi.org/10.1146/annurev-genet-111212-133456
Suga H, Chen Z, de Mendoza A et al (2013) The Capsaspora genome reveals a complex unicellular prehistory of animals. Nat Commun 4:2325. https://doi.org/10.1038/ncomms3325
Huerta-Cepas J, Serra F, Bork P (2016) ETE 3: reconstruction, analysis, and visualization of Phylogenomic data. Mol Biol Evol 33:1635–1638. https://doi.org/10.1093/molbev/msw046
Huerta-Cepas J, Dopazo J, Gabaldón T et al (2010) ETE: a python environment for tree exploration. BMC Bioinformatics 11:24. https://doi.org/10.1186/1471-2105-11-24
Golubchik T, Wise MJ, Easteal S, Jermiin LS (2007) Mind the gaps: evidence of bias in estimates of multiple sequence alignments. Mol Biol Evol 24:2433–2442. https://doi.org/10.1093/molbev/msm176
Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797. https://doi.org/10.1093/nar/gkh340
Edgar RC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5:113. https://doi.org/10.1186/1471-2105-5-113
Löytynoja A (2014) Phylogeny-aware alignment with PRANK. Methods Mol Biol 1079:155–170
Notredame C, Higgins DG, Heringa J (2000) T-coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302:205–217. https://doi.org/10.1006/jmbi.2000.4042
Katoh K, Misawa K, Kuma K, Miyata T (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30:3059–3066. https://doi.org/10.1093/nar/gkf436
Katoh K, Toh H (2008) Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform 9:286–298. https://doi.org/10.1093/bib/bbn013
Thompson JD, Linard B, Lecompte O, Poch O (2011) A comprehensive benchmark study of multiple sequence alignment methods: current challenges and future perspectives. PLoS One 6:e18093. https://doi.org/10.1371/journal.pone.0018093
Long H, Li M, Fu H (2016) Determination of optimal parameters of MAFFT program based on BAliBASE3.0 database. Springerplus 5:736. https://doi.org/10.1186/S40064-016-2526-5
Waterhouse AM, Procter JB, Martin DMA et al (2009) Jalview version 2--a multiple sequence alignment editor and analysis workbench. Bioinformatics 25:1189–1191. https://doi.org/10.1093/bioinformatics/btp033
Finn RD, Bateman A, Clements J et al (2014) Pfam: the protein families database. Nucleic Acids Res 42:D222–D230. https://doi.org/10.1093/nar/gkt1223
Finn RD, Coggill P, Eberhardt RY et al (2016) The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res 44:D279–D285. https://doi.org/10.1093/nar/gkv1344
Guindon S, Dufayard J-F, Lefort V et al (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59:307–321. https://doi.org/10.1093/sysbio/syq010
Lefort V, Longueville J-E, Gascuel O (2017) SMS: smart model selection in PhyML. Mol Biol Evol 34:2422–2424. https://doi.org/10.1093/molbev/msx149
Ronquist F, Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–1574. https://doi.org/10.1093/bioinformatics/btg180
Meng F, Uversky VN, Kurgan L (2017) Comprehensive review of methods for prediction of intrinsic disorder and its molecular functions. Cell Mol Life Sci 74:3069–3090. https://doi.org/10.1007/s00018-017-2555-4
Dosztányi Z, Csizmok V, Tompa P, Simon I (2005) The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. J Mol Biol 347:827–839. https://doi.org/10.1016/j.jmb.2005.01.071
Dosztányi Z, Csizmok V, Tompa P, Simon I (2005) IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics 21:3433–3434. https://doi.org/10.1093/bioinformatics/bti541
Di Domenico T, Walsh I, Tosatto SCE (2013) Analysis and consensus of currently available intrinsic protein disorder annotation sources in the MobiDB database. BMC Bioinformatics 14(Suppl 7):S3. https://doi.org/10.1186/1471-2105-14-S7-S3
Mészáros B, Erdős G, Dosztányi Z (2018) IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding. Nucleic Acids Res 46:W329–W337. https://doi.org/10.1093/nar/gky384
Fuxreiter M, Tompa P, Simon I (2007) Local structural disorder imparts plasticity on linear motifs. Bioinformatics 23:950–956. https://doi.org/10.1093/bioinformatics/btm035
Felsenstein J (1985) Phylogenies and the comparative method. Am Nat 125(1), 1–15. http://www.jstor.org/stable/2461605
Dos Santos HG, Siltberg-Liberles J (2016) Paralog-specific patterns of structural disorder and phosphorylation in the vertebrate SH3–SH2–tyrosine kinase protein family. Genome Biol Evol 8:2806–2825. https://doi.org/10.1093/gbe/evw194
Ortiz JF, MacDonald ML, Masterson P et al (2013) Rapid evolutionary dynamics of structural disorder as a potential driving force for biological divergence in flaviviruses. Genome Biol Evol 5:504–513. https://doi.org/10.1093/gbe/evt026
Fahmi M, Ito M (2019) Evolutionary approach of intrinsically disordered CIP/KIP proteins. Sci Rep 9:1575. https://doi.org/10.1038/s41598-018-37917-5
Rahaman J, Siltberg-Liberles J (2016) Avoiding regions symptomatic of conformational and functional flexibility to identify antiviral targets in current and future coronaviruses. Genome Biol Evol 8(11):3471–3484. https://doi.org/10.1093/gbe/evw246
Smock RG, Gierasch LM (2009) Sending signals dynamically. Science 324:198–203. https://doi.org/10.1126/science.1169377
Ahrens JB, Nunez-Castilla J, Siltberg-Liberles J (2017) Evolution of intrinsic disorder in eukaryotic proteins. Cell Mol Life Sci 74:3163–3174. https://doi.org/10.1007/s00018-017-2559-0
Rose PW, Prlić A, Bi C et al (2015) The RCSB protein data Bank: views of structural biology for basic and applied research and education. Nucleic Acids Res 43:D345–D356. https://doi.org/10.1093/nar/gku1214
UniProt Consortium (2019) UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res 47:D506–D515. https://doi.org/10.1093/nar/gky1049
The UniProt Consortium (2014) UniProt: a hub for protein information. Nucleic Acids Res 43:D204–D212. https://doi.org/10.1093/nar/gku989
El-Gebali S, Mistry J, Bateman A et al (2019) The Pfam protein families database in 2019. Nucleic Acids Res 47:D427–D432. https://doi.org/10.1093/nar/gky995
Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 245:403–410. https://doi.org/10.1016/S0022-2836(05)80360-2
Gouy M, Guindon S, Gascuel O (2010) SeaView version 4: a multiplatform graphical user Interface for sequence alignment and phylogenetic tree building. Mol Biol Evol 27:221–224. https://doi.org/10.1093/molbev/msp259
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic Supplementary Material
P.1.1
Supplementary_materials.zip (1477 KB)
Rights and permissions
Copyright information
© 2020 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Nunez-Castilla, J., Siltberg-Liberles, J. (2020). An Easy Protocol for Evolutionary Analysis of Intrinsically Disordered Proteins. In: Kragelund, B.B., Skriver, K. (eds) Intrinsically Disordered Proteins. Methods in Molecular Biology, vol 2141. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0524-0_7
Download citation
DOI: https://doi.org/10.1007/978-1-0716-0524-0_7
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-0523-3
Online ISBN: 978-1-0716-0524-0
eBook Packages: Springer Protocols