Abstract
The recurrent cycle of whole genome duplication (WGD) followed by massive duplicate gene loss (fractionation) differentiates plant evolutionary history from that of most other phylogenetic domains, where WGD has occurred relatively rarely, even on an evolutionary time scale. We discuss the mechanism of WGD and its biological consequences. We survey the prevalence of WGD in the flowering plants. We outline some of the major kinds of combinatorial optimization problems arising in computational biology for analyzing WGD. Fractionation and its consequences are the subject of mathematical modeling questions and further combinatorial algorithms. A strong connection is made between WGD in phylogenetic context and the theory of gene trees and species trees. We illustrate the analysis of WGD with studies involving a large number of sequenced plant genomes, including grape, the crucifers and other rosids, the asterid tomato, the eudicot Nelumbo nucifera and pineapple, a monocot.
Similar content being viewed by others
References
Aury J-M et al (2006) Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia. Nature 444:171–178
Wolfe KH, Shields DC (1997) Molecular evidence for an ancient duplication of the entire yeast genome. Nature 387:708–713
Soltis DE et al (2009) Polyploidy and angiosperm diversification. Am J Bot 96:336–348
Lokki J, Saura A (1980) Polyploidy in insect evolution. In: Polyploidy. Springer, Boston, pp 277–312
Tsutsui ND, Suarez AV, Spagna JC, Johnston JS (2008) The evolution of genome size in ants. BMC Evol Biol 8:64
Nakatani Y, Takeda H, Kohara Y, Shinichi M (2007) Reconstruction of the vertebrate ancestral genome reveals dynamic genome reorganization in early vertebrates. Genome Res 17:1254–1265
Jaillon O et al (2004) Genome duplication in the teleost fish tetraodon nigroviridis reveals the early vertebrate proto-karyotype. Nature 431:946–957
Kassahn KS, Dang VT, Wilkins SJ, Perkins AC, Ragan MA (2009) Evolution of gene function and regulatory control after whole-genome duplication: comparative analyses in vertebrates. Genome Res 19:1404–1418
Mable BK, Alexandrou MA, Taylor MI (2011) Genome duplication in amphibians and fish: an extended synthesis. J Zool 284:151–182
Gallardo MH, Gonzalez CA, Cebrian I (2006) Molecular cytogenetics and allotetraploidy in the red vizcacha rat, Tympanoctomys barrerae (Rodentia, Octodontidae). Genomics 88:214–221
Lavrenchenko LA (2014) Hybrid speciation in mammals: illusion or reality? Biol Bull Rev 4:198–209
Upham N, Evans B, Ojeda A (2015) The super-sized genomes of desert vizcacha rats. https://crowd.instrumentl.com/campaigns/super-sized-genomes-desert-vizcacha-rats/. Accessed April 1, 2016
Jiao Y, Wickett NJ, Ayyampalayam S, Chanderbali AS, Landherr L, Ralph PE, Tomsho LP, Hu Y, Liang H, Soltis PS, Soltis DE, Clifton SW, Schlarbaum SE, Schuster SC, Leebens-Mack J, Ma H, dePamphilis CW (2011) Ancestral polyploidy in seed plants and angiosperms. Nature 473:97–100
Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815
Blanc G, Hokamp K, Wolfe KH (2003) A recent polyploidy superimposed on older large-scale duplications in the Arabidopsis genome. Genome Res 13:137–144
Bowers, JE, Chapman BA, Rong J, Paterson AH (2003) Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422:433–438
Goff S et al (2002) A draft sequence of the rice genome (oryza sativa l. ssp. japonica). Science 296:92–100
Yu J et al (2002) A draft sequence of the rice genome (oryza sativa l. ssp. indica). Science 296:79–92
Tuskan G, DiFazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A, Schein J, Sterck L, Aerts A, Bhalerao R, Bhalerao R, Blaudez D, Boerjan W, Brun A, Brunner A, Busov V, Campbell M, Carlson J, Chalot M, Chapman J, Chen G, Cooper D, Coutinho P, Couturier J, Covert S, Cronk Q, Cunningham R, Davis J, Degroeve S, Dejardin A, dePamphilis C, Detter J, Dirks B, Dubchak I, Duplessis S, Ehlting J, Ellis B, Gendler K, Goodstein D, Gribskov M, Grimwood J, Groover A, Gunter L, Hamberger B, Heinze B, Helariutta Y, Henrissat B, Holligan D, Holt R, Huang W, Islam-Faridi N, Jones S, Jones-Rhoades M, Jorgensen R, Joshi C, Kangasjarvi J, Karlsson J, Kelleher C, Kirkpatrick R, Kirst M, Kohler A, Kalluri U, Larimer F, Leebens-Mack J, Leple J, Locascio P, Lou Y, Lucas S, Martin F, Montanini B, Napoli C, Nelson D, Nelson C, Nieminen K, Nilsson O, Pereda V, Peter G, Philippe R, Pilate G, Poliakov A, Razumovskaya J, Richardson P, Rinaldi C, Ritland K, Rouze P, Ryaboy D, Schmutz J, Schrader J, Segerman B, Shin H, Siddiqui A, Sterky F, Terry A, Tsai C, Uberbacher E, Unneberg P, Vahala J, Wall K, Wessler S, Yang G, Yin T, Douglas C, Marra M, Sandberg G, Van de Peer Y, Rokhsar D (2006) The genome of black cottonwood, populus trichocarpa (torr. & gray). Science 313:1596–1604
Jaillon O et al (2007) The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449:463–467
McKain MR, Tang H, McNeal JR, Ayyampalayam S, Davis JI, dePamphilis CW, Givnish TJ, Pires JC, Stevenson DW, Leebens-Mack JH (2016) A phylogenomic assessment of ancient polyploidy and genome evolution across the poales. Genome Biol Evol 8:1150–1164
Amborella Genome Project (2013) The Amborella genome and the evolution of flowering plants. Science 342:1241089
El-Mabrouk, N, Sankoff D (2003) The reconstruction of doubled genomes. SIAM J Comput 32:754–792
Alekseyev MA, Pevzner PA (2007) Colored de Bruijn graphs and the genome halving problem. IEEE/ACM Trans Comput Biol Bioinform 4:98–107
Mixtacki J (2008) Genome halving under DCJ revisited. In: Hu X, Wang J (eds) Computing and combinatorics (COCOON). 17th annual conference. Lecture notes in computer science, vol 5092. Springer, Berlin/Heidelberg, pp 276–286
Warren R, Sankoff D (2009) Genome halving with double cut and join. J Bioinforma Comput Biol 7:357–371
Tannier E, Zheng C, Sankoff D (2009) Multichromosomal median and halving problems under different genomic distances. BMC Bioinf 10:120
Gagnon Y, Tremblay-Savard O, Bertrand D, El-Mabrouk N (2010) Advances on genome duplication distances. In: Tannier E (ed) Comparative genomics (RECOMB CG ‘10). Lecture notes in computer science, vol 6398. Springer, Berlin/Heidelberg, pp 25–38
Sankoff D, Zheng C, Wall PK, dePamphilis C, Leebens-Mack J, Albert VA (2009) Towards improved reconstruction of ancestral gene order in angiosperm phylogeny. J Comput Biol 16:1353–1367
Gavranović H, Tannier E (2010) Guided genome halving: probably optimal solutions provide good insights into the preduplication ancestral genome of Saccharomyces cerevisiae. In: Pacific symposium on biocomputing, vol 15, pp 21–30
Zheng C, Zhu Q, Adam Z, Sankoff D (2008) Guided genome halving: hardness, heuristics and the history of the Hemiascomycetes. Bioinformatics 24:i96–i104
Zheng C (2010) Pathgroups, a dynamic data structure for genome reconstruction problems. Bioinformatics 26:1587–1594
Zheng C, Sankoff D (2011) On the Pathgroups approach to rapid small phylogeny. BMC Bioinf 12:S4
Warren R, Sankoff D (2010) Genome aliquoting revisited. In: Tannier E (ed) Comparative genomics (RECOMB CG). 8th annual workshop. Lecture notes in computer science, vol 6398. Springer, Berlin/Heidelberg, pp 1–12
Freeling M et al (2012) Fractionation mutagenesis and similar consequences of mechanisms removing dispensable or less-expressed DNA in plants. Curr Opin Plant Biol 15:131–139
Eckardt N (2001) A sense of self: the role of DNA sequence elimination in allopolyploidization. Plant Cell 13:1699–1704
Dietrich FS et al (2004): Ashbya gossypii genome as a tool for mapping the ancient Saccharomyces cerevisiae genome. Science 304:304–307
Kellis M, Birren BW, Lander ES (2004) Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428:617–624
Byrnes JK, Morris GP, Li WH (2006) Reorganization of adjacent gene relationships in yeast genomes by whole-genome duplication and gene deletion. Mol Biol Evol 23:1136–1143
van Hoek MJ, Hogeweg P (2007) The role of mutational dynamics in genome shrinkage. Mol Biol Evol 24:2485–2494
Sankoff D, Zheng C, Zhu Q (2010) The collapse of gene complement following whole genome duplication. BMC Genomics 11:313–313
Wang B, Zheng C, Sankoff D (2011) Fractionation statistics. BMC Bioinf 12(S9):S5
Sankoff D, Zheng C, Wang B (2012) A model for biased fractionation after whole genome duplication. BMC Genomics 13(S1):S8
Yu Z, Sankoff D (2016) A continuous analog of run length distributions reflecting accumulated fractionation events. BMC Bioinf 17:412
Zheng C, Sankoff D (2012) Fractionation, rearrangement and subgenome dominance. Bioinformatics 28:i402–i408
Jahn K, Zheng C, Kováč J, Sankoff D (2012) A consolidation algorithm for genomes fractionated after higher order polyploidization. BMC Bioinf 13(S19):S8
McLachlan GJ, Peel D, Basford KE, Adams P (1999) The Emmix software for the fitting of mixtures of normal and t-components. J Stat Softw 4(2):1–14
Sankoff D, Zheng C, Lyons E, Tang H (2016) The trees in the peaks. In: Algorithms for computational biology. Lecture notes in bioinformatics, vol 9702. Springer, Cham
Kimura M (1984) The neutral theory of molecular evolution. Cambridge University Press, Cambridge
Kumar S, Subramanian S (2002) Mutation rates in mammalian genomes. Proc Natl Acad Sci 99:803–808
Zheng C, Sankoff D (2013) Practical aliquoting of flowering plant genomes. BMC Bioinf 14(15):S8
Argout X et al (2011) The genome of Theobroma cacao. Nat Genet 43:101–108
Ming R, VanBuren R, Liu Y, Yang M, Han Y, Li L-T, Zhang Q, Kim M-J, Schatz MC, Campbell M, Li J, Bowers JE, Tang H, Lyons E, Ferguson AA, Narzisi G, Nelson DR, Blaby-Haas CE, Gschwend AR, Jiao Y, Der JP, Zeng F, Han J, Min XJ, Hudson KA, Singh R, Grennan AK, Karpowicz SJ, Watling JR, Ito K, Robinson SA, Hudson ME, Yu Q, Mockler TC, Carroll A, Zheng Y, Sunkar R, Jia R, Chen N, Arro J, Wai CM, Wafula E, Spence A, Han Y, Xu L, Zhang J, Peery R, Haus MJ, Xiong W, Walsh JA, Wu J, Wang M-L, Zhu YJ, Paull RE, Britt AB, Du C, Downie SR, Schuler MA, Michael TP, Long SP, Ort DR, Schopf JW, Gang DR, Jiang N, Yandell M, dePamphilis CW, Merchant SS, Paterson AH, Buchanan BB, Li S, Shen-Miller J (2013) Genome of the long-living sacred lotus (Nelumbo nucifera Gaertn.). Genome Biol 14(5):1–11
Zheng C, Sankoff D (2014) Practical halving; the Nelumbo nucifera evidence on early eudicot evolution. Comput Biol Chem 50:75–81
Tomato Genome Consortium et al (2012) The tomato genome sequence provides insights into fleshy fruit evolution. Nature 485:635–641
Denoeud F, Carretero-Paulet L, Dereeper A, Droc G, Guyot R, Pietrella M, Zheng C, Alberti A, Anthony F, Aprea G et al (2014) The coffee genome provides insight into the convergent evolution of caffeine biosynthesis. Science 345:1181–1184
Schnable JC, Springer NM, Freeling M (2011) Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss. Proc Natl Acad Sci 108:4069–4074
Ming R, VanBuren R, Wai CM, Tang H, Schatz MC, Bowers JE, Lyons E, Wang M-L, Chen J, Biggers E et al (2015) The pineapple genome and the evolution of cam photosynthesis. Nat Genet 47:1435–1442
Wang X, Wang H, Wang J, Sun R, Wu J, Liu S, Bai Y, Mun J-H, Bancroft I, Cheng F, Huang S, Li X, Hua W, Wang J, Wang X, Freeling M, Pires JC, Paterson AH, Chalhoub B, Wang B, Hayward A, Sharpe AG, Park BS, Weisshaar B, Liu B, Li B, Liu B, Tong C, Song C, Duran C, Peng C, Geng C, Koh C, Lin C, Edwards D, Mu D, Shen D, Soumpourou E, Li F, Fraser F, Conant G, Lassalle G, King GJ, Bonnema G, Tang H, Wang H, Belcram H, Zhou H, Hirakawa H, Abe H, Guo H, Wang H, Jin H, Parkin IAP, Batley J, Kim J-S, Just J, Li J, Xu J, Deng J, Kim JA, Li J, Yu J, Meng J, Wang J, Min J, Poulain J, Hatakeyama K, Wu K, Wang L, Fang L, Trick M, Links MG, Zhao M, Jin M, Ramchiary N, Drou N, Berkman PJ, Cai Q, Huang Q, Li R, Tabata S, Cheng S, Zhang S, Zhang S, Huang S, Sato S, Sun S, Kwon S-J, Choi S-R, Lee T-H, Fan W, Zhao X, Tan X, Xu X, Wang Y, Qiu Y, Yin Y, Li Y, Du Y, Liao Y, Lim Y, Narusaka Y, Wang Y, Wang Z, Li Z, Wang Z, Xiong Z, Zhang Z (2011) The genome of the mesopolyploid crop species Brassica rapa. Nat Genet 43:1035–1039
Liu S, Liu Y, Yang X, Tong C, Edwards D, Parkin IAP, Zhao M, Ma J, Yu J, Huang S, Wang X, J Wang, Lu K, Fang Z, Bancroft I, Yang T-J, Hu Q, Wang X, Yue Z, Li H, Yang L, Wu J, Zhou Q, Wang W, King GJ, Pires JC, Lu C, Wu Z, Sampath P, Wang Z, Guo H, Pan S, Yang L, Min J, Zhang D, Jin D, Li W, Belcram H, Tu J, Guan M, Qi C, Du D, Li J, Jiang L, Batley J, Sharpe AG, Park B-S, Ruperao P, Cheng F, Waminal NE, Huang Y, Dong C, Wang L, Li J, Hu Z, Zhuang M, Huang Y, Huang J, Shi J, Mei D, Liu J, Lee T-H, Wang J, Jin H, Li Z, Li X, Zhang J, Xiao L, Zhou Y, Liu Z, Liu X, Qin R, Tang X, Liu W, Wang Y, Zhang Y, Lee J, Kim HH, Denoeud F, Xu X, Liang X, Hua W, Wang X, Wang J, Chalhoub B, Paterson AH (2014) The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes. Nat Commun 5:3930
Kitashiba H, Li F, Hirakawa H, Kawanabe T, Zou Z, Hasegawa Y, Tonosaki K, Shirasawa S, Fukushima A, Yokoi S, Takahata Y, Kakizaki T, Ishida M, Okamoto S, Sakamoto K, Shirasawa K, Tabata S, Nishio T (2014) Draft sequences of the radish (Raphanus sativus l.) genome. DNA Res 21(5):481–490
Hu TT, Pattyn P, Bakker EG, Cao J, Cheng J-F, Clark RM, Fahlgren N, Fawcett JA, Grimwood J, Gundlach H, Haberer G, Hollister JD, Ossowski S, Ottilar RP, Salamov AA, Schneeberger K, Spannagl M, Wang X, Yang L, Nasrallah ME, Bergelson J, Carrington JC, Gaut BS, Schmutz J, Mayer KFX, Van de Peer Y, Grigoriev IV, Nordborg M, Weigel D, Guo Y-L (2011) The Arabidopsis lyrata genome sequence and the basis of rapid genome size change. Nat Genet 43:476–481
Haudry A, Platts AE, Vello E, Hoen DR, Leclercq M, Williamson RJ, Forczek E, Joly-Lopez Z, Steffen JG, Hazzouri KM, Dewar K, Stinchcombe JR, Schoen DJ, Wang X, Schmutz J, Town CD, Edger PP, Pires JC, Schumaker KS, Jarvis DE, Mandakova T, Lysak MA, van den Bergh E, Schranz ME, Harrison PM, Moses AM, Bureau TE, Wright SI, Blanchette M (2013) An atlas of over 90,000 conserved noncoding sequences provides insight into crucifer regulatory regions. Nat Genet 45:891–898
Lyons E, Freeling M (2008) How to usefully compare homologous plant genes and chromosomes as DNA sequences. Plant J 53:661–673
Lyons E, Pedersen B, Kane J, Freeling M (2008) The value of nonmodel genomes and an example using SynMap within CoGe to dissect the hexaploidy that predates rosids. Trop Plant Biol 1:181–190
Kagale S, Robinson SJ, Nixon J, Xiao R, Huebert T, Condie J, Kessler D, Clarke WE, Edger PP, Links MG et al (2014) Polyploid evolution of the Brassicaceae during the cenozoic era. Plant Cell 26:2777–2791
Acknowledgements
Research supported in part by grants from the Natural Sciences and Engineering Research Council of Canada (NSERC) and by National Science Foundation IOS –1339156. DS holds the Canada Research Chair in Mathematical Genomics.
Competing Interests
The authors declare that they have no competing interests.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media LLC
About this protocol
Cite this protocol
Sankoff, D., Zheng, C. (2018). Whole Genome Duplication in Plants: Implications for Evolutionary Analysis. In: Setubal, J., Stoye, J., Stadler, P. (eds) Comparative Genomics. Methods in Molecular Biology, vol 1704. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7463-4_10
Download citation
DOI: https://doi.org/10.1007/978-1-4939-7463-4_10
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-7461-0
Online ISBN: 978-1-4939-7463-4
eBook Packages: Springer Protocols