7.1 Introduction to Plant Repetitive DNA

There is, approximately, a 2,400-fold range in genome size (GS; measured as the amount of DNA in giga base pairs [Gbp] in the haploid unreplicated 1C nucleus) across angiosperms, ranging from ~0.061 Gbp/1C in Genlisea aurea to ~148 Gbp/1C in Paris japonica (Pellicer et al. 2010), a larger range than that observed for any comparable group of eukaryotes (Leitch and Leitch 2012; Pellicer et al. 2018). This variation in GS arises from two sources: polyploidy (whole-genome duplication) and the balance between processes that amplify repetitive DNA (i.e., (retro)transposition and recombination-based amplification) and those that delete repeats and other types of non-essential DNA (i.e., recombination-based deletion) (Kejnovsky et al. 2012).

Despite the large range in GS found across angiosperms, most species have a small GS (modal GS 0.587 Gbp/1C, mean GS 5.020 Gbp/1C, n = 10,768 species, Pellicer et al. 2018). This is interesting given the context of multiple ancestral polyploid events in most, if not all, angiosperm lineages. For example, within seed plants, the lineage leading to Solanaceae is predicted to have been through multiple rounds of whole-genome duplication, such that all species are at least paleo-36-ploid (Wendel 2015). Such high ploidal levels are not generally reflected in large genomes, most likely as a consequence of genome downsizing that typically follows polyploidy and involves the deletion of both genes and repeats during the process of post-polyploidization diploidization (Leitch and Bennett 2004; Dodsworth et al. 2016). Another general feature reported for repeats in plants with a small to medium GS (<10 Gbp/1C) is that many are highly dynamic, with amplification and deletion processes occurring so rapidly such that the half-life of any particular repeat is reported to be only 3–4 million years in Poaceae, Brassicaceae, Fabaceae, and Vitaceae (Ma et al. 2004; El Baidouri and Panaud 2013).

The proportion of a plant genome that is repetitive is highly variable between taxa. This partly reflects differences in the balance between amplification and deletion processes, which in turn influence the occurrence and longevity of (micro)satellite repeats, (retro)transposable elements, and their truncated derivatives. For example, conifers typically have larger GS than most lineages of angiosperms, probably reflecting contrasting genome dynamics between gymnosperms and angiosperms (Leitch and Leitch 2012). There are also differences in repetitive content of the genome depending on GS, such that species with the smallest GS have the smallest genome proportion occupied by repeats. For example, the estimated genome proportion in Utricularia gibba with a GS of 0.088 Gbp/1C is just 3% (Ibarra-Laclette et al. 2013) versus a reported 91.6% genome proportion in Aegilops tauschii, with a GS of 4.3 Gbp/1C (Li et al. 2004).

7.2 Introduction to Nicotiana

There are 42 species of Nicotiana that are cytogenetically diploid, falling into eight sections: Alatae with eight species (haploid chromosome count, n = 9–10), Noctiflorae with six species (n = 12), Petunioides with eight species (n = 12), Paniculatae with seven species (n = 12), Sylvestres with a single species (n = 12), Tomentosae with five species (n = 12), Trigonophyllae with two species (n = 12), and Undulatae with five species (n = 12). Homoploid hybrids have been reported to form between some of the sections, particularly in species belonging to Petunioides and Noctiflorae (Knapp et al. 2004; Kelly et al. 2010).

In addition, there are six allopolyploid Nicotiana sections, thought to involve species from different diploid sections, and providing an opportunity to study and compare repeat divergence in diploid lineages and the genomic melting pot generated by allopolyploidy itself. All the allopolyploids retain a signature of polyploidy in chromosome counts, and evidence of interspecific hybridization is seen in phylogenetic patterns from multiple nuclear markers, including the internally transcribed spacer sequences of ribosomal DNA (rDNA) (Clarkson et al. 2010, 2017), paralogs of the nuclear genes ADH, GS, FLO/LFY, WAXY, and MADS1/FUL (Kelly et al. 2010), and comparisons between these and plastid sequences from the maternal genome donor (Clarkson et al. 2004, 2010). Figure 7.1 summarizes the phylogenetic relationships in genus Nicotiana.

Fig. 7.1
figure 1

Summary of phylogenetic relationships in the genus Nicotiana, including the dates of formation (Clarkson et al. 2017) and number of species in each of the allopolyploid sections (Knapp et al. 2004). The diploid phylogenetic tree was reconstructed from combined plastid and nrITS data and is summarized from Clarkson et al. (2004) and Kelly et al. (2010). Haploid chromosome numbers (n) are given for each section. Section names in bold have been given. There are three polyploid sections containing multiple species (Nicotiana sections Suaveolentes, Repandae and Polydicliae and two monotypic polyploid sections, Nicotiana section Nicotiana (N. tabacum) and Rusticae (N. rustica)). The polyploid species N. arentsii is in Nicotiana section Undulatae, which also includes diploid species

(1) The oldest allopolyploid section, Nicotiana section Suaveolentes, contains at least 40 species (Knapp et al. 2004). This section has radiated substantially in Australia, and there are likely to be many more taxa than are currently recognized (four new species were described recently by Chase et al. 2018). The section includes N. benthamiana, a species widely used in studies of plant–pathogen interactions, and whose genome has been fully sequenced (Bombarely et al. 2012). Time-calibrated phylogenetic reconstructions suggest that the section probably formed about 6 million years ago (Mya). The diploid progenitors of section Suaveolentes are difficult to determine, likely because of their formation from one or more diploid hybrid taxa. Phylogenetic evidence suggests the involvement of several diploid sections (Noctiflorae, Alatae, and Petunioides) as the maternal genome donor, with N. sylvestris (section Sylvestres) as the most closely related extant relative of the paternal genome donor (Clarkson et al. 2010, 2017; Schiavinato et al. 2020). Chromosome numbers vary from the expected n = 24 (the sum of the parental chromosome counts, for an allotetraploid) down to n = 15, depending on the species. The species with lower chromosome numbers are presumed to have arisen from post-polyploidization dysploidy involving chromosome rearrangements, as part of a broader diploidization process.

(2) Nicotiana section Repandae has four allotetraploid species, each with n = 24 chromosomes, with two species that are endemic to the Revillagigedo Islands. The section is thought to have formed about 4 Mya (Clarkson et al. 2017) from a maternal parent most closely related to extant N. sylvestris and a paternal species related to Nicotiana section Trigonophyllae (e.g., N. obtusifolia var. obtusifolia/N. obtusifolia var. palmeri).

(3) Nicotiana section Polydicliae has two species, each with n = 24 chromosomes, and one of which, N. quadrivalvis, was cultivated by native North Americans, presumably for recreational use. The section formed about 1.4 Mya (Clarkson et al. 2017) from progenitors of N. obtusifolia (section Trigonophyllae; maternal genome donor) and a member of section Petunioides as the paternal genome donor.

(4) Nicotiana rustica (section Rusticae, n = 24) formed about 0.7 Mya (Clarkson et al. 2017) from progenitors of N. undulata (section Undulatae; paternal) and section Paniculatae (either N. paniculata or N. knightiana) as the maternal progenitor.

(5) Nicotiana tabacum (section Nicotiana, n = 24), the best-known Nicotiana allopolyploid, formed about 0.6 Mya (Clarkson et al. 2017) from progenitors of N. sylvestris (section Sylvestris; maternal genome donor) and N. tomentosiformis (section Tomentosae; paternal).

(6) Nicotiana arentsii (section Undulatae, n = 24) formed about 0.4 Mya (Clarkson et al. 2017) from two other species in section Undulatae, most closely related to extant N. undulata (section Undulatae; maternal parent) and N. wigandioides (section Undulatae; paternal parent).

7.3 Genome Size Variation in Nicotiana

An analysis of angiosperm GSs, grouped into presumed ploidal levels based on chromosome counts (from diploid to octoploid), indicated that higher ploidal levels were not associated with larger GS, as might be predicted (Leitch and Bennett 2004). These findings suggested that following polyploidy, there is a tendency toward the loss of genomic DNA, termed genome downsizing, perhaps caused by selection against large GS in angiosperms (Leitch and Leitch 2012; Guignard et al. 2016). Genome downsizing is predicted as a likely occurrence given enough time post-polyploidization. Serial genome rearrangements associated with genome downsizing lead to the so-called “wondrous cycles of polyploidy” reported for many angiosperm lineages (Wendel 2015).

To predict the direction of GS change following ancient polyploidy, it is necessary to reconstruct the size of ancestral diploid genomes at the point of polyploid formation. Such ancestral GSs can be estimated by summing the GSs of the most closely related extant diploids (Leitch et al. 2008). Alternatively, they can be derived by reconstructing the ancestral diploid GSs of diploid lineages at the point of allopolyploidy. This is achieved using ancestral reconstruction approaches, which requires a confident phylogenetic assignment of parental lineages. With estimates of the ages of polyploid events, rates of genome size change can be estimated. The predicted GS of ancestral polyploids can then be compared with the actual GS of the extant polyploid taxa. When such an analysis is conducted on Nicotiana, some polyploid species are predicted to have experienced genome upsizing, while most experienced genome downsizing.

Given the GS of the progenitor diploid taxa, the allotetraploids N. tabacum and N. arentsii are thought to have undergone genome downsizing of ~3.7% and ~3.9%, respectively, over ~0.4–0.6 million years. Similarly, N. rustica is likely to have downsized by ~1.9–5.4% over a similar time frame. In contrast, in section Polydicliae, divergence over a longer time frame of ~1.4 million years is thought to have given rise to approximately 2.5% (N. clevelandii) and 7.5% (N. quadrivalvis) increases in GS. Both genome upsizing and downsizing have been reported in species in section Repandae, over ~4 million years, with genome upsizing in N. repanda (~26.6%), N. nesophila (~19.1%), and N. stocktonii (~19.1%), and downsizing in N. nudicaulis (~14.3%) (Leitch et al. 2008). In the oldest section, Suaveolentes, which originated ~6 Mya, almost all species have undergone substantial genome downsizing associated with their divergence (Fig. 7.2). This has generated genomes that are similar in size to the mean GS of extant Nicotiana diploids (i.e., ~3.2 Gbp/1C, 34 species) and not much larger than N. sylvestris (2.6 Gbp/1C, paternal parent) and the average GS of species in sections Noctiflorae, Alatae, and Petunioides (about 3.4 Gbp/1C, 14 species) of which an extinct relative or a homoploid hybrid is likely to have been the maternal parent. Such genome downsizing likely arises as part of the diploidization process that includes chromosome rearrangements and dysploidy (reduction from n = 24 to n = 15 chromosomes in some taxa).

Fig. 7.2
figure 2

Box plots showing the distribution of genome sizes (1C-values) across diploid and polyploid Nicotiana taxa. Young polyploids show near-additive and/or genome upsizing, whereas older polyploid groups tend to show genome downsizing. Analysis based on data from Leitch et al. (2008) for diploid (nine taxa), young polyploid taxa (N. tabacum, N. rustica, N. arentsii, N. clevelandii, and N. quadrivalvis), and the 4-Mya section Repandae (all four taxa), and unpublished data for the 6-Mya section Suaveolentes (five taxa from the core Australian group: N. simulans, N. velutina, N. maritima, N. truncata, and N. goodspeedii)

7.4 Repeat Divergence in Diploid Nicotiana

In concert with GS variation, diploid Nicotiana genomes contain variable amounts of repetitive elements. The types, abundance, and patterns of element accumulation within the genomes of Nicotiana are, however, fairly consistent across the different diploid taxa thus far investigated. Nicotiana genomes are dominated by retrotransposons, particularly Ty3/Gypsy and Ty1/Copia elements. The most abundant retroelements in all Nicotiana genomes studied are Ty3/Gypsy elements, most of which are chromoviruses, particularly Tekay elements (Fig. 7.3). Most of the retrotransposon families found in Nicotiana are also present in genus Symonanthus, which is thought to be the closest relative to Nicotiana, based on phylogenetic evidence available to date (Clarkson et al. 2004, 2010; Kelly et al. 2010; Särkinen et al. 2013). Nevertheless, some elements are unique to Symonanthus (e.g., Ty1/Copia elements of the Ikaros family) and do not appear to be present in Nicotiana genomes (Fig. 7.3).

Fig. 7.3
figure 3

Comparison of the repetitive DNA content in diploid Nicotiana species (with Symonanthus as the outgroup) as analyzed using high-throughput sequence data (Illumina) and RepeatExplorer2 de novo clustering (Novak et al. 2010; Novák et al. 2013). The abundance of repetitive elements is shown as a proportion of the genome and annotations are shown for major repeat types using REXdb (Neumann et al. 2019) as implemented in RepeatExplorer2. Abundant categories are shown with enlarged squares in the legend. Phylogenetic relationships are summarized and section names are shown in bold

Xu et al. (2017) examined repeats in four diploid Nicotiana species, as well as in potato (Solanum tuberosum) and tomato (Solanum lycopersicum), and suggested there was an expansion of Ty3/Gypsy retroelements in Nicotiana that correlated with its increased GS. In Fig. 7.3, we present further evidence that GS is correlated with the abundance of retroelements, particularly Ty3/Gypsy retrotransposons, across a broader set of eight Nicotiana diploid species (including representatives from all diploid sections). Xu et al. (2017) also suggested there was an expansion of a Solanaceae-specific subgroup of MITE elements (called DTT-NIC1) in N. attenuata that were enriched within a 1-Kbp region upstream of the genes involved in nicotine biosynthesis. They proposed that these elements may have been involved in the up-regulation of genes involved in nicotine biosynthesis. Other class II DNA transposons that are present in all Nicotiana genomes analysed to date include hAT and MuDR_Mutator elements (Fig. 7.3), while EnSpm/CACTA elements were found in the genomes of about half of the species analysed.

Using molecular and cytogenetic approaches, much research has been conducted on tandemly repeated satellite DNAs of Nicotiana, including well-studied families such as GRS (Gazdová et al. 1995), NTRS (Matyasek et al. 1997), geminivirus-related DNA (GRD) (Bejarano et al. 1996), and HRS60 (e.g., Fajkus et al. 1995b; Gazdová et al. 1995) (Table 7.1). These repetitive elements were isolated predominantly from telomeric and subtelomeric domains (Fajkus et al. 1995a) and centromeric and pericentromeric domains (Shibata et al. 2013), and their overall structure and typically high levels of cytosine methylation studied (Kovarik et al. 2000). In addition, there has been considerable focus on sequence divergence, activity, cytosine methylation, and distribution of 5S (Fulnec̆ek et al. 2002; Matyasek et al. 2002) and 35S rDNA loci (Kovarik et al. 2004).

Table 7.1 Repeats that characterize the chromosome domains of Nicotiana tabacum (common tobacco)

The structure of the intergenic spacer and the sub-repeats contained in 35S rDNA have been further studied because they are thought to be involved in driving homogenization of rDNA unit arrays and the regulation of rDNA expression (Borisjuk et al. 1997; Volkov et al. 1999; Lim et al. 2004b; Kovarik et al. 2008). One 135-bp sub-repeat termed A1/A2, which was isolated from the 26–18S rDNA intergenic spacer of N. tomentosiformis, was shown to be dispersed across several chromosomes (Lim et al. 2004b). A comparison of the distribution of A1/A2 repeats in species of section Tomentosae showed that they were present in low copy numbers in all species except N. tomentosiformis and likely expanded in the lineage leading to this species.

The HRS60 family of repeats occurs at a subtelomeric location in all diploid Nicotiana sections examined (i.e., Alatae, Sylvestres, Undulatae, and Paniculatae), but not in section Tomentosae where the HRS60 repeats are interstitial; the significance of this is currently unclear (Lim et al. 2000b, 2004b, 2005, 2006). All the HRS60 sequences are clearly related, and variants have diverged and undergone sequence homogenization such that each Nicotiana section has its own characteristic member of this family (Koukalova et al. 2010). The predominantly subtelomeric location of the HRS60 family of sequences and rDNA loci in Nicotiana means that the sequences interface directly with telomeric and degenerate telomeric motifs. The colocalisation of these apparently distinct domains may have consequences for plant genome evolution, and certainly they share some interesting biology; for example, they can generate non-coding transcripts, promote large-scale chromosomal rearrangements (e.g., Robertsonian fusions), and can be associated with nucleoli (Dvořáčková et al. 2015).

The GRD tandem repeat family is of particular interest and is thought to have derived from a free-living geminivirus (with which there is much sequence similarity) that integrated into the genome. This integration is proposed to have occurred in an ancestor of diploid species in section Tomentosae, where it amplified in a tandem array (GRD5 sequences on homologous chromosome 2). Later, specifically in the lineage leading to N. tomentosiformis, but before the formation of N. tabacum, a related sequence integrated de novo at a new chromosomal locus (GRD3 on chromosome 2) where it too amplified in a tandem array. Sequence analysis suggests that a large number of synonymous changes occurred in GRD3 compared with the number of synonymous changes in GRD5, indicative of purifying selection between integration events (Murad et al. 2002, 2004). The best explanation of these data is that there was a recombination event involving GRD5 and another free-living geminivirus, which generated the new GRD3 sequence that was subsequently re-integrated into the genome in a process that may have involved a helitron transposable element (Murad et al. 2004). Thus, N. tabacum inherited both GRD3 and GRD5 from N. tomentosiformis at its formation, and subsequently a variant sequence (GRD53) integrated at a third locus in the tobacco genome, potentially also involving a helitron (Murad et al. 2004).

7.5 Divergence of Repeats in Allopolyploid Nicotiana

Allopolyploids have formed in Nicotiana over widely different timescales (<0.5–6 Mya), frequently involving the same diploid parental lineages, and creating multiple species in section Suaveolentes. In addition, multiple synthetic polyploid lines have been developed. This makes the Nicotiana genus an ideal model to determine the fate of repetitive sequences and genome restructuring subsequent to polyploidization over a range of timescales. Using sequence-specific amplified polymorphisms to characterize the distribution of Copia-like retrotransposons, Petit et al. (2007) compared N. tabacum with its progenitor diploids and found that, in N. tabacum, the different Copia-like elements studied showed unique patterns that arose subsequent to the formation of N. tabacum. The frequencies of losses and gains of the observed insertion sites reflect retroelement mobility and sequence losses. Such changes are also apparent in the S4 generation of synthetic tobacco polyploids, where significant amplifications of Tnt1 Copia elements have been observed (Petit et al. 2010). As first proposed by McClintock (1984), the genomic shock of polyploidy, which is thought to arise from the merger of two distinct parental genomes in the nucleus, potentially stimulated rapid and dynamic genomic changes early in N. tabacum divergence (Petit et al. 2010).

In addition to changes in the mobility of retroelements, the targeted loss of tandemly repeated and dispersed repeats in N. tabacum, particularly from the N. tomentosiformis-derived subgenome, has been reported from tobacco accessions (Skalicka et al. 2005; Renny-Byfield et al. 2011) and, to a lesser extent, from synthetic tobacco lines (Skalicka et al. 2003). For example, the NicCL3 tandem repeat sequence has a high copy number in N. tomentosiformis as in other diploids in section Tomentosae (Renny-Byfield et al. 2012). In contrast, in N. tabacum the NicCL3 tandem repeat sequence has a much lower copy number than in N. tomentosiformis, indicative of its loss from the N. tomentosiformis-derived subgenome of N. tabacum (Renny-Byfield et al. 2012). Similarly, there are fewer copies of an endogenous pararetrovirus-like sequence (Matzke et al. 2004) and of A1/A2 tandem repeats (Lim et al. 2004b) in N. tabacum than in N. tomentosiformis. The loss of the repeats from the paternally derived N. tomentosiformis progenitor’s genome is an example of asymmetric loss of DNA from parental subgenomes, a phenomenon called biased fractionation (cf. Wendel 2015). This may have arisen from incompatibilities between the maternally derived cytoplasm and the biparentally derived nucleus at the time of allopolyploid formation (Leitch et al. 2006), as proposed by the nuclear–cytoplasmic interaction hypothesis (Gill and Friebe 2013).

A comparison of six transposable elements using sequence-specific amplified polymorphisms in synthetic polyploids of N. tabacum, N. rustica, and N. arentsii and multiple accessions of their diploid progenitors revealed that although element losses were evident in newly formed polyploids (Mhiri et al. 1997, 2019), element mobility is not apparent until later generations, suggesting that meiosis stimulates mobility. Nevertheless, it should be noted that the dynamics of the elements were not the same in the three synthetic polyploids and that the most active elements in the diploid progenitors were those that had the greatest impact on the early genome divergence of the polyploids (Mhiri et al. 2019).

Unlike the repeats described in the previous paragraphs, fluorescent in situ hybridization to metaphase spreads of N. tabacum, N. rustica, and N. arentsii revealed complete additivity in the number and distribution of rDNA loci to those found in the parental diploids (Lim et al. 2004a). In some S4 generation synthetic tobacco lines, there was perfect additivity of rDNA loci and rDNA units to that found in the parents; however, in other lines, new loci had evolved and new units amplified. Furthermore, in some of these lines, new units with close similarities to those in N. tomentosiformis have completely overwritten parental units. This process represents astonishingly rapid homogenization that has affected thousands of genes in just a few generations (Skalicka et al. 2003). In natural allopolyploids, rDNA units show varying degrees of rDNA homogenization depending on the allopolyploid species studied. Nicotiana arentsii (formed ~0.4 Mya) shows complete homogenization toward an rDNA unit that is similar to that observed in the N. undulata parental genome, this process having thus replaced all the units derived from the N. wigandioides parent. In contrast, in the older polyploid N. rustica (~0.7 Mya), we see the least change in the rDNA unit structure, with the different accessions analyzed showing only different ratios of parental rDNA unit types (Dadejová et al. 2007). In N. tabacum, there is large-scale, but incomplete, replacement of N. sylvestris-type units by units that are most similar to the N. tomentosiformis parent (Lim et al. 2000a). More recently, an analysis of rDNA diversity in N. tabacum using high-throughput sequence read data revealed that sequences in the 35S rDNA genic component of the rDNA unit are most homogenous, whereas those of the intergenic spacer are more diverse, indicating a more complete homogenization of the functional domains, or less subsequent divergence in these regions following homogenization (Lunerová et al. 2017).

There is no evidence to suggest that nuclear–cytoplasmic interactions affect the direction of rDNA unit homogenization, because the direction of replacement/bias is not always toward the maternal progenitor (Volkov et al. 1999; Lim et al. 2000a). This suggests that these processes involve different mechanisms to those that affect the loss of other high-copy repeats, as reported by Renny-Byfield et al. (2011) for N. tabacum. Thus, as described previously in nucleolar dominance studies, there is no indication that the parental direction of the cross has any effect on homogenization processes (Chen and Pikaard 1997). Instead, based on an analysis of rRNA transcripts in N. tabacum and N. rustica, it appears that the rDNA units that are most actively transcribed are also those that were amplified (Lunerová et al. 2017). Kovarik et al. (2008) proposed that epigenetic changes triggered by polyploid formation establish patterns of nuclear dominance that affect the expression of rDNA units, rendering the active units vulnerable to homogenization and unit conversion. Conversely, silent units, which are mostly condensed at interphase, may be less vulnerable to recombination processes, that may occur in the nucleolus and are most likely to mutate and be lost. In this way, variant types that are active are prone to amplify across the rDNA loci, first across intralocus tandem arrays, then less frequently between arrays (Kovarik et al. 2008; Lunerová et al. 2017).

In older polyploids in sections Polydicliae and Repandae, which formed approximately 1 and 4 Mya, respectively, the numbers of rDNA loci have dropped to two or three, as found in the progenitor diploids (Clarkson et al. 2005). In section Polydicliae, subtelomeric satellite DNAs from both parents can be found on the same chromosomes, a characteristic that is not observed in younger polyploids, which tend to maintain subtelomeric satellites intact (Kenton et al. 1993; Lim et al. 2004a, 2005; Matyasek et al. 2011). The distribution in section Polydicliae could only have arisen if recombination-based processes occurred between both parental subgenomes (Koukalova et al. 2010). In the even older polyploid section Repandae (~4 Mya), three species (N. nesophila, N. stocktonii, and N. repanda) have a clade-specific satellite (NNES10) that has replaced other existing satellites found in the progenitor diploids, revealing the emergence, amplification, and homogenization of sequences over time frames of a few million years (Koukalova et al. 2010; Dodsworth et al. 2017).

Genomic in situ hybridization used to identify the parental origin of the chromosomes of polyploid species in section Repandae largely failed in early experiments (Lim et al. 2007). This was presumed to be due to large-scale replacement of the high copy repeats across the genome. Such genome turnover, in under 5 million years, is consistent with the half-life reported for repeats in rice, of ~6 million years (Ma et al. 2004). Modifications to genomic in situ hybridization protocols have enabled the discrimination of parental subgenomes in all Repandae species (Dodsworth et al. 2017), albeit with differences between taxa. These experiments probably localize the lower copy repeat fractions more effectively and, in doing so, reveal intergenomic translocations (Dodsworth et al. 2017). As noted in Sect. 7.3, three of the species in section Repandae (N. nesophila, N. stocktonii, and N. repanda) show genome upsizing, largely arising from the expansion of some high-copy repeats, whereas a fourth species (N. nudicaulis) shows genome downsizing and is sister to the rest of the species in this section (Renny-Byfield et al. 2013; Dodsworth et al. 2017). The divergence of section Repandae is thus associated with a changing profile of repeats (Dodsworth et al. 2017) that reflects the evolutionary history of this and other sections of Nicotiana (Koukalova et al. 2010).

7.6 Lag-Phase Hypothesis and Nicotiana

Much has been written about the advantages of polyploidy, including, for example, fixed heterozygosity, the release from selection of redundant gene duplicates that can then take up new functions (subfunctionalization/neofunctionalization), and the mix-and-match of induced redundancy in biochemical pathways allows for the generation of new biochemistry (Soltis and Soltis 2000; Wendel 2015). In the polyploid events ancestral to Solanaceae and Nicotiana, there were duplications of the polyamine and nicotinamide adenine dinucleotide pathways that led to gene redundancy, reducing selection pressure on these pathways, and the evolution of a new alkaloid, nicotine, which characterizes the genus (Xu et al. 2017).

Despite the advantages of polyploidy, there may also be both severe and subtle disadvantages. For example, in the early stages of polyploid formation, there is a potential for reduced fitness, perhaps associated with nuclear–cytoplasmic interactions and the aberrant segregation of chromosomes at meiosis, the latter seen in the synthetic tetraploid N. tabacum (Burk, 1973; Leitch et al. 2006) and in other well-studied tetraploids such as Arabidopsis arenosa (Yant et al. 2013). It has also been proposed that the extra burden of DNA associated with polyploidy may lead to resource limitation, particularly nitrogen and phosphorus, because nucleic acids are particularly demanding for these elements and many soils contain only small amounts of these nutrients (Šmarda et al. 2013; Guignard et al. 2016). Furthermore, increased GS is associated with larger cell sizes, longer cell cycle times, and negative impacts on photosynthesis (Greilhuber and Leitch 2013), all of which may act as a selection pressure against larger GSs, potentially leading to genome downsizing.

Given the high occurrence of polyploidy in the ancestry of most angiosperm lineages at a variety of nested phylogenetic levels (Wood et al. 2009), it has been proposed that the advantages likely outweighed disadvantages, especially from a long-term evolutionary viewpoint. Nonetheless, in the short term, polyploids are likely to show higher extinction rates (Mayrose et al. 2011), and, possibly, higher diversification rates than diploids (Soltis et al. 2014; Mayrose et al. 2015). Further analyses of diversification rates among angiosperms suggest nested shifts in diversification that do not correlate perfectly with predicted paleopolyploidy events, instead increased diversification rates appear to follow a lag phase (Tank et al. 2015; Landis et al. 2018). The lag-phase hypothesis (Schranz et al. 2012) posits that over several millions or tens of millions of years, polyploid species can undergo rapid radiations and speciation depending on subsequent evolutionary events. These events may include the diploidization process itself (e.g., genome downsizing and gene neo- and sub-functionalization) that occurs subsequent to polyploidy (Dodsworth et al. 2016). Potentially, such diploidization processes also facilitate adaptation to new environments and the radiation of diploidized taxa into new niches. This phenomenon is certainly occurring in section Suaveolentes, where, despite the origin of the section being ~6 Mya (Clarkson et al. 2017), most species have evolved more recently (2–3 Mya; Chase et al. 2018), concomitant with moves into new habitats across the arid zones of Australia. A summary of the events that are likely to have occurred in that lag phase from a genomic viewpoint is given in Table 7.2, including an outline of the genomic processes affecting repeat dynamics over shorter timescales in polyploid Nicotiana.

Table 7.2 A hypothesis for genome evolution in polyploids, with reference, where possible, to Nicotiana species