Introduction

Transposable elements (TEs) are fragments of DNA that are able to “move” and proliferate within the host genome. TEs are commonly divided into two main groups: (1) Class I or retrotransposons, which replicate through a “copy and paste” mechanism; first, producing an RNA molecule from a promoter in a terminal repeat (LTR retrotransposons) or from an internal promoter (non-LTR retrotransposons), reverse transcribing the RNA, and integrating it into the host genomic DNA. (2) Class II or DNA transposons, which move via a “cut and paste” mechanism, producing double-strand breaks in the process. TEs are then subdivided into subclasses, orders, superfamilies, and families depending on their insertion mechanism, structure, and protein-coding similarities (Wicker et al. 2007).

TEs makeup a large fraction of most genomes. For example, they compose over 45 % of the human genome (Alexander et al. 2010), ~40 % of the rice genome (Tenaillon et al. 2010), ~80 % of the maize genome (Schnable et al. 2009), and ~90 % of the wheat genome (Charles et al. 2008). Class I non-LTR retrotransposons predominate in mammalian genomes, with class II making up less than 5 % of the TE fraction (Deininger and Batzer 2002). For example, ~30 % of the human genome is derived from just two non-LTR retrotransposon families, termed LINE 1 (L1) and SINE (Alu) (Hancks et al. 2011; Lander et al. 2001). The yeast Saccharomyces cerevisiae has only LTR retrotransposons, termed Ty elements (Bleykasten-Grosshans et al. 2011). Flowering plants, including monocots (e.g., grasses) and dicots (e.g., tomato and Arabidopsis), have a rich collection of both TE classes, with clear domination of class I LTR retrotransposons (Kumar and Bennetzen 1999; Tenaillon et al. 2010). The Drosophila genome also contains TEs from both classes, but mainly class I elements (Kaminker et al. 2002). In wheat (TriticumAegilops superfamily), there were several reports that showed copy number variation of TEs using limited number of BAC sequences (Charles et al. 2008; Sabot et al. 2005), or by analyzing specific TE families in several wheat accessions (Asakura et al. 2008; Tomita et al. 2008; Yaakov and Kashkush 2012). The variability of TE proliferation in Triticum and Aegilops revealed their usefulness as genetic markers (Queen et al. 2004). The variability of TE content in wheat might impact wheat evolution (Charles et al. 2008; Yaakov and Kashkush 2012).

In plants, transposons are usually silenced by epigenetic means through DNA methylation, chromatin modifications, and small RNAs (Cantu et al. 2010; Slotkin and Martienssen 2007). However, TEs may be reactivated in situations of cell stress, such as wide hybridization and polyploidization (Grandbastien et al. 2005; Lisch 2009; Mansour 2007). The activity of transposons may be one mechanism to create the genetic variability needed for speciation of reproductively isolated populations (Dubcovsky and Dvorak 2007; Hosid et al. 2012). The heritable variability brought about by TEs includes genetic changes resulting from transpositional activity or homologous (Xuan et al. 2012), nonhomologous and illegitimate recombination of TEs (Devos et al. 2002), and epigenetic changes affecting neighboring genes resulting from production of anti-sense RNA (Kashkush and Khasdan 2007; Puig et al. 2004) and changes in DNA methylation and chromatin modifications.

Allopolyploidy involves the hybridization of genetically distinct but related genomes and whole-genome duplication. Allopolyploidy is a common occurrence in plants, found in most economically significant crops such as cotton and wheat (Feldman and Levy 2005).

The evolutionary history of wheat involves two separate allopolyploidization events: the first includes wild wheat T. urartu (2n = 2x = 14, genome AA) and a species from section Sitopsis (2n = 2x = 14, genome BB) to produce T. turgidum ssp. dicoccoides (2n = 4x = 28, genome BBAA) and the second includes T. turgidum ssp. dicoccoides and Ae. tauschii (2n = 2x = 14, genome DD) to produce bread wheat T. aestivum (2n = 6x = 42, genome BBAADD).

For the study of mechanisms underlying the immediate consequences of polyploidy-induced genomic changes (revolutionary changes), wheat has been a useful model as its historical events of allopolyploidization can be mimicked in the lab, by hybridizing the known ancestors of wheat polyploids (Adams and Wendel 2005; Kashkush et al. 2003; Liu et al. 1998; Ma et al. 2004; Ozkan et al. 2001; Shaked et al. 2001; Yaakov and Kashkush 2011a, b). The impact of transposable elements on the genomic evolution of plants is not completely clear; however, the use of recently domesticated species, such as wheat, is essential for the study of rapidly changing genetic elements (Comai 2005). Previous studies have shown that newly formed polyploids undergo rapid and reproducible genomic and epigenomic changes (Chen 2007; Comai 2005; Feldman and Levy 2009). Furthermore, transcriptional activation of an LTR retrotransposon (WIS2-1A) was observed to occur as a result of polyploidization in wheat, as well as influenced adjacent gene products (Kashkush et al. 2003). Polyploidization in wheat also affected the methylation status (Kraitshtein et al. 2010; Parisod et al. 2009, 2010; Yaakov and Kashkush 2011a, b; Zhao et al. 2011), of associated siRNA (Kenan-Eichler et al. 2011) and rearrangements (Bento et al. 2008; Kraitshtein et al. 2010; Petit et al. 2010) of transposable elements.

In this study, we assessed in detail the genomic composition of 16 TE families, including LTR and non-LTR retrotransposons and DNA transposons, in 22 Triticum and Aegilops accessions, including diploid (AA, BB, and DD), tetraploid (BBAA), and hexaploid (BBAADD) species. We used our quantitative PCR protocol (Baruch and Kashkush 2012; Kraitshtein et al. 2010; Yaakov et al. 2013) to assess the relative copy number of each TE in all species in a relatively high resolution. The use of species that donated the AA, BB, and DD genomes to modern wheat, and the use of newly synthesized wheat allohexaploids, facilitated the detailed analysis of evolutionary and revolutionary (genetic and epigenetic changes that occur in the first generations of the nascent allopolyploid species) dynamics of TEs. The possible activity of TEs during the evolutionary history of wheat, and reactivation due to genomic stress resulting from the allopolyploidization process, is discussed.

Materials and methods

Plant material

In this study, we used 22 wheat accessions (see details in Supplemental Table 1): five accessions of Aegilops searsii; nine accessions of Ae. speltoides, Ae. sharonensis, Ae. longissima; two accessions of Triticum urartu, Aegilops tauschii, Triticum turgidum ssp. dicoccoides, Triticum turgidum ssp. durum, and Triticum aestivum. In addition, four generations (S1–S4) of newly formed wheat allohexaploid (genome BBAADD) and its parental lines, Triticum turgidum ssp. durum (accession TTR19; genome BBAA) and Aegilops tauschii (accession TQ27; genome DD) were used. Note that we refer to all Sitopsis species as containing the B genome (equivalent to the S genome) to prevent confusion, because the genomic composition of the polyploid species are considered BBAA and BBAADD. The newly formed allohexaploid resembles natural hexaploid wheat (Ozkan et al. 2001). DNA was extracted from young leaves (4 weeks post germination) using the DNeasy plant kit (Qiagen).

Quantitative PCR

We used qPCR assay (Baruch and Kashkush 2012; Kraitshtein et al. 2010; Yaakov et al. 2012; Yaakov et al. 2013) to relatively quantify 13 retrotransposons (including 11 LTR and 2 non-LTR families) and three DNA transposons (see details in Supplemental Table 2) from the Triticeae Repeat Sequence Database (http://wheat.pw.usda.gov/ITMI/Repeats/). We compared each reaction to amplification of the VRN1 gene, as this gene is found in one copy in each wheat genome (ΔC t = C tTargetC tVRN1 ) (Kraitshtein et al. 2010). The normalized quantities were then compared to the quantity in Ae. tauschii (accession TQ27), such that the relative quantity (RQ) in this sample was 1 [ΔΔC t = ΔC tSample−ΔC tTQ27; see (Kraitshtein et al. 2010)]. The efficiency of the PCR reactions (E), as determined by a standard curve through serial dilutions of mixed templates (Supplemental Fig. 1a), as well as the relative ploidy level of each sample (P; diploids = 1, tetraploids = 2, and hexaploids = 3), were taken into account. The calculations can be summarized in the equation: RQ = P × (2 × E)-ΔΔCt (Livak and Schmittgen 2001). As validation, we ran the products of the PCR reaction using primers for Fatima (see Supplemental Table 2), in all 9 analyzed species and Triticum monococcum (accession TMB02, genome AmAm), on agarose gel, and observed amplification in all cases (Supplemental Fig. 1b). In addition, we ran all the samples again, using two different primer sets for Angela-A (Copia retrotransposon) and Fatima (Gypsy retrotransposon), and observed results very similar to the ones presented below.

Results and discussion

Copy number variation of TEs in Triticum and Aegilops species

Complete analysis of the copy number variation (CNV) of TE content requires the availability of fully sequenced genomes. In the absence of a complete assembled sequence for wheat, various strategies have been used to estimate the copy number of TEs in wheat. For example, Sabot et al. (2005) and Charles et al. (2008) have analyzed a limited number of BAC sequences (representing ~0.3 % of the wheat genome) to estimate the copy number of TEs, while Asakura et al. (2008) have used Southern blot analysis to estimate the copy number of one Mutator-like TE family in various wheat species. The availability of whole-genome shotgun sequencing for hexaploid wheat using 454-pyrosequencing (Brenchley et al. 2012) allowed the analysis of the copy number of miniature TEs such as MITEs (Yaakov et al. 2013). The small size of MITEs in wheat (55–300 bp in length) facilitated the retrieval of intact MITE elements from the 454 databases (average sequence size of ~400 bp). However, retrieving long TEs, such as the ones used in this study from the 454 database is very challenging because of the huge redundancy of TE sequences and the inability to properly differentiate between TE sequences. In a very recent study, Senerchia et al. (2013) used low coverage 454-sequences of Ae. cylindrica and Ae. geniculata to analyze the abundance of several LTR retrotransposon families.

We have previously showed that the data produced by our qPCR assay is very efficient by analyzing rice TEs and comparing the relative quantity to the absolute copy numbers from the fully sequenced rice species (Baruch and Kashkush 2012). In addition, we have performed a similar analysis for MITEs in wheat (Yaakov et al. 2013).

In this study, we have assessed the relative quantity of six Gypsy, four Copia and one Copia-like LTR retrotransposon, two non-LTR retrotransposons, and three DNA transposons. It is very important to note that the primer pairs that were used for each TE family were designed from conserved sequences specific to each family (based on the analysis of multiple sequence alignment of several members of each family retrieved from the NCBI database). In addition, for some cases, we used different primer combinations to validate the results (see “Materials and methods”).

Relative quantity of Gypsy LTR retrotransposons

The analysis of the relative quantities of six Gypsy families (Fig. 1a; Supplemental Fig. 2a–e) revealed that Fatima is very abundant in Ae. searsii and Ae. speltoides, while it is least abundant in Ae. tauschii and T. urartu. Fatima content in Ae. searsii is ~4,300 times its content in Ae. tauschii and ~93 times its content in T. urartu. Interestingly, the relative quantity of Fatima in tetraploid and hexaploid species was dramatically lower (~10 times less) than the one observed in Ae. searsii. The high abundance of Fatima in Ae. speltoides was also detected using FISH analysis (Salina et al. 2011). For each TE, we have calculated the coefficient of variation (CV) parameter, a statistical value that is based on the observed standard deviation in RQ of all species divided by the average RQ of all species, to reveal the level of variability of each TE, which in turn indicates the level of dynamics (activity) for each TE (Wright and Schoen 2000). In the Gypsy superfamily, Fatima together with Erika (specifically proliferated in the A genome, Fig. 1a) showed relatively higher CV values (1.56 and 1.66, respectively, Supplemental Table 3) in all species, indicating their relatively higher dynamics in wheat. In addition, Sabrina and BAGY2 showed relatively lower CV values in all species (0.41 and 0.72, respectively), indicating their lower dynamics in wheat compared to other Gypsy TE families. Finally, the Latidu family showed specific proliferation in Ae. sharonensis: ~73 times its content in Ae. tauschii and Ae. speltoides, twice its content in Ae. longissima, and three times its content in Ae. searsii (Supplemental Fig. 2c).

Fig. 1
figure 1

Relative quantification (compared to Ae. tauschii, set as 1, see materials and methods) of four transposable elements in various wheat species. The elements presented are examples of a genome-specific proliferation (Erika), b Ae. speltoides as donator of the B genome (Paula) and nonadditive values in c tetraploid T. turgidum ssp. dicoccoides (Balduin), and d hexaploid T. aestivum (Charon) wheat. All relative quantities are normalized to Ae. tauschii. Standard deviation was calculated based on three technical replicates

Relative quantity of Copia LTR retrotransposons

The analysis of relative quantity of five Copia families (Supplemental Fig. 2f–j) revealed that Angela, Barbara, and WIS-A showed similar levels of variability among the wheat species (CVs of 0.41, 0.35, and 0.36, respectively, Supplemental Table 3). Interestingly, all three elements showed relatively higher content in one of the seven Ae. speltoides accessions, TS01 (Supplemental Fig. 2f–h). Among the Copia superfamily, BARE1C showed significantly higher variability (CV of 1.1, Supplemental Table 3) compared to the other three elements. Finally, the Veju element [Copia-like element, (Kraitshtein et al. 2010)], showed low variability (CV of 0.45) among species. In addition, Ae. tauschii (genome DD) showed significantly lower quantities compared to the A and B genomes, similar to what was estimated by Kraitshtein et al. (2010).

Relative quantity of non-LTR retrotransposons

The analysis of relative quantity of two non-LTR retrotransposon families (Fig. 1b; Supplemental Fig. 2l) revealed that both elements showed relatively high variability among species (CV of 1.32 for Ramona and 1.23 for Paula, Supplemental Table 3). The two elements also show greater quantities in the tetraploid T. turgidum ssp. dicoccoides, but Paula has much less relative quantity in the hexaploid T. aestivum compared with Ramona.

Relative quantity of DNA transposons

The analysis of relative quantity of three DNA element families (Fig. 1c, d; Supplemental Fig. 2k) revealed that Balduin had the lowest CV values, while Rong had the highest CV values (Supplemental Table 3), except the CV value for Ae. speltoides which were elevated due to a specific proliferation of Charon in one accession (TS41) from Israel (Fig. 1d; Supplemental Table 1). In addition, Rong had high relative quantities in Ae. sharonensis and a particular accession of Ae. speltoides (542274, Supplemental Table 1) from Adiyaman in Turkey, Balduin had high relative quantities in T. urartu and T. aestivum, and Charon had high relative quantities in T. urartu, Ae. longissima, and T. turgidum ssp. dicoccoides.

Variation across all species

The CV of all the species ranged from 0.354 (Barbara, a Copia family) to 1.661 (Erika, a Gypsy family), indicating that the former is the least active and the latter is the most active element during the evolutionary history of wheat (Supplemental Table 3). Furthermore, the Gypsy superfamily, except for Sabrina, seemed to have higher CV values than the Copia superfamily, except for BARE1C. Our data indicate that TE dynamics are specific to each TE. For example, while the CV values of the Erika and Fatima families from the Gypsy superfamily indicate that they are two of the most active elements, the CV value of the Sabrina family from the same superfamily showed that it is one of the least active elements. Thus, in contrast to what was reported by Sabot et al. (2005), our data indicate that one cannot draw a general conclusion regarding the comparison of TE activity among different superfamilies.

Genome-specific proliferation

We examined the genomic uniqueness of each element to certain genomes and observed that Erika (Gypsy retrotransposon; Fig. 1a) had proliferated in the A genome, as it appeared to a much greater extent in T. urartu (genome AuAu), T. turgidum ssp. dicoccoides (genome BBAA), and T. aestivum (genome BBAADD). Fatima (Gypsy retrotransposon), Rong (PIF/Harbinger DNA transposon), and Paula (non-LTR retrotransposon) had proliferated in the B genome, as they appeared in very low levels in Ae. tauschii (genome DD) and T. urartu (genome AuAu).

Relative quantity among Ae. searsii and Ae. speltoides accessions: the candidate donors of the B genome to wheat

We studied the relative quantities of TEs in five accessions of Ae. searsii and nine accessions of Ae. speltoides, as these are the two candidate species for the contribution of the B genome to form tetraploid wheat (T. turgidum ssp. dicoccoides). The results showed elements with proliferation specific to each species (t test p < 0.05): Latidu (Gypsy retrotransposon; Supplemental Fig. 2c), Sabrina (Gypsy retrotransposon; Supplemental Fig. 2a), BAGY2 (Gypsy retrotransposon; Supplemental Fig. 2d), and Charon (Mutator DNA transposon) were specific to Ae. searsii and Erika (albeit at low levels), Angela-A (Copia retrotransposon; Supplemental Fig. 2f), and Ramona (non-LTR retrotransposon; Supplemental Fig. 2l) were specific to Ae. speltoides. This data, together with the finding that all the tested elements are found in the polyploid species, indicate that we cannot favor either of the two species to be the donor of the B genome. As previously reported, the true donor of the B genome might be a common ancestor to the two species (Salse et al. 2008; Yaakov and Kashkush 2012).

Evolutionary-scale changes in TE quantity in polyploids

In order to assess TE dynamics following polyploidization events, we examined the changes from the expected RQ for the tetraploid and hexaploid species (T. turgidum ssp. dicoccoides and T. aestivum, respectively), which may be calculated by combining the RQ for T. urartu (genome AA) and any accession of Ae. speltoides or Ae. searsii (genome BB) for the tetraploid (genome BBAA) and T. turgidum ssp. dicoccoides (genome BBAA) and Ae. tauschii (genome DD) for the hexaploid (genome BBAADD). A deviation from the expected RQs in the tetraploid was counted only if no combination of accessions could add up to the observed RQs in the tetraploid. We observed a deviation from the expected RQs for the tetraploid T. turgidum ssp. dicoccoides, which showed higher than expected values for Erika (Fig. 1a; p = 0.0002) and lower than expected values for Balduin (CACTA DNA transposon; Fig. 1c; p = 0.0001). When comparing the additive values of the natural tetraploid T. turgidum ssp. dicoccoides (genome BBAA) and Ae. tauschii (genome DD) with the values of the natural hexaploid T. aestivum (genome BBAADD), the hexaploid T. aestivum showed higher than expected values for Rong (p = 0.0001) and lower than expected values for Geneva (Gypsy retrotransposon; p = 0.0001), BARE1C (Copia retrotransposon; Supplemental Fig. 2; p = 0.0001), Charon (Fig. 1d; p = 0.0001), and Paula (p = 0.0001). However, Fatima and Latidu also had higher than expected values in the hexaploid, but were just above statistical significance (p = 0.061 and 0.0563, respectively). Importantly, the only Triticum-specific element (Erika) showed a deviation from the expected values in the allotetraploid, which was also observed for Triticum-specific Stowaway-like MITEs (Minos and Fortuna) (Yaakov et al. 2013). Thus, the genome-specific elements may play a role in the differentiation of sub-genomes following polyploidization, via transcriptional, transpositional, or recombinational events evoked by epigenetic changes (Mirouze et al. 2012).

TE dynamics in the first generations of newly formed allohexaploid

Because the examination of relative TE quantity in natural polyploids results in an assessment of TE activity within long time scales [~10,000 years for the hexaploid and ~0.5 million years for the tetraploid; see (Feldman and Levy 2005)], we decided to investigate the immediate effects of polyploidization on the relative quantity of each TE by performing the same qPCR analysis on a system of newly formed allohexaploid wheat (see “Materials and methods”).

The relative copy number values in the newly formed polyploid offspring (S1–S4 generations) were then compared to the expected additive parental copy number (the sum of the copy numbers of both parents), as the polyploid harbors both parental genomes, and each generation of the polyploid was compared to the successive generation. Thus, for each transposon, a pattern of increase or decrease in relative copy number could be observed. For 6 of the 16 elements [Fatima (Fig. 2a; p = 0.0041), Angela-A (Supplemental Fig. 3a; p = 0.0001), BARE1C (Supplemental Fig. 3b; p = 0.0142), Paula (Supplemental Fig. 3c; p = 0.0001), Rong (Supplemental Fig. 3d; p = 0.0004), and WIS-A (Supplemental Fig. 3e; p = 0.0001)], a significant decrease in relative copy number (between the additive parental values and the first generation of the newly formed allohexaploid, S1) was seen. However, the subsequent generations showed different patterns of change between these elements, consisting mostly of decrease in relative copy number in different generations. Another large group (9 of 16 elements) showed a significant increase in relative copy number, followed mostly by a decrease in different times in later generations. These elements included: Erika (Fig. 2b; p = 0.0001), Balduin (Supplemental Fig. 3f; p = 0.0001), Ramona (Supplemental Fig. 3g; p = 0.0109), Barbara (Supplemental Fig. 3h; p = 0.0001), Charon (Supplemental Fig. 3i; p = 0.0005), Veju (Supplemental Fig. 3j; p = 0.0001), Geneva (Supplemental Fig. 3k; p = 0.0001), BAGY2 (Supplemental Fig. 3l; p = 0.0021), and Latidu (Fig. 2c; p = 0.0002). Only one element (Sabrina; Fig. 2d) showed no change in its relative quantities in S1, but did show an 11.4 % increase between generations S2 and S3. These results indicate that the genomic changes that occur in transposable element sequences, following polyploidization, are unique to each family of transposable element. This is in agreement with our observation of the level of TE dynamics that was calculated based on their variability among the different species (see above), where the dynamics might dramatically vary from one TE family to another in the same superfamily.

Fig. 2
figure 2

Relative quantification (RQ) of two parental species (T. turgidum ssp. durum and Ae. tauschii) and their newly formed polyploid offspring (S1–S4) for a Fatima, b Erika, c Latidu, and d Sabrina. The numbers at the top of each bar indicate the relative quantity of the element compared to Ae. tauschii. The error bars represent standard deviations from three technical replicates. The genome composition of each species is indicated at the bottom

Another interesting observation, a specific group of three Gypsy superfamily elements, which include Erika (Fig. 2b), Geneva (Supplemental Fig. 3k), and BAGY2 (Supplemental Fig. 3l), showed a particular pattern of increased relative copy number between the expected additive parental values and the S1 generation, followed by a decrease between the S1 and S2, an increase between S2 and S3, and a decrease between S3 and S4. This phenomenon might indicate the unique dynamics of the Gypsy superfamily in wheat.

Finally, we have assessed the timing of TE dynamics and found that ~94 % of the elements showed a significant change in their quantity in S1 compared to the additive value of the parental lines, while ~75 % of the elements showed a significant change between S1 and S2, ~68 % of the elements showed a significant change between S2 and S3, and 65 % of the elements showed a significant change between S3 and S4.

Interestingly, we found that the Charon, Rong, Paula, BARE1C, Fatima, Latidu, and Geneva families showed significant changes, both in the natural hexaploid species (compared to its diploid and tetraploid parental species) and in the newly formed allohexaploid (compared to the TTR19 and TQ27 parental lines). This data suggests that these early changes in element copy numbers impact the subsequent evolution of each element in the genome.

In conclusion, in this study, we have performed a genome-wide analysis of the relative quantity of 16 large TE families representing different superfamilies from both TE classes in a large number of Triticum and Aegilops species, including accessions from the donors of the AA, BB, and DD genomes of polyploid wheat. In addition, the use of natural wheat allopolyploids and newly formed allopolyploids allowed us to assess the TE dynamics both at revolutionary and evolutionary scales. Furthermore, we have analyzed several accessions of the potential donors of the B genome (Ae. searsii and Ae. speltoides), which has not been performed before, allowing us to track the possible evolutionary trajectory of these wheat species.

Based on our observations, we have reached the following conclusions: (1) there exists copy number variation of TEs among Triticum and Aegilops species, which might be the result of different activity levels; (2) long elements were active in specific genomes during the evolutionary history of wheat, contributing to the diversification of diploid wheat species; (3) Ae. speltoides by itself cannot be the only contributor of the B genome to polyploid wheat; (4) elements which proliferate in specific genomes are, apparently, reactivated (or undergo rearrangements) following polyploidization and might play a role in the genetic differentiation of polyploid homeologous chromosomes; (5) the changes that occur following polyploidization events are unique to each TE family; and (6) early changes in TE copy numbers impact the subsequent genomic evolution of that element.