Introduction

The plant genus Agave L. (Sensu stricto) is endemic to the American continent, with 75% of its species distributed in México. Cytologically, the genus has species with different ploidy levels from diploid (2n = 2x = 60) to octoploid (2n = 8x = 240); it has a basic chromosome number of n = 30 and a bimodal haploid karyotype (5 long acrocentric chromosomes and 25 small metacentric or submetacentric chromosomes) (Palomino et al. 2005, 2015; Moreno-Salazar et al. 2007; Robert et al. 2008) and is of possible allopolyploid origin (McKain et al. 2012). Currently, few studies are available on the evolutionary relationships, speciation processes, genetic variation, and diversity of species of Agave. The importance of the knowledge about evolutionary relationships in the genus is based on biotechnological interest in obtaining different Agave products. Therefore, the physiological and morphological characteristics exhibited by its species, which may confer them resistance against biotic and abiotic stress factors, position Agave as a genus of interest for facing global climate change (Tamayo-Ordóñez et al. 2016a).

The analysis of ribosomal DNA (rDNA) genes has been used for estimating the genetic variation and diversity of several species of Agave and has been used as a genetic marker of ploidy level. It has been observed that 5S and 45S rDNA loci with additive patterns are located in long acrocentric and short metacentric or submetacentric chromosomes, respectively, and that the number of rDNA loci increases proportionally to the ploidy level (Robert et al. 2008). Gómez-Rodríguez et al. (2013) described that in some diploid species of Agave (A. tequilana Weber ‘Azul’, A. cupreata Trelease et Berger and A. angustifolia Haworth), the 5S rDNA regions may have an irregular chromosomal distribution pattern, possibly due to divergence and reduced homogenization rate in these regions. In the plant family Asparagaceae, Garcia et al. (2010) stated that organization and grouping of rDNA genes depend on the taxon; therefore, they are possibly the product of evolutionary mechanisms and the genetic events within each taxon by rDNA region (such as unequal recombination and transposition). Recently, Khaliq et al. (2012) reported that the main genomic components of A. tequilana are the retrotransposons Ty1-Copia (approximately 50–80% of the total), and the high activity of these retroelements may play a relevant role in the diversity, organization, and evolution of redundant, highly represented genes such as rDNA genes. The genus Agave is of recent origin (6–8 and 1.5–3 Mya) and has a high index of species diversity (0.32–0.56 species per million years) relative to angiosperms in general (0.089–0.07 species per million years), and according to the model for evolution of rDNA regions proposed by Kovarik et al. (2008), its rDNA regions are undergoing an evolutionary stage of fixation of certain rDNA copies. This evolutionary process suggests that rDNA regions in Agave are subjected to genetic events, including recombination, amplification, duplication, transposition, and gene loss, that could have resulted in the high variability and diversity of these genomic regions. According to Good-Avila et al. (2006), the process of speciation in Agave has been coincident with increasing aridity in central Mexico, which suggests that high retrotransposition activity in response to water deficit may have had an important role in speciation. Additionally, Tamayo-Ordóñez et al. (2016b) found that in some species of Agave, there is differential regulation of genes associated with biotic and abiotic stress depending on their habitat and proposed that environmental stress factors could have contributed to genetic diversity expressed as speciation and species’ adaptation.

Tamayo-Ordóñez et al. (2015) previously studied the variability, functionality, and regulation of the 18S and 5S rDNA regions in some species of Agave having different ploidy levels. The authors found that the diploid species A. tequilana (2n = 2x = 60) has a lower number of copies and more non-redundant (possibly functional) haplotypes than the species with higher ploidy levels A. fourcroydes Lemaire (2n = 3x = 90 and 2n = 5x = 150) and A. angustifolia (2n = 2x = 60 and 2n = 6x = 180). The same study found differential expression of 5S rDNA gene allelic groups in A. tequilana, suggesting a rapid adaptive strategy resulting from strong selection pressures acting on this commercially important species, which is therefore experiencing different evolutionary pressures than A. fourcroydes and A. angustifolia (Tamayo-Ordóñez et al. 2015). The presence of non-redundant haplotypes in A. tequilana implies weakening or loss of genetic homogenization in the ribosomal unit that is associated with increased genetic diversity. However, despite our previous work (Tamayo-Ordóñez et al. 2015) demonstrating the functionality of 5S rDNA haplotypes in each studied species by in silico analyses, the effect of genetic and epigenetic factors involved in loss of functionality, diversity, and differential expression of haplotypes was not fully analyzed. In this regard, the organization and transcription of 18S, 5.8S, 26S, and 5S rDNA genes in plants have been related to the diversity of these genomic regions, but the genetic regulation of rDNA regions depends on the activation and function of genetic events that are important for expression and involve promoter sequences, transcription factors, and enzymes relevant for the formation of the transcription complex (Lemon and Tjian 2000; Paule and White 2000). Additionally, epigenetic changes, such as siRNA, remodeling of chromatin, histone modification, TFs such as TFIIIA, and, mainly, DNA methylation, can influence transcriptional regulation of some copies of rDNA genes, in particular the 5S rDNA genes (Lawrence et al. 2004; Preuss and Pikaard 2007; McStay and Grummt 2008; Martinez et al. 2013). To accommodate higher demands for protein synthesis in each cell, there is a stoichiometric relation in the number of rRNA copies. DNA methylation events have been shown to contribute to the control of rRNA transcription that slows down the homogenization process and influences the evolution of these sequences, therefore increasing the diversity of ribosomal regions (Lawrence et al. 2004; Mayer et al. 2006; Eickbush and Eickbush 2007; Song and Chen 2015).

Our objective in this study was to generate knowledge about three important aspects of rDNA regions in Agave: (i) determining the organization of rDNA regions in the genus; (ii) exploring epigenetic (DNA methylation) regulation mechanisms in the 5S rDNA haplotypes and allelic groups known from polyploid species of Agave (A. tequilana, A. angustifolia, and A. fourcroydes); and (iii) assessing the contribution of retroelements to the diversity of rDNA regions in the genus. The knowledge about these three aspects will allow us to better understand the behavior of rDNA regions in terms of their evolution and to understand the evolutionary processes currently experienced by rDNA regions in the genus Agave, the impacts of environmental conditions on this process, the differentiation processes caused by artificial selection in the genus, and the variability and diversity of these regions of the genome.

Methods

Biological Material

Five accessions of three species of Agave were studied for which the ploidy level had been previously established by Robert et al. (2008): A. tequilana Weber ‘Azul’ (2n = 2x = 60), A. fourcroydes Lem. ‘Kitam ki’ (2n = 3x = 90), A. fourcroydes ‘Sac ki’ (2n = 5x = 150), A. angustifolia Haw. ‘Marginata’ (2n = 2x = 60), and A. angustifolia ‘Chelem ki’ (2n = 6x = 180). A. tequilana ‘Azul’ has been propagated in vitro and adapted to greenhouse conditions for commercial purposes. The studied accessions are maintained in the Roger Orellana Regional Botanic Garden of the Yucatan Center for Scientific Research (Centro de Investigación de Yucatán, A.C.; CICY) in Merida, Yucatan, Mexico.

Determination of the Organization of 5S rDNA Regions in A. tequilana ‘Azul’

Agave L. has a medium size genome, 4312 Mbp, (Baneerjee and Sharma 1987) compared with other plant species, such as Citrus clementina (367 Mbp) (Terol et al. 2008) and Pinus pinaster (23,000 Mbp) (Bautista et al. 2008). Due to the size and complexity of its genome, its genomic structure and organization remain unknown. For this reason, we followed two strategies to determine the organization of the 5S rDNA regions in Agave L. The first strategy consisted of using the information obtained by Tamayo-Ordóñez et al. (2012). This author selected clones positive for rDNA regions by dot-blot hybridization and partially sequenced these clones by next generation sequencing (NGS) technology at Macrogen Korea. The information obtained from these sequences is reported in this manuscript and denoted—BIBAC. Subsequently, to obtain greater sequencing coverage of rDNA regions, we decided to carry out a second selection of clones positive for regions of rDNA from BIBAC of A. tequilana ‘Azul.’

The presence of 5S rDNA regions in the selected clones was confirmed by conventional PCR of plasmidic DNA extracted by the alkaline lysis method of Bimboim and Doly (1979) and purified by affinity chromatography using PureLink kits (Invitrogen Corp., Carlsbad, CA, USA).

Amplification of 5S rDNA was conducted using the primer pair forward 5′-CGATCATACCAGCACTAAAGCACC-3′ and reverse 5′-ATGCAACACGAGGACTTCCCAG-3′. The primers used for amplifying the non-transcribed spacers (NTS) were forward 5′-GATCCCATCAGAACTCC-3′ and reverse 5′-GGTGCTTTAGTGCTGGTAT-3′ (Additional file 1). Afterwards, clones to be sequenced were selected following two approaches: (a) clones presenting a characteristic tandem repeat ApoI restriction enzyme pattern according to findings of Tamayo-Ordóñez et al. (2012) and (b) clones having a higher absolute number of copies as determined by qPCR relative to other analyzed clones (Tamayo-Ordóñez et al. 2016b). Selected clones were digested with NotI in order to produce inserts of approximately 150 kb, which were then purified by electroelution following the procedures described by Tamayo-Ordóñez et al. (2012). The resulting inserts from two clones were sequenced in the Roche GS-FLX platform, and in parallel, the inserts from ten clones with the same pattern were sequenced by NGS at Macrogen Korea.

Sequences were depurated and aligned in the software BioEdit Sequence Alignment Editor 5.0.6 (Hall 1999) and MEGA 5.1 (Tamura et al. 2011). Contig building and identification of the coding region (CR) of the genes of interest were performed using the software GS De Novo Assembler version 2.9 (http://www.454.com/products/analysis-software/), UGENE (Okonechnikov et al. 2012) and Geneious R8 (http://www.geneious.com/). Finally, the identity and similarity of the CR present in contigs were determined online using BlastN (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch) and BlastX (http://blast.ncbi.nlm.nih.gov/blast/Blast.cgi?PROGRAM=blastx&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome).

Mapping of the 5S CR in the contigs was performed by selecting sequences from related plants based on identity level (> 90%). Sequences from the class Liliopsida were used for the 5S-1 rDNA contig (Supplementary Fig. 1a). For contig 5S-2 rDNA, sequences from the class Magnoliopsida were used together with sequences from Zea mays (DQ351339) and Lilium tsingtauense (KM117262), belonging to the class Liliopsida (Supplementary Fig. 1b). For contig 5S-3 rDNA, sequences of classes Liliopsida and Rosopsida were used (Supplementary Fig. 1c). Finally, for contig 5S-4 rDNA, the sequences used were from Gossypium schwendimanii (U32037.1) in the class Magnoliopsida, Cephalotaxus sinensis (JF829969.1) in the class Pinopsida, and Lilium tsingtauense (KM117262.1), Agave tequilana (GQ983554.1), Avena occidentalis (EF071653.1), and Zea mays (DQ351339.1) in the class Liliopsida (Supplementary Fig. 1d).

Isolation of NTS Regions in Agave Accessions

Genomic DNA was isolated from young leaves by the silica method (Echevarría-Machado et al. 2005). Conditions for amplification of the 5S rDNA NTS spacer were as follows: initial denaturing for 3 min at 94 °C, followed by 35 cycles of 30 s at 94 °C (denaturing), 60 °C (alignment), and 72 °C (extension) and a final extension step at 72 °C for 10 min. The oligonucleotides that were designed to serve as primers for amplifying the NTS region (450 bp) were forward 5′-GATCCCATCAGAACTCC-3′ and reverse 5′-GGTGCTTTAGTGCTGGTAT-3′. The product of the PCR was cloned into pGEM-Teasy vector according to the manufacturer’s instructions (Promega, Madison, WI, USA). The external company MACROGEN (Seoul, Korea) sequenced the selected NTS clones. With the goal of minimizing sequencing errors, we decided to sequence each clone using forward and reverse M13 universal primers. We verified that the forward and reverse sequences of each fragment were identical. Only sequences showing no errors in nucleotides were included in the analyses. Sequences identity was confirmed by comparison with sequences in the GenBank database (BLASTn). Nucleotide sequences obtained in this study were deposited in the GenBank database.

Assessment of Total and Specific Methylation Ratios of 5S rDNA and NTS Regions

Two approaches were followed for determining the total methylation ratios of the studied accessions of Agave: determining the total percentage of methylation in the complete genome of each polyploid accession and determining site-specific methylation in the haplotypes and NTS groups of 5S rDNA.

Genomic DNA was extracted according to the procedures of Echevarría-Machado et al. (2005). To determine the total DNA methylation, nucleic acids were digested and nucleosides separated following the methods of De-la-Peña et al. (2012). Afterwards, 5 µg of DNA of each sample was hydrolyzed and mixed with 5 µL of acetic acid, 10× DNA digestion buffer (200 mM glycine, 50 mM magnesium chloride, 5 mM zinc acetate, and 2 mM calcium chloride adjusted to pH 5.3 with sodium hydroxide), 2 µL of 10 U/mL DNase I (Sigma-D2821), and 1 µL of 1.25 U/mL nuclease P1 (Sigma-N8630). Mixtures were incubated overnight at 37 °C. Afterwards, digested samples were mixed with 5 µL of 100 mM NaOH and 2 µL of 1 U/mL alkaline phosphatase (Sigma-P4879); the mixtures were incubated for 3.5 h at 37 °C and afterwards mixed with mobile phase D (50 mM ammonium phosphate dibasic and 15 mM ammonium acetate adjusted to pH 4.1 with phosphoric acid). Afterwards, samples were centrifuged at 18,000xg and analyzed by high-performance liquid chromatography (HPLC; Agilent 1200 Series). The total methylation percentage of the samples was determined from HPLC results according to the protocol of Nic-Can et al. (2013), which consisted of obtaining the reverse phase chromatograms, using the areas of the peaks for estimating the concentration of 2′-deoxycytidine (dC) and 5-methyl-1-2′-deoxycytidine (5mdC) in samples and finally calculating the percent total methylation ratio (TM%) as follows: TM% = 100 x C5mdC/(C5mdc + CdC), where C is concentration.

Specific 5S rDNA methylation of the promoter region and of haplotypes was determined by sodium bisulfite conversion of non-methylated cytosine according to the following procedures using QIAGEN EpiTect Bisulfite Kits (Hetzl et al. 2007; Foerster et al. 2010). Initially, genomic DNA was deaminated with sodium bisulfite in a reaction mix prepared with 1 ng of genomic DNA, 85 µL of Bisulfite mix, and 35 µL of DNA Protect Buffer (as provided in kits) to a final volume of 140 µL. Deaminated samples were then denatured for 7 min at 95 °C, incubated for 25 min at 60 °C, denatured for 7 min at 95 °C, and incubated for 175 min at 60 °C. Afterwards, genomic DNA treated with bisulfite was purified in the provided spin columns following instructions in the kit. The final product of this chemical conversion is uracil replacing non-methylated cytosine.

To amplify regions of rDNA after the sodium bisulfite treatment, we followed the specifications described by Henderson et al. (2010). Forward primers were aligned to the complementary chain; the Gs were replaced with As, and the GC regions included R degeneration (G or A). For reverse primers, the Cs were replaced by Ts, except for CG regions where degenerate R (C or T) was included. The sequences of the primers were as follows: for the 5S rDNA gene, forward 5′-CAATCATACCARCACTAAARCACC-3′ and reverse 5′-ATGTAACAYGAGGACTTCTCAG-3′. The primers used for amplifying the NTS were forward 5′-AATCCCATCAAAACTCC-3′ and reverse 5′-GGTGTTTTAGTGTTGGTAT-3′. The next step was PCR amplification of haplotypes and NTS genes in 5S rDNA in each accession. Reaction mixtures and amplification conditions for the 5S rDNA gene were as described by Tamayo-Ordóñez et al. (2015), and conditions for amplification of the 5S rDNA NTS spacer were the same as those described in the previous section with the exception of the alignment time of 45 s.

Additionally, amplifications of the haplotypes 5S rDNA and NTS regions were obtained from genomic DNA that was not deaminated with sodium bisulfite. These PCR products were used as a control for comparison between methylated and non-methylated DNA.

The amplification products obtained from the non-deaminated genomic DNA were separated by electrophoresis in 1.3% agarose gels. Afterwards, the desired bands were gel purified using the Wizard®SV Gel and PCR Clean-Up System (Promega, USA) and cloned into the 3 kb vector pGEM-T (Promega, USA) in competent HB101 cells transformed by thermal shock. This strain of E. coli retains methylation of 5-methylcytosine during replication. Transformants were selected by resistance to ampicillin (50 µg/mL), and plasmids were extracted by the alkaline lysis method (Bimboim and Doly (1979). The presence of inserts in transformants was verified by digestion with EcoRI and amplification of the insert with the universal M13 (-24/-40) primers located at the extremes of the multiple cloning side of the vector.

The next step was selecting clones, some of which were sequenced by NGS (ABI 3730 XL) at Macrogen Korea, and others were sequenced in an ABI PRISM model 337 automatic sequencer in the Center for Biotechnology and Genomics of the National Polytechnique Institute (IPN) at Reynosa, Tamaulipas, Mexico. For the latter approach, fragments of interest were preamplified from 30 ng of PCR product in the following reaction mix: 4 µL of 5× BigDye v3.1 buffer, 10 µL of Milli-Q sterile water, 10 pmol M13 (-40/-24) primer, and 4 µL of BigDye v3.1 ready mix in a final volume of 20 µL. The amplification conditions were as follows: initial denaturing for 2 min at 94 °C, followed by 30 cycles of 30 s each at 94 °C (denaturing), 58 °C (alignment), and 72 °C (extension), and a final extension step at 72 °C for 7 min. The PCR products were purified in 10 µL of X-terminator and 45 µL of SAM, incubated for 30 min at 25 °C with agitation, and centrifuged at 12,000xg. A total of 20 µL of the supernatant was sequenced in an ABI PRISM model 377 sequencer. Finally, methylation at symmetric and asymmetric sites was assessed in the software Cytosine Methylation Analysis Tool for Everyone (cyMATE) (Foerster et al. 2010).

Haplotype, RNA Fold Prediction, and Functionality Analysis

Minimum haplotypes of 5S rDNA and NTS regions were constructed using the median-joining method implemented in Network v.4.6 software (http://www.fluxus-engineering.com; Bandelt et al. 1999), assuming an epsilon of 0 and a transversion/transition ratio of 1:2. The 5S rDNA regions obtained from PCR amplification of methylated and non-methylated DNA were subjected to an analysis of secondary structure modeling and functionality. RNA fold prediction was carried out using the RNA structure web server (http://rna.urmc.rochester.edu/RNAstructureWeb/Servers/Predict1/Predict1.html) (Reuter and Mathews 2010) at a folding temperature of 37 °C. The RNAz program of Vienna RNAweb servers (http://rna.tbi.univie.ac.at/) (Gruber et al. 2007) was used to test the probability of the functionality of predicted secondary structures. Sequences were analyzed in the reverse orientation as a control.

Results

Determination of the Organization of 5S rDNA Regions in A. tequilana ‘Azul’

Massive sequencing of inserts in the clones selected from a BIBAC library resulted in 65,099 kb with a total of 153,934 reads of an average size of 423 bp. Afterwards, 28,768 contigs were generated from 153,922 reads spanning 64,700 kb for assembly in the GS De Novo Assembler version 2.9 software. In that analysis, 19% (28,768 reads) assembled correctly, 18% (27,198 readings) assembled partially (only a portion of the readings assembled correctly), 62% (96,014 readings) had unique assemblies (readings did not overlap any other reading), and only 1% (1394 readings) of assemblies had atypical values (i.e., the software detected problems in the readings). In general, 2618 contigs had an average length of 866 bp (longer than ~ 500 bp), of which 15% contained regions reported in GenBank (396 contigs). Also identified were chloroplastic (23 contigs), mitochondrial (96 contigs), and, most abundantly, nuclear (263 contigs) DNA regions. Analysis of rDNA regions found seven contigs with the CR of the 5S, 18S, and 26S rDNA genes of different lengths ranging from 590 to 2060 bp (not shown in the results). Four contigs of A. tequilana had 5S rDNA regions that were designated 5S-1, 5S-2, 5S-3, and 5S-4 (Fig. 1), as described below.

Fig. 1
figure 1

Identification of the 5S rDNA region of A. tequilana. Schematic representation of the 5S rDNA tandems identified in four contigs of A. tequilana. a In contig 5S-1 rDNA (MH547303) we delimited a partial and a complete unit of 5S rDNA (NTS-5S) with lengths of 347 and 452 bp, respectively. Termination sites and the TATA box were also identified. The arrow shows the GTkACA-3′ consensus sequence of the 3′ TIR of the Cassandra retrotransposon located in base pair 655 adjacent to the 5S rDNA. b In contig 5S-2 rDNA (MH547304), a complete unit and the vestigial remains of the 5S rDNA gene are shown. The arrows show the 5′-TGYYABA and GTkACA-3′ consensus sequences of the TIR of the Cassandra retrotransposon located in base pairs 551–566 and 171–184, respectively. c In contig 5S-3 rDNA (MH547305) two partial units of 5S rDNA are shown, the arrow indicating the consensus sequence GTkACA-3′ of the TIR of Cassandra retrotransposon located in base pairs 541–554, near the partial sequence of the 5S rDNA. d Contig 5S-4 rDNA (MH547306) containing an open reading frame (ORF) of the 5S rDNA gene and a 590 bp long NTS. The arrow corresponds to the GTkACA-3′ consensus sequence of the TIR of the Cassandra retrotransposon of the 5S rDNA unit

Contig 5S-1 rDNA

Contig 5S-1 was 1149 bp long and contained two units of 5S rDNA: a complete sequence (including the promoter region and the gene) of 425 bp and an incomplete sequence of 347 bp. Three regions corresponding to the NTS were identified in contig 5S-1 sized 232 bp (NTS-1–I at position 1–232), 358 bp (NTS-1–II at position 348–706), and 349 bp (NTS-1–III at position 800–1149). Two 5S rDNA CR were identified in the contig with lengths of 114 bp (CR-1–I at position 233–347) and 92 bp (CR-1–II at position 707–799; Fig. 1a).

Contig 5S-2 rDNA

Contig 5S-2 was 1092 bp long and contained 340 bp of a partial 5S rDNA sequence (CR-2–I), followed by a 606 bp long NTS and a 51 bp long partial 5S rDNA gene (3′-terminus of the gene). Contig 5S-2 included three NTS regions, two with partial sequences of 92 bp in positions 1-217 (NTS-2–I) and 998–1092 (NTS-2–III) and a complete 604 bp long NTS in position 341–945 (NTS-2–II; Fig. 1b).

Contig 5S-3 rDNA

Sized at 666 bp, contig 5S-3 contained two CR of the 5S rDNA gene, one 120 bp long at position 109–229 (CR-3–I) and another 72 bp long at base pairs 594–666 (CR-3–II; Fig. 1c). Contig 5S-3 rDNA also contained two NTS regions, one 109 bp in length at base pairs 1–108 (NTS-3–I) and another one 365 bp in length located at base pairs 230–593 (NTS-3–II).

Contig 5S-4 rDNA

With a size of 590 bp, contig 5S-4 contained a partial, vestigial 5S rDNA gene. Two NTS occupied 75% (443 bp) of the contig’s length (Fig. 1d), one 95 bp long at base pairs 1–95 (NTS-4–I) and another one 443 bp long at base pairs 147–590 (NTS-4–II).

Comparison of the CR of 5S rDNA of each contig with the haplotype sequences reported by Tamayo-Ordóñez et al. (2015) indicated that in general, the 5S rDNA gene region of the four contigs of A. tequilana corresponded to that in HAP-01 and was shown to be regulated by different types of NTS (Fig. 1). The in silico analysis of sequences revealed the presence of conserved elements, such as consensus sequences of boxes BOX-A and BOX-C, and the internal element (IE) characteristic of the 5S rDNA region of eukaryotes (Supplementary Fig. S2). The maintenance of the redundant haplotype HAP-01 suggests conservation and preferential amplification of this allele in species of Agave.

Identification of the Cassandra TRIM Recognition Sites in the 5S rDNA Region of Agave Accessions

Because the Cassandra TRIM sequence was found in all the consensus sequences of 5S rDNA contigs, we conducted a detailed analysis to describe the intraspecific variability of the 5S rDNA spacer and to determine the implications of these regulatory elements in the diversity of 5S rDNA regions. For these goals, we followed two approaches, described below.

One strategy was based on the isolation of two 5S rDNA units (NTS and 5S rDNA gene) with sizes of 476 and 506 bp from a BIBAC library of A. tequilana (Tamayo-Ordóñez et al. 2012), one with the redundant haplotype 01 (HAP-01) and another with non-redundant haplotype 08 (HAP-08) specific to A. tequilana (Tamayo-Ordóñez et al. 2015). These units were designated NTS-BIBAC HAP-01 (accession number submitted) and NTS-BIBAC HAP-08 (accession number submitted). Comparison of these two sequences of 5S rDNA units revealed high variability of the NTS. Likewise, we found two consensus boxes, such as TATA, TSS, BOX-A and BOX-C, and the internal control element ET characteristic of the 5S rDNA regions of eukaryotes; with the exception of the TATA box, these were conserved (Supplementary Fig. S2).

The second approach consisted of using genomic DNA from five accessions of Agave with different ploidy levels for isolating the rDNA NTS by amplification with universal primers for the class Liliopsida. The results of this approach showed inter- and intraspecific variability in the NTS of A. tequilana (At2X), A. angustifolia (Aa2X, Aa6X), and A. fourcroydes (Af3X, Af5X). The variants of the NTS we identified were classified into four groups of NTS (I–IV) according to their length, grouping by haplotype (Fig. 2), number and percentage of cytosine residues, and theoretical restriction profile. The NTS-I, NTS-II, NTS-III, and NTS-IV presented sizes of 395, 409, 412, and 442 bp, respectively (Fig. 2); the four groups of spacers presented high homology with the NTS-BIBAC HAP-01. Each group had similar AT and GC contents. The AT contents in groups NTS-I, NTS-II, NTS-III, and NTS-IV were, respectively, 64, 70, 64, and 65%, and their corresponding GC contents were 36, 30, 36, and 35%. High AT contents were concentrated between base pairs 75 and 350. The relative frequency indicated that NTS groups IV and I were more represented in the Agave genome than NTS groups II and III (Table 1).

Fig. 2
figure 2

Minimum haplotype network based on NTS from different Agave L. accessions. Haplotype color representing different Agave L. accessions is shown in the right upper border. Missing haplotypes are indicated as small black square. The line connecting haplotypes represents one mutational step. Circle area is proportional to the methylated ratio of each NTS. Dotted circles showed NTS of the same size (bp) range and NTS obtained from the massive sequencing of BIBAC clones of A. tequilana are illustrated in gray circles

Table 1 Methylation of symmetric and asymmetric sites in the NTS in species of Agave L.

Additionally, we performed a comparative analysis of the sequence similarity of the NTS in the BIBAC library of A. tequilana and found that contig 5S-3 rDNA contains a spacer from group I (NTS-I), while contigs 5S-1 rDNA, 5S-2 rDNA, and 5S-4 rDNA had low similarity values of 80, 57, and 11%, respectively, grouping separately from the NTS groups I, II, III, and IV; this suggests new groups and a specialized divergence of the NTS in contigs 5S-1 rDNA, 5S-2 rDNA, and 5S-4 rDNA (Fig. 3).

Fig. 3
figure 3

Comparative in silico analysis of the NTS ADNr 5S of Agave. a The upper panel illustrates the general alignment diagram of the NTS sequences. The dendrogram resulting from the analysis of sequence similarity of NTS is shown to the left. The parameters of analysis were a global alignment with an identity matrix 1.0/0.0 of NTS sequences with a size of 600 bp about using the method of the Unweighted Pair Group Method with Arithmetic Mean (UPGMA). The numbers in the dendrogram represent the genetic similarity of sequences NTS, assigned to the value 0 and 1 as minor and major similarity, respectively. Accession numbers corresponding to the 18 sequences are MH547285–MH547302. b Consensus sequences of the Cassandra terminal-repeat retrotransposon in miniature (TRIM) identified in the different contigs

Table 2 Total methylation in accessions of Agave L.

In addition, the analysis of the NTS region in the four 5S rDNA contigs of A. tequilana and of the NTS sequences classified in the groups I–IV in Agave (NTS-I, -II, -III, and -IV) revealed the presence of consensus sequences of the 5S rDNA-dependent Cassandra TRIM (Fig. 3). The Cassandra TRIM contains a 5′-TG…CA-3′ universal terminal sequence and a terminal inverted repeat (TIR) varying in length from 6 to 12 bp that is typical of the long terminal repeats (LTR) of retrotransposons. The TIR sequence GTkACA-3′ of the Cassandra TRIM was found at base pair 450 in all NTS groups of Agave (I–IV) and in all contigs of A. tequilana except for NTS-BIBAC HAP-08. In contrast, the TIR sequence 5′-TGYYABA at the approximate position of base pair 310 was only found in contig 5S-2 rDNA (Fig. 3).

In general, the similarity comparative analysis of the NTS sequences showed that the NTS in contigs 5S-1 rDNA and 5S-4 rDNA and in NTS-BIBAC HAP-08 are probably divergent and have different transcription regulation elements relative to the more abundant NTS-BIBAC HAP-01.

DNA Methylation in Regions of 5S rDNA

Because of the differences in expression reported for some rDNA haplotypes of Agave L. (Tamayo-Ordóñez et al. 2015), we decided to analyze the role of DNA methylation in the diversity and regulation of 5S rDNA units in Agave. We analyzed four major aspects in each studied accession of the genus: (i) total DNA methylation in the genome of accessions of Agave, (ii) specific DNA methylation patterns of the 5S rDNA gene and NTS regions, (iii) methylation of symmetric and asymmetric sites in the 5S rDNA gene and NTS regions, and (iv) methylation effect on RNA fold and functionality of 5S rDNA.

Total DNA Methylation

The percent DNA methylation values determined by HPLC were similar in all studied species of Agave (36–41%) (Fig. 4). However, considering differences in ploidy level and genome size in each polyploid accession, the proportion of methylated genome was higher in polyploid species A. fourcroydes (2n = 5x = 150) and A. angustifolia (2n = 6x = 180), which had 1.5 (13,212 Mbp) and 3 (18,081 Mbp) times more methylated genome, respectively, than their lower ploidy level counterparts (Table 2).

DNA Methylation in the 5S rDNA Gene and NTS Regions

The DNA methylation of the 5S rDNA regions (including the 5S gene and NTS) determined by the sodium bisulfite method in each accession of Agave (Table 2) indicated that the percentage of methylation of the 5S rDNA gene ranged from 30 to 32%, and that of the NTS ranged from 26 to 29%. Both ribosomal regions (5S and NTS) have similar percentages of methylation in the polyploid species, suggesting that methylation of both ribosomal regions might play a relevant role in regulation and organization of the 5S rDNA region; additionally, the high total methylation of the genome could be significantly contributing to genome condensation and organization of the complex genome of Agave L.

Methylation of Symmetric and Asymmetric Sites in Haplotypes’ 5S rDNA and NTS Regions

Specific 5S rDNA methylation patterns in the haplotypes determined by the bisulfite method were used for establishing symmetric and asymmetric sites of 5-methylcytosine (5mC) in 5S rDNA haplotypes of the studied accessions of Agave (Tamayo-Ordóñez et al. 2015). Our strategy was based on identifying percentages of CGN, CHG, and CHH methylation events as either independent or combined events relative to the total number of cytosines (100%).

Fig. 4
figure 4

Total methylation and specific methylation of 5S rDNA units in different species of Agave L. Percentage of total methylation determined by HPLC technique (left side) and percentage of methylation obtained by sodium bisulfite technique (right side) different letters indicate significantly different values (Student’s t test; p < 0.05)

In this study, we identified 11 haplotypes previously reported by Tamayo-Ordóñez et al. (2015) and two new haplotypes (HAP-18 and HAP-20) (Table 3). The results of the haplotype methylation analysis indicated that 5mC events occurred both in symmetric CGN sites and in asymmetric CHG and CHH sites.

Table 3 Site specific methylation of 5S rDNA haplotypes in species of Agave L.

Independent methylation events occurred in CGN at average percentages of 88, 100, and 94% for the II, III, and IV allelic groups, respectively. The percentages of independent methylation events in CHG sites were 45–100%, and the lowest percentages of methylation (23–56%) were found in CHH sites, suggesting that this type of methylation is a less frequent methylation event found in the 5S rDNA haplotypes of Agave (Table 3).

The results of the analysis of combined 5mC events (CGN, CHG, and CHH) showed an average total methylation of 94, 95, and 90% for the II, III, and IV allelic groups, respectively. We determined that methylation values were lower at CHH sites (8–23%), followed CGN (15–37%), and CHG (28–63%). Interestingly, allelic group IV presented haplotypes with the lowest percentages of methylation (HAP-18 and HAP-10), and some haplotypes that presented 99% methylation were HAP-01, HAP-09, HAP-15, and HAP-17 (allelic group IV) and HAP-08 (allelic group II).

On the other hand, methylation analysis of the NTS regions showed similar methylation patterns among the four groups of spacers analyzed. Methylation analysis, when considered as independent sites, showed variation in the methylation percentage when comparing between the four groups of spacers (12–98%). According to the site of the methylation patterns, NTS-II and NTS-III showed higher percentages of methylation in CGN sites (> 94%). Regarding methylation in CHG sites, NTS-II and NTS-IV showed the highest percentages (> 91%). NTS-I and NTS-IV exhibited values ranging from 35 to 41% in CHH sites. The average symmetric and asymmetric methylation of NTS groups showed similar frequency trends, this trend being CGN > CHG > CHH in all groups of spacers analyzed (Table 1). When estimated as the combined events of asymmetric and asymmetric methylation, the highest values and more ample ranges of methylation percentages were found in CGN sites (22–66%), followed by the CHG (12–42%) and CHH (8–27%) sites. The average total percentages of methylation of NTS groups were similar in NTS-I, II, and IV (> 88%). NTS-III (only present in A. angustifolia) had the lowest total methylation value of 38%. Similar methylation trends in rDNA regions have been reported for the family Asteraceae (Garcia et al. 2012) and Cycas revoluta Thunb (Wang et al. 2016).

The comparative analysis of methylation of the NTS (analyzed as combined events) among polyploid species showed that in triploid and pentaploid species of A. fourcroydes, methylation ranged from 68 to 92%. The diploid A. angustifolia ‛Marginata’ had the highest (99%; NTS-02 of group NTS-I) and lowest (38%; NTS-08 of group NTS-III) total methylation values (Table 1). The hexaploid accession of A. angustifolia showed higher methylation percentages (83%). Finally, the diploid species of A. tequilana and A. angustifolia showed the widest ranges of methylation from 38 to 99%. In all species of Agave having different ploidy levels, the total methylation percentages of the NTS-I, -II, -III, and -IV groups were higher in CGN sites (> 22%) compared with CHG and CHH sites (> 8%) (Table 1). Few differences in methylation density were found among the four groups of NTS and when analyzed by species, which could suggest that DNA methylation is associated with the diversity of rDNA regions.

Methylation Effect on RNA Fold and Functionality of 5S rDNA.

As has been described previously, transcription of 5S rDNA is not inhibited by DNA methylation in Arabidopsis (Mathieu et al. 2002), and post-transcriptional regulation through the methylation of the 2-hydroxyl-group of ribonucleotides has been described in 18S rRNA (Granneman and Baserga 2005; Xue and Barna 2012; Fu et al. 2014; Yang et al. 2015). With the objective of knowing how methylation of RNA transcripts could influence the conformation of the secondary structure of the ribosomal RNAs (already transcribed) and possibly its functionality, we performed in silico analysis of RNA fold prediction and functionality analysis. Modeling of the secondary structures of HAP-01, HAP-03, HAP-05, HAP-08, and HAP-10 unmethylated haplotypes showed secondary structures that fold into the structure derived from X-ray crystallography of ribosomal genes described in species of the class Liliopsida, such as Elymus sibiricus L. and Hordeum chilense (Szymanski et al. 2002; Szymański et al. 2003; Tamayo-Ordóñez et al. 2015) (Fig. 5). HAP-18 and HAP-20 exhibited multiple secondary structures (Fig. 5e, f). Interestingly, the haplotype HAP-03 (Fig. 5b) could lose the conformation of its secondary structure if post-transcriptional methylation was present. Additionally, this haplotype has been shown to have low percentages of methylation (70%), suggesting that perhaps the post-transcriptional methylation could regulate the transcription of rDNA genes. HAP 10 had methylation percentages similar to HAP-03 (68%), showing that post-transcriptional methylation could contribute to the acquisition of secondary structure conformations closer to the structure derived from X-ray crystallography reported by Szymanski et al. (2003); this result suggests that post-transcriptional methylation in 5S rDNA genes may function as a second transcriptional control barrier of these highly represented genes in the Agave genome, which have not been methylated at the DNA level. Haplotypes with lower percentages of methylation may be transcribed and subjected to post-transcriptional methylation, as demonstrated by haplotypes HAP-03 and HAP-10. In general, all secondary structures showed ΔG values ranging from − 43.1 to − 17.3; HAP-01 and HAP-02 showed lower values of free energy, indicating a greater stability of these structures.

Fig. 5
figure 5

Secondary structures of methylated haplotypes (HAP) 5S rDNA of Agave L. Haplotypes 5S rDNA, allelic group IV: a HAP-01 redundant; non-redundant; b HAP-03; c HAP-05; d HAP-10; e HAP-18; f HAP-20. Allelic group II (g) HAP-08 non-redundant. HAP-01, HAP-05, HAP-08 showed 100% of methylation and HAP-03, HAP-10, HAP-18, HAP-20 had the highest and lowest percentage of methylation (> 75%). The haplotypes of Agave L. complemented with sequences terminals 5′-GGAUG and 3′-CCC. Values on the right lower border of each plot show free energy (ΔG)

In silico functionality analysis showed that HAP-01, HAP-15, HAP-18, and HAP-20 had high probabilities of being functional (> 99%) (Supplementary Table 1). Meanwhile, the HAP14, HAP05, and HAP08 haplotypes were less likely to be non-functional (< 67%). No relationship was found between the functionality of the haplotypes and the percentages of methylation in each haplotype. This result suggests that methylation could impact rDNA gene regulation but not functionality.

Discussion

Involvement of Retroelements in Organization and Variability of 5S rDNA Regions in Agave L.

Massive sequencing of clones from the BIBAC genomic library of A. tequilana (2n = 2x = 60) (Tamayo-Ordóñez et al. 2012) verified that 5S rDNA units have a tandem organization, as has been described in species of Nicotiana and Arabidopsis and in Rosa rugosa, among other plant species (Cloix et al. 2000; Kovarik et al. 2008; Tynkevich and Volkov 2014), and the size of the CR of the 5S rDNA genes (111–120 bp) is similar to that reported for other members of the class Liliopsida, such as Zea mays, Lilium tsingtauense, and Helictotrichon marginatum. In general, all of the CRs we identified in our 5S rDNA contigs were identical to those in haplotype 01 (HAP-01), previously reported by Tamayo-Ordóñez et al. (2015).

In Agave, it has been determined that even though the genes 45S rDNA and 5S rDNA are related to protein synthesis, they are located in completely independent loci and increase according to the ploidy level of species (Robert et al. 2008). Additionally, in the family Asparagaceae, the organization and clustering of rDNA genes is not independent of taxon but could be the product of evolutionary processes on these genes and of the genetic events they are experiencing (Garcia et al. 2010). In Agave, Gómez-Rodríguez et al. (2013) demonstrated that the 18S rDNA locus is constant across ploidy levels, but the 5S rDNA region is increased in diploid species, suggesting that these 5S rDNA regions could be subject to more frequent genetic imbalances, such as unequal exchange or transposition events, in comparison to 18S rDNA regions. Retroelements might have a primary function relating to events of transposition in the Agave genus (Khaliq et al. 2012; Hertweck 2013).

Tandem organization of 5S rDNA in A. tequilana would suggest that tandem organization of related genes in Agave can be directly related to gene divergence, gene function, and gene regulation mechanisms and can be affected by all genetic factors to which they have been exposed. While these factors have a closer relationship with chromosome organization, these genes will have the same organization in the genome of a species (Tamayo-Ordóñez et al. 2016b). The identification of 21 haplotypes of 5S rDNA that are possibly regulated by four different NTS in Agave and joined indicated that organization of the 5SrDNA regions on chromosomes in metaphase showed two independent loci (Robert et al. 2008), suggesting a chromosomal reorganization in 5S rDNA regions where haplotypes show similar functions and regulation mechanisms allow for them to organize in a single loci. A better understanding of the organization of 5S rDNA genes in the polyploid genome may help gain insight into the regulation, organization, and function of the new haplotypes, which at present are unknown in Agave polyploids.

Interestingly, consensus sequences of the 5S rDNA-dependent terminal repeat of the Cassandra TRIM repeat were present in all four contigs of A. tequilana and in the four groups of NTS from all the studied species of Agave (Fig. 3). In all 5S rDNA NTS, the TIR sequence of the Cassandra retrotransposon GTkACA-3′ was located at base pair 450. Only in the 5S-2 rDNA contig was the TIR sequence 5-TGYYABA located at base pair 310 (Fig. 1). The presence of vestigial sequences of the Cassandra retrotransposons in the genome of the genus Agave suggests participation of this mobile element in the increase in the number of copies and genetic diversity of 5S rDNA in the genome of Agave L. The origin of the Cassandra retroelements remains undetermined, but because they are present in ferns, tree ferns, and angiosperms, Kalendar et al. (2008) claimed that these mobile elements originated from an evolutionary radiation rather than from horizontal transfer. The coevolution of retroelements Cassandra and Athila with 5S rDNA regions has been described in Triticum spp., Hordeum spp., Oryza sativa, Avena sativa, and Arabidopsis, among other plants (Cloix et al. 2000; Wicke et al. 2011). It has been suggested that the Cassandra retrotransposon originated from retrotransposition of SINE elements derived from 5S rDNA. The function of the Cassandra retroelement is unknown, but it has been shown to play a role in the diversification of 5S rDNA as the result of conservation of its vestiges in the ribosomal region (Cloix et al. 2000; Kalendar et al. 2008; Cioffi et al. 2010; Wicke et al. 2011; Sampath and Yang 2014). Because recognition sites for polymerase III are found in LTR, it is thought that Cassandra belongs to the group of non-autonomous retroelements dependent on the 5S rDNA transcription mechanism. The presence of the autonomous replication Cassandra TRIM in A. tequilana could be contributing to the reorganization and diversity of the 5S ribosomal region in this diploid species, in which other 5S rDNA haplotypes have been found (Tamayo-Ordóñez et al. 2015).

The rDNA regions of polyploid species in the genus Agave have been used as chromosomal markers. According to Robert et al. (2008), these regions show conservation and additivity in species with different ploidy levels. However, Gómez-Rodríguez et al. (2013) reported an irregular distribution pattern of 5S rDNA loci in diploid species of Agave, including A. tequilana, suggesting that this reorganization of loci is possibly related to the divergence of this genomic region. Genetic events, such as unequal recombination and retrotransposition in rDNA regions, have been amply reported in monocots, such as Alstroemeria L. (Chacón et al. 2012), and in gymnosperms (Garcia and Kovařík 2013). Additionally, Khaliq et al. (2012) recently described that the Ty1-Copia retrotransposons are the main component of the genome of A. tequilana (approximately 50–80%), and Tamayo-Ordóñez et al. (2012), by means of the partial characterization of a genomic BIBAC A. tequilana, found a high representation of retroelements, indicating that these mobile elements could play an important role in the organization, regulation, and evolution of redundant, highly represented genes in the genome, such as the rDNA genes.

The monocotyledonous plant Agave has a large angiosperm genome, which has shown to vary in size and structure (karyotype) (Cavallini et al. 1995; Castorena-Sánchez et al. 1991; Moreno-Salazar et al. 2007; Robert et al. 2008). In plants, genome size increases primarily arise from a combination of polyploidy and proliferation of TEs, which are balanced by various methods of genome downsizing (Bennetzen 2002). Molecular approaches in Maize, Rice, Oryza, Gossypium, Vicia and Pinus (with larger genomes) (Sanmiguel and Bennetzen 1998; Hill et al. 2005; Hawkins et al. 2007; Piegu et al. 2006; Zuccolo et al. 2007; Morse et al. 2009) have attributed large genome sizes to LTR retrotransposons, and recently, a comparative study of the abundance of retroelements in the order Asparagales has indicated that the Ty1-Copia, Gypsy, and SINE retroelements could be responsible for the variation in genome size in this order (Hertweck 2013). Within the family Asparagaceae, the representation of retroelements was variable between subfamilies (Aphyllanthoideae, Lomandroideae, Nolinoideae, Asparagoideae, Scilloideae, Brodiaeoideae, and Agavoideae). This result indicates that the abundance of retroelements in different genomes may be influenced by hybridization and (or) polyploidy (Tamayo-Ordóñez et al. 2016a). To date, there is evidence for at least two polyploid events in the history of Agavoideae, which is consistent with the origin of bimodal karyotypes in the subfamily (McKain et al. 2012). Genome downsizing following allopolyploidy in Nicotiana was partially explained by transposon removal (Renny-Byfield et al. 2011). It is possible that in subfamily Agavoideae, TE dynamics from repeated hybridization and (or) polyploid events resulted in a large genome and a bimodal karyotype (Hertweck 2013). The Cassandra retroelement identified in species of Agave might be linked to a decrease of the homogenization rate of ribosomal regions and be contributing to the diversity of the 5S rDNA region, and as described above, it could have a major role in the increase in the size of the genome and contribute to the processes of pseudogenization, subfunctionalization, and neofunctionalization (Chen 2007, 2010; Wicke et al. 2011) that impact the Agave genus evolutionarily.

Impact of DNA Methylation on the Diversity of 5S rDNA in Agave L.

According to our results, the average percentage of total methylation in the genome of studied accessions of Agave was 40%, for which the 5S rDNA and NTS regions were 30 and 28% methylated, respectively (Table 2). This shows that specific methylation of the 5S rDNA regions may have a relevant role in the regulation and diversity of the rDNA region (Tamayo-Ordóñez et al. 2015).

In plants, 5S rDNA methylation can occur either in symmetric (CG and CXG) or in asymmetric (CXX) sites characteristic of peri-centromeric regions (Vanyushin 2006; Vaillant et al. 2007, 2008). Total methylation in the 5S-II, 5S-III, and 5S-IV allelic groups was approximately 89–94% (Table 3). In the 5S-IV allelic group, we identified the largest number of haplotypes compared with the other groups, and 40% of the total haplotypes represented in this group demonstrated low percentages of methylation. Haplotypes with lower percentages of methylation (analyzed as combined events) within this group were HAP-10 and HAP-18 (< 85%). This trend could be related to the expression of redundant genes in this allelic group and the need for better regulation in transcription of rDNA genes because there are a greater number of haplotypes with functional redundancy. Expression analyses indicated that in all studied species of Agave, the 5S-IV rDNA group was the most highly expressed group (Tamayo-Ordóñez et al. 2015), and in this study, we demonstrated that in this group, there are haplotypes with lower percentages of methylation. We suggest that methylation may be playing an important role in regulating the transcription of haplotypes belonging to group IV in Agave.

Regarding the functionality of the haplotypes in each 5S rDNA allelic group (Tamayo-Ordóñez et al. 2015), we found no relationship between functionality and percentage of methylation of variants of the 5S rDNA region in the studied accessions of Agave. However, the finding of haplotypes (HAP-03 and HAP-10) that can modify their secondary structures if there is post-transcriptional methylation suggests that perhaps, post-transcriptional methylation could regulate the transcription of rDNA genes and may function as a second transcriptional control barrier of highly represented genes in the Agave genome, which have not been methylated at the DNA level.

Nevertheless, the highest methylation percentage (37% as a combined event and 82% as an independent event) and a wider range of methylation percentage occurred in the symmetric site CGN, which suggests that CGN methylation events play an important role in regulation of 5S rDNA in this genus. Thus, transcriptional regulation of certain allelic groups of 5S could be controlled by symmetric/asymmetric methylation events, mainly in CGN sites. In Nicotiana, Garcia et al. (2012) reported that methylation of symmetric or asymmetric sites can be independent or combined events. The apparent trend of methylation in symmetric and asymmetric sites in Agave is CGN>CHG>CHH (groups 5S-III and 5S-IV), and similar trends have been reported in members of the Asteraceae family (Garcia et al. 2012). These results are evidence that intense dynamics are occurring in the 5S rDNA of Agave.

The main roles played by ribosomal genes in cell metabolism are well documented, and in plants, these genes are present in a large number of copies in order to respond to greater demands for protein synthesis during development and for stabilizing cell functions. Epigenetic events, such as DNA methylation, have an important function in maintaining control of rDNA gene transcription according to the metabolic demands of each species (Vaillant et al. 2007; Cokus et al. 2008; Kobayashi 2011). Additionally, in allopolyploid species of Nicotiana, rDNA methylation has been related to the capability of these species to lower the rate of homogenization, therefore contributing to divergence and diversity of rDNA (Kovarik et al. 2008).

Otherwise, the high percentage of symmetric methylation (in CGN sites) could be associated with maintaining plasticity of plant responses against environmental factors. In a study of transcriptionally active 5S rDNA units in Arabidopsis thaliana, two variants differing in one or two base substitutions and size were evaluated, and it was found that hypomethylation of 5S rDNA in symmetric CG or GNC sites was correlated with repression of 5S rDNA genes dependent on the independent methylation pathway (MOM) (Mathieu et al. 2002, 2005; Vaillant et al. 2007). It has been hypothesized that symmetric methylation in plants is strongly correlated with repression of certain 5S rDNA genes and with prevention of rapid epigenetic deregulation in response to abiotic factors (Mittelsten-Scheid et al. 2002; Kinoshita and Seki 2014). Thus, methylation of 5S rDNA in Agave could be associated with the continuous transcriptional dynamics of ribosomal regions.

In a previous study, Tamayo-Ordóñez et al. (2015) showed that redundant and most non-redundant 5S rDNA haplotypes are probably functional, and in the present study, the percentage of methylation and functionality of 5S rDNA haplotypes were unrelated in the accessions of Agave that we studied. Methylation of rDNA has been related to proliferation of pseudogenes contributing to higher condensation of chromatin and heterochromatin, preventing these pseudogenes from being eliminated from the genome so that they are transmitted to new generations in high copy numbers, which contributes to the divergence of these ribosomal regions (Kobayashi 2006, 2011; Chen 2007; Kovarik et al. 2008). We here identified two new methylated 5S rDNA haplotypes (HAP-18 and HAP-20), which probably indicates that genetic dynamics is strongly correlated with 5S rDNA methylation in Agave. These epigenetic events play an important role in chromatin condensation and may be involved in the maintenance of non-functional haplotypes and contribute to divergence of 5S rDNA genes in subsequent generations, as has been described in Nicotiana (Kovarik et al. 2008).

Regarding the frequency of specific methylation events in the 5S rDNA NTS of the studied accessions of Agave, we identified and described four NTS groups and found that in NTS-I and NTS-IV, the most represented of such groups and which are present in all studied species of Agave, the percent of methylation was > 82% in CGN sites for independent events and > 37% for combined events. Average symmetric and asymmetric methylation of NTS groups showed similar frequency trends, this trend being CGN>CHG>CHH in all spacers groups (Table 1).

Few differences in the density of methylation among NTS clones could be associated with high levels of condensation in the 5S rDNA regions of the studied accessions of Agave, which prevents access of transcription machinery such as TFs and inhibits their homogenization (Pikaard 2000; Lawrence et al. 2004; Orioli et al. 2012).

On the other hand, it is important that transcription of 5S rDNA genes by RNA polymerase III (RNA pol III) requires an internal promoter located inside the gene and is regulated by factors TFIIIA, B, and C. In the polyploid species of Agave that we studied, cytosines were absent from the elements BOX-A, -C, and IE in the 5S rDNA region. Therefore, methylation is possibly acting by protecting these genetic regions from recombination, therefore contributing to divergence of rDNA units and, to a lesser extent, to transcription of non-functional units. Overall, DNA methylation of the whole 5S rDNA unit (including the promoter, NTS, and gene) could be important for chromatin condensation, transcriptional regulation of genetic units highly represented in the genome, and genomic organization of loci in the studied species of Agave (Probst et al. 2004; Layat et al. 2012; Orioli et al. 2012; Tamayo-Ordóñez et al. 2016b).

Finally, it is known that changes in cytosine methylation density in promoters and specific histone modifications dictate the on and off states of rRNA genes (Lawrence et al. 2004), and recently, variation in ribosomal RNA (rRNA) sequences in conjunction with diversity in the composition and post-translational modifications of subsets of ribosomal proteins and variation in binding to distinct ribosome-associated factors has been associated with the occurrence of specialized ribosomes in different cell types (Xue and Barna 2012). In Agave, variations identified in rDNA sequences and different profiles of methylation in NTS and the 5S rDNA gene might suggest the presence of specialized ribosomes that provide an important new layer for the spatio-temporal control of genes in complex polyploidy genomes (Xue and Barna 2012; Tamayo-Ordóñez et al. 2016b).

Genetic Diversity of 5S rDNA Is Related to Evolution and Ploidy Level in the Genus Agave

Ribosomal genes constitute a multigene family of sequences repeated in tandem and are highly represented in the genome. The 18S and 5S RNA are part of the greater and lesser subunits of the ribosome. The regulation of these genes is essential for the metabolism, development, and maintenance of different types of cells (Tamayo-Ordóñez et al. 2015; Xue and Barna 2012). In Agave, the dynamics of the ribosomal regions is still unknown. The information generated from this research in conjunction with the comparison of our results with those in other allopolyploid models leads us to suggest the following model of evolution and diversity of 5S rDNA in Agave.

In Agavoideae, there is evidence for at least two polyploid events, which is consistent with the origin of bimodal karyotypes in Agave (McKain et al. 2012). This suggests that if the origin of Agave is from the combination of two different genomes (allopolyploid), it could have led to genomic shock and involved changes in DNA sequence, epigenetic events, karyotype, and gene transcription levels. It is known that in allopolyploid plant species, repetitive DNA in particular is subjected to changes in sequence, copy number, and the occurrence of genetic and epigenetic events, which could affect the regulation and genetic diversity of these repetitive regions (Fig. 6) (Adams et al. 2003; Kovarik et al. 2008; Song and Chen 2015; Tamayo-Ordóñez et al. 2016b).

Fig. 6
figure 6

Graphic model of the behavior of 5S rDNA in species of the genus Agave. Agave has a bimodal karyotype in which rDNA loci are located in different chromosomes. The 5S rDNA locus is located in the peri-centromeric region of a small submetacentric chromosome and locus 35S rDNA is located in acrocentric chromosomes. In plants and other organisms, the 5S rDNA and 35S rDNA loci are organized in tandem and contain genes coding for the small and large ribosomal subunits, respectively. In the allopolyploid genome (resulting from combination of two different genomes) of Agave different 5S rDNA alleles have been identified. The divergence, representation, differential expression, and functionality of each allelic group of 5S rDNA differed among species of Agave due to presence of the Cassandra retrotransposon, transcription factors, and mainly, the participation of symmetric/asymmetric methylation in the NTS controlling each allelic group

Tamayo-Ordóñez et al. (2015) described possible functional alleles of 5S rDNA in the genome of Agave that are widely represented in all studied species and at all ploidy levels (A. tequilana, 2n = 2x = 60; A. fourcroydes, 2n = 3x = 90 and 2n = 5x = 150; A. angustifolia, 2n = 2x = 60 and 2n = 6x = 180). Specifically, 5S rDNA regions of A. tequilana, A. fourcroydes, and A. angustifolia have variable copy numbers and differential expression among accessions (Tamayo-Ordóñez et al. 2015). Additionally, the presence of divergent allelic groups and NTS of 5S rDNA that we report here could have originated by genetic events, such as rearrangements, deletions, unequal recombination, gene conversion, and transposition. Ty1-copy and Cassandra retroelements also play an important role in the variability of the genome and diversity of 5S rDNA genes (Chen 2007; Kovarik et al. 2008; Cioffi et al. 2010; Wicke et al. 2011; Tamayo-Ordóñez et al. 2012; Sampath and Yang 2014). We also found high proportions of methylation in 5S rDNA haplotypes; methylation events might favor fixation of some 5S rDNA alleles resulting in higher copy numbers, divergence, and differential expression of 5S rDNA genes. The 5S rDNA units that were methylated prior to the processes of homogenization could have been transmitted to subsequent generations (Fig. 6) (Kovarik et al. 2008; McStay and Grummt 2008).

The differential expression of 5S rDNA allelic groups in the genus Agave could also be associated with species dependent functions and demands of these genes. Expression of the different 5S rDNA alleles might be additive or non-additive (nucleolar dominance) and may be influenced by both genetic and epigenetic events, such as symmetric/asymmetric methylation, and the presence of retroelements, as has been demonstrated by our present work (Fig. 6) (Kovarik et al. 2008; Secco et al. 2015; Martinez et al. 2013).

It is possible that the high divergence we observed in NTS may be an effect of the fixation of certain haplotypes due to the rapid evolution to which commercially important species of Agave are subjected to, in particular A. tequilana. However, A. tequilana has the highest number of non-functional haplotypes and NTS groups with high genetic variability (Mayer et al. 2006). Finally, the genetic diversity we found in the 5S rDNA regions of the studied accessions of Agave might be related to the evolution and complexity of the genome of this genus, and given its recent origin (< 10 Mya), it is possibly undergoing a process of sequence homogenization due to concerted evolution of ribosomal units (Fig. 6) (Kovarik et al. 2008).

In conclusion, Agave tequilana showed that the 5S rDNA regions are organized in tandem, and the presence of at least 13 haplotypes classified into 4 allelic groups and the identification of 4 NTS groups suggests that some copies of these genes can be differentially regulated in the polyploid species of Agave.

Our results also indicated that the divergence found between NTS groups and haplotypes described in Agave can be influenced by genetic changes, as the transposition and the identification of the remains of the Cassandra retroelement suggest active participation of these mobile elements in the diversity of 5S rDNA regions in Agave.