Introduction

Most angiosperms have symmetric karyotypes (Stebbins 1971; Weiss-Schneeweiss and Schneeweiss 2013), with chromosomes similar in size and morphology. However, several lineages have been known to have asymmetrical karyotypes, characterised by centromeres at different positions along chromosomes, or chromosomes of different sizes. Karyotypes with two sets of chromosomes markedly different in size are called bimodal and represent the extreme of karyotype asymmetry (Stebbins 1971; McKain et al. 2012). Three main hypotheses were suggested for the origin of bimodal karyotypes. The first mechanism involves chromosomal rearrangements such as fusion-fission events. Thus, larger chromosomes can be the product of small chromosome fusions or small chromosomes the result of fission of larger ones (Schubert and Lysak 2011). Lygosoma bowringii Günther 1864, a lizard of the Scincidae family, has 2n = 32, with 18 macrochromosomes and 14 microchromosomes. As in most reptiles, macrochromosomes have originated from the fusion of microchromosomes (Lisachov et al. 2018, 2020). The second mechanism is allopolyploidy involving parental species with different chromosomal sizes, suggested for some genera such as Agave L. (McKain et al. 2012). The third possibility is the progressive amplification of repetitive DNA sequences in one set of chromosomes (de la Herrán et al. 2001). In Eleutherine Herb. (Iridaceae), for example, an accumulation of different families of repetitive sequences in the largest chromosome pair was suggested to be the cause of the differences between both chromosome sets (Báez et al. 2019).

Repetitive DNA constitutes a large fraction of plant genomes and can be found either organised in tandem (micro-, mini- and satellite DNAs), or dispersed through the genome (transposons and retrotransposons) (Heslop-Harrison and Schwarzacher 2011). Transposable elements are capable of moving within the genome, impacting genome structure and even the function of genes (Bourque et al. 2018). The highly abundant repetitive sequences are frequently associated with heterochromatin formation located at the (peri-)centromeres, subtelomeres and in interstitial heterochromatic blocks (Barros e Silva et al. 2010; Van-Lume et al. 2019). The effects of transposable element and satellite DNA accumulation on genomes are dynamic and can lead to significant increase in genome size. In species of the genus Zea L., the accumulation of repetitive DNAs families, mainly LTR retrotransposons like Ty3/Gypsy, resulted in a two-times larger genome in Zea luxurians (Durieu) R. M. Bird in relation to Z. mays L. and Z. diploperennis Iltis, Doebley & R. Guzmán in less than two million years (Estep et al. 2013).

Satellite DNAs are composed by monomers that are oriented head-to-tail and can vary in length, nucleotide composition, sequence complexity and abundance. These sequences frequently form clusters that can rapidly change in number, position and size (Garrido-Ramos 2015; Biscotti et al. 2015). A mutation that occurs within a monomer can spread among the repeat units or be eliminated by homogenization (Plohl et al. 2012). The mechanism of concerted evolution, for example, can generate varied patterns of repetitive DNA families, producing in general homogeneity within species and diversity between species. Thus, different species can have different families of satellite DNAs or these satellites can be shared between related species (Feliner and Rosselló 2012; Plohl et al. 2012). Furthermore, these tandem repeats can be species specific or even chromosome specific. Some tandem repeats are highly conserved among species, such as the 5S and 35S ribosomal DNAs (rDNAs), which encode for the ribosomal RNAs. But most repetitive families are usually non-coding sequences evolving rapidly and generating genomic differentiation (Biscotti et al. 2015).

The parasitic genus Cuscuta L. (Convolvulaceae Juss.) includes some 200 species, divided into four subgenera: Grammica (Lour.) Peter, Engl. & Prantl, Pachystigma (Engelm.) Baker & C.H. Wright, Cuscuta and Monogynella (Des Moul.) Peter, Engl. & Prantl (García et al. 2014; Costea et al. 2015a). Subgenus Grammica, with about 150 species, has almost exclusive distribution in the Americas. Pachystigma includes only five species, all endemic to South Africa, and Cuscuta is native to Europe, Africa and Asia, with a few species introduced and naturalised in the Americas, Australia and New Zealand. Subgenus Monogynella had its origin in Central Asia from where it dispersed to S and E Asia, Europe, Africa, and one species, C. exaltata Engelm., is native to south-eastern North America (García et al. 2014; Costea et al. 2015b).

The genus Cuscuta shows high cytogenetic variation in chromosome number (2n = 8 to 2n = 150), chromosome size (1.66 to 21.60 μm) and genome size (1C = 0.270 Gbp to 1C = 34.73 Gbp). The genus also presents symmetric to bimodal karyotypes, as well as monocentric and holocentric chromosomes (García and Castroviejo 2003; Guerra and García 2004; McNeal et al. 2007; Sun et al. 2018; Ibiapino et al. 2019, 2020; García et al. 2019; Oliveira et al. 2020; Neumann et al. 2021). Each Cuscuta subgenus seems to have different karyotypic features. Species of subgenus Monogynella have the largest genome sizes and the largest chromosomes. Subgenus Cuscuta is the only one that has species with exclusively holocentric chromosomes. Subgenus Grammica presents the largest variation in chromosome number and size. This subgenus has at least five cases of interspecific hybridization which can contribute to this chromosome number variation (Fogelberg 1938; Pazy and Plitmann 1994; García 2001; García and Castroviejo 2003; McNeal et al. 2007; Ibiapino et al. 2019; García et al. 2019). A preliminary study of two species of Pachystigma revealed bimodal karyotypes and extensive heterochromatic blocks in the larger chromosomes, suggesting the influence of repetitive DNA in the emergence of bimodality in this subgenus (García et al. 2019). An asymmetrical karyotype was also reported for some populations of the holocentric C. epithymum (L.) L. (subgenus Cuscuta), with 2n = 14 individuals showing bimodal karyotype, while 2n = 16 individuals having symmetric karyotypes (García and Castroviejo 2003).

Species of the genus Cuscuta also vary in heterochromatin content, ranging from species with few bands and few rDNA sites, such as C. denticulata Engelm. (Ibiapino et al. 2019), up to species with numerous bands, where heterochromatin may have contributed to the expansion of the genome size, as in C. monogyna Vahl. and C. indec-ora Choisy (Ibiapino et al. 2020; Oliveira et al. 2020). In the latter two species, heterochromatin may have contributed to maintaining karyotype symmetry, since both have similar karyotypes, but belong to different subgenera (Ibiapino et al. 2020). Repeat DNA composition was investigated in 12 Cuscuta species, demonstrating that the extensive variation in genome size in species of this genus is caused by the differential accumulation of repetitive sequences (Neumann et al. 2021). However, no representatives of subgenus Pachystigma were included in that study.

Our current work investigates heterochromatin distribution in three of the five species of the subgenus Pachystigma (C. nitida E. Mey. ex Choisy, C. africana Thunb. and C. angulata Engelm.) and evaluates the repetitive DNA composition of C. nitida genome, in order to better understand the role played by repetitive DNA sequences in the emergence of bimodal karyotypes within this subgenus.

Materials and methods

Material

Flower buds of two accessions of C. africana, one of C. angulata and three of C. nitida (subgenus Pachystigma) were collected in November 2017 from the Cape region of South Africa, where they are endemic (Table 1). Vouchers were deposited at the herbaria of the University of Toronto Mississauga (TRTE) and Wilfrid Laurier University (WLU), Canada.

Table 1 Collection sites for three Cuscuta species of the subgenus Pachystigma

Slide preparation and CMA/DAPI double staining

Slides were prepared using flower buds collected and fixed in the field in Carnoy (ethanol:acetic acid, 3:1, v/v). The material was washed in distilled water, digested in an enzymatic solution containing 2% cellulase (Onozuka) and 20% pectinase (Sigma) for 40 min. The slides were prepared by air drying, mainly using the ovary wall, as described by De Carvalho and Saraiva (1993), with small modifications. After the material was macerated and dried, the slides were dipped in 60% acetic acid for up to 5 min to clear the cytoplasm. Finally, the slides were left at 37 °C until completely dry.

For double CMA/DAPI staining, the slides were aged at room temperature for three days, stained with 8 μL of 0.1 mg/μL chromomycin A3 (CMA) for 60 min, mounted in 8 µL of 1μg/mL 4′,6-diamidino-2-phenylindole (DAPI) in mounting medium (glycerol:McIlvaine buffer pH 7.0, 1:1, v/v) and aged again for three days at room temperature. The images were captured with a COHU CCD camera attached to a Leica DMLB fluorescence microscope equipped with Leica QFISH software. After image capture, slides were destained for 30 min in Carnoy, for 1 h in absolute ethanol and stored at −20 °C for in situ hybridization.

For chromosomal measurements, five metaphases of C. nitida were used. Chromosomes were measured with the ruler tool in Adobe Photoshop CS3 version 10.0.

DNA extraction and in silico repetitive DNA analysis

Cuscuta nitida genomic DNA was extracted following Doyle and Doyle (1987) protocol. Sequencing of the total genomic DNA generated low coverage (ca. 0.1×), 250-bp paired-end reads in an Illumina HiSeq 2500 (BGI, Hong Kong, China). Repetitive DNA analysis was performed by the RepeatExplorer pipeline (https://galaxy-elixir.cerit-sc.cz/; Novak et al. 2013), where reads showing at least 95% similarity in at least 55% of its length were clustered together.

Clusters showing an abundance greater than 0.01% were automatically annotated and manually checked. Clusters similar to plastomes or mitogenomes were considered putative contamination and excluded from the final annotation. All contigs with tandem repetitions identified by Tandem Repeat Analyzer (TAREAN) (Novák et al. 2017), as well as other satellites not identified by this tool, but which presented typical satellite graph layouts after clustering, were confirmed with dot-plot (DOTTER software) (Sonnhammer and Durbin 1995) (Fig. 1). High abundance dispersed elements had their integrase domain identified using the NCBI Conserved Domain Search (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi). The consensus sequences of the satellites and the integrase domains of the transposable elements were used for primer design using the primer design tool implemented in Geneious version 7.1.9 (Kearse et al. 2012).

Fig. 1
figure 1

Sequence comparisons of hybridised satellites in C. nitida. a, b Superfamily 1; c, d superfamily 2; e, f superfamily 3; g satellite CnSat10-1400; h CnSat12-1060. a, c, e, g, h, Dot-plots and cluster graphics; b, d, f The alignments and similarity between the satellite subunits that constitute superfamilies 1, 2 and 3, respectively. b, g Similarity of the subunits with the 35S rDNA previously assembled from C. campestris for superfamily 1 and CnSat10-1400, respectively

The consensus sequences of all identified satellites were compared to verify their homology. The consensus monomers that showed similarity in dot-plot were aligned using Muscle in Geneious. Different satellite families were considered as part of the same superfamily when monomer sequences showed identity between 50 and 80%. Sequences with 80–95% similarity were considered subfamilies of the same family and similarity greater than 95% were considered variants of the same family (Ruiz-Ruano et al. 2016). As two of these satellites showed similarity with transposable elements, alignments were made of the consensus satellite sequence with the most similar transposable element domains indicated by the RepeatExplorer. Two of the satellites that showed similarity with transposable elements also showed in situ colocalization with the 35S rDNA cluster. Therefore, a comparison of the satellite consensus sequence with a putative C. campestris Yunk. (GenBank accession number PRJEB19879) 35S rDNA consensus sequence, assembled using the NOVOPlasty algorithm (Dierckxsens et al. 2017), was included. This assembly was made using Illumina reads obtained from Vogel et al. (2018). After assembled, the complete 35S rDNA was aligned with the satellite consensus sequence using Muscle in Geneious (Fig. 1). Satellites were named as follows: code referring to the species name (Cn), followed by “Sat”, a number referring to the abundance order and the size of the consensus monomer in base pairs.

Repeat amplification, probe preparation and in situ hybridization (FISH)

Polymerase chain reaction (PCR) for repeat amplification was performed in 50 μL reactions containing 20 ng of C. nitida genomic DNA, 1× PCR buffer (20 mM Tris-HCl pH 8.4, 50 mM KCl), 2 mM MgCl2, 0.1 mM dNTPs, 0.4 μM of each primer, 0.4× TBT (750 mM trehalose, 1 mg/mL BSA, 1% Tween 20, 8.5 mM Tris hydrochloride) and 0.6 μL of a homemade Taq Polymerase. Amplification programme was 1× 94°C for 3 min, plus 30 cycles of 94 °C for 1 min, 55–65 °C for 1 min (see Supplementary Table 1 for annealing temperatures of each primer pair) and 72 °C for 1 min, followed by a final extension of 7 min at 72 °C. Primers for amplification of satellite superfamilies SF1 and SF2 annealed in a conserved region shared between the satellite variants from each superfamily. All satellite DNA primers were designed facing outwards, so amplification was additional evidence of a tandem organization. The transposable element primers were designed using the sequences of the integrase domain of each element (the sequences of the integrase domains can be found in Supplementary Table 2).

The PCR products were sequenced to confirm its identity and labelled in a total volume reaction of 12.5 μL, containing 1 μg of amplified DNA, 1× Nick Translation buffer (0.5 M Tris HCl pH 7.5; 50 mM MgCl2), dNTP mix (0.016 mM each of dATP, dCTP, dGTP), 0.08 mM Cy3-dUTP or Alexa-dUTP, 7.5 U of DNA Polymerase I and 0.006 U of DNase I. The mixture was incubated at 15 °C for 1 h or longer if needed, until most fragments were under 500 bp, and reactions were stopped using 0.5 M EDTA.

For rDNA probes, the plasmids D2 of Lotus japonicus (Regel) K. Larsen (5S rDNA) and pTa71 of wheat (25-28S, 5.8S and 18S rDNA) were used (Pedrosa et al. 2002; Sousa et al. 2011). Probes were labelled by Nick translation with Cy3-dUTP (5S), as described above, and digoxigenin 11-dUTP (35S) with a Nick Translation kit (Invitrogen — Oregon, USA).

Fluorescence in situ hybridizations followed Pedrosa et al. (2002). The hybridization mixture, composed of 50% formamide, 10% dextran sulphate, 2× SSC and 5 ng/μL probe, was denatured at 75°C for 10 min. Slides were denatured for 5 min with the hybridization mixture and hybridised for 18–20 h at 37°C in a humid chamber. Final stringency was 76% for 5S and 35S rDNA and satellite DNAs. The 35S rDNA probe was detected with anti-digoxigenin produced in sheep, conjugated with FITC (Roche — Basel, Switzerland) and the signal amplified with anti-sheep IgG produced in rabbit conjugated with FITC (Serotec — California, EUA). The slides were mounted as described above. With 76% stringency, transposable element probes did not show strong, clear signals. For this reason, these elements were hybridised with lower stringency (40%), as described in Ribeiro et al. (2017).

To verify the putative localization of telomeric sequences at interstitial chromosome sites, the ND-FISH protocol described by Cuadrado et al. (2009) was applied. Thirty microlitres of the hybridization solution containing 2 pmol (25 ng) of the diluted probe (TTTAGGGTTTAGGGTTTAGGGTTTAGGGT5 directly labelled with Cy3, Macrogen, Seoul, Korea) in 2× SSC was added per slide and covered with a coverslip. The slide was incubated 2 h at room temperature protected from light. The coverslip was removed with 2× SSC, washed in 4× SSC/0.2% Tween 20 at room temperature for 10 min under agitation and mounted in DAPI with mounting medium as described above. The satellites and rDNA images were captured as previously described. For transposable elements, images were captured using an epifluorescence Leica DMLB microscope equipped with a COHU 4912-5010 CCD Camera using the Leica QFISH software.

Results

Bimodal karyotypes are a typical feature of the subgenus Pachystigma, with large chromosomes enriched in heterochromatin

All three samples of C. nitida analysed showed 2n = 28 and bimodal karyotypes (4L+24S), with two larger chromosomes pairs (average sizes of 12.34 μm and 8.19 μm) and 12 smaller chromosome pairs (average size of 2.67 μm) (Fig. 2a-c). The second largest pair harbours a proximal nucleolus organizing region (NOR), as evidenced by the large decondensed region between both chromosome arms (Fig. 3a-c) . The total haploid complement size was 107.6 μm.

For heterochromatin characterization, CMA/DAPI double staining was performed in the three species. Cuscuta nitida showed only two pairs of heterochromatic bands, a large CMA+/DAPI band in the short arm of the largest chromosome pair (Fig. 2a–c) and a second CMA+/DAPI band colocalised with the NOR in the second chromosome pair (Fig. 2b–c). In situ hybridization with 5S and 35S rDNA revealed one large pair of 5S site, colocalised with the CMA+ band at the short arm of the largest pair (Fig. 3a–d). One major and one minor pair of 35S rDNA sites were observed, both in the large chromosome pairs. The major 35S site was observed highly decondensed, colocalised with a CMA+ band in the second largest chromosome pair, while the minor site, not always visible, was observed proximally at the largest pair (Fig. 3a–d).

Fig. 2
figure 2

Metaphases of C. nitida (a, b, c), C. africana (d, e, f) and C. angulata (g, h, i) stained with CMA (yellow) and DAPI (blue). Overlapping in c, f and i. Arrowheads in c, f and i highlight heterochromatic bands in each karyotype

Fig. 3
figure 3

Metaphases of C. nitida showing colocalization of CMA+ bands (a, c, e) with 5S rDNA in red (b) and 35S rDNA in green (b, d) and SF1 superfamily in green (f). Insets in b and d show smaller 35S sites. In f, the telomeric probe detected only terminal loci. In h, the 5S rDNA (in red) is flanked by the SF2 superfamily sites (in green). In g, CnSat10-1400 satellite signals in red; inset shows a detail of the satellite signal on one of the chromosomes of the largest pair. Chromosomes were counterstained with DAPI (blue, a, c and e, or grey)

Cuscuta africana and C. angulata also showed bimodal karyotypes (Fig. 2d–i). Cuscuta africana exhibited karyotype similarities with C. nitida (2n = 28, 4L +24S), but the heterochromatic band present in the largest pair was DAPI+/CMA (Fig. 2d–f). On the other hand, C. angulata (2n = 30, 10L+20S) presented ten large chromosomes in its karyotype. These chromosomes have numerous heterochromatic bands, mainly DAPI+/CMA. In addition, the set of smaller chromosomes of this species has pericentromeric bands, mainly DAPI+/CMA (Fig. 2g-i). Thus, the heterochromatin characterization evidenced the presence of bimodal karyotypes in the three analysed species of Pachystigma, with a high number of heterochromatic bands in the large chromosome pairs, with different compositions, GC or AT rich, depending on the species.

The repetitive fraction of C. nitida genome is rich in tandem repeats

To understand the composition of the repetitive DNA fraction and its relation to the heterochromatin content, we performed genome skimming in C. nitida and characterised the most abundant DNA repeats. A total of 5,156,846 reads were generated, of which 1,173,600 reads were randomly sampled by RepeatExplorer for analysis. A total of 53,117 clusters were identified and 330 clusters, containing at least 0.01% of genome abundance, were grouped into 322 superclusters (Fig. S1) and annotated. Five and 33 clusters that showed similarity to mitochondrial and plastid sequences, respectively, were excluded from further analysis (Table 2).

Table 2 General annotation of the repeat type and genome proportion (%) of Cuscuta nitida

The repetitive fraction corresponded to 42.83% of the C. nitida genome. It was possible to annotate 198 of the 330 clusters, the rest (6.56% of the total genome) remained unclassified (Table 2). The dispersed repetitive DNA sequences corresponded to 22.01% of the total genome. LTR-retrotransposons from the Ty1/Copia superfamily comprised 5.6% of the genome, while Ty3/Gypsy elements were 1.6 times more abundant (8.86%). Within Ty1/Copia, the SIRE lineage was the most abundant with 3.5%, while the Tekay lineage was the most represented among Ty3/Gypsy with 4.19%. LTR elements without a clear lineage classification corresponded to 5.13% of the total genome. LINEs corresponded to 0.96%, while class 2 transposable elements corresponded to 1.46%, with CACTA being the most abundant (1.05%).

Among tandem repeats, the 5S rDNA showed a large abundance, comprising 4.75% of the C. nitida genome, even larger than the 35S rDNA (3.1%). Other tandem repeats (satDNA) corresponded to 6.4% of the genome (Table 2). TAREAN identified six clusters with high-confidence satellites and six with low-confidence. Another six clusters, not identified by the TAREAN, showed typical circular graphs and were confirmed as tandem repeats by a self-similarity dot-plot analysis. The comparative dot-plot with the consensus sequences of the 18 identified satDNAs revealed some sequences with similarity to each other and was grouped into three superfamilies (Table 3; Fig. 1; Fig. S2 and Fig. S3).

Table 3 Satellite DNA families and superfamilies identified in C. nitida genome, showing genome proportion (%), percentage of Guanine and Cytosine (%GC), TAREAN output and similarity to other repeats. First number in the name of the satellite represents its order of abundance, while the second number, the consensus size of its monomer sequence

Superfamily 1 (SF1), with 2.44% abundance, is composed of two satellites: CnSat1-213, classified by TAREAN, and CnSat2-295, not identified by TAREAN, showing 68.6% similarity between consensus sequences. The superfamily 2 (SF2) is composed of three satellites classified with high-confidence by TAREAN, CnSat3-111, CnSat4-115 and CnSat8-125, showing 72.3% similarity among consensus sequences and representing together 1.08% of the genome (Fig. 1). The third superfamily is composed of two satellites, one classified with low confidence and one not identified by TAREAN, CnSat11-1357 and CnSat15-990, together corresponding to 0.07% of the genome (Table 3). The consensus sequences are provided in Supplementary Table 2. In addition to these satellites, some microsatellites were identified and described in Supplementary Table 3. Two satellites classified with high confidence by TAREAN showed similarity to retrotransposon domains. The alignment of the consensus sequence of these satellites with these domains showed 32.2% identity between CnSat10-1400 and the Reverse Transcriptase (RT) domain of a LINE element, while CnSat12-1060 showed 39.5% identity to the Ribonuclease (RH) domain of Ty1/Copia Reina lineage.

Mapping repetitive sequences

Different repeats were selected for investigating their chromosomal distribution and putative association with heterochromatin and the largest chromosome pairs. Apart from the 5S and 35S rDNA, four others satDNA were selected: superfamily 1 (SF1), the most abundant among satDNAs, superfamily 2 (SF2) and two tandem repeats that showed similarity with mobile elements. Superfamily SF1 signals colocalised with 35S rDNA in both large chromosome pairs. The SF1 signals, however, were stronger and more extended than the rDNA signal, occupying the proximal region on the long arm of the largest pair (Fig. 3e-f). Superfamily SF2 is also located in the largest chromosome pair, presenting two signals on each homologue. These signals flanked the 5S rDNA sites (Fig. 3g). The 5S rDNA and SF2 sites occupy most of the short arm of the largest chromosome pair. The CnSat10-1400 satellite showed a small signal in the pericentromeric region of the short arm of the largest pair and a larger signal in the distended region of the second pair, similar to the 35S rDNA (Fig. 3h). Despite colocalization of satellites SF1 and CnSat10-1400 with the 35S rDNA, these satellites did not show any significant in silico similarity with the 35S rDNA assembled from C. campestris (2.4% to CnSat1-213, 3.6% to CnSat2-295 and 29.9% to CnSat10-1400, identity to the aligned sequence, Fig. 1). CnSat12-1060, on the other hand, showed no evident chromosome hybridization (data not shown). The ND-FISH with telomeric probe showed terminal signals in all chromosomes of the complement, but no interstitial signals that could suggest previous chromosome fusions (Fig. 3f).

The most abundant LTR retrotransposon lineages, SIRE (Ty1/Copia), Tekay and Retand (Ty3/Gypsy), as well as the putative centromeric CRM lineage (Ty3/Gypsy), were also selected for hybridization in C. nitida chromosomes, showing slightly different patterns. The SIRE element showed signals along the entire length of the largest chromosome pair, with a gap in the 5S rDNA site. In addition, it also labelled the distal regions of the second largest pair, not including the proximal NOR (Fig. 4a). The Retand element showed a similar pattern, but it also displayed dispersed signals along the distended region of the NOR and along the 5S rDNA cluster (Fig. 4b). CRM labelled the largest chromosomes pairs, but also showed a weak labelling of the small chromosomes, slightly enriched in the pericentromeric region at least in some of them (Fig. 4c). Tekay element showed scattered proximal signals on the largest chromosome pairs and no signal in the NOR (Fig. 4d).

Fig. 4
figure 4

Metaphases of C. nitida showing the distribution of LTR-retrotransposons with an enrichment in the largest chromosome pairs. In a, the element Ty1/Copia SIRE (violet); in b, Ty3/Gypsy Retand (green); in c, Ty3/Gypsy CRM (yellow); and in d, Ty3/Gypsy Tekay (red). Chromosomes were counterstained with DAPI (grey)

Combined, these data demonstrate the enrichment of the large chromosome pairs of the C. nitida karyotype with tandem (rDNAs and satDNAs) and disperse (LTR retrotransposons) repetitive sequences (Fig. 5). Several of these repeats colocalise in the largest pairs, evidencing a complex chromosome organization of this large chromosomes and indicating them as cause for the bimodal karyotype in this species.

Discussion

All three analysed species — C. nitida, C. africana and C. angulata — presented bimodal karyotypes. Although the remaining two species, C. gerrardii Baker and C. natalensis Baker, should be analysed in the future for confirmation, the presence of bimodal karyotype is likely a synapomorphy of Cuscuta subgenus Pachystigma. The phylogenetic relationships within the subgenus resolved C. nitida as sister to a clade with C. natalensis and C. gerrardii (García et al. 2014), with this clade sister to C. africana + C. angulata. This suggests that this karyotypic feature was maintained in the whole clade, thus supporting the hypothesis that all species of this subgenus share bimodal karyotypes. In fact, the four subgenera of Cuscuta are not only delimited by phylogenetic, biogeographic and morphological data, but each present unique cytogenetic peculiarities, such as the presence of holocentric chromosomes in the subgenus Cuscuta (García et al. 2014; Costea et al. 2015; García et al. 2019).

Much of the cytogenetic studies in the genus Cuscuta are restricted to conventional staining techniques with an emphasis on chromosome counting. Fewer studies conducted more detailed cytogenetic analyses, such as CMA/DAPI banding for characterization of heterochromatin. The latter studies revealed a numerical variation of these bands, with species having few bands, like C. denticulata with only one pair of evident CMA+/DAPI bands, to species like C. monogyna, with approximately 90 CMA+/DAPI bands and 80 DAPI+/CMA bands in pachytene (Ibiapino et al. 2019, 2020). Despite this variation, our previous unpublished data showed that karyotypes with less bands are more frequent, even in polyploid species. In most of those cases, CMA+ bands are found in pericentromeric regions and are colocalised with 5S and 35S rDNA. In the species with a higher number of CMA+/DAPI or DAPI+/CMA bands, they are localised in interstitial regions. There is no evidence that this banding pattern is different for each subgenus and there may be similar patterns between different subgenera (Ibiapino et al. 2020). In the three species of the subgenus Pachystigma analysed here, the multiple CMA/DAPI bands were mainly present at the largest pairs. These bands do not differ much in number and position from those already reported in the genus; however, in these bimodal karyotypes, the bands are larger. In C. nitida, for example, the largest CMA+/DAPI band occupies a large part of the short arm of the largest chromosome pair. Cuscuta africana showed a similar pattern; however, the largest heterochromatic band was DAPI+/CMA. The large number of heterochromatic bands on the smallest chromosomes is not observed in the other two species of the subgenus Pachystigma. This characteristic may indicate an incipient accumulation of heterochromatin in these chromosomes, which could eventually lead to a less asymmetrical karyotype, such as the amplification observed in other unrelated Cuscuta species, like in C. indec-ora (subgenus Grammica ) and C. monogyna (subgenus Monogynella ) (Ibiapino et al. 2019), indicating that this character is homoplastic in the genus. Alternatively, karyotype asymmetry in Pachystigma may be maintained by an unknown mechanism.

Bimodality in Pachystigma is not due to chromosome fusion. Although C. nitida has 2n = 28, lower than the basic number proposed for the genus Cuscuta, which is x = 15 (Pazy and Plitmann 1995), the sizes of the two largest pairs cannot be explained by a single fusion of two pairs of small chromosomes. In addition, the ND-FISH with telomeric probe did not provide evidence for any interstitial sites in C. nitida, which may indicate that there was no fusion event in the origin of this karyotype. Furthermore, Cuscuta angulata presented 2n = 30, showing no reduction in chromosome number and a bimodal karyotype. Many Cuscuta species have 2n = 30, but there are species with 2n = 8, 10, 14, 16, 18, 20, 28, 30, 32 and 34 and polyploids with 2n = 28, 42, 44, 56, 60, 90 and 150. Cuscuta epithymum (subgenus Cuscuta), for instance, shows an intraspecific variation which could be attributed to chromosome fusions and polyploidy, with 2n = 14, 16, 28, 30, 32 and 34. Individuals with 2n = 14 and 2n = 32 are bimodal, while 2n = 16 and 2n = 34 are symmetric (García and Castroviejo 2003; García et al. 2019). Three species of subgenus Cuscuta (holocentric), C. epithymum (2n = 14), C. europaea L. (2n = 14) and C. epilinum Weihe (2n = 6x = 42) had a reduction in the chromosome number. In the case of this subgenus, there may have been chromosomal fusion events, since holocentric chromosomes have diffuse kinetochores, and consequently these chromosomes can stabilise fragments or fused chromosomes favouring rearrangements (Mandrioli and Manicardi 2020).

Bimodal karyotypes may also originate through interspecific hybridization, as proposed for the genus Agave, in which allopolyploid species might have chromosomes of different sizes inherited from different parents (McKain et al. 2012). In Cuscuta, there are numerous cases of interspecific hybridization and polyploidy (reviewed by García et al. 2014). For example, C. veatchii Brandegee is an allopolyploid originated from the hybridization of C. denticulata and C. nevadensis I.M. Johnst. With 2n = 60, C. veatchii possess 30 smaller chromosomes and 30 slightly larger chromosomes with very evident centromeres, characteristic of C. denticulata and C. nevadensis, respectively (Ibiapino et al. 2019). However, molecular phylogenetic analyses have showed that reticulate evolution occurs mainly in the subgenus Grammica (e.g. Stefanović and Costea 2008; Costea and Stefanović 2010; García et al. 2014; Costea et al. 2015a). There is also preliminary phylogenetic evidence suggesting that some species of subgenus Cuscuta may have a hybrid origin. Different accessions of C. approximata Bab., for example, have polymorphism in the ITS, and the location of C. kurdica Engelm. differed between the ITS and trnL trees (García and Martín 2007). These contrasting topologies may indicate hybridization events similar to those reported in subgenus Grammica (e.g. Stefanović and Costea 2008; García et al. 2014). Evidence such as this has not been observed in C. africana, C angulata and C. nitida with ITS, 26S, trnL nor rbcL sequence analyses (García and Martín 2007). Chromosome number and size, as well as the number of rDNA sites in C. nitida, were within the range of variation already reported for species of the genus Cuscuta. So far, most Cuscuta species have shown few rDNA sites, varying from two to 36 sites of 5S rDNA and from two to 30 sites of 35S rDNA (Fogelberg 1938; García 2001; García and Castroviejo 2003; Guerra and García 2004; McNeal et al. 2007; Ibiapino et al. 2019, 2020; García et al. 2019). Thus, our results do not suggest neither hybridization nor polyploidy as the cause of bimodality in Pachystigma.

Cuscuta africana presented divergence in chromosome number compared to previous report (2n = 30, García et al. 2019). Intraspecific variation is unlikely because samples were plants collected from the same population. It is more likely that the 2n = 30 reported earlier was a mistake, since conventional staining may leave the proximal, distended NOR unnoticed. NORs are more easily identified as CMA+/DAPI bands (see, for example Fig. 2d and 2f). Similar miscounts have been registered for Passiflora foetida L., which was first described as having 2n = 22 (Snow and MacDougal 1993) and later corrected to 2n = 20 (De Melo and Guerra 2003).

Because chromosome fusions and intraspecific hybridization seem less probable, repetitive sequence accumulation in specific chromosome pairs could be a probable mechanism for karyotype asymmetry in subgenus Pachystigma. Indeed, repetitive DNA in Cuscuta is involved in the expansion of the genome, causing an increase in chromosomes, such as in C. monogyna and C. indec-ora (Ibiapino et al. 2020; Neumann et al. 2021). In these two cases, however, chromosomes increased proportionally in size, maintaining karyotype symmetry, and resulting in similar karyotypes, although C. monogyna and C. indec-ora belong to different subgenera (Monogynella and Grammica, respectively). All 12 Cuscuta species sequenced by Neumann et al. (2021) showed a greater abundance of LTR type elements, with SIRE being the most dominant among Ty1/Copia lineages and Tekay most dominant among Ty3/Gypsy, ranged from 8.5 to 30.8% for Ty1/Copia and 7.3 to 28.5% for Ty3/Gypsy. Many small genome species such as C. pentagona Engelm. showed a higher proportion of class II elements, representing 12.6% of the genome. A large fraction of these genomes was also composed of satellite DNA, reaching up to 18% in C. europaea. In this species, the satDNA CUS-TR24 is the major constituent of its heterochromatic bands (Vondrak et al. 2021). Similar results were observed for C. nitida, with 3.5% SIRE and 4.19% Tekay. Satellites also made up a significant percentage of the C. nitida genome, 6.4%. However, class II elements showed a low proportion, 1.46%. These demonstrate that repetitive accumulation is a common mechanism in the evolution of genomes within the genus Cuscuta, increasing the size of chromosomes within a particular karyotype.

Bimodal karyotypes can maintain a differential composition of DNA sequences between large and small chromosomes, thus creating subgenomes different in structure and function (Báez et al. 2019). In chicken, for example, the recombination rate in microchromosomes is much higher than in macrochromosomes. The high density of genes favours meiotic recombination and could lead to an increase of gene density in small chromosomes (Smith et al. 2000; Rodionov et al. 2002). On the other hand, large chromosomes may be richer in heterochromatin. Within these regions, repetitive sequences are more prone to accumulate because new insertions are not selected against and less likely to be removed by recombination (Biscotti et al., 2015). The evident accumulation of repetitive DNA sequences in the largest chromosomal pairs of C. nitida supports the influence of heterochromatin in the karyotype asymmetry of Cuscuta. In the bimodal karyotypes of subgenus Pachystigma, the most evident heterochromatic bands are restricted to the largest pairs. The 5S rDNA corresponded to 4.75% of the genome of C. nitida and was colocalised with the largest CMA+ band of that species. In addition, all hybridised satellite DNAs, as well as the 35S rDNA sites and most of the transposable elements, are restricted or highly enriched in the largest chromosomal pairs (Fig. 5). In Muscari Mill. (Asparagaceae), a massive amplification of the MCSAT satDNA family occurred in only one chromosome pair. This single satDNA family corresponds to 5% of the total genome of M. comosum (L.) Mill. and contributed to the progressive increase in the karyotype asymmetry of Muscari species (de la Herrán et al. 2001). In Eleutherine, two of the E. bulbosa Urb. satellites, Ebusat1 and Ebusat4, occur in the interstitial region of the largest pair of E. bulbosa and E. latifolia, (Standl. & L.O. Williams) Ravenna both with bimodal karyotypes. In addition, the four most abundant retrotransposons also showed accumulation in the largest pair. This demonstrates that accumulation of repetitive sequences can generate an increase of only part of the chromosomes of a karyotype and lead to a change in karyotype symmetry (Báez et al. 2019). This suggests that the bimodality of subgenus Pachystigma could also originate from the asymmetric expansion of multiple repetitive DNA lineages (Fig. 5).

Fig. 5
figure 5

Schematic representation of the distribution of heterochromatic bands and repetitive sequences in C. nitida bimodal karyotype. Ribosomal DNAs (5S e and 35S) and satellite superfamily SF1 colocalised with CMA+ heterochromatin in the largest chromosome pairs, with SF1 and SF2 also adjacent to it. LTR-retrotransposons were enriched in the largest pairs, except for the CRM element, also present in the smallest chromosomes

The SF1 and CnSat10-1400 signals colocalised with the 35S rDNA but showed no similarity with the 35S rDNA of C. campestris, which is a species of the subgenus Grammica. This may suggest that this satellite DNA unit has originated from tandem duplications of a less-conserved, intergenic region of Cuscuta rDNA, such as the IGS, or that it was inserted in C. nitida rDNA locus after the divergence between subgenera Grammica and Pachystigma. In Phaseolus L. (Fabaceae), jumper satDNA was inserted into the NTS region of 5S rDNA (Ribeiro et al. 2017). However, the 35S rDNA and CnSat10-1400 formed independent clusters, suggesting that this satDNA has not become part of the rDNA unit but is rather interspersed along the rDNA site. Nevertheless, no 35S rDNA cluster from C. nitida presented a circular graph, indicating its incompleteness. Therefore, the association of this satellite with the 35S of C. nitida cannot be excluded and awaits further investigations using long reads. Furthermore, CnSat10-1400 showed higher similarity with the reverse transcriptase domain of a LINE element, which may indicate a possible origin of this satellite from a TE, and later interspersion within the 35S rDNA loci, or the insertion of LINEs in this satDNA, as observed in Cuscuta europaea for CUS-TR24 (Vondrak et al. 2021).

The satellites of C. nitida CnSat10-1400 and CnSat12-1060 showed similarity with transposable elements, LINE and Reina, respectively. Some transposable elements and repetitive genes can contribute to the formation and dissemination of satellite DNAs. In Lathyrus sativus L., most of the satellites originated from small tandem repetitions present in the 3′ untranslated region of the Ogre retrotransposons (Vondrak et al. 2020). MITE transposable elements were appointed as generators of satellite DNA in bivalve molluscs and Drosophila Fallén (Miller et al. 2000; Pons 2004). Similarly, in ants of the genus Messor Forel, a Mariner element gave rise to the expansion of satellite DNA IRE-130 (Palomeque and Lorite 2008). In fish, copies of the 5S rDNA originated the satellite 5SHindIII, while ancestors of tRNA were probably responsible for the formation of tandemly repeated sequences in higher plants (López-Flores and Garrido-Ramos 2012). In humans, it has been identified that a quarter of all mini/satellites are derived from transposable elements. TE-derived satellites usually have monomers above the standard 500 bp of size and generally occupy pericentromeric regions (Meštrović et al. 2015). This is the case of CnSat10-1400 and CnSat12-1060, with monomers of 1400 bp and 1060 bp, respectively. In Pisum sativum L., variants of the satellite PisTR-A are incorporated into Ty3/Gypsy Ogre elements. The untranslated region that separates the 3′ gag-pol domains from the LTR is highly variable in the pea Ogre elements and carries several other tandem repeats (Macas et al. 2009). In maize, the CRM1TR and CRM4TR tandem repeats are entirely derived from centromeric retrotransopsons (CM) (Sharma et al. 2013). None of the satellites found in C. nitida showed similarity to satellites previously described by Oliveira et al. (2020) and Neumann et al. (2021) either between species of the closest subgenus Cuscuta, or between the other subgenera, indicating that these satellites have independent origins and the composition of heterochromatic bands in holocentric and monocentric chromosomes of the genus or between different subgenera are different.

The well-supported clade of subgenus Pachystigma is characterised by the presence of bimodal karyotypes in all species analysed. Although the three species had different CMA/DAPI band patterns, these bands were mostly enriched in the larger chromosomes of the three karyotypes. The genome organization of C. nitida repetitive fraction suggested a differential chromosome accumulation of diverse repetitive families, mainly satDNA, rDNA and retrotransposons, as the probable mechanism of origin for the bimodal karyotypes within this subgenus. This shows that the increase in chromosomes, which led to the emergence of bimodality in this clade, was associated to the accumulation of repetitive sequences in heterochromatin. The composition of this heterochromatin may be different among species. In Cuscuta, the amount and diversity of repetitive DNA are high and satellite DNAs can originate from transposable elements and potentially be incorporated or interspersed with the rDNA.