Introduction

In most eukaryotes, the nuclear ribosomal RNA genes are present in multiple copies as tandem repeats separated by intergenic spacers (IGS). The rRNA genes are usually arranged as two distinct units, the 5S and the main rDNA unit, transcribed by the RNA polymerase III and I, respectively (Orioli et al. 2012; Goodfellow and Zomerdijk 2013). The rDNA unit contains the 18S, 5.8S and 28S rDNA organized in a single operon, where the genes are separated by two internal transcribed spacers (ITS-1 and ITS-2) and regulatory elements for the transcription by RNA polymerase I in a non-transcribed spacer (NTS) (Goodfellow and Zomerdijk 2013; Henras et al. 2015). The rDNA unit is transcribed into a long 45S pre-rRNA that undergoes a complex maturation process to release the three rRNAs, in which conformational folding interactions with ITS-1 and ITS-2 play a role in the intervention of various nucleolytic cleavage factors (Fernández-Pevida et al. 2015; Henras et al. 2015; de la Cruz et al. 2018).

Microsporidia are a notable exception to such an organization because their genes are considerably reduced in size and structure (e.g. 16S-like SSU), and the 5.8S gene is fused at the 5′ of the LSU, giving an overall prokaryote-like resemblance (Torres-Machorro et al. 2010). This is certainly related to the evolution of microsporidians as extreme intracellular parasites, and it does not reflect their ancestral nature, as originally thought. They are indeed characterized by highly compact genomes and unusual morphological features (Corradi 2015; Han and Weiss 2017). The 5.8S/LSU fusion occurred with the loss of ITS-2. The reduction in size and structure can be traced from the mitochondriate ancestors of Microsporidia (these have DNA-free mitosomes) within the Rozellomycota (Corsaro et al. 2014, 2016), a basal lineage of zoosporic fungi (chytrids). Some of these organisms (i.e. Mitosporidium, Nucleophaga, Paramicrosporidium) have indeed microsporidia-like morphologies while preserving fungal genomic traits (Haag et al. 2014; Quandt et al. 2017). Although their SSU rRNAs are typically eukaryotic in size (~ 1800 nt) and structure, the length of their ITS-2 is reduced by almost half (~ 150 nt) compared to that of other chytrids (~ 300 nt). A greater reduction in size and structure of ITS-2 is present in the most basal members of Microsporidia, the Chytridiopsida and Metchnikovellida, which, however, have SSU of eukaryotic structure, but of reduced size (Corsaro et al. 2014, 2019a). Microsporidia and Rozellomycota form a basal branch of the holomycotan lineage, which includes the kingdom Fungi and their closest protist relatives (Aphelidea, Nucleariidae) (Liu et al. 2009; Corsaro et al. 2014).

Surprisingly, the recent release in Genbank of the entire rDNA sequence of Mitosporidium daphniae (Genbank ID MF278562) shows that a much longer ITS-2 (722 nt) is present in this organism. Nevertheless, our preliminary analysis proved to be incompatible with the information reported in the database, suggesting instead the possible presence of a group I intron within the ITS.

Group I introns are mobile genetic elements that insert into the genes encoding for RNAs and proteins in viral, prokaryotic and nuclear and organellar genomes (Haugen et al. 2005; Hedberg and Johansen 2013). They are self-splicing RNAs, folding into complex structures that generally consist of ten paired regions (P1–P10). On the basis of structure and sequence variations, the introns can be divided into five main classes: IA, IB, IC, ID and IE (Michel and Westhof 1990; Suh et al. 1999). Some introns can further encode for different types of endonucleases that increase their spread (Chevalier and Stoddard 2001), although reverse splicing also plays an important role in their mobility (Bhattacharya et al. 2005). Introns have been widely studied, especially those occurring into the rRNA genes (i.e. SSU and LSU rDNA), but they have never been found in the ITS. The only note on this subject comes from communications to annual meetings by the team of Rogers, who reported the possible presence of introns in nuclear ITS of chondrichthyan fishes. The authors named them spintrons (for spacer introns) and hypothesized that they invaded a common ancestor of the existing sharks (Shivji et al. 1997; Walker et al. 1999). We therefore included along holomycotan rDNA, some of these shark sequences in our analysis. Our results indicate the possible presence of spacer introns (spintrons) in various organisms.

Materials and methods

Retrieval of ITS sequences

Along with the sequence of Mitosporidium, we retrieved from Genbank a set of sequences containing the complete portions of ITS-1, ITS-2 and 5.8S, for other members of Rozellomycota. Since the Rozellomycota form a basal group near the root of the Fungi (Corsaro et al. 2014), we have included in the comparison the sequences of additional chytrids (Chytridiomycota) and other two basal holomycotan groups, the Aphelidea and Nucleariida (Table 1). We selected ITS-1 and ITS-2 sequences from five shark species (Chondrichthyes, Elasmobranchii, Selachimorpha) that have been shown forming distinct lineages (Cooper 2018). Species are: Hexanchus griseus (JN003428, JN003446) and Notorynchus cepedianus (JN003429, JN003447) (Squalimorphii, Hexanchiformes); Carcharhinus leucas (JN003436, JN003437) (Galeomorphii, Carcharhiniformes); Isurus oxyrinchus (JN003432, JN003441) and Lamna nasus (JN003431, JN003440) (Galeomorphii, Lamniformes).

Table 1 Summary of holomycotan ITS sequences analysed

Sequence analyses

Boundaries between the different rRNA genes were identified by locating in the alignments a set of highly conserved primers targeting known positions near to the terminal parts of 18S (euk1520F) and 28S (euk28-1R). Positions were further checked by conformational studies (Corsaro et al. 2019a) and research in the ITS-1 (http://www.itsonedb.cloud.ba.infn.it/) and ITS-2 (http://www.its2.bioapps.biozentrum.uni-wuerzburg.de/) databases (Ankenbrand et al. 2015; Santamaria et al. 2018). The selected sequences were screened for the presence of repeats using Tandem Repeat Finder (Benson 1999), and they were analysed performing separate multiple alignments with MAFFT including different groups of ITS-rDNA or introns. Blast search for possible similarities was also carried out for each sequence, both over its entire length and on partial fragments.

Intron characterization and molecular phylogeny

The reference group I intron sequence alignments (Michel and Westhof 1990; Suh et al. 1999; Li and Zhang 2005) were used as guide to identify the conserved core paired helices P3–P7. Thus, secondary core structures were inferred manually, and the remaining portions of the molecules were determined using Mfold. Final structures were drawn using RnaViz (De Rijk et al. 2003). The conserved core pairs (P3–P8) of spacer introns were manually aligned with those of representatives of each intron class (IA–IE), including mostly introns located in the SSU or LSU rDNA of bacteria (b), nucleus or organelles (c, chloroplast; m, mitochondria) of different eukaryotes. The molecular phylogenetic tree was then inferred as described previously (Corsaro et al. 2017, 2019a, b), by using maximum likelihood (ML, GTR + Γ model; 1000 replicates), neighbour joining (NJ, Jukes-Cantor model; 2000 replicates) and maximum parsimony (MP, 1000 replicates) with TBR (tree-bisection-reconnection, search level 1 with 10 random additions), and using IE introns as outgroup.

ITS-2

For M. daphniae, Amoeboaphelidium occidentale and Nuclearia simplex, the identified spacer introns were removed and secondary structures of ITS-2 were inferred manually as previously described (Corsaro et al. 2014, 2019a).

Results

ITS length

Some of the selected sequences of holomycotans revealed high differences in the size of ITS. Mitosporidium daphniae and Polychytrium aggregatum have ITS-1 two and four times longer, respectively. For ITS-2, sizes exceeding two to five times the lengths of the other members were found in Mitosporidium, A. occidentale and N. simplex (Table 1). These unusually long sequences were analysed further. The ITS-1 of Polychytrium and ITS-2 of Nuclearia have no repeats, whereas these are present in both ITS of Mitosporidium. ITS-1 contains two repeats, long (11 bases: GATGGTGGATC) and short (2 bases: AC), for total lengths of 38 and 64 bp, respectively. ITS-2 has a unique short repeat (4 bases: ATCA; total length of 34 bp). The ITS-2 of A. occidentale proves to be particular because it contains a long portion of 409 nt repeated twice, separated by 22 nt corresponding to the first bases of 28S. The true beginning of the 28S is, however, very likely localized after this double repetition where we can locate, as expected, the primer 28-1R site preceded by this same terminal portion. Another abnormal situation occurs in the ITS-2 of Aphelidium desmodesmi, where the 5′-end of 28S consisting of the first 20 nt followed by the 28-1R primer site is repeated twice. We consider these sequences as true, but a possible error in their amplification/assembly cannot be ruled out.

Repeats were not found in the shark sequences, except for the ITS-2 sequences of C. leucas, Hexanchus griseus and N. cepedianus (consensus patterns of 20–26 bases for total lengths of 41–63 bp).

Spacer introns identification and deduced ITS-2 in the holomycotans

The selected sequences do not show significant similarities with any other available sequence. On the other hand, for each of them it was possible to identify short regions (< 10 nt) corresponding to conserved pairs of introns of group I. The 2D reconstructions of the molecules according to the pairs identified proved to be in accordance with the group I intron model (Michel and Westhof 1990; Cech et al. 1994). The complete set of P1–P10 pairs was found (Figs. 1a, b, 2a, b, 3). All have a wobble u:G at the 5′ splice site, most with a complementary pair at the G-binding site in the P7, normally formed by the highly conserved C:G or the rarer U:G or U:A. While the ITS-1 intron of Polychytrium possesses as terminal residue at the 3′-end the rare U, all the others have the most common G (Gω). Furthermore, the ITS-2 intron of A. occidentale contains an open reading frame in the P9 loop (Fig. 2a), and its translation in frame-2 gives a theoretical product of 44 amino acids containing an His-Cys box (Fig. 4). The presence of such a highly degenerated gene would be congruent with the intronic nature of these elements because the nuclear introns of several protists, targeting the rRNA genes, can indeed contain homing endonucleases of the His-Cys box family (Haugen et al. 2004; Corsaro and Venditti 2018). For the ITS-2 intron of Mitosporidium, A. occidentale and N. simplex, the putative insertion site of the introns is located in helix 2 of ITS-2 deduced after intron removal, near the pyrimidine bulge. Indeed, once the introns were removed from the original sequences of the three organisms, it was possible to reconstruct 2D models of ITS-2 that appear more consistent with those expected (Figs. 1c, 2c, d). All have a multi-helix structure typical of ITS-2 containing a helix 2 with a characteristic pyrimidine bulge. The ITS-2 of Mitosporidium has only 148 nt, a length totally compatible with that envisaged for this group of microsporidia-like organisms (Corsaro et al. 2014, 2019a).

Fig. 1
figure 1

Predicted secondary structures of the spacer introns present in the ITS-1 (a) and ITS-2 (b) of Mitosporidium daphniae. Sequences of the introns are in uppercase, and portions of the ITS serving as flanking exons are in lowercase. Paired elements (P1–P10) and the 5′ and 3′ splice sites (black arrowheads) are indicated. Predicted secondary structure of the ITS-2 once the spacer intron has been removed (c). Ends of 5.8S and LSU rDNA are in uppercase, and the sequence of the ITS-2 is in lowercase. Roman numbers label ITS helices, and arrowhead indicates the insertion site of the spacer intron. Schematic draw of the rDNA operon of Mitosporidium (d). The rRNA genes are indicated by white boxes, and the ITS by thin lines. Spacer introns are represented by thick lines. Numbers refer to the position of the M. daphniae rDNA sequence (Genbank ID MF278562)

Fig. 2
figure 2

Predicted secondary structures of (top) the spacer introns present in the ITS-2 and of (bottom) the ITS-2 after intron removal, of Amoeboaphelidium occidentale (a, c) and Nuclearia simplex (b, d). Labelling as in Fig. 1

Fig. 3
figure 3

Predicted secondary structure of the spacer intron present in the ITS-1 of Polychytrium aggregatum. Labelling as in Fig. 1

Fig. 4
figure 4

The deduced amino acid sequence (44 aa) of the putative homing endonuclease found in the ITS-2 intron of Amoeboaphelidium occidentale (Aocc) is aligned with the sequences of His-Cys endonucleases present in nuclear introns of various protists by using COBALT. Selected sequences were from three heterolobosean amoebae, Naegleria jamiesoni (Njam), Naegleria gruberi (Ngru) and Allovahlkampfia spelaea (Aspe), the zoopagomycotan fungus Coemansia mojavensis (Cmoj), the rhodophyte Porphyra umbilicalis (Pumb), the myxogastrid Physarum polycephalum (Ppol) and Acanthamoeba sp. KA/E5 genotype T4 (Acan) (see Corsaro and Venditti 2018). The sequence carried by the aphelid is very small, but an almost complete His-Cys box can be identified. Asterisks and open circles mark the conserved zinc-binding and active site residue, respectively

Spacer introns in sharks

As indicated before, the suggestion that the shark ITS could contain introns was advanced by Rogers and colleagues. These works have, to our knowledge, remained unpublished, and they are only available as conference papers or master thesis. In the present study, we were able to effectively reconstruct secondary structures corresponding to the group I intron models from ITS-1 and ITS-2 of five shark species (Fig. 5), which would confirm the spintron hypothesis.

Fig. 5
figure 5

Examples of predicted secondary structures of spacer introns present in the ITS-1 and ITS-2 of chondrichthyan fishes, Carcharhinus leucas (a, b) and Isurus oxyrinchus (c, d)

Intron phylogeny

To trace the phylogenetic relationships of the spacer introns, a set of rDNA introns, occurring in prokaryotes and eukaryote nuclear and organellar genomes and representing the five intron classes IA–IE, was selected. Introns present in the DNA viruses infecting protists were also included. Phylogenetic analysis was performed on the basis of conserved regions P3–P8.

The spacer introns seem to form a unique group within the IC class showing some relationships with introns from viruses of amoebae (Tupanvirus) and nuclear introns of Acanthamoeba lenticulata and Physarum polycephalum (Fig. 6). However, the statistical supports are weak, although all methods recovered the virus intron as the closest relative, as well as the introns of Acanthamoeba and Physarum as the next relative but in different positions. The spacer introns are divided into subgroups almost in accordance with their location in ITS-1 or ITS-2. The ITS-1 and ITS-2 introns of Mitosporidium cluster with the ITS-1 intron of Polychytrium and the shark ITS-2 introns, respectively. The ITS-2 intron of A. occidentale, which has no relation with the S943 intron located in the 18S rDNA of the same organism we previously shown belonging to IC (Corsaro et al. 2019b), appears rather to form a lineage with the ITS-2 intron of Nuclearia, while shark ITS-1 introns form a paraphyletic basal group (holophyletic in NJ and MP). This could be due to the fact that for sharks we only included five of these introns. Indeed, according to Rogers and colleagues, shark introns should form two sister groups that, inherited vertically, have diversified over the evolutionary history of the hosts. In addition, it is possible that the internal topology is affected by the artificial position of the L2449 intron of Physarum and that, therefore, ITS-1 and ITS-2 introns actually form sister groups. Overall, although the present analysis probably does not reflect an accurate phylogeny, it would nonetheless confirm that these elements belong to introns.

Fig. 6
figure 6

Phylogenetic relationships of spacer introns within group I introns. Classes IA to IE are indicated. The analysis includes mainly introns located into either bacterial (b), eukaryotic nuclear or organellar (c, chloroplast; m, mitochondria) rDNA, using IE introns as outgroups. The only introns located in protein-encoding genes here considered are those occurring in DNA viruses of different protists (green algae and amoebae). The alignment is based on the conserved core P3–P8. Bootstrap values for ML/NJ/MP are shown at the node (1000 replicates for ML and MP; 2000 for NJ). Asterisk, node < 50%; hyphen, node not supported

Discussion

Exceptionally long ITS have already been found in very distant organisms such as some oomycetes or arthropods, where the increase in size of the sequences is caused by the inclusion of tandem repeated elements (e.g. Hlinka et al. 2002; Thines 2007; Paredes-Esquivel and Townson 2014). These elements generally consist of about ≥ 100 nt, and their repetition gives ITS often exceeding 1.5 kb. Such a tandem repetition of long fragments does not seem to occur in the sequences analysed herein. Indeed, repeats are absent in most ITS of sharks as well as of Polychytrium and Nuclearia, and those identified have only short fragments for total lengths of about 50 nt. The only exception is the sequence of Amoeboaphelidium. On the other hand, portions of these sequences can be folded, giving congruent configurations with the 2D models of the group I introns (Figs. 1a, b, 2a, b, 3), and the phylogenetic analysis conducted on the conserved regions corroborates their membership to group I introns (Fig. 6). Residing in the ITS, which are much more variable compared to 18S and 28S rRNA, spacer introns could therefore be highly derived lineages representing new subtypes or even new classes. In addition, ITS-2 of plausible structure can be modelled for all three holomycotans after intron exclusion (Figs. 1c, 2c, d). These reconstructions indicate that a vestigial ITS-2 can theoretically form once the spacer intron is removed, which could occur naturally during the processing of the pre-rRNA if these elements are actually self-splicing introns. In the same way, self-splicing should also occur for introns in the ITS-1 by releasing a functional pre-rRNA intermediate. Therefore, assuming that these elements are true introns, they should behave as such, with little or no effect at the DNA level, whereas at the RNA level their self-splicing should allow the release of functional products from the invaded genes. The protistan nature of these organisms could also explain how the introns may have invaded them. Introns are indeed widespread, although scattered among protists at the level of nuclear rRNA, and they are particularly common in fungi and algae (Bhattacharya et al. 1996; Haugen et al. 2004; Simon et al. 2005). Once the intron is inserted into the host DNA, it propagates by vertical inheritance to the entire lineage. The possible presence of endonuclease and reverse splicing are responsible for its mobility at the DNA or RNA level, respectively (Bhattacharya et al. 2005; Hedberg and Johansen 2013). Such mobility includes either the change of the insertion site within the same host or the change of host by horizontal transfer, the latter generally requiring close cell-to-cell interactions that would promote gene transfer even between very distant organisms (Bhattacharya et al. 2001; Haugen et al. 2005).

More intriguing is the situation of ITS shark introns. In animals, group I introns were found only in the non-bilaterian basal lineages, i.e. poriferans, cnidarians and placozoans, and they are confined to the mitochondrial genomes, located in the cytochrome c oxidase subunit I (cox1) or the NADH dehydrogenase subunit 5 (nad5). These introns are usually of class IB and often the cox1 intron encodes for a LAGLIDADG endonuclease (Goddard et al. 2006; Emblem et al. 2014; Celis et al. 2017; Osigus et al. 2017; Schuster et al. 2017). Studies have shown the occurrence of vertical inheritances and horizontal transfers between species, between sponges and corals, as well as the possible fungal origin of some of these introns. The intron spread could be explained by the absence or weak germline barrier in non-bilaterian animals. In bilaterians, segregation between germline and somatic cells is thought to be a major obstacle to horizontal gene transfers, although those involving functional genes are possible via symbionts or parasites, or during the early developmental stages exposed to foreign DNA (Huang 2013; Boto 2014; but see Ku and Martin 2016). In their analyses, Rogers and colleagues (Shivji et al. 1997; Walker et al. 1999; Cooper 2018) found that spacers introns are widespread among sharks, and they suggested that intron invasion probably occurred in an ancestor of present-day elasmobranchs (sharks, skates and rays), followed by vertical inheritance. A possible conformational role of the spintrons helping in the pre-rRNA processing could explain their conservation over the millions of years of shark evolution. Also, one could speculate that such early intron invasion occurred possibly via an infectious microorganism such as a virus or protist as a donor. Additional studies are obviously necessary.

It is likely that the great variability of the ITS does not offer many possibilities for intron insertion, which requires the recognition of specific sequences corresponding to the flanking exons. On the other hand, this is ensured by the conserved portions of the rRNA genes. The existence of spacer introns, if any, should therefore be a rare event, probably due to fortuitous circumstances. However, the data presented herein seem consistent with the interpretation of these elements as spacer introns. In addition, we analysed only a very small number of ITS because our initial interest was to study the sequence of Mitosporidium which was abnormally long according to our prediction (Corsaro et al. 2019a). More detailed and extended analyses, including a larger number of sequences, can thus shed light on the real existence of these putative spacer introns and their diffusion. In the event that these elements prove to be really present, and in several other sequences, this would open up a new field of research. In addition, there may also be direct impacts as the ITS region is considered a promising DNA barcode for fungi.