Introduction

Genes encoded by fungal mitochondrial genomes (mtDNA) can be categorized as follows: (1) rRNA and tRNA encoding genes, (2) genes coding for proteins involved in the respiratory chain, subunits of the NADH dehydrogenase, and components of the ATP synthase, and (3) in some cases ribosomal proteins (Bullerwell et al. 2003a; Hausner 2003). There is tremendous variation among the eukaryotes with regard to mtDNA encoded ribosomal proteins, for example the jakobid Reclinomonas americana has 27 mtDNA encoded ribosomal proteins but most eukaryotes encode few or none (Gray et al. 1998). For example no such genes are found in the animal mtDNAs, but in the mtDNAs of the choanoflagellate, Monosiga verticillata, and the icthyosporean Amoebidium parasiticum mtDNAs (Burger et al. 2003; Bullerwell et al. 2003b) the ribosomal protein S3 (rps3) gene has been located. The latter two organisms may represent lineages that share ancestry with the metazoans and the fungi. The metazoans appear to have lost mtDNA encoded ribosomal protein genes; and it has been speculated that these mtDNA genes were transferred to the nuclear genomes (Smits et al. 2007).

Within the fungi, the first described mtDNA encoded ribosomal protein gene was VAR1 in yeast (Groot et al. 1979; Terpstra and Butow 1979) which was noted to encode an essential protein (Mason et al. 1996) that is a component of the small ribosomal subunit (Terpstra et al. 1979; Davis and Ellis 1995; Graack and Wittmann-Liebold 1998). This gene is highly polymorphic among strains of Saccharomyces cerevisiae due to variation in the number of AAT (asparagine) codons (Butow et al. 1985; Hudspeth et al. 1984) and insertions of G+C rich clusters (Wenzlau and Perlman 1990). Later, another mtDNA ribosomal gene, S5, was found encoded within an rnl group I intron (mL2449) in Neurospora crassa (Burke and RajBhandary 1982). Like the VAR1 protein, S5 was also shown to be a stoichiometric component of the mitochondrial ribosome (LaPolla and Lambowitz 1981). The var1 gene has been detected in other ascomycetous yeasts (Douglas and Butow 1976; Hoeben and Clark-Walker 1986; Groth et al. 2000) and the S5 gene has been located in a diverse set of species of filamentous ascomycetes (Burke and RajBhandary 1982; Bullerwell et al. 2003a, b, c; Hausner 2003). A gene with homology to the E. coli rps3 gene, was discovered in the mtDNA of the chytridiomycete fungi, A. macrogynus (Paquin and Lang 1996) and within the zygomycetes, M. verticillata, and S. culisetae (Paquin et al. 1997). Overall the rps3 gene has a scattered distribution among the fungal mitochondrial genomes suggesting frequent loss of mtDNA encoded ribosomal genes and the existence of nuclear encoded equivalents (Bullerwell et al. 2000, 2003a, c). While VAR1, Rps3, and S5 ribosomal proteins appear to have few sequence similarities, they have been shown to share a novel amino acid motif at the C-terminus, this suggests that VAR1 and S5 are homologs of Rps3 (Bullerwell et al. 2000; Smits et al. 2007). Fungal mitochondrial encoded rps3 homologs are extremely variable in sequence and in mtDNA location (Bullerwell et al. 2003c; Formighieri et al. 2008) with the size of the encoded proteins ranging from 227 amino acids in the fission yeast Schizosaccharomyces pombe (urfa = rps3; Zimmer et al. 1990; NC_001326, Bullerwell et al. 2000) to 1453 amino acids in the basidiomycete Schizophyllum commune (NC_003049; Bullerwell et al. 2000).

In members of the Eurotiomycetes (e.g., Penicillium chrysogenum) and Sordariomycetes (e.g., Neurospora crassa) examined so far the S5 (now rps3) gene is located within a rnl group I intron (mL2449). This unique arrangement would ensure the transcription of the rps3 gene along with the rRNAs and it would ensure the maintenance of the group I intron. Group I introns are often viewed as neutral/optional or parasitic type elements.

In this study we examined the evolutionary dynamics of the mtDNA encoded rps3 gene within the fungi. The main focus of this study was to examine the mtDNA encoded rps3 gene for ophiostomatoid species belonging to genera such as Ceratocystis, Ophiostoma (including mitotic members), Grosmannia, Ceratocystiopsis, Gondwanamyces, and Sphaeronaemella. These genera include not only ecologically unique fungi (Harrington 1993a, b; Paine et al. 1997; Hausner and Reid 2004) but also economically important fungi such as blue strainers, that impart stains on stored lumber thus reducing the value of stored lumber, and tree pathogens such as Ophiostoma novo-ulmi (Dutch Elm Disease) (Wingfield et al. 1993; Hausner et al. 2005). We also examined the position of fungal rps3 homologues among its putative bacterial and eukaryotic counterparts. Currently few rps3 sequences are available for the ascomycetes fungi; during this study we determined mL2449 intron encoded rps3 sequences from 41 different fungal species.

Materials and Methods

Source and Maintenance of Fungal Cultures and DNA Extraction Protocols

The sources for all strains used in this study are listed in Table 1. All strains were cultured in petri dishes containing 2% malt extract agar [MEA; 20 g malt extract (Difco, Michigan) supplemented with 1 g yeast extract (YE; Gibco, Paisly, UK) and 20 g bacteriological agar (Gibco) per liter]. Methods for generating biomass and protocols for DNA extraction have been described previously (Hausner et al. 1992, 2005)

Table 1 List of strains and Genbank accession numbers

PCR Amplification, Cloning of PCR Products, and DNA Sequencing

A PCR-based survey utilizing primers IP1 and IP2 (Bell et al. 1996; Gibb and Hausner 2005; Sethuraman et al. 2008) was conducted in order to examine the mL2449 intron, that encodes rps3, in members of Ophiostoma and related taxa. About 50–100 ng of whole cell DNA served as a template for PCR reactions. Taq DNA polymerase, buffers, and dNTPs were obtained from Invitrogen (Life Technologies, Burlington, ON) and used according to the manufacturer’s recommendations. PCR conditions were as follows: an initial denaturation step of 94°C for 3 min was followed by 25 cycles of denaturing (93°C for 1 min), annealing (52.9°C for 1 min 30 s), and extension (70°C for 4 min 30 s) followed by cooling the reactions to 4°C. PCR fragments were separated by gel electrophoresis through a 1% agarose gel in TBE buffer (89 mM Tris–borate buffer with 10 mM EDTA at pH 8.0). DNA fragments were sized using the 1-kb plus DNA ladder (Invitrogen) and the DNA fragments were visualized by staining with ethidium bromide (0.5 μg/ml).

Some PCR products were cloned with the TopoR TA cloning kit (Invitrogen) to facilitate DNA sequencing. The PCR products or plasmid DNAs were purified with the Wizard SV Gel and PCR clean-up system (Promega, Madison WI) or Wizard Plus Minipreps DNA purification system (Promega), respectively. Sequencing reactions contained about 100 ng of template DNA, 1 μl (3–4 pmol) of primer, 4 μl of the AB Big Dye terminator reagent (version 3.1, Applied Biosystem/Life Technologies, Forest City, CA, USA), and ultra-pure water to a final volume of 20 μl. The sequencing reactions were carried out as described in the AB Cycle sequencing kit (Applied Biosystems/Life Technologies). The extension products were purified by ethanol precipitation as recommended by the manufacturer. The purified sequencing products were resolved on an ABI 3700 DNA analyzer. Initially, sequencing employed the IP1 and IP2 primers, or when appropriate for cloned PCR products, the M13 forward and reverse primers were used; thereafter nested primers were designed as needed. DNA sequences were obtained for both strands of the template. Table 1 lists GenBank accession numbers for all sequences obtained in this study.

Sequence and Phylogenetic Analysis

The individual sequences were assembled manually into contigs using the GeneDoc program v2.5.010 (Nicholas et al. 1997). The ORF Finder program (http://www.ncbi.nlm.nih.gov/gorf/gorf.html) was used (setting 4: genetic code for mtDNA of molds) to search for potential ORFs within the mL2449 intron sequences.

The online resource BLASTp (Altschul et al. 1990) was used to retrieve additional rps3 sequences. Nucleotide sequences were aligned with Clustal-X (Thompson et al. 1997) and the alignments were refined manually with the aid of the GeneDoc program. Amino acid sequence alignments were generated with the online PRALINE multiple sequence alignment program (Simossis and Heringa 2003, 2005) and then refined further with GeneDoc.

For phylogenetic analysis, only those segments of the alignment where all sequences could be aligned unambiguously were retained. Phylogenetic estimates were generated by the programs contained within the PHYLIP package (Felsenstein 2006) and the MrBayes program v3.1 (Ronquist and Huelsenbeck 2003; Ronquist 2004). In PHYLIP, a phylogenetic tree was obtained by analyzing the alignment with the PROTPARS (protein parsimony algorithm, version 3.55c) program in combination with bootstrap analysis (SEQBOOT) and CONSENSE to obtain the majority rule consensus tree along with an estimate of confidence levels for the major nodes within the phylogenetic tree (Felsenstein 1985). Phylogenetic estimates were also generated within PHYLIP using the NEIGHBOR program using distance matrices generated by PROTDIST (setting: Dayhoff PAM250 substitution matrix; Dayhoff et al. 1978) or DNADIST (K84 setting).

The MrBayes program was used for Bayesian analysis and the parameters for amino acid alignments were as follows: mixed models and gamma distribution with four gamma rate parameters. The Bayesian inference of phylogenies was initiated from a random starting tree and four chains were run simultaneously for 1,000,000 generations; trees were sampled every 100 generations. The first 25% of trees generated were discarded (“burn-in”) and the remaining trees were used to compute the posterior probability values. For nucleotide sequence alignments the GTR model with gamma distribution was applied to the dataset and as above four chains were run simultaneously for 1,000,000 generation with sample frequency of 100 and a “burn-in” corresponding to the first 25% of sampled trees.

Phylogenetic trees were drawn with the TreeView program (Page 1996) using PHYLIP tree outfiles or MrBayes tree files, and annotated with Corel DrawTM (Corel Corporation and Corel Corporation Limited, Ottawa, ON).

To assess whether the Ophiostoma mtDNA rps3 genes are evolving under functional constraints or drifting (suggesting degeneration) the rates of synonymous (dS) and non-synonymous (dN) substitutions within the mL2449 encoded rps3 ORFs were calculated using the Syn-SCAN program (http://hivdb6.standford.edu/synscan/synscan.cgi; Gonzales et al. 2002). A set of 34 rps3 ORF sequences were analyzed that represent strains of Ophiostoma and closely related taxa plus sequences for rps3 ORFs from strains of Sordaria fimicola and Gelasinospora tetrasperma were included for comparison. The nucleotide sequences for these ORFs were aligned such that all codons were aligned against each other and segments with indels were deleted from the alignment. The alignment analyzed consisted of 1179 positions (i.e., 393 codons).

Results and Discussion

Rps3

The Rps3 protein is of interest for several reasons: (1) it performs an essential function within the ribosome and thus translation occurs (Wilson and Nierhaus 2005); (2) this protein appears to be part a of an ancient protein family (Copertino et al. 1991) and like other ribosomal proteins could be useful in phylogenetic inference (Müller and Wittmann-Liebold 1997; Martini et al. 2007; Verbruggen et al. 2007); (3) plastid and mitochondrial rps3 homologues appear to be extremely variable in size, position, and gene arrangements; and (4) some Rps3 proteins are “moonlighting” proteins, i.e., can have two distinct functions. For example when mutations were introduced within the C-terminus of the Rps3 homolog (227 urf a) found in S. pombe a mitochondrial mutator phenotype was noted suggesting involvement in DNA repair (Zimmer et al. 1991; Neu et al. 1998). The human and Drosophila melanogaster nuclear versions of the rps3 gene have been shown to be involved in DNA repair (Lyamouri et al. 2002; Jang et al. 2004). Also in S. pombe the nuclear version of rps3 is involved in gene regulation (Kim et al. 2009) and endonuclease activity has been noted for some other Rps3 homologues (Jung et al. 2001).

Unusual rps3-like ORFs have been noted in organellar genomes, for example the Chlamydomonas reinhardtii chloroplast (cp) ORF712 was shown to be a functional gene with homology to rps3. However, in the encoded protein only the first 117 amino and last 133 amino acids show homology to Rps3, an intervening sequence of 462 amino acids shows no homology to any known protein (Liu et al. 1993). A similar situation was noted in the related alga Scenedesmus obliquus where again an intervening sequence appears to be present within the cpDNA encoded rps3 ORF (de Cambiaire et al. 2006). In Euglena gracilis a 409 bp mixed group II/group III twintron is present within the cp rps3 gene (Copertino et al. 1991). A more dramatic example is found in Dictyostelium discoideum, here the mtDNA rps3 gene is split and both the 5′ and 3′ segments are fused to separate ORFs of unknown functions (ORF425 and ORF1740, respectively) (Iwamoto et al. 1998). Complex organization of mtDNA rps3 genes is also seen in the Chara vulgaris (green algae, Charophyceae) (Turmel et al. 2003), Pyscomitrella patens (moss, Bryophyta) (Terasawa et al. 2007) and in many angiosperms where the gene is interrupted by one group II intron (Laroche and Bousquet 1999) or in the case of the Gymnosperm Cycas there are two group II introns inserted (Regina et al. 2005). Within some fungi the rps3 gene as noted previously is a component of an rnl group I intron (mL2449) and in a few examples these intron encoded rps3 ORFs have been invaded by homing endonuclease genes (Hausner et al. 1999; Gibb and Hausner 2005; Sethuraman et al. 2008).

Mosaic and Unusual mtDNA rps3 Genes Within the Fungi

The mitochondrial rps3 gene within the fungi can exist in various configurations (Fig. 1a, b), for example in Schizophyllum commune this gene encodes an ORF of 1453 amino acids (AF402141, Bullerwell et al. 2003a), however, only the C-terminal 499 amino acid sequence appears to be similar to the Rps3 amino acid sequence of another Basidiomycete Moniliophthora perniciosa (AY376688, Bullerwell et al. 2003b). The origin of the 5′ segment of the S. commune rps3 gene is unknown and the authors viewed this gene to be a hybrid gene (Forget et al. 2002; Lang, F.B.F. 2006, AAG10295). Also in the basidiomycete Pleurotus ostreatus the rps3 gene encodes a 735 amino acid ORF (Wang et al. 2008; EF204913) and the N-terminal 444 amino acids do not share any homology with any other known protein. The origin of these extra sequences and their effect on the function of the Rps3 protein are unknown.

Fig. 1
figure 1

Diversity found within rps3/var1 gene configurations. In general rps3 genes are highly variable in size and sequence, the E. coli rps3 homolog is shown for comparative purposes. Homology among the rps3 genes is usually restricted (or recognized) to the N- and C-terminus coding regions (Bullerwell et al. 2000). a Arrangements present in the freestanding var1 or rps3 gene. The S. commune rps3 gene potentially encodes a 1453 amino acid peptide (Bullerwell et al. 2000). However, only the C-terminal region (amino acid positions 1143–1449) of the S. commune Rps3 shows similarity to amino acid positions 187–499 of the M. perniciosa rps3 ORF (size: 499 amino acids; Formighieri et al. 2008). The P. ostreatus rps3 ORF encoding a 735 amino acids (Wang et al. 2008) but only the C-terminus segment (amino acid positions 445–733) has similarities with other Rps3 amino acid sequences. In S. pombe the urfA gene encodes for 227 amino acids (Zimmer et al. 1990; Schafer et al. 2005). In S. cerevisiae the mtDNA rps3 homolog, var1, usually encodes 398 amino acids (Saccharomyces Genome Database, yeast-curator@genome.stanford.edu). In P. nodorum (Hane et al. 2007) the rps3 gene potentially encodes a 771 amino acids peptide with the N-terminus section (positions 416–771) showing no similarity to other Rps3 homologues. The P. nodorum Rps3 N-terminus contains a short sequence (55 amino acids) that may have been derived from the cox1 gene (see text). b Variations present among the mL2449 intron encoded versions of rps3. In Hyopcrea jecorina the Rps3 protein consists of 474 amino acids (Chambergo et al. 2002). A similar situation is present (408 amino acids) when no insertion elements are found within the rps3 gene in O. ulmi s.l. Three potential (A, B, and C) insertion sites have been noted within different Ophiostoma species (FJ717851, FJ717839, and AY275136) for double motif LAGLIDADG HEGs (Gibb and Hausner 2005; Sethuraman et al. 2008; Sethuraman 2009). In Sphaeronemella fimicola the rps3 gene encodes an ORF of 779 amino acids of which only the N and C terminal segments show similarities with other mL2449 encoded Rps3 sequences. In addition the S. fimicola (strain WIN(M) 818) rps3 sequence contains an in-frame microsatellite (GCT)16

An unusual mtDNA rps3 ORF arrangement can be found in Phaeosphaeria nodorum (syn. Stagonospora nodorum), here again the ORF appears quite long encoding a 771 amino acid peptide (Hane et al. 2007; YP_001427397). Closer inspection shows that amino acid positions 45–417 align with other intron (mL2449) encoded Rps3 protein sequences (in particular with the Penicillium marneffei rps3 ORF; AAQ54923). Amino acid positions 418–771 of P. nodorum’s rps3 ORF have no similarity with any other Rps3 homolog but a small segment within the region, positions 432–487, based on BLASTp analysis appears to be derived from the cytochrome oxidase subunit 1 gene. This is based on 80% identity at the amino acid level and 90% identity at the nucleotide level to a segment of the cox1 gene of P. nodorum (ABU49440; NC_009746 position 21800–21931). Although a free-standing ORF, this rps3 gene is related to the mL2449 intron encoded version of rps3. At this point one can only speculate that the recombination event that relocated the rps3 gene from its original position (one assumes the mL2449 intron) generated this mosaic gene that incorporated a small segment of the cox1 gene.

Within the ophiostomatoid fungi we noted previously that the mL2449 intron encoded rps3 ORF can be invaded by three different HEGs (Sethuraman et al. 2008; Sethuraman 2009). One double LADLIDADG type HEG inserts within the N-terminus region of the ORF effectively splitting the rps3 gene into two pieces and two HEG insertion sites were noted within the C-terminal components of the rps3 ORF (Sethuraman 2009). The latter two HEGs insert in such a manner that they displace a short segment of the host gene and the HEG duplicates the displaced segment in order to fuse-in-frame to the upstream rps3 coding region (see Gibb and Hausner 2005), effectively generating hybrid ORFs that could express fusion proteins that may have to be resolved post translationally.

One more unusual mL2449 intron encoded rps3 gene was found among strains of Sphaeronaemella fimicola [WIN(M) 818 and UAMH8839 (=WIN(M)1402)] (Fig. 1b). These two strains were previously studied and shown to be closely related, although the WIN(M)818 strain may represent a new species of Sphaeronaemella (Hausner and Reid 2004; Hausner and Wang 2005). Both appear to encode relatively long rps3 ORFs, 779 and 688 amino acids for strains WIN(M) 818 and UAMH8839, respectively. This compares to rps3 ORFs for phylogenetically related fungi (Gondwanamyces proteae, Cornuvesica falcata, and Ceratocystis spp.) that are around 400 amino acids; showing that the size of the rps3 gene can be highly variable and changes apparently quickly within one evolutionary lineage. Based on BLASTp analysis the Sph. fimicola rps3 ORFs show similarity to other intron encoded rps3 ORFs only at the amino acid level for the N-terminus segment of 120 amino acid positions and for about 100 amino acid positions at the C-terminus. Between these two segments about 400 amino acids in the case of WIN(M)818 or about 300 amino acids for strain UAMH8839 show no similarities to any other ribosomal proteins, or for that matter to any other protein within the NCBI database. This is similar to the unusual Chlamydomonas reinhardtii cp rps3 ORF712 described earlier, where an intervening sequence of unknown origin separates the recognizable N- and C-terminal Rps3 sequences. Also when comparing the rps3 sequences of WIN(M) 818 and UAMH8839 against each other we noted that within the Sph. fimicola WIN(M) 818 rps3 ORF an indel was present that consists of a microsatellite segment (GCT)16 that is in-frame and thus results in the presence of 16 consecutive alanines. This again demonstrates the extreme variability that can be encountered among rps3 genes and one mechanism on how the rps3 can rapidly expand in size.

It appears that the rps3 gene and its product are very tolerant to insertion of sequences. It is quite possible that post transcriptional or translational modifications occur that trim the Rps3 protein into a smaller functional version. Also one cannot exclude the possibility that indeed some mtDNA rps3 genes are either pseudogenes or they are redundant as nuclear encoded counterparts exist. Either a functional copy of the mtDNA encoded rps3 gene was at one time transferred to the nuclear genome or as in the case of some plant mitochondrial ribosomal proteins the loss of the mitochondrial copy was substituted by co-opting a duplicated nuclear ribosomal gene (Adams et al. 2002; Bonen and Calixte 2006).

Evolution of the rps3 Homologues

With BLASTp we extracted 207 Rps3 protein sequences from GenBank, these Rps3 sequences were of nuclear, mitochondrial, chloroplast, eubacterial, and archaeal origin. Among this range of Rps3 sequences similarities were restricted to the N- and C-terminal regions. We sampled among the 207 Rps3 sequences and restricted the final data set to 51 Rps3 sequences for more detailed analysis on (Fig. 2). We aligned 218 amino acid sequence positions within the N- and C-terminus. A cyanobacterial Rps3 sequences was designated as the outgroup for this analysis, causing the chloroplast Rps3 sequences to be polarized toward the base of the tree. The topology of the tree follows expected pattern showing that the chloroplast Rps3 proteins are related to the cyanobacterial Rps3 proteins and the Archaea are more closely related to the Eukarya then to the Eubacteria. This is in agreement with other studies showing that many key eukaryotic nuclear genes share common ancestry with archaeal genes (Allers and Mevarech 2005).

Fig. 2
figure 2

Phylogenetic relationships among Rps3 amino acid sequences. The topology of the phylogenetic tree is based on Bayesian analysis (50% majority rule consensus tree). Two nodes are marked that identify the diversity within the mitochondrial Rps3 protein sequences. Node 1 identifies a potential clade that includes the mitochondrial encoded Rps3 protein sequences from the Kingdom Viridiplantae, whereas node 2 represents the clade that comprises of the fungal mitochondrial Rps3 protein sequences. For this tree “nuclear” rps3 genes refers to eukaryotic-type cytosol ribosome homologs. The numbers at the nodes, on top of the line, indicate the level of support based on bootstrap analysis in combination with NJ and PROTPARS analysis, respectively. The value at the nodes below the line indicates the posterior probability values obtained from 50% majority consensus tree generated using Bayesian analysis. Only the nodes that received >50% support are indicated. Deep branches with poor support were collapsed. In order for nodes to be considered statistically significant bootstrap support numbers of >95% and posterior probability values of >99% are expected. NA indicates that a particular node did not receive significant values in one of the phylogenetic programs. Branch lengths are based on Bayesian analysis and are proportional to the number of substitutions per site (see scale bar)

The mitochondrial Rps3 amino acid sequences, however, appear to be quite diverse and grouped into two clades one consisting of the Rps3 sequences belonging to the Kingdom Viridiplantae and a second clade includes the fungal Rps3 sequences. Also based on the sequence alignments and the observed branch length within the fungal mitochondrial Rps3 clade it can be noted that these sequences are far more diverse compared to the other Rps3 sequences included in the analysis.

Evolutionary Relationships Among Fungal mtDNA Encoded rps3 Homologs

Rps3 sequences extracted from Genbank mostly include representatives for two major groups within the Pezizomycotina: the Eurotiomycetes and the Sordariomycetes (Blackwell et al. 2006). The Rps3 amino acid sequence from the zygomycete Mortierella verticillata served as the outgroup in this analysis. We also included sequences from members of the Basidiomycota and Schizosaccharomyces. The Rps3 dataset consisted of 72 sequences including 43 newly obtained Rps3 sequences from various ascomycetes species plus sequences for meiotic (teleomorphs) and mitotic (anamorph) members of the genus Ophiostoma sensu lato (Fig. 3).

Fig. 3
figure 3

Phylogeny of fungal mtDNA Rps3 amino acid sequences. Tree topology is based on a 50% majority rule consensus tree obtained from Bayesian inference. The numbers at the nodes, above the line, indicate the level of support based on bootstrap analysis in combination with NJ and PROTPARS analysis, respectively. The numbers below the line represent the posterior probability values obtained from Bayesian analysis. Numbers of support are only shown if a node received >50% support. NA indicates that at a particular node failed to receive significant values with one of the tree reconstruction methods. The branch lengths shown are based on the number of substitutions per site (indicated as scale bar). Letters indicate the taxonomic position (Class level) for the filamentous ascomycetes members examined in this analysis: E Eurotiomycetes, D Dothidiomycetes, L Leotiomycetes, S Sordariomycetes

Species of filamentous ascomycetes examined so far belonging to the Eurotiomycetes and Sordariomycetes have their mtDNA rps3 gene encoded within the mL2449 group I intron (Burke and RajBhandary 1982; reviewed in Hausner 2003; Sethuraman et al. 2008). The potential importance of this intron/rps3 combination is demonstrated in strains of Verticillium dahliae, Metarhizium anisopliae var. anisoplia, and Lecanicillium muscarium where the only mtDNA group I present is the mL2449 intron encoding rps3 (Pantou et al. 2006; Kouvelis et al. 2004).

The phylogenetic analysis of yeast VAR1 and the intron encoded Rps3 amino acid sequences indicated that they were in two distinct clades, although they shared a common ancestor (Fig. 3). However, not all filamentous ascomycetes fungi have intron encoded rps3 ORFs. Among the two species of the Pleosporales (Dothidiomycetes) characterized so far, the mL2449 intron is missing and in the case of Phaeosphaeria nodorum (syn. Stagonospora nodorum, causes leaf and glume blotch disease in wheat) rps3 is a free-standing gene and in the second instance, Mycosphaerella graminicola (anamorph: Septoria tritici, causal agent of S. tritici leaf blotch foliar disease of wheat), both the mL2449 intron and rps3 are missing from the mtDNA. The rps3 present in P. nodorum groups with intron encoded versions of rps3 implies that the P. nodorum rps3 gene somehow was relocated and the rnl intron mL2449 was lost.

The phylogenetic analysis of the Rps3 amino acid sequence data (Fig. 3) yielded a tree that essentially reflects the expected relationships among the species sampled based on previous rDNA studies (Hausner et al. 1993; Hausner and Reid 2004). The Rps3 data grouped all analyzed members of the Saccharomycetales into a single clade, however, deeper nodes that link to branches leading to Rps3 sequences from the Schizosaccharmycetales and Agaricales (Basidiomycota) were poorly resolved. The Rps3 data grouped together members of the Eurotiales and showed that the Onygenales share a common ancestor with species of the Eurotiales. The position of the Dothiomycetes (Pleosporales, Phaeosphaeria nodorum) and Leotiomycetes (Helotiales, Sarcotrochila macrospora) among the various filamentous ascomycetes orders were not resolved in this analysis. However, the Rps3 data showed the connection between the members of the Mircoascales (i.e., Ceratocystis, Cornuvesica, Kernia pachypleura, and Sphaeronaemella) and species of the Hypocreales (Fig. 3) and that the genera Ophiostoma, Grosmannia, and Ceratocystiopsis which contain species that morphologically resemble species of Ceratocystis are phylogenetically only distantly related to the Microascales. Overall these findings on the convergent evolution of the ophiostomatoid fungi are in agreement with other studies based on rDNA analysis (Hausner et al. 1992, 1993; Spatafora and Blackwell 1994; Hausner and Reid 2004; Zipfel et al. 2006; Kolařík and Hulcr 2009).

Organellar intron encoded ORFs such as the group II intron encoded matK and matR ORFs have been used extensively in plant systematic studies (reviewed in Hausner et al. 2006). However, the utility of the intron encoded rps3 gene for resolving taxonomic relationships within the fungi might be problematic as these genes appear to be prone to insertions and deletions, can be a refuge for HEGs, and can be subject to recombination events that can generate unique fusion products that incorporate segments of other genes within the rps3 coding region.

Origin of the Intron Encoded rps3 Genes

Expect for S. cerevisiae and N. crassa there are no functional studies confirming mtDNA encoded rps3 genes are actually functional. Therefore, we examined a set of intron encoded rps3 genes in more detail to evaluate whether based on the type of substitutions found within the gene there is any evidence for selection, drift, or conservation of function. This was addressed by examining the frequencies of synonymous (dS) and non-synonymous (dN) changes in the rps3 ORFs (Fig. 4). Sequences from 36 different taxa including strains belonging to ophiostomatoid fungi were examined in a pairwise comparative analysis. The dS/dN ratio is suggested to be an indicator of the evolutionary pressures (if any) that may operate on a gene. For example a dS/dN value >1.0 would suggest that the encoded protein is under functional constraint, i.e., natural selection is operating to minimize the number of amino acid changes thereby maintaining the activity of the protein (Nei and Gojobori 1986). The dS/dN values obtained for all the rps3 sequences in relation to the outgroup Gelasinospora tetrasperma were consistently above 1.0 (Fig. 5) suggesting that the rps3 sequences are most likely not drifting and probably under functional constraint. The conservation observed among the N- and C-termini for mtDNA encoded Rps3 protein sequences observed here and in previous studies (Bullerwell et al. 2000; Smits et al. 2007) would also support that these genes are functional. However, more experimental data is needed to show if these rps3 genes produce functional proteins and how intron encoded rps3 genes are expressed and how elements (such as HEGs) that have inserted into rps3 ORF are removed either post transcription or translation without diminishing the function of Rps3.

Fig. 4
figure 4

Phylogenetic analysis of a codon-based rps3 nucleotide alignment for 36 different taxa covering 1179 positions (i.e., 393 codons). Tree topology shown is based on a 50% majority rule consensus tree obtained from Bayesian inference. Additional phylogenetic reconstruction methods used were distance (NJ), parsimony (DNAPARS) and Bayesian analysis. All three programs yielded similar tree topologies. The numbers at the nodes indicate bootstrap analysis in combination with NJ and PARS, respectively, and the third number below the line represents the posterior probability values obtained from Bayesian analysis. The branch lengths are proportional to the number of substitutions per site (see scale bar). The table besides the phylogenetic tree shows the SynScan results. The dS/dN values entered to the right of the species/strain names are based on comparative analysis with the rps3 sequence of Gelasinospora tetrasperma, the latter also serving as the outgroup for the phylogenetic anlaysis

Fig. 5
figure 5

A schematic phylogenetic tree based on a multigene phylogeny (Blackwell et al. 2006) that provides an overview of relationships among some members of the Ascomycota. Indicated on the phylogenetic tree are potential gain and loss events for the rnl-U11 intron (mL2449) and its encoded rps3 gene. Within members of the Saccharomycetales the rnl-U11 region may contain a group I intron (mL2449) referred to as the omega intron that encodes a homing endonuclease I-SceI (Dujon 1980; also see Goddard and Burt 1999) and the var1 (rps3) is a free-standing gene. In contrast within the examined members of the Sordariomycetes (node S; Sordariales, Ophiostomatales, Diaporthales, Microascales, Hypocreales) and Eurotiomycetes (node E; Eurotiales and Onygenales) the rps3 gene is encoded within the mL2449 group I intron. The sole member of the Heliotales examined so far also had the mL2449 intron version of rps3, but within the Pleosporales (Dothideomycetes; node D in Fig. 3) two events were noted: (1) in Phaeosphaeria nodorum the loss of the mL2449 group I intron and a relocated free-standing rps3 gene; and (2) in Mycosphaerella graminicola the loss of both, the ml2449 intron and the rps3 gene (see text)

It is tantalizing to speculate on how the rps3 gene was incorporated within the mL2449 intron. In Saccharomyces species if present, the mL2449 intron encodes the omega LAGLIDADG homing endonuclease gene and var1 is a free-standing gene (Dujon 1979, 1980; Goddard and Burt 1999) in contrast to many filamentous ascomycetes fungi where the mL2449 intron encodes the rps3 gene (Fig. 5). The mL2449 intron/rps3 ORF arrangement may be advantageous as the ribosomal gene is co-transcribed (and regulated) with an rnl RNA ensuring that proper stoichiometric amounts of transcripts for both components need to assemble ribosomes.

The freestanding var1 gene in Schizosaccharomyces pombe (previously designated urf a) and the nuclear counterpart of rps3 in humans and Drosophila melanogaster have been shown to have multiple activities such as being involved in DNA repair and potentially having endonuclease activities (Neu et al. 1998; Hedge et al. 2001; Jang et al. 2004), thus it is possible that the ancestor of the intron encoded rps3 gene was a freestanding version but its product had endonuclease activity and thus could act like a homing endonuclease and via a recombination pathway inserted a copy of itself within a group I intron. Eventually the free-standing ancestral copy of the intron encoded rps3 gene degenerated and got lost.

The mL2449 group I intron belongs to the class IA1 and is transcribed along with the host gene and the intron is spliced out from the host transcript thus the intron would be a neutral location for inserted elements. This intron might be somewhat unique as it has been shown for the N. crassa version of mL2449 that host factors are required for in vitro splicing of this intron from the rnl precursor RNA (Guo et al. 1991; Kittle et al. 1991; Mohr et al. 2002). So mL2449 may not be a self-splicing intron, it may have degenerated to a point that host factors are essential for splicing. The cost of maintaining this intron might be offset by the presence of the rps3 gene. The loss of the P. nodorum mL2449 intron might be due to the relocation of the rps3 gene and thus selection would not favor an element that is not beneficial and requires host resources for its maintenance.

An alternative to the above is based on the observation that sometimes the intron encoded rps3 coding region is fused to a LAGLIDADG type HEG (Hausner et al. 1999; Gibb and Hausner 2005), thus one could propose that a free-standing rps3 ancestor was invaded by a HEG and later on the HEG lead to the mobilization/transposition of the entire fusion ORF (rps3/HEG) into a group I intron. The intron either was already located within the rnl-U11 region or at a later point inserted into the L2449 location. HEGs are optional elements and thus the HEG may have eventually degenerated (Goddard and Burt 1999).

Intron encoded rps3 genes so far have been noted within the filamentous ascomycetes fungi such as members of the Sordariomyctes and Eurotiomycetes; but many groups of fungi have yet to be analyzed for the presence of rps3. Only one member of the Helotiales was examined in this study (Sarcotrochila macrospora, a causative agent of needle blight in hard woods; Fig. 5) and it has an mL2449 encoded rps3 gene. However, within the Pleosporales the rps3 gene was either lost (Mycosphaerella graminicola) from the mtDNA or relocated to be a free-standing mtDNA gene (Phaeosphaeria nodorum). So it appears that the mtDNA genome is very dynamic and genes such as rps3 can be relocated, expanded in size, and periodically lost (Fig. 5). The latter suggest that their might be a certain degree of redundancy that allows nuclear genes to compensate for the loss of the mitochondrial encoded rps3 gene. Although this study only examined one mtDNA ribosomal gene it might be representative for what can happen to ribosomal protein genes and this may explain why fungal mtDNAs appears to be in constant flux with regards to size, gene arrangements, and the internal organization (intron/exons, expansion segments) of genes.