Introduction

The complete mitochondrial DNA (mtDNA) sequences of one bryophyte, Marchantia polymorpha (Oda et al. 1992), and four angiosperms, Arabidopsis thaliana (Unseld et al. 1997), Beta vulgaris (Kubo et al. 2000), Oryza sativa L. (Notsu et al. 2002), and Brassica napus (Handa 2003), is currently our blueprint for the comparative analysis of land plant mitochondrial genomes. It is apparent that this representation of land plants is particularly skewed towards angiosperms and model systems and a broader perspective is needed to enhance our comparative understanding of the evolutionary dynamics of land plant mitochondrial genomes. Comparisons between liverwort and the aforementioned flowering plant mitochondrial genomes suggest an evolutionary trend towards a decrease in coding capacity (Oda et al. 1992; Handa 2003). Therefore, the basal groups of land plants are more likely retaining a similar complement of mitochondrially encoded genes as their green algal ancestor (Turmel et al. 2003). These mitochondrial genes encode components of respiratory complexes (nad, sdh, cob, and cox), rRNAs and tRNAs, ribosomal proteins (rps and rpl), a group of proteins involved in cytochrome c biogenesis (ccb), and a number of other open reading frames (orfs) (Handa 2003 and references therein). During angiosperm evolution, distinct lineages have experienced frequent independent transfers of mitochondrial genes to the nucleus. This has led, particularly in the case of ribosomal proteins, to a variable distribution of mitochondrially encoded genes among extant angiosperms (Adams and Palmer 2003; Sandoval et al. 2004). Intriguingly, it has recently been shown that genes lost early during evolution can be reacquired by the mtDNA in individual angiosperm lineages either via intracellular gene transfer (IGT) or via plant-to-plant horizontal gene transfer (HGT) (Bergthorsson et al. 2003).

In land plant mitochondria, protein-coding genes frequently contain introns of groups I and II (Lehmann and Schmidt 2003). In the Marchantia mtDNA 32 introns of both types are present (Oda et al. 1992), while most of the approximately 25 introns identified to date in flowering plant mitochondria (Unseld et al. 1997; Kubo et al. 2000; Notsu et al. 2002; Handa 2003) have been categorized as group II introns. The sole group I intron reported to date in the mtDNA of all vascular plants is located in the gene encoding the subunit 1 of cytochrome oxidase (cox1) (Cho et al. 1998). Plant mitochondrial group II introns are found in either cis or trans configuration (Morawala-Patell et al. 1998) and can assume a canonical secondary structure with six helical domains (dI–dVI) radiating from a central core (Lehmann and Schmidt 2003 and references therein). However, group II introns can be further subdivided into groups IIA and IIB according to anatomical and structural differences (Lehmann and Schmidt 2003). The definition of functional domains of higher plant introns is based by analogy on data from yeast self-splicing intron models (Lehmann and Schmidt 2003). Plant mitochondrial group II introns are poorly catalytic RNAs and are incapable of self-splicing in vitro. Although many group II introns contain an orf, which encodes a multifunctional protein involved in splicing and mobility, plant mitochondrial group II introns are orf-less and most of them are derivatives of mobile introns (Lehmann and Schmidt 2003). Up to now only nad1i4 has been found to contain a maturase-type (matR) gene (Wahleithner et al. 1990). It is intriguing that several group II introns evolved from cis- to trans-splicing during pteridophyte and angiosperm evolution (Malek et al. 1997; Qiu and Palmer 2004).

The pattern of intron composition and distribution among mtDNAs of green algae and land plants is consistent with the idea that the great majority of introns originated after the emergence of land plants and that the majority of the liverwort introns arose independently from their counterparts in angiosperms (Turmel et al. 2003).

Although among angiosperms mitochondrial group II introns are mostly vertically inherited, the particular distribution of introns among land plants has been interpreted as suggesting plant-to-plant horizontal transfer (Won and Renner 2003). Uncovering the history of introns in plant mitochondria is thus complicated by their different modes of inheritance. Therefore, the evolutionary history of introns and the relative contributions of intron loss and acquisition in the evolution of mitochondrial genes remains on the whole poorly understood.

Regulation of land plant mitochondrial gene expression is posttranscriptionally mediated both at the level of splicing of intronic sequences and through an mRNA editing process that rectifies DNA genetic information so that functionally competent and evolutionarily conserved polypeptides are translated (reviewed by Gray 2003). These nucleotide amendments are enzymatic transformations of cytidines (Cs) to uridines (Us) and occasionally Us-to-Cs in the mRNAs (Gray 2003). Editing of mitochondrial transcripts is widespread within the land plants, occurring in all major groups, including the Bryophyta (Gray 2003). Nonetheless, as studies of RNA editing have been extended to a broader selection of land plants, some intriguing differences have started to emerge (Perrotta et al. 1996; Lu et al. 1998).

To date, no extensive molecular studies have considered the mitochondrial genomes of extant gymnosperms. These land plants occupy an evolutionary position between ferns and angiosperms—a key node in plant evolution particularly with respect to the study of the seed plant origins and early divergence (Pryer et al. 2001). Therefore, our analysis has been extended to one of the major, but understudied, land plant lineages: the cycads, historically treated as the most primitive extant seed plants and, thus, often termed living fossils (Rai et al. 2003).

Here we focus on the rps3 gene in Cycas revoluta mitochondria and report its structure, genomic organization in cluster and cotranscription with the rps19 and rpl16 genes. Furthermore, RT-PCR analysis demonstrated that the Cycas rps19rps3rpl16 transcripts undergo accurate processing by RNA editing and splicing. To gain insights into evolutionary affiliations between land plants, we also first report in the present study the rps3 locus in other distantly related flowering plants such as Magnolia and Helianthus.

Unexpectedly, comparative analysis revealed that in contrast to other plant species, the rps3 orf in Cycas mitochondria harbors a novel second cis-splicing intron in addition to the group II intron present at the same location in angiosperms. This extra rps3 intron seems to be positionally conserved in other representatives of gymnosperms.

These original findings and their implications concerning the evolutionary history of land plant mitochondrial genome and its group II introns are discussed.

Materials and Methods

Plant Material

Ovules and leaves of Cycas revoluta were mostly provided by the Dipartimento di Scienze Botaniche at the Università degli Studi of Pisa (Italy) and, partly, by the Orto Botanico of the Facoltà di Scienze at the Università degli Studi “Federico II” of Naples (Italy).

Isolation, Analysis, and Sequencing of Nucleic Acids

Total cellular DNA from Cycas leaves was extracted following the CTAB procedure according to Doyle and Doyle (1990). Mitochondrial nucleic acids were extracted from ovules of C. revoluta as already described (Perrotta et al. 1996). All DNA and RNA manipulations were performed following standard techniques (Sambroock and Russel 2001).

PCR Amplification, cDNA Synthesis, and Sequencing

About 0.1 μg of Cycas mtDNA was amplified with external primers specific for the rps19rps3rpl16 locus (see supplementary Fig. 1) and Ampli Taq Gold (Perkin Elmer). Amplification conditions were as follows: denaturation for 9 min at 94°C, followed by 30 cycles of 30 s at 94°C, 30 s at 58°C, and 3 min at 72°C, and a last elongation step of 7 min at 72°C. PCR amplification products of Cycas mtDNA were purified using a QIAquick PCR Purification Kit or QIAquick Gel Extraction Kit (QIAGEN) and sequenced on both strands with an automated DNA sequencer (ABI Prism 310, Applied Biosystems, USA).

Figure 1
figure 1

Comparison of the genomic organization of the rps19, rps3, and rpl16 gene cluster in Cycas, Magnolia, and Helianthus mitochondrial DNA. Coding regions for the three ribosomal protein genes are shown as open boxes, while grey triangles indicate the two rps3 intervening sequences (rps3i1 and rps3i2). Vertical lines below each coding region identify the editing sites found in the sequenced rps19rps3rpl16 cDNAs of Cycas, Magnolia, and Helianthus, while vertical bars with filled squares indicate the editing events found in the rps3rpl16 overlap in Cycas and Helianthus. Arrows show Cs found edited in all compared transcripts.

First-strand cDNA synthesis was performed on 1 to 5 μg of CsCl-purified total mitochondrial RNA from Cycas after extensive treatment with RNase-free DNase I (GIBCO-BRL) with random examers and according to the Instruction of the Superscript Preamplification System for First Strand cDNA Synthesis. The resulting cDNAs were amplified by PCR with the same primers (see supplementary Fig. 1) and conditions as described for mtDNA amplification, except that the number of amplification cycles was raised to 35. RT-PCR products were directly sequenced as described above.

Sequence Analysis

The novel Cycas revoluta, Helianthus annuus and Magnolia liliiflora DNA and cDNA sequences were deposited in the database under accession nos. AY345867, AF319170, and AF319171, respectively. The Cycas as well as the Helianthus and Magnolia mtDNAs, cDNAs, and deduced amino acid sequences were compared with the sequences available in the databases (EMBL, GenBank).

To analyze the rps3i1 in land plants, nine sequences were retrieved from GenBank using the accession numbers listed in Table 1.

Table 1 Plant species sampled for this study and sizes and GenBank accession numbers of the rps3i1 sequences

The rps3i1 sequences were pairwise analyzed with the DotPlot program of the GCG package version 9.1 to first identify common regions. These results were used to manually refine ambiguous regions of a Clustal W multiple alignment using the sequence editor GeneDoc (Higgins et al. 1994; Nicholas et al. 1997). To determine the secondary structure model for the Cycas rps3i1, the “domain-by-domain” approach of Kelchner was adopted (Kelchner 2000). According to this approach, each domain was folded using the domain boundary sequences and specific structural elements available from the previously predicted secondary structure of the Alnus rps3 intron (Laroche and Bousquet 1999). In addition, to validate each folded domain and to identify new base pairing when the identification of domain boundaries was not detectable, the PFOLD program was used (Knudsen and Hein 1999).

The complete sequences of the two rps3 introns from Cycas revoluta were submitted as independent queries to a BLAST (Altschul et al. 1997) search against a nonredundant primary database. In addition, FASTA3 (Pearson and Lipman 1988) and BLASTX (Altschul et al. 1997) were used to identify sequences related to rps3i2. Values of thermodynamic stability for some regions of the rps3i1 and for the Cycas rps3i2 dV were estimated by means of the energy minimization method implemented in Mfold by Zuker (2003). The clustering analysis for the same dV of the Cycas rps3i2 was performed with MEGA2 (Kumar et al. 2001). Finally, direct and inverted repeats within both Cycas rps3 intron sequences were identified by means of the Repfind program (Betley et al. 2002).

Results and Discussion

Genomic Environment and Sequence Analysis of the rps3 Locus in Cycas Mitochondria

The complete sequence of the rps3 gene and its flanking regions on the Cycas (about 7-kb) mtDNA was determined by a PCR-based strategy employing sets of rps3-specific primers (see supplementary Fig. 1). The sequences of the rps3 locus from Magnolia (more than 4-kb) and sunflower (6-kb) mitochondria were also determined for further comparison. Unexpectedly, sequence analysis revealed that in contrast to the rps3 orf in Magnolia, sunflower, and other higher plants investigated to date, the Cycas rps3 coding region consists of three exons and two intervening sequences of 2984 and 1985 bp, respectively (Fig. 1). The three exons of 74 (exon 1), 193 (exon 2), and 1473 (exon 3) bp (Fig. 1), respectively, specify a 579 S3 polypeptide. The two Cycas rps3 introns were named rps3i1 and rps3i2 by order of appearance in the orf (Pruchner et al. 2001). Exon–intron boundaries for each intron were determined by comparing Cycas genomic and cDNA sequences from the spliced rps3 transcripts (see below). Both Cycas introns are located near the 5′-end of the orf, but while the rps3i1 is a phase 2 intron, the rps3i2 is a phase 0 intron. The insertion site of this rps3 second intron is unique to the Cycas mitochondrial genome. The rps3i2 in Cycas contributes, thus, to novel changes in the gene-structure giving rise to genetic variation upon which natural selection may act.

As deduced by Southern hybridization analyses, parallel sequencing, and nucleotide and amino acid sequence comparisons, the Cycas, as well as the Magnolia and Helianthus mitochondrial rps3 gene share the same genomic context with an upstream rps19 and a downstream rpl16 gene maintaining a pattern which is highly conserved among prokaryotic, chloroplastid, and plant mitochondrial genomes (Fig. 1) (Kumar 1995; Turmel et al. 2003). It is noteworthy that the rps19 encoded in sunflower mitochondria is most likely a nonfunctional pseudogene, because of scattered deletions that disturb the reading frame, introducing a frame shift and creating several consecutive stop codons. Likewise, the rps19 gene is disrupted in Arabidopsis mitochondria, where the functional S19 protein is nuclear-encoded and imported into the mitochondrion to sustain the translation apparatus of the organelles (Sa’nchez et al. 1996).

Altogether our findings suggest that the rps19rps3rpl16 cluster connects intact genes in the Cycas and Magnolia mitochondrial genome but not in Helianthus.

Processing of rps19–rps3–rpl16 Transcripts in Cycas Mitochondria

Cycas rps19–rps3–rpl16 cDNAs covering the entire coding region were obtained by RT-PCR using a specific primer set (see supplementary Fig. 1), sequenced, and then compared to the genomically encoded rps19, rps3, and rpl16 orfs.

RT-PCR analysis established that the rps19, rps3, and rpl16 genes were transcribed together as polycistronic mRNAs in Cycas, Magnolia, and Helianthus mitochondria (Fig. 1), as well as in other higher plants (Kumar 1995).

In addition, our RT-PCR approach (see supplementary Fig. 1) did not yield any detectable PCR amplicon derived from partially spliced rps3 mRNA molecules (data not shown), indicating that rps3 transcripts are efficiently spliced in Cycas as well as in Magnolia and Helianthus mitochondria (Fig. 1). Correct excision of both Cycas rps3 group II introns, rps3i1 and rps3i2, from the primary transcript in vivo appears, thus, to play an important role in the evolution of the rps3 gene in mitochondria and the host organism. It is likely that the insertion of an additional intron within the rps3 gene may be crucial for the mRNA stability or provide a new advantageous mechanism to control gene expression at the posttranscriptional level in Cycas mitochondria.

Sequence analysis of the cDNA population derived from the Cycas rps19rps3rpl16 transcripts established that transcripts undergo accurate mRNA processing by mRNA editing. Pronounced variations in the frequency of RNA edited sites were observed between the rps19rps3rpl16 transcripts from Cycas, Magnolia, and sunflower (Fig. 1). Forty-seven edited positions occur in the rps19rps3rpl16 transcript of Cycas, versus the 30 and 18 sites found in Magnolia and Helianthus, respectively (Fig. 1). Interestingly, the Cycas rps3 gene is the most extensively edited, involving 28 C-to-U nucleotide transitions with a rather uneven distribution along the three rps3 exons (Fig. 1). Most of the edits observed are nonsynonymous substitutions restoring phylogenetically conserved amino acid residues. As a result, the overall similarity of the Cycas encoded S19, S3, and L16 polypeptides increases in the comparison with their counterparts in other plant and nonplant mitochondria and in eubacteria (Bock et al. 1994).

This overall high degree of nucleotide conservation of the Cycas rps19rps3rpl16 transcripts reflects strong functional constraints on the encoded amino acid sequences. As a corollary, the Cycas rps19rps3rpl16 gene cluster whose transcript undergoes processing in the form of splicing and RNA editing is likely to be functional.

Our results confirm that mRNA editing occurs more frequently in the Cycas rps19, rps3, and rpl16 transcripts than in the counterparts of several flowering plants so far investigated, further supporting previous observations on the high frequency of RNA editing in gymnosperm mitochondria (Perrotta et al. 1996; Lu et al. 1998). All of the aforementioned evidence suggests that, in gymnosperms, mitochondrial gene expression is regulated much more efficiently at the RNA than at the DNA level.

It appears that the mitochondrial rps19rps3rpl16 gene cluster is evolving much faster in the long-lived Cycas than in the other investigated angiosperms. The editing-related RNA sequence evolution in gymnosperm mitochondria might induce an accelerated evolution of the Cycas rps19rps3rpl16 genomic locus by allowing accumulation of T-to-C transitions at the mtDNA level, compensating for the slowdown caused by the long generation time of these land plants (Perrotta et al. 1996; Lu et al. 1998).

Structural and Molecular Evolutionary Features of the Two rps3 Introns

A BLAST search (Altschul et al. 1997) indicated that the Cycas rps3i1 is most similar to the previously well-characterized group II intron present at the same position within the rps3 gene from several other angiosperms (Fig. 1) (Laroche and Bousquet 1999) Interestingly, comparison among a set of previously characterized 12 mitochondrial rps3 introns from different monocots and dicots, including the here-reported Magnolia and sunflower, revealed that the rps3i1 of the gymnosperm Cycas, with a size of 2984 bp, was the largest rps3i1 discovered up to now (Table 1).

Conversely, the rps3 intron sequence from the mitochondrial genome of Helianthus is only 976 bp long and appears to be the most reduced in size among the analyzed eudicots (Table 1).

The remarkable size heterogeneity between the rps3i1 sequences is mainly attributable to large indels. Only few differences in primary structure and a small number of differences in arrangements of direct and inverted repeats were detected among the compared mitochondrial rps3i1 sequences.

A secondary structure model for the rps3i1 intron sequence of Cycas was obtained by means of the “domain-by-domain” approach combined with the PFOLD program (Knudsen and Hein 1999) (see Materials and Methods). These combined approaches allowed us to identify for the Cycas mitochondrial rps3i1 a novel reliable secondary-structure model in complete accordance with the group IIA intron model proposed by Michel et al. (1989) (Fig. 2).

Fig. 2
figure 2

RNA secondary structure model of the mitochondrial rps3i1 intron in Cycas revoluta. The model was predicted from the rps3 gene nucleotide sequence. Roman numbers (I–VI) indicate the conserved domains of group II introns (dI to dVI in the text) According to the accepted secondary structure model for group II A introns (Michel et al. 1989), the rps3i1 shows six major structural helices radiating from a central wheel of single stranded segments. External and internal binding sites (EBS and IBS), the dVI bulging adenine shown with an asterisk and the γ-γ′ interactions are also depicted in the above model. Large indels of 190, 155, 208, and 123 bp, distinguishing the Cycas rps3i1, were found located in the loop of dIV, which appeared to be the largest and the most variable. In contrast, the dII was the smallest but the most conserved. Loops are not drawn to scale and numbers inside the loops indicate their size. With respect to the dIII of the Alnus rps3 intron (Laroche and Bousquet 1999), a more specific and reliable base-pairing was identified for the corresponding domain in Cycas, showing a better fit with the mitochondrial consensus as indicated by boldfaced nucleotides.

According to this model the dII was the smallest domain, while the dIV, including the 70% of the total intronic sequence length, was the largest (Fig. 2).

Interestingly, several differences between the predicted folding of the dIII of the Cycas rps3i1 and the structural model previously proposed for the Alnus rps3 intron (Laroche and Bousquet 1999) were noted. However, given that this different base-pairing was the only one feasible and the most conserved for the dIII of the Cycas rps3i1, we believe that the dIII folding of the Alnus rps3 intron reported earlier (Laroche and Bousquet 1999) can be considered an exception rather than the rule.

Furthermore, a site similarity plot (Fig. 3) of the multiple structural alignment for rps3i1 from the plant species listed in Table 1 demonstrated that sites involved in domain base-pairing are among those most highly conserved. As a consequence, a different substitution pattern should be expected among the six domains of the rps3i1. A higher substitution rate was, indeed, detected by MEGA2 software (Kumar et al. 2001) in domains I, VI, and the large and variable dIV, whereas comparable evolutionary dynamics were detected for the II and III domains (Z-test, p < 0.05).

Figure 3
figure 3

Similarity plot of mitochondrial rps3i1 sequences from land plants. The mitochondrial rps3i1 sequences listed in Table 1 were aligned according to the combined DotPlot and Clustal W approach (Higgins et al. 1994; Nicholas et al. 1997). The highest identity scores (>0.8) were found for the 3′-portion of the dI (P1), the dII and dIII (P2), and the dV and VI (P3) within the multiple alignment of rps3i1 sequences from the species listed in Table 1. Reduced similarity scores were detected in the dIV, where two hypervariable regions (Hp1–2) with identity less than 0.2 were identified. The unique large insertions (I1-4) of Cycas rps3i1 were also located in the large dIV and they corresponded to four regions with no identity in the similarity plot. A bar indicating nucleotide positions and domain boundaries is depicted above the plot.

On the whole, the overall number of substitutions per site calculated for the rps3i1 of the analyzed plant species (0.105 ± 0.006) (supplementary Table 1) was in agreement with the previously estimated rate values for several mitochondrial group II introns (Laroche et al. 1997), but it was also comparable to the rate of nonsynonymous nucleotide substitutions per site of different mitochondrial exons (Laroche et al. 1997). Relative rate tests, conducted by RRTREE program (Robinson-Rechavi and Huchon 2000) with Cycas as reference taxon, revealed a higher substitution rate per site in monocots than in eudicots (p < 0.05) (see the matrix in supplementary Table 1).

Alongside, an analysis of primary and secondary structural features of the 1.985-kb additional intervening sequence within rps3 in the Cycas mitochondrial genome (rps3i2) has been undertaken. Sequence analysis revealed an unexpectedly high similarity (97%) between a 38-bp stretch of rps3i2 and rps3i1, showing a secondary structure (Fig. 4A) consistent with those previously published for dV of plant mitochondrial group II introns (Knoop et al. 1994).

Figure 4
figure 4

A Similarity relationships between the dV regions of homologous land plant cis-group II introns detected by FASTA. B The secondary structure model of Cycas rps3i2 dV and dVI. The dV consists of one proximal and one distal stem, of 9 and 5 bp, respectively, joined by a dinucleotide insertion (AC), with the distal stem closed by a four-base loop. The strictly conserved positions (nucleotides 2–4), the AGC triads, and nucleotides 18 (A) and 31 (G), important for dV function, have been also detected (Qin and Pyle 1998). An asterisk indicates the bulged A residue in dVI.

Most surprisingly, FASTA results revealed that the 3′-portion of the Cycas rps3i2 was highly similar to other group II introns interrupting plant mitochondrial genes (Fig. 4A). The highest score was on a 233-bp stretch belonging to the cobi2 of Marchantia (Ohyama et al. 1993). Furthermore, matches higher than 70 and 80% were observed on a 120-bp stretch in the nad5i1 either in Vicia (Scheepers et al. 2001) or in Beta (Kubo et al. 2000) and on a 76-bp stretch of the nad7i2 of the moss Takakia mitochondria (Pruchner et al. 2001), respectively. Interestingly, secondary structure modeling of all the aforementioned sequences revealed the presence of integral and well-conserved dV of group II introns (Fig. 4A). This high fit with the consensus (only 3 mismatches on 34 compared base pairs), and the above-reported essential features of the dV in the novel identified Cycas rps3i2 demonstrate unequivocally that it belongs to the well characterized group II introns. The significance of the identification of a dV is confirmed by the presence of an adjacent dVI (Fig. 4B) that, although less canonical, shows a six-nucleotide helix possessing a bulging adenosine which allows prediction of the 3′-splice site (Lehmann and Schmidt 2003). Indeed, the helical dV and dVI near the splice site are the most highly conserved structures of group II introns and have already been successfully used as a specific marker for group II intron identification (Knoop et al. 1994). Unfortunately, despite a conserved sequence at the 5′- (GGGYG) and 3′- (AT) ends, an extensive search failed in identifying other intron structural motifs or tertiary interactions such as EBS1–IBS1 and EBS2–IBS2, confirming the high structural flexibility existing among organellar group II intervening sequences (Lehmann and Schmidt 2003).

As a result, the lack of similarity at the rps3i2 5′-portion compared with any well-known group II introns registered in databases prevented the application of standard comparative methods to obtain a complete secondary structure for this Cycas rps3i2.

However, the nonrandom nature of the expected FASTA parameters (E values ranging from 2.6e-10 to 3.8e-05) together with the conservative nature in primary and secondary RNA structure of the retrieved stretches suggests a strong selective pressure on the rps3i2 dV due to functional constraints. Even in the absence of additional strong similarities with other intron domains, the significance of rps3i2 dV similarities could possibly be explained by a common origin for both rps3 introns and the other cis-splicing group II introns or, at least, for their 3′-region including dV (Fig. 4A). The evolutionary relationship between Cycas rps3i1 and rps3i2 dV as well as between these Cycas domains and their counterparts in Marchantia cobi2 (Ohyama et al. 1993) and Takakia nad7i2 (Pruchner et al. 2001) is depicted by the cluster analysis in Fig. 4A. Unlike the Cycas rps3i1, the second rps3 intervening sequence contains numerous repeats, mainly located in its 5′-region. Over a stretch of 123 nucleotides four direct repeats, each 28 bp long, have been found, suggesting a high potential for recombination for this intronic region.

In addition, an advanced search in protein databases, and further investigations by means of InterPro, showed a high similarity between a stretch of 214 bp upstream of the dV of the Cycas rps3i2 and the ORF760 present on the Chara mtDNA (Turmel et al. 2003) harboring two functional domains for a maturase and a reverse transcriptase. It thus appears that the ancestral group II intron, which gave rise to the Cycas rps3i2, likely, encoded a multifunctional and now partly degenerated orf. Therefore, the Cycas rps3i2 could belong to a family of retroelements of still unknown origin (Lehmann and Schmidt 2003).

Evolutionary Considerations on the rps3 IntronComposition and Distribution

During evolution of land plants the rps3 locus has undergone multiple changes in its intron content. In the liverwort Marchantia (Oda et al. 1992) the rps3 gene is devoid of introns, while the rps3 from the Chara and the angiosperms investigated to date (Handa 2003; Turmel et al. 2003), including the here-reported Magnolia and Helianthus, harbor one positionally conserved intron. Only in the Beta mtDNA rps3 is without introns (Kubo et al. 2000). Surprisingly, the rps3 orf in Cycas harbors two group II introns, rps3i1 and rps3i2. To the best of our knowledge this is the first time that a novel group II intron has been found to be present within the mitochondrial rps3 gene in plants.

According to a PCR assay with rps3i2-specific primers (see supplementary Fig. 1 and data not shown), the presence of rps3i2 appears to be a shared feature of the mitochondrial rps3 gene in Cycas and Ginkgo and, thus, a distinctive intron signature in gymnosperms. An upcoming larger-scale survey of additional gymnosperms, closest relatives to seed plants, and basal lineages of vascular plants for the presence of rps3i2 will allow us to determine the distribution pattern and to identify the point of acquisition of this intron.

Consistent with the results reported in this study, the currently known distribution of the rps3 introns raises several alternative evolutionary scenarios, although the lack of data in extant relatives to seed plants do not allow a selection between them.

The presence of closely related rps3i1 at identical positions in Chara and land plant rps3 gene suggests an earlier gain of this intron in a common ancestor of algae and land plants (Turmel et al. 2003). Although an intron positionally and structurally homologous with rps3i1 is not present in the Marchantia mtDNA (Oda et al. 1992), it has been suggested that the charalean rps3 intron might have given rise to its seed plant homolog via vertical descent (Turmel et al. 2003). To complete this picture, we have to consider a subsequent step of complete loss of the rps3i1 at least once during angiosperm evolution within the time frame of the evolutionary diversification of the lineage leading to Beta. Several cases of intron loss have, indeed, been documented for land plant species (Qiu et al. 1998; Beckert et al. 1999; Pruchner et al. 2002).

The serendipitous finding of a second stable intron (rps3i2) at a novel insertion site of the rps3 gene in gymnosperms seems to suggest that this intron was independently gained in gymnosperms likely at the time or early after the divergence of the angiosperms. The absence of a rps3i2 homolog from relatives in either algal lineages or Marchantia also seems to be consistent with a relatively recent acquisition of the rps3i2 by gymnosperms.

Another possible evolutionary scenario is an ancestral gain of the rps3i2 in the last common progenitor of seed plants followed by its subsequent loss in distinct lineages at the time of the evolutionary appearance of angiosperms. Therefore, since Cycads and angiosperms seem not to share a direct common ancestor (Rai et al. 2003), both rps3i1 and rps3i2 would have been retained exclusively throughout gymnosperm evolution. Lineage-specific selective pressure or dependence on specific host-encoded splicing factors would have contributed to the loss of rps3i2 in angiosperms and other land plant lineages.

Therefore, our current knowledge about the distribution of the rps3i2 in land plants does not allow the inference of how this intron was inherited. Thus, the origin of the second rps3 intron present in the gymnosperm mtDNA remains enigmatic.

However, the lack of similarity of Cycas rps3i2 to any known sequences registered so far in databases does not provide direct support for horizontal transfer and points to a vertical descent from an ancestral mitochondrial genome containing it during the evolution of land plants. Interestingly, a mitochondrial ancestry of the rps3i2 is also suggested by the fact that its dV is closely related to the corresponding domain of a group of other introns, some of which are located in different genes or even located in the same mitochondrial gene at a different insertion site in mitochondria of bryophytes such as Marchantia (Oda et al. 1992) and Takakia (Pruchner et al. 2001), as well as in angiosperms (Fig. 4A). This result, in addition to the similarity to the Chara ORF760 with functions of maturase and reverse transcriptase (Turmel et al. 2003), suggests a common evolutionary origin of all those group II introns and also suggests that they have characteristics of elements mobile between different mitochondrial genes (Ohyama et al. 1993; Zanlungo et al. 1995).

Limiting the number of speculations and considering the best-known mechanisms cited to explain intron gain or loss in plant mitochondria (Lehmann and Schmidt 2003), we believe that the simplest explanation is that rps3i2 have been acquired in gymnosperms via intragenomic transposition or, alternatively, through reverse transcriptase–mediated movement into novel mtDNA sites (retrotransposition). The origins of most of the group II introns in Chara mtDNA and several group II introns in Marchantia mtDNA (Ohyama et al. 1993) have also been attributed to intragenomic transposition events (Turmel et al. 2003).

The evolutionary dynamics of the rps3 gene and its structural changes are likely to provide us with a broader evolutionary perspective for new mitochondrial genomic endeavors and diverse molecular innovations that have characterized the rich botanical diversity that dominates our terrestrial ecosystem.