Introduction

Massive gene transfer from the mitochondrion to the nucleus is known to have occurred early in eukaryotic evolution following endosymbiosis (reviewed in Gray et al. 1999), and migration is still ongoing in plants, albeit sporadically (reviewed in Adams and Palmer 2003). This contrasts with many other eukaryotes where alterations to the mitochondrial genetic code have effectively stopped successful gene movement. In flowering plants, the mitochondrial ribosomal protein genes exhibit a greater propensity for migration to the nucleus than the respiratory chain genes, and at most four large subunit ribosomal protein genes and eleven small subunit ones remain in the organelle, although only a subset is typically seen in any given plant species. This is fewer than for the non-vascular plant Marchantia polymorpha which still encodes sixteen in the mitochondrion (Takemura et al. 1992) and is only a small proportion of the 54 or so bacterial-type proteins in the present-day plant mitoribosomes, the rest being nuclear-encoded (cf. Bonen and Calixte 2006). Initial insights into the extent and variability of gene loss from the mitochondrion among flowering plants came from Southern hybridization surveys (reviewed in Adams and Palmer 2003) and this has been extended by mitochondrial genome sequence data now available (cf. NCBI Organelle Genome Resources website). Information about nuclear-translocated gene copies has for the most part come from individual gene analysis (reviewed in Adams and Palmer 2003; Liu et al. 2009), from surveys of plant nuclear genomes for particular gene sets, such as the ribosomal protein ones (cf. Bonen and Calixte 2006), and from large-scale analyses of genomes representing various eukaryotic lineages (cf. Maier et al. 2013; Kannan et al. 2014). Analysis can be complicated by the presence of non-functional mitochondrial DNA (Numt) sequences in nuclear genomes, although it is worth noting that in plants, successful transfer also requires an RNA intermediate since virtually all mitochondrial coding sequences undergo C-to-U RNA editing to generate the correct amino acid sequence (reviewed in Takenaka et al. 2013) and the nuclear expression system does not contain such editing machinery nor for the splicing of mitochondrial group II introns.

Successful transfer to the nucleus not only requires integration of the mitochondrial gene into a “hospitable” genomic site, but also acquisition of regulatory elements for proper expression and targeting signals so that the protein can be imported back into the mitochondrion. Amino-terminal targeting signals are sometimes acquired from duplicated copies of genes that specify other mitochondrion-destined proteins, sometimes from unknown sources, and in some cases, there is no additional extension. About 25 % of the nuclear-located mitochondrial ribosomal protein genes in Arabidopsis and rice fall in this latter category (Bonen and Calixte 2006). In one interesting case in grasses, a mitochondrial rps14 gene has been inserted within the intron of a gene encoding another mitochondrial protein (sdh4) and is expressed through alternative splicing (Figueroa et al. 1999; Kubo et al. 1999). Additional protein-coding domains are sometimes acquired by the translocated gene, as for the RNA-binding domain fused to rps19 in Arabidopsis (Sánchez et al. 1996). Implicit in evolutionary scenarios of functional mitochondrion-to-nucleus gene transfer is that there will be a period in which active copies are present in both compartments (Brennicke et al. 1993; Adams and Palmer 2003; Bonen 2006), after which the redundant copy degenerates into a pseudogene and is eventually lost. Little is known about factors that influence the length of such periods of co-existence or the eventual outcome (i.e. which copy “wins out”). In some cases, the transition period may be extremely short, based on variation seen among families within Silene vulgaris for mitochondrial rpl5 and rps13 gene status (Sloan et al. 2012). To our knowledge, there are only four documented cases of active copies of genes encoding fundamental mitochondrial functions that are simultaneously present in both the mitochondrion and the nucleus: atp9 in Neurospora crassa (van den Boogaart et al. 1982), cox2 in certain legumes (Nugent and Palmer 1991), rpl5 in wheat (Sandoval et al. 2004) and sdh4 in Populus (Choi et al. 2006).

Although unsuccessful mitochondrion-to-nucleus gene transfers may be difficult to detect because the final outcome is as though the event had never happened, it was possible in the case of the mitochondrial rps19 gene in grasses to infer that the mitochondrial copy “won out” in the rice lineage (Fallahi et al. 2005). In present-day rice, this mitochondrial gene is located immediately downstream and co-transcribed with an intron-containing rpl2 gene in an ancestral bacterial-type linkage (Kubo et al. 1996), and there is no functional rps19 gene copy in the nucleus. In maize, the mitochondrial rps19 gene has been lost, and in wheat mitochondria it is a 5’-truncated pseudogene (Fallahi et al. 2005). There is strong support for the rps19 gene transfer having occurred in the ancestor of these grasses, because the nuclear-located rps19 genes in lineages that diverged before and after the rice lineage, namely maize/sugarcane and wheat/barley, respectively, share the same acquired amino-terminal targeting sequence derived from the hsp70 presequence as well as a number of shared amino acids that are absent from the “native” mitochondrial rps19 genes (Fallahi et al. 2005). Such features are most simply explained by a single transfer to the nucleus in the common ancestor of these grasses, which diverged about 60 million years ago (Kellogg and Bennetzen 2004). For these reasons, we were interested in examining the status of rps19 and rpl2 in another grass, brome (Bromus inermis) which shared a common ancestor with wheat about 20 million years ago (Gaut 2002; Kellogg and Bennetzen 2004), with the aim of gaining insight into the potential length of the period of co-existence and parameters that might influence eventual success of the translocated copy.

Materials and methods

Mitochondrial DNA and RNA were isolated from 6- to 9-day etiolated seedlings of brome grass (Bromus inermis), barley (Hordeum vulgare cv. OAC Kippen) and oats (Avena sativa cv. 0A974-1) using previously described procedures (Subramanian and Bonen 2006). Brome seeds were purchased from Ritchie Feed and Seed Inc (Ottawa Canada) and other seeds were kindly provided by Dr. R. Pandeya (Agriculture and Agri-Food Canada, Ottawa). Total DNA was isolated from brome seedlings using standard protocols with the modifications described by Hazle and Bonen (2007), except that the final CTAB step was omitted. Total RNA was isolated using Trizol® (Invitrogen) according to the manufacturer’s specifications.

The brome mitochondrial and nuclear DNA regions of interest were obtained primarily by PCR using primers based on conserved regions from other grasses. Oligomer sequences are given in Supplementary Table S1. The brome mitochondrial DNA region preceding ψrpl2-rps19 was obtained by inverse PCR using DNA restricted with HindIII and circularized with DNA ligase (Invitrogen) prior to PCR amplification. For RNA editing analysis, brome mitochondrial RNA from 6-day etiolated seedlings was treated twice with DNase I (Amersham) prior to cDNA synthesis with MMLV reverse transcriptase (invitrogen) at 37 °C for 2 h and subsequent PCR amplification. Reactions in which reverse transcriptase was omitted were also performed as controls. To examine expression of the nuclear-located rps19, rpl2 and atpβ genes, total RNA without DNAse I treatment was used in RT-PCR experiments with the primers given in Supplementary Table S1. PCR and RT-PCR amplification products were gel purified using UltraClean 15 (MoBio Laboratories Inc.) and after corroboration by nested PCR, they were either sequenced directly or after cloning into pGemT-Easy plasmid vectors (Promega). Sequencing was performed by StemCore Laboratories at the Ottawa Health Research Institute, Ottawa, Canada. Sequences have been deposited in the NCBI Genbank with accession numbers KT022083-KT022085.

For comparative sequence analysis, rps19 homologues from other grasses were retrieved from NCBI databases (nr, EST and high-throughput genomic) and EMBL-EBI Ensembl (plants.ensembl.org), the latter for wheat and barley draft nuclear genomes. Accession numbers are given in Supplementary Table S2. Sequence alignments were carried out using MUSCLE (Edgar 2004) and prediction of protein targeting to the mitochondrion was assessed using TargetP (Emanuelsson et al. 2007) and Predotar (Small et al. 2004). It should be noted that the wheat nuclear genome has two additional copies of rps19 (on chromosome 1) that were omitted from our analysis because of their close sequence similarity to the paralogous copies on chromosomes 3 and 5. Similarly, maize has closely related rps19 copies on chromosomes 1 and 5 (Supplementary Table S2). It is also worth noting that the rpl2 intron/exon junction was mis-annotated in the bamboo Ferrocalamus rimosivaginus databank entry (JN120789) and for Bambusa oldhamii (EU365401) no rpl2 annotation was given, although inspection shows that a complete rpl2 gene is present in the database entry.

Results

An expressed rps19 gene is located downstream of rpl2 pseudogene in the brome mitochondrial genome

We identified an intact rps19 gene in the brome mitochondrial genome using PCR with primers based on the rice counterpart (Kubo et al. 1996). It is located three nucleotides downstream of an rpl2 pseudogene (Fig. 1a) and it has a downstream linkage with nad4L as in rice. The brome rpl2 homologous region is missing 298 bp within exon 1 compared to rice, as well as the extreme 3’ end of exon 1 and most of the intron. Based on our Southern analysis, there are no additional full-length copies of rps19 or rpl2 elsewhere in the brome mitochondrial genome, although there are short pseudogene segments (Supplementary fig.S1 and data not shown).

Fig. 1
figure 1

Organization of brome mitochondrial rpl2-rps19 genomic region and its expression. a Schematic comparing brome and rice rps19 (black) and rpl2 exons (grey). Triangles represent regions missing from brome rpl2 pseudogene. Black arrows show positions of oligomers (#1–4) used in PCR and RT-PCR, and grey arrows for ones (#5–6) used in inverse PCR to obtain upstream sequence. b RT-PCR products with brome mitochondrial RNA from etiolated seedlings and oligomer 4 for cDNA synthesis. Lane 1 primers 4+2, lane 2 primers 4+2, no RT, lane 3: primers 4+1, lane 4 primers 4+3. M denotes size markers. C Schematic of brome mitochondrial rps19 C-to-U editing sites (circles) and chromatograms of direct sequencing of RT-PCR products from lane 4 in panel B. Arrows show editing sites. Inset shows editing status of codon 55 for RT-PCR products from lane 1 (upper chromatogram) and lane 3 (lower chromatogram)

The ψrpl2-rps19 region of the brome mitochondrial genome is actively transcribed as determined by RT-PCR analysis (Fig. 1b), and direct sequencing of the products revealed that the rps19 mRNA undergoes C-to-U editing at five positions, two of which are non-synonymous and result in conserved amino acids (Fig. 1c). Synonymous sites are known to often be only partially edited in the RNA population (reviewed in Takenaka et al. 2013) and this can be seen for codon 42. The extent of rps19 editing was also less complete in longer transcripts (Fig. 1c, inset) reflecting their immature state. The same five edits, plus an additional silent one, had been observed in rice rps19 (Kubo et al. 1996) and overall, the brome and rice mitochondrial rps19 genes differ by only one non-synonymous substitution and a 9-bp indel preceding the stop codon (Fig. 2). These RNA level nucleotide changes also confirm that this ψrpl2-rps19 DNA region is from the mitochondrion not Numt sequences from the nucleus in brome. Based on our Northern analysis, transcripts from the brome ψrpl2-rps19 region are present at only low steady-state levels in etiolated seedlings (data not shown), and it is worth noting that certain other mitochondrial ribosomal protein genes also exhibit very low levels during this developmental stage in wheat (Li-Pook-Than et al. 2004).

Fig. 2
figure 2

Organization and expression of brome nuclear gene for mitochondrial S19 protein. a Schematic of brome nuclear rps19 gene with core (black) and hsp70-type presequence exons (hatched). In gel at right, lane 1 PCR product, primers 7+9, lane 2 RT-PCR product, primers 7+8, lane 3 RT-PCR product with ATP-β primers. Primer sequences are given in Supplementary Table S1. M denotes size markers. Etiolated seedling RNA was used as template. b Amino acid alignment of mitochondrial S19 ribosomal protein homologues for the mitochondrion-located genes of brome and rice, as well as the nuclear-located genes in brome, barley, wheat and maize. Accession numbers are given in Supplementary Table S2. Identities within the S19 core and HSP70-type presequence are shaded in grey and black, respectively. Asterisks denote positions shared among the nuclear gene copies, but not the mitochondrial ones. Positions of introns are shown by arrows and open rectangles indicate motifs that distinguish the wheat chromosome 3 vs. chromosome 5 paralogous copies. The maize chromosome 1 copy is shown. The two changes generated by mitochondrial editing in brome and rice are shown in bold italic letters

Within the brome ψrpl2 coding region, there is an edit at the same position as the single site in rice, but not one within the rpl2-rps19 spacer (Kubo et al. 1996). Our RT-PCR experiments (Fig. 1b, lane 3) also showed no evidence for splicing of the truncated rpl2 intron in brome. This is not surprising since group II introns require intricate RNA folding for splicing competence (Bonen 2008). The region upstream of ψrpl2 in brome is virtually identical to that of bamboo (Ma et al. 2012), consistent with an ancestral-type promoter driving transcription of both ψrpl2 and rps19. There are other examples of pseudogenes being retained when located very close to functional genes (cf. rps14, Ong and Palmer 2006) and such sequences may serve a role in mRNA stability or translation. We had previously presented evidence for rpl2 gene transfer having occurred in the common ancestor of wheat and barley (Subramanian and Bonen 2006), and our data for brome, which confirm the presence of an rpl2 gene in its nucleus (Supplementary fig.S2), push the date of transfer back to earlier than 20 million years ago.

Brome also possesses a functional gene for the mitochondrial S19 ribosomal protein in the nucleus

To determine whether the situation in brome resembles that of rice, where the mitochondrial rps19 copy has persisted during evolution and the nuclear copy lost, we examined brome nuclear DNA using PCR primers designed from wheat genomic (Brenchley et al. 2012) and wheat/barley EST data. Unexpectedly we found that a functional rps19 gene is present in the nucleus (Fig. 2). It resembles counterparts in other grasses and has a conserved hsp70-type presequence containing two introns which are located at the same positions as in the hsp70 gene (cf. Fig. 3b). At the protein level, nine amino acids within the S19 core region are shared by the other nuclear-located copies but absent from mitochondrial ones (Fig. 2b, asterisks), so all evidence points to a single gene transfer in the common ancestor of grasses, rather than a recent independent event in the brome lineage. The brome nuclear-located rps19 gene shares 80 and 76 % identities with the mitochondrion-located copy at the amino acid and nucleotide levels, respectively. The brome nuclear-located rps19 gene is both transcribed and correctly spliced as determined by RT-PCR sequence analysis (Fig. 2a, Supplementary fig. S3A). The additional heterogeneous signals for rps19 compared to that for atpβ (Fig. 2a, lane 2 vs. lane 3) may in part reflect mis-priming related to the necessity of using an hsp70-type oligomer in PCR. Direct sequencing of the PCR and RT-PCR products within the rps19 core revealed several polymorphic positions (Supplementary fig. S3B), not surprisingly since brome is polyploid (Bromus inermis AAAABBBB; cf. Armstrong 1979).

Fig. 3
figure 3

Organization and expression of oat nuclear gene for mitochondrial S19 protein. a Schematic of oat nuclear gene with rps19 core (black), hsp70-type presequence exon 2 (hatched) and novel exon1/intron 1 (light grey block and thick grey line). In gel at right, lane 1 PCR product with primers 11+12, lane 2 RT-PCR product with primers 10+12. M denotes size markers. b Amino acid alignment of presequence regions of nuclear-located rps19 from oat [this study], Festuca [EST data], Lolium [incomplete EST data], as well as brome and maize rps19 and wheat hsp70 gene. Underlining shows where the core S19 begins. Identical amino acid positions are shaded in grey. Intron positions are shown by arrows and novel presequences in oat, Festuca and Lolium are in italics

In comparison of the brome nuclear-located rps19 gene to its counterparts in wheat, it showed more similarity to the chromosome 3 copy than to the chromosome 5 one. There are several signature amino acid residues that distinguish the wheat paralogues (Fig. 2b, open rectangles) and more numerous sites at the nucleotide level (Supplementary fig. S4). The chromosome 3 form more closely resembles the “native” mitochondrial rps19 gene, as do those of the other grasses in Fig. 2b, except for barley which is more similar to the wheat chromosome 5 copy (Supplementary fig. S4). Since the barley genome contains only one such rps19 gene (based on the nuclear draft genome and many EST entries), it appears that the duplication event pre-dated the wheat–barley lineage split, with a subsequent loss of the chromosome3-type rps19 gene in the barley lineage.

Rearrangement within the hsp70 presequence region of the rsp19 gene in the oat lineage

Because the mitochondrion-located rps19 gene in oats is known to be a pseudogene (Fallahi et al. 2005), we anticipated that a functional copy would be present in the nucleus. This is indeed the case (Fig. 3) and it shares the expected features with nuclear copies in other grasses, except within the extreme amino-terminus where a subsequent rearrangement has replaced the first exon and intron with sequences unrelated to any databank entries. Close relatives of oats also have this amino-terminal sequence, as deduced from EST data for Lolium and Festuca (Fig. 3b, italics), so we conclude that this exon shuffling event occurred in their ancestor, that is 15–20 million years ago (Catalán et al. 2004; Kellogg and Bennetzen 2004). Both of the potential initiation codons (Fig. 3b) are predicted to be able to target the S19 protein to the mitochondrion based on algorithms such as TargetP (Emanuelsson et al. 2007) and Predotar (Small et al. 2004). Our RT-PCR sequencing data confirm that the oat nuclear rps19 copy is expressed and properly spliced (Fig. 3a). Figure 3b also illustrates the similarity between the N-terminal sequences of S19 in grasses and those of the hsp70 gene.

Discussion

Our comparative analysis strongly suggests that the nuclear-located rps19 gene in present-day brome is derived from a gene transfer event which occurred in the common ancestor of Poaceae grasses, and that the mitochondrion-located rps19 gene in brome represents the “native” endosymbiotic-origin form. Both are actively transcribed and undergo the expected RNA processing events of splicing and editing, respectively. This period of co-existence is much longer than for the three other documented cases in plants which are in an “intermediate stage” of gene transfer. The Populus sdh4 mitochondrial and nuclear genes share such high similarity (95 % amino acid identity) that the authors concluded that it must reflect a very recent transfer (Choi et al. 2006). For the cox2 gene in legumes, transfer was estimated to have occurred during the evolution of the legume subfamily Papilionoideae (Adams et al. 1999) which would be about 25–30 million years ago (Stefanovic et al. 2009). In the case of rpl5 in wheat (Sandoval et al. 2004), our examination of EST data for relatives of wheat supports the view that transfer occurred in the common ancestor of wheat/barley/oats (Supplementary fig. S5), lineages which have a divergence time of approximately 25–30 million years (Kellogg and Bennetzen 2004; Chalupska et al. 2008).

Another documented case of functional gene copies being present in both compartments is atp9 in filamentous fungi, including Neurospora crassa (van den Boogaart et al. 1982) and Aspergillus nidulans (Brown et al. 1984), which diverged from a common ancestor over 200 million years ago (Taylor and Berbee 2006). These genes appear to have evolved specialized roles and are expressed either during vegetative growth or in germinating spores (Bittner-Eddy et al. 1994; Déquard-Chablat et al. 2011). Interestingly, in the Podospora lineage, duplicated copies of the nuclear-located atp9 gene perform those two roles and the mitochondrial atp9 gene copy has been lost (Déquard-Chablat et al. 2011). This illustrates that, even after very long periods of co-existence, events such as gene duplication can lead to redundancy of other paralogues (in this case the mitochondrial one) and its subsequent loss. The lineage-specific presence of multiple intronic maturase gene copies in the mitochondrion and nucleus of plants such as Selaginella (lycophyte) and Physcomitrella (moss) may also reflect a rather similar dynamic phenomenon (Guo and Mower 2013).

Figure 4 presents a scenario for the evolutionary history of the mitochondrial rps19 gene in grasses. After migration of a copy to the nucleus more than 60 million years ago, in its new environment the gene acquired an hsp70-type targeting sequence through exon shuffling as well as amino acid-altering differences from the mitochondrial copy that are shared among the grasses (Fig. 2b, asterisks), so presumably were acquired early during adaptation to the new environment and then were subject to evolutionary constraint. In certain lineages, the mitochondrial rps19 copy became a pseudogene (e.g. wheat, oats, Lolium) and in others it was lost (e.g. maize, barley). In contrast, in the rice lineage, the mitochondrial gene was retained and the nuclear copy lost. In the oat clade, a lineage-specific rearrangement event conferred a new distal amino-terminal targeting sequence in their ancestor that is about 15 million years ago (Kellogg and Bennetzen 2004). The duplication event leading to multiple copies in present-day wheat likely occurred after divergence of the wheat and oat lineages because EST databases did not reveal any evidence for multiple rps19 copies in the nucleus for members of the oat clade (oat/Festuca/Lolium). The maize nuclear genome has multiple rps19 gene copies;however, their close similarity (cf. only 5 amino acid substitutions in the S19 core compared to 15 between the wheat chromosome 3 and 5 paralogues) suggests either a recent independent duplication event or gene conversion that would confound tracing their evolutionary history. Interestingly, a search of the barley draft nuclear genome revealed a block of mitochondrial DNA containing rps19 and its flanking sequences (>850 bp, Supplementary fig. S6) and we confirmed this by PCR sequencing. Because it is an unedited form and lacks nuclear-type expression sequences, we conclude that this reflects a non-functional DNA-mediated transfer (Numt) to the nucleus and although not yet pseudogenized, it has five unique non-synonymous substitutions (Supplementary fig.S6). The present-day barley mitochondrial genome lacks an rps19 gene (based on our Southern hybridization experiments using probes from rps19 PCR products as well as oligomers), so its loss must have been quite recent. Therefore, the barley lineage appears to have been in a transition stage (Fig. 4, white bars on tree) for almost as long as the brome lineage (Fig. 4, thick black line on tree).

Fig. 4
figure 4

Scenario showing evolutionary history of the mitochondrial rps19 gene in grasses. Curved arrow indicates mitochondrion-to-nucleus rps19 gene transfer in ancestor of grasses, and period of co-existence of functional copies in both the mitochondrion and nucleus is shown by bold black line (brome lineage) or white open line (barley lineage) on phylogenetic tree. Table beside the tree shows the status of rps19 genes in the mitochondrion and nucleus for these grasses. Designations are as in Figs. 2 and 3, except that introns are shown as triangles. The rps19 gene copies most similar to the mitochondrial gene are shown in black, and the wheat chr5 copies (as well as the barley one) are shown as grey. Numbers indicate the lengths of rps19 homologous sequences. Ψ indicates pseudogene, while dash indicates absence. Data for rps19 status in the mitochondrion are from Fallahi et al. 2005, databank information (see Suppl. Table S1) and this study. Divergence times for the grasses shown on the tree are maize/wheat ~60 Mya, rice/wheat ~50 Mya, wheat/oat ~25–30 Mya, brome/wheat ~20 Mya, barley/wheat ~15 Mya and oat/Lolium ~15 Mya (Gaut 2002; Catalán et al. 2004; Kellogg and Bennetzen 2004; Chalupska et al. 2008)

It will be of interest to learn whether the brome mitochondrial ribosomes contain both forms of the S19 protein or whether there is variation under certain environmental or developmental conditions. In a broader context, the presence of multiple gene copies, regardless of whether they are located in the same or different genetic compartments, and regardless of whether they are generated by intragenomic DNA duplication or intracellular horizontal transfer, provides the opportunity for specialization or acquisition of new cellular functions. In this regard, it is perhaps worth noting that recent chloroplast-to-mitochondrion gene transfer events have resulted in ten intact ribosomal protein genes, including rps19, being present in the Vitis vinifera mitochondrial genome (Goremykin et al. 2009). This raises the possibility of their recruitment for a role in mitochondrial translation, analogous to certain chloroplast-origin tRNAs which have replaced endogenous tRNAs (reviewed in Huot et al. 2014), or for the creation of chimeric genes through homologous recombination. Moreover, since ribosomal proteins are well known for performing additional extra-ribosomal functions (Wool 1996), and considering the complexities of RNA processing events in plant mitochondria (reviewed in Hammani and Giegé 2014), it might not be surprising if certain ribosomal proteins undertook additional “moonlighting” roles.