Introduction

Plastid genome sequencing of gymnosperms species from Pinaceae and Gnetales reveals the lack of ndh genes. Several genes have been lost from the prokaryote plastidial genome and functionally integrated in the eukaryote host genomes through serial events of gene transfer during plant evolution, but others have been definitively lost. It is thus interesting from an evolutionary perspective to assess whether the ndh genes in plants lacking these genes were transferred to the nuclear or mitochondrial genomes during the course of evolution. Therefore we attempted to trace the ndh genes in the nuclear and organellar Norway spruce genomes by following an in silico based approach.

Plastid DNAs are the remnants genomes of free living cyanobacterial organisms that became endosymbionts in algae and plants as contemporary chloroplasts. The evolution of the plastid genome during the course of endosymbiosis, which occurred at least 1.2 billion years ago (Yoon et al. 2004), was accomplished by a dramatic reduction in gene content, and it is estimated that they encode only a vestigial sample of the genes present in their bacterial ancestors (about 5–10 %; Martin et al. 2002). In fact, plastid genomes include approximately 120 different genes in photosynthetic plants (Palmer 1991), which is only a handful of the thousands of genes present in modern cyanobacterial organisms (e.g., about 2600 putative genes in Synechococcus elongatus; Sugita et al. 2007).

During the course of plastid evolution most genes were definitively lost from the endosymbiont cyanobacterial genome; however, many genes lost from this genome were early relocated and functionally integrated, to the host nuclear genome through serial events of endosymbiotic gene transfer (Deusch et al. 2008). Moreover, the permeability of the plastid genome to whole or partial (e.g., introns) gene turnover, including both gains and losses, is a molecular feature characterizing land plant evolution and has often occurred independently in unrelated lineages and in contrasting evolutionary times (Martin et al. 1998; Millen et al. 2001).

One suite of genes for which recurrent losses (both, physical and functional) from plastid genomes of land plant have been documented is the ndh family. The plastid ndh genes encode the 11 subunits of the thylakoid Ndh complex (Maier et al. 1995; Sazanov et al. 1998; Casano et al. 2000) which is analogous to NADH-dehydrogenase (nicotinamide adenine dinucleotide dehydrogenase, EC 1.6.5.3) or complex I (electron transport chain) of mitochondrial respiratory chain that catalyzes the transfer of electrons from NADH to plastoquinone. The NDH complex seems to be primarily involved in the transfer of electrons from stromal reductants to a plastoquinone pool, a process commonly known as chloro-respiration (Peltier and Cournac 2002). In addition, involvement of this complex in photo-oxidative stress reduction in high light intensity, regulation of photosynthesis by modulating the activity of cyclic electron flow around photosystem I, and/or leaf senescence regulation have also been suggested (Martin and Sabater 2010).

The 11 plastid ndh genes (ndhA to ndhK) are known to be widespread in the autotrophic land plant lineages (Neyland and Urbatsch 1996) and remain highly conserved over large evolutionary distances, indicating the presence of strong selection pressure for their retention. To some degree, expected losses or pseudogenization of ndh genes from the plastid DNA of saprophytic and non-photosynthetic, parasitic bryophytes (Aneura mirabilis, Wickett et al. 2008) and seed plants (Cuscuta sp., Haberhausen and Zetsche 1994; Braukmann et al. 2013; Epifagus virginiana, Wolfe et al. 1992; Neottia nidus-avis, Logacheva et al. 2011; Rhizanthella gardneri, Delannoy et al. 2011) have been documented, concomitant to a loss of function of additional genes involved in photosynthetic processes.

Surprisingly, however, the partial or complete copy losses of plastid ndh genes in several photosynthetic seed plants have been reported in distantly related lineages showing contrasting ecological and life habit features, including trees (Pinus sp. and others conifers, Wakasugi et al. 1994; Braukmann et al. 2009), shrubs (Ephedra sp., Braukmann et al. 2013), climbers (Gnetum parvifolium, Wu et al. 2007), long lived perennials (Welwitschia mirabilis, Mccoy et al. 2008), epiphytes [Erycina pusila, (Pan et al. 2012), Oncidium sp. (Wu et al. 2010), Phalaenopsis aphrodite (Chang et al. 2006)], terrestrial short-lived herbaceous (Erodium sp., Blazier et al. 2011) and aquatic species (Najas flexilis, Peredo et al. 2012). Recently, it has been reported that deletion and truncation of ndh gene fragments are common to orchid genomes placed among four orchid subfamilies (Lin et al. 2015).

Several authors have postulated that the ancestral, functional ndh gene copies may have been transferred to the nuclear genome (Chang et al. 2006). Sequence migration from the plastid genome to the nucleus may have been driven directly or through a first integration to the mitochondrial genome and an eventual further incorporation to the nuclear genome, as reported for some plastid DNA fragments (Notsu et al. 2002).

Recent findings based on transcriptomic data suggest that the mitochondrial genomes of about two dozen of seed plant species contained some portions of ndh genes (Lin et al. 2015), indicating that non-plastidial ndh genes exist and may be transcribed. In addition, no transcripts were detected across several angiosperms and gymnosperm species that could represent transferred plastid ndh genes to the nucleus (Ruhlman et al. 2015). However, results of whole-cell transcriptomes should be interpreted with caution, as ndh gene expression may be tissue-specific and ndh genes may be epigenetically silenced to such extent that the experimental setup may not be sensitive to detect them.

Chloroplast genome sequencing shows that many gymnosperms species from Gnetales and Pinaceae have lost the plastid ndh genes (Wakasugi et al. 1994; Braukmann et al. 2009). In this paper we use Picea abies L. Karsh (Norway spruce) as a model to assess the evolutionary fate of ndh genes in plants lacking functional plastid copies (Braukmann et al. 2009). We searched the P. abies genome in order to assess the presence of ndh genes in the nuclear and mitochondrial genomes based on conserved domains of the ndh peptides. Our main aim was to assess the evolutionary fate of ndh genes in Norway spruce to address their genomic location(s), structure, integrity and functionality. The eventual finding of functional ndh genes in the nuclear (or mitochondrial) genome would reinforce previous thoughts about their key role in plant cellular metabolism, and may provide insights about the evolutionary dynamics of their organellar transfer and how much their gene structure and evolution has been modulated under a nuclear environment.

Materials and methods

We considered ndh genes from the plastidial genome of four gymnosperm (Cryptomeria japonica, Cycas revoluta, Ginkgo biloba, Podocarpus totara) and a single angiosperm species (Arabidopsis thaliana) to trace putative homologs in the nuclear and organellar Norway spruce genomes, which were recently sequenced (Nystedt et al. 2013) and available through ConGenIE (http://congenie.org/). NDH protein residues used for the homology search in ConGenIE were downloaded from GenBank databases (AP009377.1, NC_020319.1, NC_016986.1, NC_020361.1, and AP000423.1). Due to the highly conserved nature of the ndh gene sequences among plants (Martin and Sabater 2010), we obtained similar hits when all five species were considered. However, we chose the gymnosperm P. totara to refine sequence alignments as it is one of the phylogenetically closest species to P. abies (Chaw et al. 2000). Since protein sequences and domains are evolutionarily more conserved than nucleotide sequences, we used the tBLASTn facility from the Basic Local Alignment Search Tool (BLAST; http://www.ncbi.nlm.nih.gov/BLAST) to detect regions of local similarity to ndh genes in the spruce genomes. All the results obtained from the P. abies genomes were subsequently analyzed by performing homology search for finding ndh specific conserved domains using conserved domain search [CD-search (Marchler-Bauer and Bryant 2004)]. Only those sequence fragments from Norway spruce containing the ndh specific conserved domains were considered for the analysis in this work to ensure that the fragment is a part of the specific ndh gene, as it has been reported that several domain architectures may be shared across multiple proteins (Marchler-Bauer and Bryant 2004).

We also extracted the flanking sequences (3000 bp upstream and downstream from the ndh domain fragment) from the ndh specific hits and analyzed for the presence of transposable elements (TEs) using RepeatMasker Open-3.0 (developed by A. F. A. Smit, R. Hubley, and P. Green—http://www.repeatmasker.org/). The default value of maximum divergence allowed by RepeatMasker was 35 %. Upstream regions were also screened for the presence of promoters, TATA boxes and enhancers using the TSSP/Prediction of Plant Promoters (TSSP: Transcription Start Sites in Plants, SoftBerry: http://www.softberry.com, Shahmuradov et al. 2003) web interface.

Results

Homology with ndh genes

Fragments of many ndh genes, except ndhE and ndhG, were detected in the nuclear, mitochondrial and plastidial genomes of Norway spruce (Table 1). Sequence features of the ndh gene fragments identified are included in the Online resource 1. Overall, a total of 55 short sequences showing significant amino acid and conserved domain similarity to the respective ndh genes were detected using CD-search, of which 47 hits were in the nuclear, three in the mitochondrial, and five in the plastidial genomes. Only three ndh genes were located both in the nuclear and either the plastidial (ndhD, ndhH) or mitochondrial genomes (ndhC).

Table 1 In silico detection of partial ndh genes in the Norway spruce genomes

Forty-three out of the 55 sequences showing homology to ndh genes showed premature stop codons within the ORFs. One of the fragments of ndhB was found in duplicate and one of the fragments of ndhI was found in triplicate, in the spruce nuclear genome, while all the other ndh gene fragments detected were singletons. The two ndh operons present in the plant plastidial DNA showing functional ndh genes were not preserved in the Norway spruce genomes and their collinearity could not be confirmed. The first plastid operon, ndhH-D (ndhH, ndhA, ndhI ndhG, ndhE, and ndhD) was physically disrupted by the complete absence of the ndhG and ndhE genes, the transfer of ndhA and ndhI, and by the partial transfer of the ndhH and ndhD genes to the nuclear genome. Similarly, the second operon (ndhC, ndhK, and ndhJ) was disarrayed due to the transfer of ndhC gene to the mitochondrial and nuclear genomes, and the transfer of ndhK and ndhJ to the nuclear genome.

Regulatory elements

Regulatory transcriptional elements, like promoters, TATA boxes and enhancers were detected in 30 upstream regions of ndh fragments found in the nuclear (27 fragments) and mitochondrial genomes (three fragments), while 25 fragments found in the plastidial and nuclear genomes did not show the presence of any canonical regulatory elements (Online resource 1.pdf).

In the nuclear genome, only two upstream regions showed the joint presence of promoter, TATA box and enhancer elements (ndhI). However, both promoters and TATA boxes were found in 24 upstream regions of eight ndh genes (ndhA–ndhD and ndhH to ndhK). Lastly, promoter alone was found in a single upstream region belonging to ndhI. With reference to the mitochondrial genome, two upstream regions showed the presence of promoters and TATA boxes (ndhC) and only promoter elements was detected in one upstream region (ndhF).

In the nuclear genome, the average distance of the promoter and TATA box from the core ndh domain was −1335 bp (ranging from −70 to −2818 bp) and −1343 bp (ranging from −106 to −2843 bp), respectively. Shorter average distances for promoter and TATA box were detected in the gene fragments from the mitochondrial genome, −871 bp (ranging from −267 to −1366 bp) and −1208 bp (ranging from −1015 to −1401 bp), respectively. The nuclear enhancer element found in the upstream region of two ndhI domain sequences was located at an average distance of −2128 bp.

Transposable elements

We found fragments of TEs in the upstream regions of 27 ndh domain sequences in the nuclear genome, where the average position of TE elements from the ndh domain was −1572 bp (ranging from −51 to −2967 bp) and the average size of the TE was 514 bp (ranging from 32 to 2315 bp). No TEs were detected in the upstream regions of ndhC domain sequences in the nuclear genome.

TE fragments were also detected in the downstream regions of 20 ndh domain sequences in the nuclear genome, where the average position of TE from the ndh domain was +1047 bp (ranging from +46 to +2917 bp) and the average size of the TE was 432 bp (ranging from 57 to 1696 bp). No TEs were detected in the downstream regions of ndhH domain sequences in the nuclear genome. Eleven fragments (ndhA, ndhB, ndhD, ndhI, ndhJ) in the nuclear genome were detected with TEs both in the upstream as well as the downstream flanking regions, out of which eight fragments (ndhB, ndhI, ndhJ) also possessed a promoter and a TATA box. The plastidial ndh gene fragments did not show any TEs in their upstream and downstream regions. However, the mitochondrial ndhC gene fragment contained LTR/Gypsy sequences (316 bp) in its downstream part, which was at the distance of +369 bp from the ndh domain. The different TEs types identified in the vicinity of the ndh gene fragments belonged to LTR, LTR/Copia, LTR/Gypsy, and LINE superfamilies and DNA-TE (class II DNA-TE, e.g., CACTA, MuDR, hAT, etc.).

Stop codons

Stop codons were detected in 35 nuclear fragments (74 %) and all the fragments from mitochondria (three fragments) and plastid (three fragments) showing homology with ndh genes.

Discussion

A recent work (Ruhlman et al. 2015) reports the loss of ndh gene expression from angiosperm and gymnosperms species, including Pinaceae, where P. abies belongs. This approach, however, searched solely for the presence of expressed, and bona fide, functional ndh genes and interacting proteins revealing the absence of functional nuclear copies of plastid ndh sequences.

Our results, based on the search for the conserved domains specific to ndh genes in the genomes of P. abies (including the nuclear and organellar transcriptomes), also did not shown the presence of any functional ndh gene. Surprisingly, however, fragments of ndh genes were detected in the nuclear genome, suggesting past gene transfers from the plastid to the nucleus (and mitochondrion). Whereas apparent ndh sequence translocation from the plastid to the mitochondrion has been postulated in distantly-related plant vascular lineages (Lin et al. 2015), no previous findings of plastid-nucleus transfer of homologous ndh gene fragments have been reported. Interestingly, most of the nuclear-encoded ndh fragments were located in genome environments showing regulatory elements such as TATA box and enhancers in their upstream regions, although the ORFs fragments contained stop codons.

These evidences support the hypothesis that, at least in Picea, ndh translocations from the plastid to the nuclear genome have occurred, and that there might have been a functional machinery at some time during evolution to accommodate them within a nuclear-encoded environment, or attempts to form it.

The fate of ndh genes in Picea: a complex evolutionary tale

In plants, the process of gene translocation between plastidial, mitochondrial and nuclear genomes has been known for decades and it has been substantiated to rely on a very strong evolutionary basis (Martin et al. 1998; Martin 2003). The losses of ndh genes from the plastidial genomes of plants have been traditionally viewed as unusual genomic events that have took place in distantly related lineages and in diverse evolutionary times (Blazier et al. 2011). These rare genomic changes (RGCs) provide additional markers with enormous potential for molecular phylogenetics (Rokas and Holland 2000). It has been suggested that the ndh genes are lost as a suite and, from a phylogenetically perspective, the entire suite should be considered as a single evolutionary loss and not as 11 independent losses (Bungard 2004; Krause 2008). Our results, on the contrary, suggest that this may be a simplistic view and that in Picea the fate of ndh genes involved a complex and dynamic scenario that cannot be surrogated under a single evolutionary event in all likelihood. First, remnants of the nhdD and ndhH genes are still recognizable in the plastid genome (Table 1), and mirror those found in other species, where different patterns of pseudogenization have been reported within plastid genomes (Blazier et al. 2011; Braukmann et al. 2013). Second, the presence of three gene fragments in two Picea genomes (ndhD and ndhH in both plastid and nuclear genomes, and ndhC in mitochondrial and nuclear genomes; Table 1) argues against the hypothesis of a simple evolutionary loss of all plastidial genes. Altogether, these evidences suggest that the loss of sequence and functionality, and the intergenomic translocation events or each ndh-coding polypeptide may have followed an independent pathway and evolutionary pace. This cautions against the use of ndh gene losses in the plastidial genome as markers showing strong phylogenetic inference unless the sequences of the nuclear and mitochondrial genomes from the same species be known.

The facts that in the Picea nuclear genome (1) the ndh gene fragments are located in separate genomic locations, (2) sequence duplications (even triplications for ndhI) are present for several ndh fragments, (3) regulatory transcriptional elements are present in the upstream regions of 55 % of the ndh fragments, and (4) premature stop codons are found in 78 % of the sequences showing homology to ndh genes, altogether suggest a more complex scenario of their intergenomic behaviour than those reported for other translocated plastidial genes to the nucleus (Bock and Timmis 2008).

In fact, reports of other transferred plastidial genes to the nucleus suggest a putative conservation of their function in the new genomic environment and a lack of pseudogenization of their sequences. Our observations suggest that all the ndh nuclear-encoded fragments are pseudogenes and may require further research. Pseudogenes are found in prokaryotes and eukaryotes, and are defined as genomic segments which have high sequence similarity to the corresponding functional genes which have lost the coding ability either due to the presence of premature stop-codons or frame shift/s. Earlier it was proposed that pseudogenes are non-functional, but with the advent of next-generation sequencing technologies, it has been revealed that pseudogenes are important not only from evolutionary point of view, but they may be involved in regulatory functions of their ancestral, functional genes (Tutar 2012; Li et al. 2013). However none of the fragments found in the plastid showed presence of regulatory elements, although they showed the presence of stop codons, and the mitochondrial and nuclear fragments showed both, regulatory elements as well as stop codons.

Functional plastid ndh genes are dispensable in photosynthetic plants

The in silico screening of the nuclear and organellar genomes of Norway spruce convincingly support the view that sequences homologous to ndh genes from seed plants are present in this species. However, the facts that (1) fragments of the ndh genes that have been detected may code for a very small portion of amino acid residues forming the ndh polypeptides, (2) the fragments recovered have a low percentage of amino acid identity with the functional ndh genes, and (3) premature stop codons are present in the corresponding ORF fragments located in the nuclear genome, strongly suggest that they are non-functional ndh genes and are probably pseudogenes. These evidences, together with the absence of conserved homologous domains specific to ndh genes in the P. abies transcriptome (data not shown), support the hypothesis that the Ndh complex is not functionally operating in the thylakoids of P. abies. This agrees with the results of Ruhlman et al. (2015) about its limited biological significance in extant plants, in contraposition with early opposite views (Martin and Sabater 2010) and rejects former hypothesis suggesting that the loss of plastid ndh genes in non-parasitic and non-saprophytic plants is concomitantly linked to both, their transfer to the nucleus and to the maintenance of their functionality in this new genome location (Chang et al. 2006).

Our work suggests that in P. abies, the translocation of ndh genes from the plastid to the nuclear genome occurred during the course of evolution. We speculate that there might have been a functional ndh machinery within the nuclear-encoded environment, or attempts to form it, at some time during evolution. Presently, ndh genes in the nuclear genome show signatures of pseudogenization and are likely to be nonfunctional but a definitive conclusion warrants further investigation (Podlaha and Zhang 2010).