Introduction

Plant mitochondrial genomes are generally known for their highly conserved coding sequences but rapidly changing gene orders and coexisting genomic arrangements (Knoop 2004; Mackenzie et al. 1994; Mackenzie and McIntosh 1999; Ogihara et al. 2005; Sugiyama et al. 2005). Plant mitochondrial DNA (mtDNA) may even vary in structure between isolates of angiosperm species, for example, among the ecotypes of the model plant Arabidopsis thaliana (Ullrich et al. 1997) or among isolates of common beans or soybeans (Arrieta-Montiel et al. 2001; Moeykens et al. 1995). However, such an ongoing, highly frequent recombination of mtDNA may be an evolutionary gain only after the rise of vascular plants (tracheophytes). Groth-Malonek and colleagues have recently reported that an ancient gene cluster, the nad5-nad4-nad2 gene arrangement, found in the alga Chara, is universally conserved among liverworts, mosses, and hornworts (Groth-Malonek et al. 2007a). While the nad5-nad4 intergenic region showed size increases to varying degrees in the three bryophyte divisions (e.g., ∼500 base pairs [bp] in mosses and up to more than ∼3,000 bp in hornworts), the nad4-nad2 spacer was strikingly conserved with its tiny size of only 26 bp across all bryophyte clades.

We have investigated another gene cluster that was found conserved when the completely sequenced chondriomes of the alga Chara vulgaris (Turmel et al. 2003), the liverwort Marchantia polymorpha (Oda et al. 1992b), and the moss Physcomitrella patens (Terasawa et al. 2006) were compared: the trnA-trnT-nad7 region. Besides the significantly larger size of the intergenic region between trnA and nad7 in Marchantia (1866 bp, compared to 528 bp in Physcomitrella and 124 bp in Chara), the trnT gene is present in inverted orientation in the liverwort. Moreover, the functional nad7 gene resides in the Marchantia nuclear genome (Kobayashi et al. 1997), while six stop codons render the mitochondrial copy a transcribed pseudogene (Takemura et al. 1995). Groth-Malonek and colleagues have recently shown that nad7 is retained as a pseudogene in all liverwort lineages but is an apparently intact and expressed gene in Haplomitrium (Haplomitriopsida), which is a representative of the sister lineage (Haplomitriopsida/Treubiopsida) to all other liverworts (Forrest et al. 2006; Groth-Malonek et al. 2007b).

The conservation of nad7 as a pseudogene and the enigmatic upstream inversion of trnT in Marchantia relative to the alga and the moss prompted us to investigate the molecular evolution of this region in related taxa. Whereas the trnA-trnT-nad7 region is conserved in this order and direction of transcription in diverse mosses, we find that it has experienced major changes among the liverworts, which include insertions of pseudogene fragments and noncoding sequence copies, subsequent inversion of trnT among marchantiid liverworts, and, finally, independent losses of trnT in the two major liverwort clades. These events of molecular evolution can be conveniently mapped onto the consensus phylogeny of liverworts as the sister clade to all other land plants (embryophytes) which is now clearly emerging after a series of recent molecular phylogenetic studies (Crandall-Stotler et al. 2005; Davis 2004; Forrest et al. 2006; Forrest and Crandall-Stotler 2004, 2005; Heinrichs et al. 2005; Qiu et al. 2006). Our new findings document significant genomic plasticity in the mtDNA of marchantiid liverworts and, as such, stand in contrast to the hitherto observed strong conservation of mitochondrial coding sequences (Beckert et al. 1999; Dombrovska and Qiu 2004; Forrest et al. 2006; Forrest and Crandall-Stotler 2005; Pruchner et al. 2002), the apparent and exclusive absence of RNA editing in this early land plant clade (Steinhauser et al. 1999), and, most importantly, the absence of active, ongoing genomic mtDNA recombination in Marchantia (Oda et al. 1992a, b; Oldenburg and Bendich 1998).

Materials and Methods

Total nucleic acids were extracted using the CTAB (cetyl-trimethyl-ammonium bromide) method or a plant DNA extraction kit (Macherey-Nagel, Düren, Germany). To PCR amplify the trnA-nad7 region, the primers trnAfor (5′-tcggttcaavtccgatcgtctcca-3′) and nad7back (5′-accatgagcagcwggrtgttgagg-3′) were used. PCRs usually contained 2.5 μl 10× PCR buffer, a 250 μM concentration of each dNTP, a 1 μM concentration of each primer, 1 U of DNA polymerase, 1 μl of DNA, and double-distilled water added up to 25 μl. Either Genaxxon Taq DNA polymerase S (Biberach, Germany) or the Triple Master PCR System (Eppendorf, Hamburg, Germany) and the respective buffers supplied by the manufacturers were used. Typical amplifications were performed with 95°C for 3 min as the denaturation step, followed by 35 cycles at 95°C for 30 s, 50°C for 30 s, 72°C for 3 min, and a final elongation step at 72°C for 7 min. Gel-purified PCR fragments were cloned into pGEM-Teasy (Promega, Mannheim, Germany) and sequenced commercially by Macrogen, Inc. (Seoul, Korea). Nucleotide sequences were edited and aligned manually using MEGA 4.0 (Kumar et al. 2004). Phylogenetic trees were calculated either with the neighbor-joining algorithm (Tamura three-parameter, pairwise deletion, and bootstrap with 10,000 replicates) with MEGA or by Bayesian phylogenetic analyses using MrBayes v3.1.2win (Ronquist and Huelsenbeck 2003) with the GTR + G + I model, partitions unlinked, for 10 million generations, with every 100th tree sampled (burn-in = 1000). Burn-in was determined as stationarity in the log likelihood plots based on the summarizing parameters of the MrBayes output. Folding of tRNAs was done with RNAfold (http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi) and subsequent manual editing. The BLAST search against the complete mitochondrial genome of Marchantia polymorpha (NC_001660 [Oda et al. 1992]) was conducted using the Local BLAST tool implemented in BioEdit v7.0.5.3 (Hall 1999).

Results

We first wished to check whether the conservation of gene orders trnA-trnT-nad7 in this direction of transcription in the alga Chara vulgaris and the model moss Physcomitrella patens was just a coincidental exception (Fig. 1). To this end we designed primers in trnA and nad7 (Fig. 1) and used them in PCR amplifications over a taxonomically wide sampling of moss DNAs (Table 1). The PCR products we retrieved successfully showed only a moderate variation in size (not shown), mostly similar to what was to be expected for Physcomitrella (534 bp), with only exceptional size reductions in Mnium, Pohlia, Takakia, and Leskea (Table 1). PCR amplification products were cloned and their nature was verified by sequencing. In all cases, trnT was found to be present in the same direction of transcription as in Chara and Physcomitrella, initially suggesting that this gene arrangement is likely conserved among early land plants as similarly observed before for the mitochondrial nad5-nad4-nad2 gene continuity (Groth-Malonek et al. 2007a).

Fig. 1
figure 1

Conservation of the mitochondrial trnA-trnT-nad7 region in the alga Chara vulgaris, the moss Physcomitrella patens, and the liverwort Marchantia polymorpha (with trnT in inverted orientation) allowed for the design of oligonucleotide primers (arrowheads) bordering a PCR amplicon (bottom) to probe conservation of the gene arrangements in other bryophyte taxa via PCR amplification. The indicated sequence numbering follows the complete mtDNA database entries for Chara vulgaris (NC_005255), Physcomitrella patens (AB251495), and Marchantia polymorpha (NC_001660), respectively. Drawing is not to scale

Table 1 List of sequence accessions for the taxa under investigation

Subsequently, we amplified the trnA-nad7 region for 11 jungermanniid and 15 marchantiid liverworts. In contrast to mosses, the analysis of PCR products immediately suggested significantly greater size variation in liverworts, ranging from ∼700 bp in some marchantiid taxa, to ∼900 bp for most jungermanniid taxa, up to nearly 2,000 bp in other marchantiids, a size fitting the expectation for Marchantia (Fig. 2). The nature of the amplification products was again verified through cloning and sequencing, which revealed intergenic distances between trnA and nad7 ranging from 536 bp in Porella up to 1,868 bp in Lunularia (Table 1). Among the jungermanniid liverworts the trnA-nad7 intergenic region reached up to 903 bp in Noteroclada (Table 1), but none of the trnA-nad7 intergenic regions in 11 jungermanniid species contained a trnT gene (Fig. 3).

Fig. 2
figure 2

Exemplary PCR amplification assays of the trnA-nad7 spacer region with DNA from Sphaerocarpos (Sph), Asterella (Ast), Oxymitra (Oxy), Riccia (Ric), and Trichocolea (Tri). The identities of the respective major amplicon products (sizes indicated) were subsequently verified through cloning and sequencing. Occasional, minor accompanying bands were gel-excised, cloned, and sequenced as well, but turned out to be products of nonspecific mispriming in all cases

Fig. 3
figure 3

The order of trnA, trnT, and nad7 in this direction of transcription in Chara and Physcomitrella turned out to be conserved in all mosses now investigated (not shown) but only in Blasia among the liverworts (Marchantiopsida 1). Sequence numbering starts with the first nucleotide following trnA for the selected exemplar taxa as indicated except for Marchantia, which is as in Fig. 1. The major part of the trnA-nad7 intergenic region including trnT is inversed in orientation in one group of marchantiid liverworts (Marchantiopsida 2). The gene for trnT is absent between trnA and nad7 in all Jungermanniopsida investigated, in a subclade of marchantiid liverworts (Marchantiopsida 3), and in Apotreubia (Haplomitriopsida). Two larger sequence stretches (light gray) are pseudogene fragment copies of nad5 and rps7. Several other regions (dark-gray boxes labeled REPm) are composed of sequence fragments with significant similarity to other noncoding regions repeated numerous times elsewhere in the Marchantia mtDNA, indicating recombination events (e.g., in introns cobi372, cox1i511, and nad4Li100 and in intergenic regions atp9-trnC, rps11-rps1, trnS-trnL, nad3-trnV, and cob-nad9, respectively). REPeq indicates a sequence motif repeated upstream of trnE and of trnQ, respectively. A large sequence insertion in Apotreubia (white box) without significant similarity to any sequence currently in the database and the inverted sequence in Marchantiopsida are flanked by imperfect direct or inverted sequence repeats (arrowhead), respectively, as indicated. The double-arrowhead indicates a 36-bp imperfect palindrome sequence downstream of trnT in Marchantiopsida (AAAGCRAGTGTTTTTTTMKAAAAAARCACTYGCTTT). Drawing is not to scale

Among the marchantiid species significantly different results were obtained. In Blasia, the presumed sister genus to a clade of all other marchantiid taxa (Forrest et al. 2006), a trnT gene was identified in the same orientation as trnA and nad7, i.e., as in the mosses and in the alga Chara (Fig. 3). The trnA-trnT spacer had a length of 926 bp, the trnT-nad7 spacer of 850 bp (Table 1), thus revealing a significant size increase of the intergenic regions compared to Chara and the mosses.

In a further four of the marchantiid taxa investigated here (Bucegia, Lunularia, Riella, and Sphaerocarpos), trnT was identified between trnA and nad7, but in all these cases its coding sequence was inverted compared to Blasia (Table 1), thus reflecting the Marchantia situation (Fig. 3; Marchantiopsida 2). Similar to all jungermanniid taxa, trnT and surrounding sequences are lacking altogether from the other 10 marchantiid species in our taxon sampling (Table 1), obviously due to a major deletion in the trnA-nad7 intergenic region (Fig. 3; Marchantiopsida 3). Finally, we strived to include a representative taxon from the Haplomitriopsida as well, given that this class is now well supported as an ancient lineage of liverworts, sister to the dichotomy of the marchantiid and jungermanniid clades. We were able to obtain a PCR product for the trnA-nad7 region from Apotreubia nana as a representative taxon for this group. Upon cloning and sequencing we found that the large intergenic spacer in Apotreubia carries a unique sequence insertion without similarity to any other sequences in the database but that trnT is absent between trnA and nad7 (Fig. 3).

Detailed sequence analyses revealed that several stretches in the liverwort trnA-nad7 spacers share significant similarities with other regions of the Marchantia mtDNA. Most notable is a highly conserved sequence stretch of ∼100 bp copied from the central coding region of the rps7 gene encoding protein 7 of the small ribosomal subunit (with an internal deletion of about 90 bp) located 15 bp upstream of nad7 in all liverworts (Fig. 3). Interestingly, another intergenic region in the Marchantia mtDNA (rps1-nad4L) carries a corresponding rps7 pseudogene fragment completely including this rps7 homology, but without the internal deletion observed upstream of nad7. A yet larger pseudogene fragment of ∼200 bp derived from the 5′ part of the mitochondrial nad5 gene is inserted upstream of trnT in Blasia and took part in the major sequence inversion including trnT observed in some marchantiid species (Fig. 3; Marchantiopsida 2). Other regions carry sequence elements of 20–70 bp repeated elsewhere in noncoding regions (introns and spacers) of the Marchantia mtDNA, suggesting ancient recombination events on evolutionary timescales. One such sequence element repeated upstream of tRNA genes trnE and trnQ in the Marchantia mtDNA (REPeq in Fig. 3) also participated in the major sequence inversion in Marchantiopsida 2.

The large sequence inversion in Marchantiopsida 2 is precisely bordered by a perfect 21-bp inverted repeat motif in Blasia. Other sequence stretches composed of homologous sequence stretches repeated multiple times in noncoding regions of the Marchantia mtDNA (boxes labeled REPm in Fig. 3) may have functionally contributed to this recombination event given their locations flanking the large sequence inversion, possibly by creating the necessary homologous stretches as substrates for recombination.

Interestingly, the large sequence insertion found in Apotreubia is flanked by a 22-bp motif (Fig. 3), in this case present as an imperfect direct sequence repeat. Similarly extended sequence repeats are not identified as flanking the large deletion event in Marchantiopsida 3, which completely removed most parts of the inverted region, resulting simply in a short run of guanidine nucleotides (Fig. 3).

The trnA-nad7 regions of all marchantiid liverworts except Blasia can easily be aligned, leaving a large gap of more than 1200 bp in the taxa without trnT. Likewise, most of the spacer sequence in the jungermanniid liverworts can be aligned with the marchantiid species except for a unique region of about 130 bp that is not found in the latter.

A reasonably well-resolved liverwort phylogeny on the basis of molecular data is available (e.g., Forrest et al. 2006), which can now be used to trace the series of molecular rearrangements in the evolution of the trnA-nad7 region in this early land plant clade. We have here used the available sequence data for the mitochondrial nad5 gene and the chloroplast rbcL and rps4 genes (with some gaps filled in the course of this study) to reconstruct the phylogeny for a taxonomically congruent data set (Fig. 4). Most likely, the ancient trnA-trnT-nad7 arrangement existed when liverworts emerged but has only survived in Blasia, the sister lineage to a clade of all other marchantiid liverworts. The loss of trnT does not characterize a monophyletic group but has likely occurred three times independently: in the lineage of jungermanniid liverworts, in a monophyletic clade of derived marchantiids, and in the Apotreubia lineage (Fig. 4). The five taxa of marchantiid liverworts with an inverted trnT represent a basal, paraphyletic grade.

Fig. 4
figure 4

A Bayesian phylogenetic tree of the liverworts under investigation based on fused nad5, rbcL, and rps4 gene data sets (GTR + G + I substitution model, two parallel runs for 10 million generations each, every 100th tree sampled, 1,000 trees discarded as burn-in). Posterior probabilities are indicated as percentage node supports in the phylogram where at least 90. Branch lengths are means of the branch length posterior probability distribution of all sampled trees. The filled circle indicates inversion of a major part of the trnA-nad7 spacer including trnT in the marchantiid liverworts. Open triangles indicate independent secondary losses of trnT and flanking sequences in Apotreubia, in the jungermanniid, and in the derived marchantiid liverworts, respectively

To explore the potential utility of the now investigated trnA-nad7 intergenic region as a candidate phylogenetically informative locus, we constructed phylogenetic trees based on the spacer sequence data set. In an approach using the complete alignment encompassing all taxa, the five basal marchantiids with inverted trnT were, somewhat expectedly, artificially retrieved as monophyletic due to the large number of shared characters in the inverted sequence region (not shown). On the other hand, well-resolved phylogenetic trees were retrieved for the taxonomic subsets with the same arrangements of the intergenic region, i.e., the jungermanniid taxa (Fig. 5A), and the marchantiid taxa with the large inverted trnT region (Fig. 5B). Phylogenetic resolution, however, was significantly lower for the derived marchantiid taxa (Fig. 5C), featuring a much smaller intergenic region lacking the (inverted) trnT sequence stretch, in line with the extreme primary sequence conservation that had been observed for marchantiids before. Compared to the now reasonably well-supported liverwort phylogenies (or see Fig. 4), it can be stated that despite the small taxon sampling, the trnA-nad7 region may be a welcome addition in phylogenetic analyses for the remaining unanswered questions among jungermanniid or basal marchantiid taxa.

Fig. 5
figure 5

Phylogenetic trees based on alignments of the trnA-nad7 regions of jungermanniid liverworts (A), of basal marchantiid liverworts with inverted trnT (B), and derived marchantiid liverworts without trnT in the intergenic region (C). Blasia was included in the alignment to derive the phylogeny under B with an artificially inverted nad5-trnT fragment. Shown are neighbor-joining trees using Tamura three-parameter corrected distances (pairwise deletion) with bootstrap support from 1,000 replicates as a conservative measure of node reliability

The secondary structure of trnT(ggu), which was found to be subject to genomic recombination, remains largely unaffected in the majority of taxa (Fig. 6). An extra 3-bp hairpin arm between the anticodon and the pseudouridine arm is conserved in all species, and only minor nucleotide exchanges of nonpaired regions were observed in positions 15, 16, 26, and 32 in some mosses and Chara, and base-paired positions 47 and 66 are alternatively part of G-C or G-U base pairs, respectively (Fig. 6). Somewhat more conspicuous is the U-to-C transition in position 4 in Chara vulgaris (shared by Takakia lepidozioides and all liverworts), leading to a C-A mismatch in the acceptor stem, and the U-to-G transversion in the acceptor stem of Sphaerocarpos, which potentially impede the function of trnT to functionally decode ACY threonine codons.

Fig. 6
figure 6

Cloverleaf secondary structure of trnT(GGU) obtained using the RNAfold WWW service (http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi), with nucleotide positions showing exchanges in certain taxa highlighted in boldface. 2-Sphaerocarpos—G; 4-Chara, Takakia, and all liverworts—C; 5-Chara—A; 15-Dawsonia and Pogonatum—A; 16-Chara—G; 26-Takakia—G; 32-Chara—T; 33-Blasia—C; 42-Dawsonia—G; 47-Marchantia, Sphaerocarpos, Riella, Lunularia, and Blasia—C; 55-Sphaerocarpos and Riella—G; 66-Dawsonia, Pogonatum, Takakia, and liverworts—C

Discussion

Liverworts are now supported unequivocally as a monophyletic clade by molecular data and, as such, are also reasonably well supported as the sister group to all other land plants (Qiu et al. 1998, 2006). Marchantia polymorpha is widely considered as the prototype liverwort. However, it should be noted that as a complex-thalloid species, it represents only one of several major clades of liverworts. With the present study it becomes clear that Marchantia may even represent more of an exception than a rule in mtDNA organization, given that trnT and most of the trnA-nad7 intergenic spacer are lacking in most marchantiid and all jungermanniid liverworts as well as in Apotreubia representing the Haplomitriopsida.

This is the first report of significant divergence in mtDNA organization within the liverwort clade. Here we have traced the recombination events in the trnA-nad7 region, which now allows us to plot parsimoniously a series of events in its molecular evolution onto a phylogeny of this early land plant clade. Notably, in contrast to the frequently recombining mtDNAs of flowering plants, the mtDNA of Marchantia polymorpha had been not only mapped as but also physically identified as a single circular molecule (Oda et al. 1992a). No subgenomic circles typical for angiosperm mtDNAs have been identified, however, concatemers of the circular genome and linear forms, possibly representing replication intermediates, seem to be present in significant amounts in Marchantia mtDNA preparations (Oldenburg and Bendich 1998, 2001). The now identified series of recombination events in the marchantiid liverwort lineage clearly documents ancient recombinational processes that were active on evolutionary timescales. The extensive rearrangements in the intergenic region studied here starkly contrast with the striking conservation, even across all three bryophyte divisions, of two other intergenic regions, nad5-nad4 and nad4-nad2 (Groth-Malonek et al. 2007a).

When rearrangements of organellar genomes occur as rare, one-time events in evolutionary history, they are of particular use for phylogenetic studies by defining monophyletic clades. A chloroplast DNA inversion, which clearly sets lycopodiophytes, but not the whisk fern Psilotum, apart from other tracheophytes (Raubeson and Jansen 1992b), is a typical example of an early phylogenetic insight now well documented through molecular studies confirming the identification of euphyllophytes as a monophyletic group (Pryer et al. 2001). Along similar lines a 22-kb inversion of cpDNA in Asteraceae except Barnadesioidae and the absence of an inverted repeat sequence copy as a synapomorphy of conifers are other examples (Jansen and Palmer 1987; Raubeson and Jansen 1992a). Similarly, the loss of rpoA in arthrodontous mosses (Goffinet et al. 2005) or the large inversion of 71 kb in the plastome in funariid mosses except the Gigaspermaceae described more recently (Goffinet et al. 2007) are significant phylogenetically informative events. A similar case in point is the degeneration of nad7 into a pseudogene in all liverworts except the Haplomitriopsida (Groth-Malonek et al. 2007b), which independently confirms a placement of the latter as sister to all other liverworts. In a similar manner, the new data presented here now independently confirm the unequivocal placement of Blasia among, and basal to all other, marchantiid liverworts. Rare molecular apomorphies like these identified on genomic scales are welcome support for phylogenies including such isolated basal lineages which may be subject to artificial long-branch attraction in sequence-based molecular phylogenies.

Moreover, this study has shown that the loss of trnT from the trnA-nad7 region has occurred several times independently—once in the jungermanniid lineage, once (after a major sequence inversion) in a lineage of derived marchantiid liverworts (and in Apotreubia)—and those events are in full accord with current insights on liverwort phylogeny. The now investigated mtDNA region holds promise as a phylogenetically informative locus, largely for the jungermanniid liverworts (due to faster sequence evolution) and possibly also for the basal marchantiid taxa (mainly due to the longer intergenic region). In contrast, it cannot offer phylogenetic resolution for the derived marchantiids given the large deletion within the intergenic region and in the light of the high degree of sequence conservation.