Introduction

Among more than 36 coding and noncoding regions of the plastid genome that have been sequenced to infer relationships among the divergent lineages of land plants, only four regions (trnL-F spacer, trnS-Q spacer, rbcL, and matK) were used in Taxaceae and Cephalotaxaceae, two ancient lineages of seed plants (Cheng et al. 2000; Wang and Shu 2000; Wang et al. 2003; Hao et al. 2008). Although the number of available markers is gradually increasing, the trnL-F region, including the trnL UAA intron (here abbreviated “trnL intron”) and trnL-F intergenic spacer, still represents one of the most frequently used molecular markers, not only in gymnosperms but also in other plants. The trnL intron interrupts the anticodon loop of the tRNALeu, which is encoded in the large single-copy region of the plastid genome. In land plants, the trnL intron is located between the tandemly arranged tRNA genes trnL UAA and trnF GAA, which is generally referred to as the trnL-F region. Since its introduction into molecular systematics by Taberlet et al. (1991), the trnL intron and trnL-F spacer have been considered appropriate for investigations at various taxonomic levels. High-level studies cover inference of relationships among cyanobacteria, algae, and land plants, thus evaluating the evolutionary history of the trnL intron, relationships of basal angiosperms (Borsch et al. 2003), as well as the molecular evolution of the trnL-F region in land plants (Quandt et al. 2004). In Taxaceae and Cephalotaxaceae, however, the intron has not been used to clarify species-level relationships, and the trnL-F spacer was used in only one study with sparse taxon sampling (Wang et al. 2003).

One of the virtues of the trnL intron, and of group I introns in general, is its mosaic structure of conserved elements (Internal guide sequence, P, Q, R, and S) and common secondary structure elements, which are essential for correct splicing (Cech 1990), and less constrained regions of variable size interspersed between P, Q, R, and S (e.g., Cech et al. 1994). Group I introns undergo splicing via a two-step transesterification reaction, which requires, among others, two highly conserved elements, a U preceding the 5′ splice site as well as a G preceding the 3′ splice site (Cech 1990). In contrast to the tRNALeu group I intron of cyanobacteria, which readily undergoes auto excision (Zaug et al. 1993), self-splicing activity has so far not been reported of plastid trnL introns in land plants, nor of plastid introns in general, with the exception of Chlorokybus (Simon et al. 2003). It has therefore been argued that plastid group I introns depend on splicing factors interacting with the RNA and facilitating secondary and perhaps tertiary structure formation (e.g., Simon et al. 2003). The lack of self-splicing activity may have led to a relaxation of constraints on intron sequence divergence, thus resulting in higher sequence divergence rates of plastid introns compared to cyanobacteria (Besendahl et al. 2000).

In spite of their extensive use in taxonomy and DNA bar-coding of land plants (Stech et al. 2003; Taberlet et al. 2007), little is known about molecular evolution of the trnL intron and trnL-F spacer in Taxaceae and Cephalotaxaceae. The current sequence data of Taxaceae and Cephalotaxaceae are too limited to characterize the region in detail and to analyze mutation patterns in combination with secondary structure information, and no comparative analysis of the sequences has been performed so far. Therefore we present an overview of the complex molecular evolution of the trnL intron and the trnL-F spacer based on newly amplified and GenBank sequences from all taxa of Taxaceae and Cephalotaxaceae, including analyses of lengths and GC contents, sequence similarities, and evolutionary relationships of the trnL-F region of all species.

Materials and Methods

The samples, voucher information, and GenBank accession numbers obtained in this study are listed in a supplementary table, available online. Genomic DNA was extracted using a Universal Genomic DNA Extraction Kit (Takara, Dalian, China).

A 50 μl PCR reaction mix consisted of 5 μl of 10× reaction buffer, 4 μl each 2.5 mM dNTP stock, 2.5 μl of 10 μM forward and reverse primers (synthesized by Takara, Dalian, China), 0.5 μl bovine serum albumin (10 mg/ml), and 1.5 U Ex Taq polymerase (Takara, Dalian, China). Approximately 50 ng genomic DNA was used as a template for the reaction. The reaction mixture was placed in a Takara PCR Thermal Cycler Dice (Takara, Japan). The primers used for amplification of trnL UAA -trnL UAA -trnF GAA (B49317: 5′-CGAAATCGGTAGACGCTACG and A50272: 5′-ATTTGAACTGGTGACACGAG) and the cycling (38 cycles) conditions were described previously (Taberlet et al. 1991). DNAs were purified using Agarose Gel DNA Purification Kit (Takara).

All PCR products were subcloned into a TA cloning vector pMD19-T (Takara). The plasmids were purified for sequencing. ABI Prism, BigDye Terminator, and cycle Sequencing Ready Reaction Kit (Applied Biosystems, Foster City, CA) were used for the sequencing reaction with RV-M and M13-47 primers. The sequences were detected using an ABI Prism 377 Genetic Analyzer (Applied Biosystems). Two PCR products from one sample were sequenced twice in both directions.

We obtained 27 new sequences from Taxaceae, 14 sequences from Cephalotaxaceae, and two from Podocarpaceae (Supplementary Table S1). A data matrix for sequences of each chloroplast region was constructed using Clustal W (Table 1). Length and G + C content were calculated from the aligned sequences. Sequence divergences were calculated using the maximum composite likelihood (MCL) model and Mega4 (Tamura et al. 2007). Chloroplast tRNALeu(UAA) intron and adjacent IGS sequences were compared to the sequences of tRNALeu(UGU) and tRNAPhe (GAA) of Pinus thunbergii, the chloroplast of which has been sequenced (Wakasugi et al. 1994), to determine the boundaries of the gene, intron, and spacer. The Tandem Repeats Finder program (http://tandem.bu.edu/trf/trf.html; Benson 1999) was used to detect repeats. To predict the secondary structure of the region, especially of tandem repeat sequences found in the trnL intron and trnL-trnF IGS, we used MFold (version 3.2; http://www.bioinfo.rpi.edu/applications/mfold; Zuker 2003), which estimates the secondary structure of DNA by thermodynamics. To detect promoter elements of the tRNA genes, we used Neural Network Promoter Prediction (http://www.fruitfly.org/seq_tools/promoter.html) to examine the IGS sequences for “–35” and “–10” promoter element homologies. The data matrices of Borsch et al. (2003) and Quandt et al. (2004), which comprise trnL intron and spacer sequences of numerous land plants, were also compared to those of the Taxaceae and Cephalotaxaceae.

Table 1 Data matrix for sequences of each chloroplast region

The best-fit evolutionary model and the gamma shape parameter of among-site rate variation were inferred with ModelTest 3.8 (Posada 2006); the latter was used to calculate the transition/transversion ratio (R) with Mega4. Distances were estimated using the pairwise-deletion option, and standard errors were calculated by the bootstrap method with 1,000 replicates. The presence of selection was tested in trnL-F regions using Tajima’s neutrality test statistic D, establishing the alternative hypothesis as H1, presence of selection, and the null hypothesis as H0, neutral mutation, no selection (Tajima 1989; Tamura et al. 2007). The D-statistic and the probability that the null hypothesis is rejected were obtained, indicating the significance level as *P (P < 0.05) and **P (P < 0.01). A sliding window analysis of the nucleotide variation across different genera was performed with the DnaSP program (Rozas et al. 2003), with pairwise removal of the alignment gaps. The window size was 20, and its midpoints were removed by 5 bp along the sequence of trnL intron, 3′ exon, and trnL-F spacer.

We designated Podocarpus as a functional outgroup for the phylogenetic analysis, based on results of phylogenetic analyses of rbcL, trnS-Q, and matK sequences (Cheng et al. 2000; Wang and Shu 2000; Wang et al. 2003; Hao et al. 2008). We used PAUP 4.0b10 (Swofford 2002), Mega4, GARLI (Zwickl 2006), and MrBayes 3.1.2 (Ronquist and Huelsenbeck 2003) for phylogenetic analyses. The data matrix of the trnL intron, trnL 3′ exon, and trnL-trnF spacer was analyzed by parsimony, neighbor-joining (NJ), maximum likelihood (ML), and Bayesian inference. Parsimony analyses were run using heuristic searching, with 100 random taxon-addition replicates, holding 100 trees at each step, tree-bisection-reconnection branch swapping, the MulTrees, Collapse, and Steepest Descent options, and no upper limit for trees held in memory. NJ used MCL distances and pairwise deletion of gaps. ML searches relied on the HKY + I + G model, which ModelTest selected as the best-fitting model for unpartitioned data. Bayesian probabilities were obtained under the mixed partitioned model (Table 1), with four Markov chain Monte Carlo chains run for 8 million generations, using random trees as starting point, and sampling every 1,000th generation. The trees sampled before the saturation of maximum likelihood estimates were discarded as burn-in. Nonparametric bootstrap support for parsimony, ML, and NJ was obtained by resampling the data 1,000 times with the same search options and model.

Results and Discussion

Length and GC Content

Noncoding regions are usually less subject to natural selection than protein-coding regions and so may be more useful for studying plant evolution. The length of the trnL intron was 447 ± 2.26 bp for Taxus, 450 bp for Pseudotaxus, 447 bp for Austrotaxus, 477 bp for Amentotaxus, 461 ± 2.07 bp for Torreya, 472 ± 0.58 bp for Cephalotaxus, and 442 ± 9.35 bp for Podocarpaceae (Fig. 1). The trnL intron of Taxus is shorter than those of Cephalotaxus and Amentotaxus (one-way ANOVA: F = 602.29, P < 0.0001; Tukey HSD-test for pairwise comparison, P < 0.01). Length mutations in Cephalotaxus and Amentotaxus mainly result in an increasing size of stem–loop regions P8 (110 bp and 103 bp, respectively) and P9 (189 bp and 200 bp, respectively; Fig. 2), while the length of [PQRS] remains constant (49 bp) in ingroup genera. The trnL-F spacer (309 ± 25.76 bp, 335 bp, 370 bp, 404 ± 0.45 bp, 370 bp, 223 ± 0.36 bp, and 361 ± 17.83 bp for the above seven groups, respectively) is shorter than the trnL intron, and the spacer of Cephalotaxus is much shorter than those of the other genera (one-way ANOVA: F = 231.56, P < 0.0001; Tukey HSD-test for pairwise comparison, P < 0.01). It is noteworthy within Taxus that the more ancestral T. floridana has a shorter spacer (223 bp, Supplementary Table S1) than T. globosa, and the latter’s spacer (302 bp) is shorter than those of more recent Taxus species (e.g., T. yunnanensis has a spacer of 355 bp). The observed extensions are mainly repetitive elements (Table 2). The length of the trnL 3′ exon was 50 bp in Taxus, Pseudotaxus, Austrotaxus, Amentotaxus, Torreya, and Cephalotaxus and was 45 bp in Podocarpaceae.

Fig. 1
figure 1

Length variation (A) and percentage variation of GC content (B) of the trnL-F region in Taxaceae and Cephalotaxaceae. Genus codes: Ta, Taxus; Ps, Pseudotaxus; Au, Austrotaxus; To, Torreya; Am, Amentotaxus; Ce, Cephalotaxus; Po, Podocarpus

Fig. 2
figure 2

Sequence alignment of the cp tRNA Leu (UAA) gene and intron. Open boxes indicate the trnL 5′ and 3′ coding regions and conserved sequence elements (P, Q, R, S) in the intron. Nine stem-and-loop structures are indicated by gray lines below the sequences and labeled P1–P9. Sequences shown are those of Amentotaxus formosana, Torreya grandis, Cephalotaxus griffithii, Austrotaxus spicata, Taxus brevifolia, and Pseudotaxus chienii

Table 2 Tandem repeats found in the trnL-trnF sequences of Taxaceae and Cephalotaxaceae

The G + C contents of the trnL intron, 3′ exon, and spacer showed different trends among the seven genera (Fig. 1). In the trnL intron, Taxus had a significantly lower G + C content than other genera (31.71 ± 0.23% vs. 35.12% of Torreya and 35.19% of Cephalotaxus, one-way ANOVA: F = 492.7, P < 0.0001). In all lineages and within the intron, the GC contents of P8 and P9 are considerably lower than that in [PQRS] (Fig. 1). The GC content of [PQRS] is higher than that of the exon in Torreya, Amentotaxus, Cephalotaxus, Pseudotaxus, and Podocarpus. Therefore, the overall GC content of the intron does not accurately characterize the region. In the trnL 3′ exon, Taxus (44.37 ± 0.81%) was comparable to the other groups, except Podocarpaceae (31.73 ± 1.07%, one-way ANOVA: F = 884.13, P < 0.0001). In the trnL-F spacer, Taxus (27.22 ± 1.86%) was comparable to Pseudotaxus, Austrotaxus, Amentotaxus, and Torreya but had a much lower GC content than Cephalotaxus (33.61 ± 0.59%) and Podocarpaceae (33.79 ± 1.30%, one-way ANOVA: F = 101.9, P < 0.0001). The GC content of the L-F spacer is significantly lower than the intron and the 3′ exon in all ingroup genera (one-way ANOVA: F = 154.72, P < 0.0001; Tukey HSD test for pairwise comparison, P < 0.01).

Nucleotide Diversity, Transition/Transversion Ratio, and Neutrality Test

In the outgroup Podocarpus, the nucleotide diversity of the 5′ portion of the trnL intron (corresponding to P8) was higher than those of the 3′ portion of the trnL intron, exon, and LF spacer (data not shown), and the total nucleotide diversity (π, expressed as the average number of nucleotide difference per site) is significantly higher than the ingroup genera (π 0.04023 vs. π 0.01229, 0.01215, 0.00748, and 0.00542 of Taxus, Amentotaxus, Cephalotaxus, and Torreya, respectively). In contrast, in Taxus, the nucleotide diversity of the trnL-F spacer was highest, followed by the 3′ portion of the trnL intron (corresponding to P9, Fig. 3). In Cephalotaxus, the highest nucleotide diversity was within the 3′ portion of the trnL intron (corresponding to P9, Fig. 3), and in Torreya and Amentotaxus, the nucleotide diversity of the trnL intron was as high as that of the trnL-F spacer (Fig. 3, data not shown). The results suggest that different portions of the trnL-F region had different evolutionary patterns and might not share the same evolutionary history.

Fig. 3
figure 3

Total nucleotide diversity (π) across the trnL-trnF regions of five genera. A Taxus (including T. floridana); B Taxus (excluding T. floridana); C Torreya; D Cephalotaxus

To gain deeper insight into the molecular evolution of the trnL-F regions, the average number of nucleotide differences per site (p NT) and the transition/transversion bias of different parts were calculated. The results of p NT (Fig. 4) are in accordance with the variation of nucleotide diversity (π, Fig. 3), e.g., the nucleotide variation observed within ingroup lineages is highest for Taxus (p NT = 0.03748 ± 0.00627 substitutions per site), followed by Amentotaxus (0.01226 ± 0.00264), Cephalotaxus (0.00753 ± 0.00162), and Torreya (0.00542 ± 0.00147), in agreement with the temporal differentiation frame of Taxaceae and Cephalotaxaceae (see “Phylogenetic Tree” section below). In contrast, the outgroup Podocarpus shows much more significant variation in the trnL intron (0.06579 ± 0.01577), especially P8 (0.43103 ± 7.07629), and the L-F spacer (0.03993 ± 0.01097). Within the different lineages, except Cephalotaxus, the ti/tv ratio was always higher in the conserved parts of the trnL intron than in the AT-rich P8 (Fig. 4), i.e., there is a transition bias in [PQRS] of the intron. Compared with Torreya and Cephalotaxus, of low nucleotide variation, Taxus and Amentotaxus with higher p NT have lower ti/tv ratios (0.496 and 0.384, respectively). On the other hand, the transition bias is evident in the L-F spacer of Cephalotaxus, Amentotaxus, and Podocarpus but not of Taxus and Torreya. Finally, the transition bias is only really definite in [PQRS] and might fade away if the region is treated as a single evolutionary unit, especially with the addition of stem–loop regions of intron, 3′ exon, and L-F spacer.

Fig. 4
figure 4

Comparisons of the trnL-F regions of the five genera. Average number of nucleotide (pNT) differences per site (A) and transition/transversion bias (B) for (left to right) complete region, trnL intron (including [PQRS] and P8), 3′ exon, and trnL-F spacer. Genus codes as in Fig. 1. Neutrality test statistics (C) are arranged from bottom to top. *P < 0.05; **P < 0.01

As previously reported, GC-rich regions tend toward a transitional bias (Morton et al. 1995), which is reflected by the present data, as [PQRS] obviously has a higher GC content compared with the stem–loop regions [P6] and P8 (Fig. 1, data not shown). In the case of the trnL intron, the transition bias of the core structure might not be linked directly to the GC content but rather to structure stabilization, as the higher GC content of the core structure is attributed to the formation of the secondary structure. For example, if a transition in a helical element of the core structure occurs, there is a 50% chance that the helix will be maintained, but a transversion would always result in a mismatch. Furthermore, the 50% chance of a mismatch transition would require a second transition to match again, as indicated by the number of compensating base pair changes observed. Similarly, in the case of the L-F spacer, the transition bias might be linked directly to the structure stabilization, i.e., the stem–loop structure related to the putative promoter sequences and tandem repeats (Fig. 5 and Supplementary Fig. S2).

Fig. 5
figure 5

RNA secondary structures of the trnL-F spacer of T. globosa (1), A. formosana (2), and T. chinensis (3). The regions containing the tandem repeat area are indicated by arrows

Statistical tests of evolutionary neutrality (Tajima’s D) were performed using sequence alignment under a standard neutral model to differentiate selection from plant evolutionary history. Tajima’s test is most powerful for testing alternative hypotheses involving selective sweeps and population subdivision (Simonsen et al. 1995). Evidence suggestive of selection was present in the trnL intron and the L-F spacer of Taxus and in the L-F spacer of Cephalotaxus (Fig. 4). Tajima’s D of the Taxus intron deviates from zero significantly whether or not T. floridana is included, suggesting that selection, possibly both positive and purifying, is involved in differentiation and diversification of the Taxus trnL intron. Tajima’s D of the Taxus L-F spacer deviates from zero significantly with T. floridana included but is not significant when T. floridana is not included, suggesting that selection is mainly involved in differentiation of the Taxus spacer between the more ancestral T. floridana and other species. The length mutational events (from 223 bp to longer) of the L-F spacer might be adaptive changes during the speciation of T. floridana’s more derived relatives. The involvement of selection in the evolution of the L-F spacer was found in the angiosperm Trema dielsiana (Chiang et al. 2004). Likewise, Tajima’s D is significantly negative (−2.124) for the trnL-F spacer of Cephalotaxus. Both the nucleotide variation (p NT) and nucleotide diversity (π) are lowest among different trnL-F regions, and the transition bias (R = 2.108) is most significant, suggesting that purifying selection might play a role in the evolution of the conserved L-F spacer of Cephalotaxus. These results motivate further studies of trnL-F regions in determining inter-genus/species differences in evolutionary pattern.

Motif, Promoter, and Tandem Repeats

Homologous motifs were not found in the spacer regions of the three families. Specific homologies of Taxaceae and Cephalotaxaceae with other seed plant sequences were also not detected. In contrast, the trnL intron sequences contained several highly conserved motifs (Fig. 2), especially the P, Q, and R sequence elements and the P1, P2, and P3 elements, which are essential for correct splicing (Cech 1990) and thus might be subject to purifying selection. In stem–loop region P1, the internal guide sequence always corresponds to GAG. The stem–loop region P2 is highly conserved in size (24 bp), as no single-base indel events in the terminal loop were scored. The P5, P6, P8, and P9 structural elements were variable and might be under less functional constraint (Supplementary Fig. S1). Comparison of Taxaceae and Cephalotaxaceae sequences with those of other land plants also revealed the expected conserved tRNALeu group I intron structure. Stem–loop regions P6, P8, and P9 of the trnL intron and the trnL-F spacer (except the promoter elements, see below) might undergo neutral evolution with respect to their escape from functional constraints.

In the trnL-F spacer, two conserved elements with a high similarity to bacterial sigma70-type promoters were situated toward the trnF gene. (The consensus sequences of the –35 and –10 motifs, with spacing between 15–20 bp, are shown in Supplementary Fig. S2.) Both putative promoter elements are present in almost all land plant lineages, from bryophytes to ferns, gymnosperms, and angiosperms (Borsch et al. 2003; Quandt et al. 2004), with occasional losses in mosses, liverworts, and Gnetales (Won and Renner 2006). Secondary structure calculations revealed that –10 promoter elements of Taxus, Austrotaxus, Pseudotaxus, Torreya, and Podocarpaceae are involved in forming the stem of the stable stem–loop structures (Supplementary Fig. S2), while those of Cephalotaxus and Amentotaxus are not. In contrast, only the –35 motif of Pseudotaxus and Podocarpaceae is involved in forming the stem of the stable stem–loop structure, but that of the other genera is not. The trnL-F spacer can be regarded as the starting point for the transcription of the tRNAPhe; however, functionality of both promoters needs to be proven by experimental data.

The Tandem Repeats Finder program detected 9, 3, 2, 1, 1, 0, and 4 putative repeats in Taxus, Amentotaxus, Torreya, Pseudotaxus, Cephalotaxus, Austrotaxus, and Podocarpaceae (Table 2), with the following search options (alignment parameters 2, 3, 5, and minimum alignment score 40). The detected repeat sequences were 2–20 bp in length, had 2.0–24.5 copies, and their match points were between 0.66 and 1.00. Seven repeats were found in the trnL intron (one in Cephalotaxus, one in Amentotaxus, two in Taxus, and three in Podocarpaceae), and all other repeats were located in the trnL-F spacer. These repetitive elements are probably due to slipped-strand mispairing (Levinson and Gutman 1987), which has the potential to expand, contract, and diversify among lineages in different ways (Benson and Dong 1999; Zhu et al. 2000). The tendency of the repeat motif and the flanking sequences to form a stem is evident from the secondary structure model (Fig. 5). For example, the mutations that occurred between Taxus and Amentotaxus led to considerable modifications in the secondary structure, particularly in the area where the tandem repeat is located. A characteristic 10-bp repeat (copy no. 2.2, the bottom arrow in Fig. 5) in T. globosa forms the half of the AT-rich stem that differentiates T. globosa from the other Taxus species. A long 17-bp repeat (copy no. 3.4) that forms a stem–loop structure is unique in T. canadensis. Taxus yunnanensis is closely related to T. wallichiana in the phylogenetic tree (Fig. 6), but a long 20-bp repeat (copy no. 3.2, Table 2) that forms a unique short stem–loop structure is present in the former species and absent in the latter. Some tandem repeats found in Taxus coincided with phylogenetic groups. For example, T. cuspidata, T. cuspidata var. nana, and T. hunnewelliana form a group in the phylogenetic tree (Fig. 6), and they all possess a 15-bp repeat of copy no. 2.5. Two North American species, T. floridana and T. globosa, possess a 12-bp repeat (copy no. 3.5) in the trnL intron. The occurrence of the tandem repeats may confer an increased stability to the pre-RNA transcript, by restricting the number of possible folding isomers and by affecting positively the RNA processing.

Fig. 6
figure 6

Bayesian 50% majority rule consensus tree (8,000 trees sampled; burn-in = 2,000 trees), inferred from trnL intron + trnL 3′ exon + trnL-F spacer DNA alignment under the mixed partitioned model. Bayesian posterior probabilities (%) are given above branches, before slash (/). ML bootstrap proportions (%) calculated under the HKY + I + G model are given above branches, after slash (—, clade not included in the tree). MP (CI = 0.7416, excluding uninformative characters; RI = 0.9559) and NJ bootstrap proportions are shown below branches (MP/NJ). Branch lengths (shown on the right; scale bar, expected number of substitutions per site) are proportional to the mean of the posterior probabilities of the branch lengths of the sampled trees

Phylogenetic Tree

Gene genealogies retain information about organismal and gene histories (Schaal and Olsen 2000). The molecular clock was rejected for the trnL intron dataset, the trnL-F spacer, and the combined dataset, because constrained and unconstrained analyses differed significantly (Table 1). On the phylogenetic tree (Fig. 6) inferred from the partitioned Bayesian analysis, ML analysis, and MP analysis, the clade formed by Austrotaxus and Pseudotaxus is basal to the large clade formed by Taxus and Torreya Amentotaxus Cephalotaxus. Based on fossil-calibrated molecular clocks, the split between Taxaceae and Cephalotaxaceae had already occurred by at least 200 mya (Won and Renner 2006). On the trnL-F tree, Torreya and Amentotaxus are closer to each other than to Cephalotaxus. The monophyly of each genus is well supported by four methods. Taxus floridana is basal to the other Taxus species. Except T. floridana, T. globosa is the first-branching species in Taxus. Within Taxus, two groups were resolved. One consisted of two hybrids and their parental species T. cuspidata, which was also recovered in the ITS tree of Li et al. (2001), and the other consisted of T. yunnanensis and T. wallichiana, which was not included in the Li et al. MP analysis. Taxus baccata grouped with T. contorta, which was recovered with other nuclear and chloroplast DNA markers (Hao et al. 2008). Within Torreya, Torreya californica was the first-branching species, and To. yunnanensis grouped with To. fargesii and To. nucifera. Within Cephalotaxus, the group formed by Cephalotaxus koreana, C. harringtonia cv. fastigiata, and C. wilsoniana was well supported by four methods, which agrees well with the previous AFLP study (Zhang et al. 2000). Four closely related species, C. mannii, C. hainanensis, C. griffithii, and C. oliveri, fell into two groups; C. griffithii and C. oliveri were first-branching species, whereas C. mannii and C. hainanensis formed a trichotomy with a group consisting of other species.

It is known that the cp trnF gene is cotranscribed with the trnL gene. Bonnard et al. (1984) and Koch et al. (2005) provided evidence for this cotranscription. Nevertheless, the structural aspects of the trnL-F spacer in Taxaceae, Cephalotaxaceae, and Podocarpaceae, especially the presence of sigma70-type promoter elements, tandem repeats, and stem–loop structures, favor an interpretation of independent transcription. Both promoter elements are conserved across most land plants and are able to form stable stem–loop structures that might play a role in recognition and/or processing of the intercistronic regions (Hong et al. 1995).

The length variation in trnL-F reported here (Fig. 1) is within the range observed in seed plants. It is possible that tandem repeats, causing nonhomologous recombination and/or unequal crossing-over, are responsible for the great trnL-F length variation in Taxaceae, Cephalotaxaceae, and Podocarpaceae. In the three families studied, we found 20 short repeats of 2–20 bp in all species except A. spicata (Table 2). Many repeats were not limited to the single species. Tandem repeats therefore are one of the causes of the observed significant trnL-F length variation in the three families. It is noted that similar tandem repeats have not been reported previously in gymnosperm, whereas in an orchid lineage, three different tandem repeat loci were found, which evolved independently at the same position in the trnL intron (Cozzolino et al. 2003).

In summary, the molecular evolution of the trnL-F region is very complex, as the variable stem–loop regions of the trnL intron, the variable trnL-F spacer, and the more conserved parts such as [PQRS] and promoter elements seem to evolve under different constraints. Taking into account the long divergence time of Taxaceae and Cephalotaxaceae, independent evolution of the highly variable parts, such as the stem–loop region P8 and the regions between the conserved blocks of the trnL-F spacer, can be assumed. The evolutionary rate could be underestimated if the more variable regions, such as P8, are completely ignored. The molecular evolutionary uniqueness of the trnL-F region of Taxaceae and Cephalotaxaceae should be kept in mind when it is used as the phylogenetic marker at various taxonomic levels.