Introduction

The sandalwood order Santalales, with 20 families and approximately 160 genera and 2200 species (Der and Nickrent 2008; Nickrent et al. 2010, 2019; Su et al. 2015), harbors the richest diversity of parasites within the plant kingdom, including lifestyles from autotrophy to hemiparasitism and holoparasitism (Nickrent 1997, 2002; Der and Nickrent 2008; Su et al. 2015). Arceuthobium (the dwarf mistletoes), which is found in both the Old and New worlds (Frank et al. 1996), is a genus belonging to the Santalales family Viscaceae. According to the most recent taxonomic revision (Frank et al. 1996; Nickrent et al 2010), the genus comprises approximately 42 species of stem parasites that are capable of photosynthesis (Hull and Leonard 1964a, b; Leonard and Hull 1965; Mathiasen et al. 2008). Arceuthobium is the most host-specialized taxon in Santalales, and the species of the genus exclusively parasitize members of Pinaceae and Cupressaceae (Frank et al. 1996). The lifecyle mistletoes is also particularly unusual among the Santalales hemiparasites. Compared with close relatives, the leaves of dwarf mistletoes are extremely reduced and their body sizes are only several centimeters in height (Frank et al. 1996). In addition, Arceuthobium species do not produce shoots immediately after seed germination, but develop a highly developed haustorial system that thoroughly permeates inside the branches of their coniferous hosts (endophytic system); after 2–6 years, aerial shoots arise from the endophytic system (Parmeter et al. 1959; Parmeter and Scharpf 1963; Tong and Ren 1980; Frank et al. 1996; Mathiasen et al. 2008). The extensive endophytic system of dwarf mistletoes enables them to absorb nutrition and water from host plants with significantly higher efficiency than other Santalales hemiparasites (Baranyay et al. 1971; Tocher et al. 1984; Alosi and Calvin 1985; Kirkpatrick 1989; Singh and Carew 1989). This makes Arceuthobium species relatively unique among mistletoes. The leafless and endophytic habit indicates a greater reliance on host plants for nutrients. It is estimated that dwarf mistletoes absorb as much as 80% of their carbon requirement from their hosts (Frank et al. 1996; Parks and Flanagan 2001; Mathiasen et al. 2008). Severe infection by dwarf mistletoes always leads to significant growth declines and premature mortality in their coniferous hosts (Mathiasen et al. 2008). Therefore, epidemics of dwarf mistletoe infection always significantly damage coniferous forests worldwide (Frank et al. 1996).

In green plants, chloroplasts are organelles that conduct photosynthesis and the biosynthesis of starch, fatty acids, pigments, and amino acids (Palmer 1985; Neuhaus and Emes 2000; Daniel et al. 2016). In most angiosperms, plastid genomes (plastomes) are maternally inherited and highly conserved in size, structure, gene content, and organization (Daniel et al. 2016). The structure of a typical angiosperm plastome is circular and quadripartite and consists of a large single copy region (LSC), a small single copy region (SSC), and a pair of inverted repeats (IRs) (Wicke et al. 2011a,b). Nevertheless, the lifestyle transition from autotrophy to heterotrophy in angiosperms always leads to massive modification of plastomes, involving size reduction, structural arrangements, and loss or pseudogenization of plastid genes, among other changes (Neuhaus and Emes 2000; Wicke and Naumann 2018). As a result, the plastome features of parasitic plants largely differ from those of their autotrophic relatives (Krause 2008; Wicke et al. 2013, 2016; Petersen et al. 2015; Frailey et al. 2018; Schneider et al. 2018; Shin and Lee 2018; Wicke and Naumann 2018; Guo et al. 2020, 2021). To date, the complete plastome of several Santalales parasites has been sequenced, providing valuable data to disentangle the evolutionary trajectory of plastome reduction associated with parasitism (Petersen et al. 2015; Su and Hu 2016; Li et al. 2017; Yang et al. 2017; Liu et al. 2018; Shin and Lee 2018; Zhu et al. 2018; Guo and Ruan 2019a, b; Jiang et al. 2019; Guo et al. 2019, 2020, 2021; Chen et al. 2020a, b).

Characterization of the complete plastomes of dwarf mistletoes will deepen our understanding of their biology, and because of their significant threats to numerous conifer species worldwide, genomic resources will be conducive to further studies on dwarf mistletoes involving phylogeny, population genetics, and interactions between dwarf mistletoes and their host plants. To date, the complete plastome of only one species, Arceuthobium sichuanense (HS Kiu) Hawksworth & Wiens, has been sequenced (Chen et al. 2020a, b). This represents merely a small fraction of the species diversity in the genus. Thus, the extent to which the plastomes of dwarf mistletoes are degraded and whether their leafless and endophytic habit is correlated with a higher level of plastome reduction than other Santalales hemiparasites remains undetermined.

Here, the complete plastomes of A. chinense and A. pini were sequenced and assembled using the genome skimming approach (Straub et al. 2012). The study was based on a comparative and phylogenetic framework, and the main objectives were as follows: (1) To characterize the genome size, structure, and gene content of the plastomes, and (2) To elucidate whether the leafless and endophytic habit of dwarf mistletoes leads to a higher level of plastome reduction than other Santalales hemiparasites.

Materials and methods

Plastome sequencing, assembly, and annotation

Plant tissues of A. chinense and A. pini collected from the wild were dried with silica gel, and voucher specimens (Table 1) were deposited in the herbarium of the Kunming Institute of Botany, Chinese Academy of Sciences (KUN), Kunming, China. Total genomic DNA was extracted from ~ 50 mg of silica gel-dried leaves using cetyltrimethylammonium ammonium bromide following the protocol of Doyle and Doyle (1987). Purified DNA was fragmented with Covaris S2 to an average length of ~ 350 bp, followed by ligation of adaptors for library amplification according to the manufacturer’s guidelines (Illumina, San Diego, CA, USA). Paired-end shotgun sequencing (2 × 150 bp) was performed on the Illumina HiSeq 2500 platform at Personal Biotechnology (Shanghai, China) to generate approximately 4.5 G raw data for each sample.

Table 1 Voucher information of the two Arceuthobium species observed in this study and the summary of shotgun sequencing and plastome assembly

A FASTX–Toolkit v.0.0.13 (http://hannonlab.cshl.edu/fastx_toolkit) was used to remove adaptors and reads with ambiguous bases from the raw Illumina data. The clean reads were de novo assembled using the software NOVOPlasty v.2.7.0 (Dierckxsens et al. 2017), with the k-mer size set at 31. The large subunit of the RuBisCO gene (rbcL) of A. azoricum (HM849787) was used as the seed for the iterative extension of contigs to recover the complete plastome of each species. The newly generated plastomes were annotated using the Dual Organellar Genome Annotator database (Wyman et al. 2004). The annotation of protein-coding genes was confirmed using a BLAST search against the NCBI protein database. The protein-coding genes with one or more frameshift mutations or premature stop codons were annotated as pseudogenes. Genes putatively annotated as transfer RNA (tRNA) were further verified by tRNAscan-SE v.1.21 (Schattner et al. 2005) with default parameters.

Comparative and phylogenetic analyses

Previously published plastomes of Santalales (Petersen et al. 2015; Su and Hu 2016; Li et al. 2017; Yang et al. 2017; Liu et al. 2018; Shin and Lee 2018; Zhu et al. 2018; Guo and Ruan 2019a, b; Jiang et al. 2019; Guo et al. 2019, 2020, 2021; Chen et al. 2020a, b) were downloaded from the NCBI GenBank for comparative and phylogenetic analyses (Table S1). The structure and gene content of the Santalales plastomes were compared using Geneious v10.2 (Kearse et al. 2012). Any putative gene deletions detected in the newly generated dwarf mistletoe plastomes were further verified by extracting intact sequences of the corresponding genes from the plastome of Erythropalum scandens Blume (Erythropalaceae, Santalales), the autotrophic relative of dwarf mistletoes, and then performing local BLAST searches against the Illumina reads of each sample. Following this, all Arceuthobium plastomes were pairwise-aligned using mVISTA (Mayor et al. 2000) to investigate sequence divergence.

Both standard maximum likelihood (ML) and Bayesian inference (BI) approaches were used to infer the phylogenetic relationships between Arceuthobium and related taxa. Based on the interfamilial relationships of Santalales recovered in previous studies (Der and Nickrent 2008; Nickrent et al. 2010, 2019; Su et al. 2015; Chen et al. 2020a, b; Guo et al. 2020, 2021), E. scandens was selected as the outgroup to root the phylogenetic trees. Forty-six plastid protein-coding genes commonly shared by the taxa were used in the phylogenetic reconstruction. These genes were separately aligned using MAFFT v.7.450 (Katoh and Standley 2013) and then integrated into a data set using Geneious v10.2 (Kearse et al. 2012).

The ML analyses were performed using RAxML-HPC BlackBox v8.1.24 (Stamatakis et al. 2008), using the sequence substitution model (GTRGAMMAI). The phylogenetic tree was inferred by conducting ten independent ML searches with 1000 replicates of standard bootstrapping (BS). The BI analyses were performed using MRBAYES v.3.1.2, (Ronquist and Huelsenbeck 2003). Runs of Markov chain Monte Carlo simulations were initiated with a random tree for one million generations, with trees sampled every 100 generations. Trees that resulted from the first 25% of generations were discarded as “burn-in” The posterior probability values (PP) were computed based on the remaining trees.

Results

Plastome features of newly sequenced Arceuthobium species

Illumina paired-end sequencing yielded over 19 million clean reads for each species; the mean depth of the plastome sequencing was 822 × for A. chinense and 447 × for A. pini (Table 1). The assembled plastomes of A. chinense and A. pini were 116,594 bp and 115,862 bp in size, respectively. They possessed a typical quadripartite structure (Fig. 1), consisting of a pair of IRs (21,333 and 21,303 bp, respectively), an LSC (66,071 and 65,229 bp, respectively), and an SSC (7857 and 8027 bp, respectively). In comparison with other Santalales plastomes (Table 2). The Arceuthobium species had the lowest GC content (33.7–34.9%), which was unevenly distributed in the LSC, SSC, and IRs. The highest GC content was in the IR regions (42.1–42.3%), followed by the LSC (29.7–30.1%). The lowest GC content was observed in the SSC (21.5–26.4%). The plastome-wide comparative analysis using the mVISTA program detected 2510 sequence variations among the 120,242 alignment sites, accounting for 2.087% of the divergence proportion among A. chinense, A. pini, and A. sichuanense (Fig. 2).

Fig. 1
figure 1

Plastome map of Arceuthobium chinense and A. pini

Table 2 Comparison of size and GC content (GC%) of complete plastomes, LSC, IR, and SSC regions among Santalales plastomes
Fig. 2
figure 2

Alignment of three Arceuthobium plastomes using mVISTA, showing the percentages of sequence identity (y-axis)

The plastomes of A. chinense and A. pini contained 86 and 88 intact genes, respectively (Table 3). In contrast to the non-parasitic plant E. scandens, all 11 NAD(P)H-dehydrogenase (NDH) complex genes (ndhA, B, C, D, E, F, G, H, I, J, and K), four RNA polymerase genes (rpoA, rpoB, rpoC1, and rpoC2), a ribosomal protein-coding gene (rpl33), the infA gene, as well as six tRNA genes (trnC-GCA, trnG-UCC, trnH-GUG, trnI-GAU, trnR-ACG, and trnV-UAC) were deleted from the plastomes. Two photosynthesis-related genes (petL and psbZ) were identified as pseudogenes in both species because of the occurrence of premature stop codons. Moreover, an additional loss of cemA and trnK-UUU was detected in A. chinense. As a result of gene loss and pseudogenization, 59 and 60 protein-encoding genes, respectively, and four ribosomal RNA genes, as well as 23 and 24 tRNAs, respectively, were retained in the A. chinense and A. pini plastomes.

Table 3 Comparison of plastome gene content between Santalales hemiparasites and the autotropic relative Erythropalum scandens

Comparison of plastome structure and gene content

The junctions of IR/LSC and IR/SSC were highly variable in the plastomes of Santalales hemiparasites due to the expansion/contraction of IRs (Fig. 3). Although the examined Santalales hemiparasites had divergent gene content in their plastomes (Fig. 4), the loss or pseudogenization of plastid ndh genes were commonly shared. Of these taxa, the plastomes of Amphorogynaceae, Santalaceae, Schoepfiaceae, and Ximeniaceae encoded the highest number of intact plastid genes, with a total of 101 to 102, including 66–68 protein-coding genes, 29–30 tRNA genes, and four rRNA genes. Comparatively, the lowest number of intact genes was observed in Arceuthobium, which possessed 54–60 protein-coding genes, 23–24 tRNA genes, and four rRNA genes. In addition, pseudogenization or loss of infA was found in Amphorogynaceae, Cervantesiaceae, Loranthaceae, Opiliaceae, Santalaceae, and Viscaceae. The deletion of trnV-UAC genes was shared by species of Loranthaceae, Schoepfiaceae, and Viscaceae. Further loss of trnG-UCC was detected in Loranthaceae and Viscaceae. The losses of rpl32, rps15, rps16, trnG-UCC, and trnK-UUU occurred in Loranthaceae. The pseudogenization of some essential photosynthetic genes (psbZ, petL, and ccsA) was identified in Viscaceae, whereas the deletion of RNA polymerase genes (rpoA, rpoB, rpoC1, and rpoC2) only occurred in Arceuthobium.

Fig. 3
figure 3

Boundaries of inverted repeats (IRA and IRB), large single copy (LSC), and small single copy (SSC) regions are compared at genus-level to show the dynamics of IR expansion/contraction among Santalales plastomes

Fig. 4
figure 4

Comparison of gene content among Santalales plastomes. 1: Erythropalum scandens (Erythropalaceae). 2: Malania oleifera; 3: Ximenia americana (Ximeniaceae); 4: Schoepfia fragrans; 5: S. jasminodora (Schoepfiaceae); 6: Macrosolen sp.; 7: M. tricolor; 8: Loranthus tanakae; 9: Dendrophthoe pentandra; 10: Tolypanthus maclurei; 11: Helixanthera parasitica; 12: Taxillus delavayi; 13: T. thibetensis; 14: T. chinensis; 15: T. vestitus; 16: T. sutchuenensis; 17: T. nigrans (Loranthaceae); 18: Champereia manillana (Opiliaceae); 19: Pyrularia edulis; 20: P. sinensis (Cervantesiaceae); 21: Osyris alba; 22: O. wightiana; 23: Santalum album; 24: S. boninense. (Santalaceae). 25: Dendrotrophe varians; 26: Phacellaria compressa; 27: P. glomerate (Amphorogynaceae). 28: Arceuthobium chinense; 29: A. pini; 30: A. sichuanense; 31: Viscum album; 32: V. coloratum; 33: V. ovalifolium; 34: V. minimum; 35: V. crassulae; 36: V. liquidambaricolum; 37: V. yunnanense (Viscaceae). Red squares: intact genes; yellow squares: pseudogenes; blue squares: deleted genes

Phylogenetic analyses

The ML and BI analyses produced identical tree topologies (Fig. 5). Ximeniaceae (root hemiparasitism) was resolved as an early diverged branch (BS = 100%, PP = 1.00), and the remaining Santalales hemiparasites formed two well-supported clades (BS = 100%, PP = 1.00). Within the first clade, the sister relationship between Schoepfiaceae (root hemiparasitism) and Loranthaceae (stem hemiparasitism) received high branch support (BS = 100%, PP = 1.00). Within the second clade, the successive divergence of the families Opiliaceae (root hemiparasitism), Cervantesiaceae (root hemiparasitism), Santalaceae (root hemiparasitism), Amphorogynaceae (stem hemiparasitism), and Viscaceae (stem hemiparasitism) were recovered with high statistical support (BS = 100%, PP = 1.00).

Fig. 5
figure 5

Phylogeny of Santalales hemiparasites reconstructed by maximum-likelihood (ML) and Bayesian inference (BI) analyses of 46 protein-encoding genes. The numbers at each node are the maximum-likelihood bootstrap percentage (BS) and posterior probability (PP) values. Green lineages: autotrophy outgroup; blue lineages: root hemiparasites; red lineages: stem hemiparasites

Discussion

Plastome reduction in Santalales hemiparasites

The lifestyle transition from autotrophy to heterotrophy always leads to prevalent gene losses from the plastomes of parasitic plants (Neuhaus and Emes 2000; Wicke et al. 2013, 2016; Wicke and Naumann 2018). A great diversity of plastome sizes, GC content, and the number of functional (intact) genes were observed in Santalales hemiparasites (Table 2), suggesting that their plastomes undergo significant modifications associated with the evolution of parasitism (Wicke and Naumann 2018). Compared with the autotrophic relative, E. scandens, the plastid GC content was lower not only in Arceuthobium but also in other Santalales hemiparasites (Table 2). Here, empirical evidence is provided to justify the idea that loss/pseudogenization of plastid genes from parasitic plants is accompanied by a reduction in the GC content in their plastomes (Wicke and Naumann 2018). Notably, diverse IR shifts were observed in the plastomes of Santalales hemiparasites (Fig. 3), which supports the hypothesis that decreasing GC content may trigger significant IR expansion/contraction in the plastomes of parasitic species (Wicke et al. 2013, 2016).

A typical angiosperm plastome contains 113 unique genes, including 79 protein-coding genes, 30 rRNA genes, and four ribosomal RNA genes (Wicke et al. 2011a,b). In the examined Santalales taxa, only the autotrophic E. scandens plastome contained relatively intact gene content. Because of gene loss and pseudogenization, only 86 and 88 functional genes were retained in the plastomes of A. chinense and A. pini, respectively. Similar levels of gene loss and pseudogenization were also observed in the congeneric species, A. sichuanense (Fig. 4; Table 3). This decrease in functional genes suggests that the relatively small plastome size of Arceuthobium species compared with that of E. scandens can be partially attributed to heterotrophy-associated gene losses. Previous studies have indicated that the losses of some plastid genes resulted in varying degrees of plastome downsizing in Santalales hemiparasites (Petersen et al. 2015; Shin and Lee 2018; Chen et al. 2020a, b; Guo et al. 2020, 2021).

The loss/pseudogenization of plastid ndh genes is commonly observed in parasitic plants, which is regarded as an early response of plastomes in the evolution of heterotrophic lifestyles (Wicke and Naumann 2018). The observation that all 11 ndh (A to K) genes were either deleted or pseudogenized in the examined Santalales hemiparasites (Fig. 4) further confirms the assumption that the NDH pathway is not indispensable in parasitic plants that retain photosynthetic capacity (Maier et al. 2007; Wicke and Naumann 2018). Remarkably, the loss/pseudogenization of these plastid–encoded ndh genes has also been observed in a variety of photoautotrophic plants, including gymnosperms (e.g., Wakasugi 1994; McCoy et al. 2008; Wu et al. 2009; Ni et al. 2017), monocots (e.g., Peredo et al. 2013; Chang et al. 2006; Lin et al. 2015, 2017), early diverging eudicots (e.g., Sun et al. 2016, 2017), and core eudicots (e.g., Sanderson et al. 2015; Blazier et al. 2011; Morais et al. 2021). Given the independent loss of this pathway in many plant lineages, it is proposed that the plastid ndh genes may have been selected against in photoautotrophic angiosperms (Frailey et al. 2018). In addition, Lin et al. (2017) proposed that the loss of the plastid NDH pathway in photoautotrophic plants may increase the possibility of evolving a heterotrophic life history. Therefore, the reduction in the NDH pathway in the plastomes of the Santalales hemiparasites is more likely to be a trigger than an outcome of evolution to a parasitic lifestyle. On the other hand, the degradation of the NDH pathway always causes severe phenotypic and physiological effects in plants experiencing light, water, or heat stress (Horváth et al. 2000; Rumeau et al. 2007). Given that infections by Arceuthobium species pose major threats to coniferous forests worldwide (Tong and Ren 1980; You 1985; You and Tong 1987; Frank et al. 1996; Mathiasen et al. 2008), the eco-physiological consequences of the loss of plastid ndh genes in dwarf mistletoes (especially under stress conditions) need to be further investigated.

The infA gene is another commonly reduced gene in the plastomes of Santalales hemiparasites. This degradation occurs in all Santalales hemiparasites, except for Ximeniaceae and Schoepfiaceae. Although the loss or pseudogenization of infA is generally observed in the plastomes of many holoparasitic plants (Wicke et al. 2011a,b,2013,2016; Wicke and Naumann 2018), this mutation is quite rare in hemiparasites and has so far been identified in Santalales (Petersen et al. 2015; Li et al. 2017; Yang et al. 2017; Liu et al. 2018; Shin and Lee 2018; Zhu et al. 2018; Jiang et al. 2019; Chen et al. 2020a, b; Guo et al. 2020, 2021). Nevertheless, pseudogenization or deletion of the infA gene has been identified in a wide range of photoautotrophic angiosperms (Millen et al. 2001; Ahmed et al. 2012; Wicke and Naumann 2018). Therefore, the degradation of this gene in Santalales hemiparasites may not be associated with the evolution of the parasitic lifestyle. Although infA is an essential gene for the initiation of translation in organelles (Pel and Grivell 1994; Yu and Spremulli 1998), earlier studies suggest that it is one of the most mobile plastid genes in angiosperms, which is often transferred to and maintained in the nucleus (Millen et al. 2001; Ahmed et al. 2012). Similarly, the plastid infA gene of the above-mentioned Santalales hemiparasites may have been transferred to the nucleus.

In addition to the loss/pseudogenization of ndh loci and infA, the deletion of rps15, rps16, and rpl32 was commonly found in Loranthaceae species. Similarly, losses of these plastid ribosomal protein-encoding genes have been observed in a wide spectrum of autotrophic angiosperms (e.g., Chumley 2006; Saski et al. 2005; Jansen et al. 2007; Wicke et al. 2011a,b; Sabir et al. 2014; Park et al. 2015; Schwarz et al. 2015; Morais et al. 2021), and a line of evidence suggests that their functions are replaced by nuclear-encoding ribosomal genes (Park et al. 2015). Given their essential roles in plastid translation (Fleischmann et al. 2011; Park et al. 2015), the plastid rps15, rps16, and rpl32 genes of Loranthaceae plastomes may have been functionally transferred to the nucleus, and the deletion of these genes is unlikely to have resulted from the evolution of parasitism.

Arceuthobium and Viscum (Viscaceae) are hemiparasites that retain their photosynthetic capacity. Therefore, the loss/pseudogenization of several photosynthesis-associated genes, such as psbZ, petL, cemA, and ccsA (Fig. 4), in their plastomes was unexpected. Arceuthobium and Viscum represent the only two Santalales genera to date in which critical photosynthetic genes have been partially lost from the plastomes, although the deletion or pseudogenization of such genes is commonly observed in holoparasitic angiosperms (Funk et al. 2007; McNeal et al. 2007; Wicke et al. 2013, 2016; Wicke and Naumann 2018; Chen et al. 2020a, b; Liu et al. 2020). Wicke and Naumann (2018) proposed that the reduction in these essential photosynthesis genes most likely occurred around the transition from hemiparasitism to holoparasitism. Such plastome mutations in Arceuthobium and Viscum imply that (1) the reduction in these genes may have been initiated in hemiparasites that still rely on photosynthesis, and that (2) the degeneration of photosynthetic capacity is a gradual process that may have been initiated at the hemiparasitic stage but is not likely completed until a holoparasitic lifestyle is achieved. Moreover, Arceuthobium and those Viscum species whose essential photosynthesis genes are partially deleted share the morphological similarity that their leaves are extremely degraded. As a result, they rely heavily on host plants for their carbon requirement (Frank et al. 1996; Parks and Flanagan 2001; Mathiasen et al. 2008), which may reduce the pressure on photosynthesis (Petersen et al. 2015; Wicke and Naumann 2018). Therefore, the loss/pseudogenization of such essential photosynthesis genes in Arceuthobium and some Viscum species is likely to be related to the evolution of the leafless habit.

In addition to the above-mentioned protein-encoding genes, the losses of plastid tRNA genes (e.g., trnV-UAC, trnG-UCC, and trnK-UUU) were commonly observed in Santalales hemiparasites (Fig. 4). Previous studies have shown that some of the deleted tRNAs, such as trnV-UAC, are crucial for plastid translation and cell viability (Rogalski et al. 2008; Alkatib et al. 2012). Therefore, the reduction of essential plastid tRNAs is a rare mutation in photosynthetic angiosperms. In addition to Santalales hemiparasites, it has so far only been observed in the Cactaceae subfamily Cactoideae (Morais et al. 2021). Nevertheless, Wicke and Naumann (2018) speculated that the import of tRNAs from the cytosol can be more easily achieved due to their relatively smaller size. Accordingly, it is expected that there should be a specific mechanism in photosynthetic plants that supplies the plastids with tRNAs from the cytosol (Morais et al. 2021). In view of the lack of empirical studies that determine the import of essential tRNAs into plastids in the literature (Rogalski et al. 2008; Alkatib et al. 2012; Morais et al. 2021), further investigations are needed to verify whether such tRNA import mechanisms exist in Santalales hemiparasites.

The phylogenetic relationships within Santalales reconstructed in this study based on the plastome data (Fig. 5) are highly consistent with those revealed by previous studies (Der and Nickrent 2008; Nickrent et al. 2010, 2019; Su et al. 2015; Chen et al. 2020a, b; Guo et al. 2020, 2021). The distribution of root and stem hemiparasites on the tree topologies suggested that stem hemiparasitism evolved at least twice from root parasitism in the sandalwood order. In addition to the family-specific loss or pseudogenization of plastid genes, the data also revealed that closely related taxa in the phylogenetic trees tended to possess high similarity in plastome size, structure, and gene content. This further supports the assumption that plastome degeneration in Santalales hemiparasites evolved in a lineage-specific manner (Chen et al. 2020a, b; Guo et al. 2020, 2021).

Does the endophytic habit lead to a higher level of plastome reduction in dwarf mistletoes?

Petersen et al. (2015) proposed that the variation in nutritional dependence on the host plant may influence the reductive evolution of plastomes in hemiparasites. Given that the endophytic habit indicates a greater reliance on host plants for nutrients and carbon requirement (Tocher et al. 1984; Kirkpatrick 1989; Singh and Carew 1989), it is expected to lead to a higher level of plastome reduction in dwarf mistletoes than other Santalales hemiparasites. A comparative analysis of plastome features between dwarf mistletoes and other Santalales hemiparasites provides good support for this assumption.

Overall, the Arceuthobium plastomes were distinctive among the examined Santalales hemiparasites in possessing the smallest size, lowest GC content, and relatively very few functional (intact) genes (Tables 2 and 3; Fig. 4). Compared with other Santalales hemiparasites, the deletion of all four RNA polymerase genes (rpoA, rpoB, rpoC1, and rpoC2) was a lineage-specific plastome mutation in Arceuthobium. The plastid rpo genes, which transcribe many plastid photosynthesis genes, are essential to all hemiparasites that retain their photosynthetic capacity (Wicke et al. 2013, 2016). In addition, the plastid rpl33 gene, which encodes the ribosomal protein L33 (Rpl33), has been lost in all Arceuthobium species. Although rpl33 is not essential for cell survival (Rogalski et al. 2008), the gene is rarely lost in angiosperms. To date, it has been merely known to be deleted from the plastomes of a few eudicot lineages, such as legume species (Fabaceae; Guo et al. 2007; Tangphatsornruang et al. 2010), Cactaceae subfamily Cactoideae (Morais et al. 2021), and mycoheterotrophic orchid Rhizanthella species (Wicke et al. 2011a,b). Remarkably, previous studies have revealed that the rpl33 gene is indispensable for plants under stress conditions (Rogalski et al. 2008), because its function is particularly important for the formation of the photosynthetic apparatus at the young seedling stage and in young developing leaves (Fleischmann et al. 2011; Ehrnthaler et al. 2014). It is interesting to note that Arceuthobium species do not produce shoots immediately after seed germination but develop a highly developed endophytic system inside the branches of their host plants (Parmeter et al. 1959; Parmeter and Scharpf 1963; Tong and Ren 1980; Gilbert and Punter 1990, 1991). The endophytic system enables Arceuthobium species to absorb nutrition and water from host plants, making their survival at the young seedling stage completely independent of photosynthesis (Baranyay et al. 1971; Tocher et al. 1984; Alosi and Calvin 1985; Kirkpatrick 1989; Singh and Carew 1989). From this perspective, the endophytic habit of Arceuthobium species may largely relax the selection pressure to delete either rpo or rpl33 genes from their plastomes. Collectively, it is reasonable to infer that the unique gene losses observed in Arceuthobium plastomes are likely correlated with the evolution of endophytic habit, which may have caused a higher level of plastome degradation in this genus.

Taxonomic implications

Arceuthobium is fairly unique among Santalales hemiparasites and possesses a remarkably host-specialized life history (Frank et al. 1996). On the other hand, Arceuthobium exhibits a high degree of morphological similarity across species, and species identification in the genus largely depends on the identity of their host plants (Frank et al. 1996; Qiu and Gilbert 2003). Therefore, it is difficult to reliably determine the identity of many herbarium specimens belonging to the genus whose host information is absent. Recently, DNA barcodes have been widely used to discriminate species (Hebert et al. 2003; Kress et al. 2005; Hollingsworth 2011; Hollingsworth et al. 2009, 2011, 2016). With the advent of next-generation sequencing technology, complete plastome sequences are increasingly used as extended DNA barcodes for species identification and discrimination (Coissac et al. 2016; Hollingsworth et al. 2016), especially in taxonomically perplexing plant taxa (e.g., Ruhsam et al. 2015; Firetti et al. 2017; Fu et al. 2019; Ji et al. 2019, 2020; Ślipiko et al. 2020). In this study, a high level of sequence divergence was detected among the plastomes of A. chinense, A. pini, and A. sichuanense, although these species have a fair degree of overlap in their morphological features and distribution ranges (Qiu and Gilbert 2003). This suggests that plastome sequencing may provide an effective solution for credibly identifying Arceuthobium specimens.

Conclusions

In this study, the plastomes of the dwarf mistletoes Arceuthobium chinense and A. pini were sequenced and assembled de novo. The newly generated plastomes were characterized by significant reductions in size and GC content, accompanied by the loss of several essential housekeeping genes (rpoA, rpoB, rpoC1, and rpoC2) and pseudogenization of some core photosynthetic genes (psbZ and petL). The results suggest that both the leafless and endophytic habitat of dwarf mistletoes may significantly relax the selection pressure on photosynthesis, as well as plastid transcription and translation, thus causing the loss/pseudogenization of such essential plastid-encoding genes. This implies that the higher level of plastome degradation in Arceuthobium species is likely correlated with the evolution of endophytic habit and highly reduced vegetative body. These findings provide new insights into the plastome reductive evolution associated with parasitism in Santalales and deepen our understanding of the biology of dwarf mistletoes.

Author contribution statement

XG and YJ conceived and designed the research framework. GZ and LF collected sample; CL and XG collected and analyzed the data. XG and GZ wrote the original draft manuscript. YJ revised and edited the final manuscript. All authors have read and agreed to the published version of the manuscript.