Background

Brown algae (Phaeophyceae) are complex photosynthetic organisms and have independently evolved complex multicellularity among the heterokont lineage (Cock and Collén 2015; Terauchi et al. 2017). The brown algae evolved in a distinct lineage from groups containing a primary plastid (i.e., green algae, land plants, rhodophytes, and glaucophytes) (Rodriguez-Ezpeleta et al. 2005; Ševčíková et al. 2015; Dorrell et al. 2017). The chloroplasts of brown algae were derived by secondary endosymbiosis in which red algae were taken up by a non-photosynthetic eukaryote (Keeling 2004, 2010). Phylogenetic studies based on multimarker datasets have provided comprehensive evolutionary trees of the Phaeophyceae (Silberfeld et al. 2010). Several monophyletic early-diverging lineages, such as Discosporangiales and Ishigeales, could be resolved in the phylogeny, but most other brown algae form two super-clades representing two subclasses, Dictyotophycidae and Fucophycidae (Guiry and Guiry 2017). Dictyotophycidae includes four orders Syringodermatales, Sphacelariales, Dictyotales, and Onslowiales (SSDO), and Fucophycidae is a crown group consisting of 13 orders (Charrier et al. 2012; Silberfeld et al. 2014).

There are currently more than 2000 brown algal species that display a great diversity in morphology, physiology, and sexually dimorphic traits and also serve many ecological roles in marine environments (Charrier et al. 2012; Luthringer et al. 2014). Thus far, plastid (chloroplast) genomes (ptDNA or cpDNA) of nine brown algae have been completely sequenced, including Fucus vesiculosus, Sargassum horneri, Sargassum thunbergii, Coccophora langsdorfii (order Fucales); Saccharina japonica, Costaria costata, Undaria pinnatifida (order Laminariales); and Ectocarpus siliculosus, Pleurocladia lacustris (order Ectocarpales) (Le Corguillé et al. 2009; Wang et al. 2013; Zhang et al. 2015a, b; Liu and Pang 2016). This data has allowed for a better understanding of the evolution of plastid genomes and phylogenetic relationships of the Phaeophyceae. However, all nine known plastid genomes of brown algae belong to Fucophycidae and no complete ptDNA of Dictyotophycidae, a group that contains 395 species of brown algae worldwide (Guiry and Guiry 2017).

The sizes of known brown algal plastid genomes are 124.1–125.0 kb in Fucales, 129.9–130.6 kb in Laminariales, and 138.8–140.0 kb in Ectocarpales (Table 1). These ptDNAs are mapped as a canonical quadripartite structure with two large inverted repeats (IRs), which divide the circular molecule into a small single copy region (SSC) and a large single copy region (LSC) (Wang et al. 2013). These ptDNAs contain 6 ribosomal RNA (rRNA) genes, 27–31 transfer RNA (tRNA) genes, 137–139 protein-coding genes, and 2–6 open reading frames (ORFs). The architecture of plastid genomes is highly conserved within the brown algal orders of Fucales (Liu and Pang 2016) and Laminariales (Zhang et al. 2015a, b), while multiple genome rearrangements occurred in the evolution of this eukaryotic lineage (Le Corguillé et al. 2009).

Table 1 General features of the ten plastid genomes of brown algae

Dictyopteris divaricata (Okamura) Okamura (Dictyotales) is a cosmopolitan brown seaweed that usually inhabits littoral and sublittoral rock zones (Guiry and Guiry 2017). The thallus of this alga is often flat with regular dichotomous branches and bifid tips (Abbas and Shameel 2011) and contains structurally unique sesquiterpenes with important biological functions (Ji et al. 2009). However, species-level taxonomy is difficult and has been troubled for a long time due to morphological plasticity of species in Dictyotales (Tronholm et al. 2010). Fortunately, methods of molecular markers have been used to unveil intraspecific and interspecific relationships, and several new species have been identified (Lozano-Orozco et al. 2015). However, there is still limited genomic information in Dictyotales, which restricts our understanding of the taxonomic status and evolution of this group.

Although the physical map of plastid DNA in Dictyota dichotoma (Hudson) J.V.Lamouroux and size estimation of this genome had been established by Kuhsel and Kowallik (1985) for more than 30 years, no further information has been reported on the plastid genome sequence in the lineage of Dictyotales. To further understand the evolution of plastid genomes in brown algae, the complete plastid genome of D. divaricata was sequenced with next-generation sequencing. This ptDNA sequence represents the first plastid genome from the subclass Dictyotophycidae.

Materials and Methods

Sample Collection and Identification

Mature plants of Dictyopteris divaricata (Okamura) Okamura were initially collected from the rocky shore of No. 3 bathing beach in Qingdao, Shandong Province, China (36° 03′ N, 120° 22′ E) in July 2016 (Supplementary Fig. S1). Samples were transported to the laboratory in coolers (5–8 °C) within 24 h after collection. Frozen tissue from the original algal samples was used for DNA extraction. Algal tissue was ground to fine powder in liquid nitrogen. Total DNA was extracted using a Plant Genomic DNA Kit (Tiangen Biotech, Beijing, China) according to the manufacturer’s instructions. The concentration and the quality of isolated DNA were assessed by electrophoresis on 1.0% agarose gel. Species identification was performed according to morphological features and based on the analyses of plastid-encoded psbA gene. Sequence dataset of the D. divaricata sample (Dd-Qingdao) and other data from GenBank were aligned using a ClustalW with MEGA 7.0 software (Kumar et al. 2016). Maximum likelihood (ML) and neighbor-joining (NJ) analyses were performed with 1000 bootstrap replicates. The ML trees were obtained based on the Kimura two-parameter model (Kimura 1980) and the NJ trees using the maximum composite likelihood method (Tamura et al. 2004) for the psbA dataset in nucleotides. The NJ and ML phylogenetic analyses were performed in MEGA 7.0. This analysis confirmed the identification of D. divaricata (Supplementary Fig. S2).

Plastid DNA Extraction, Sequencing, and Assembly

The plastids of D. divaricata were isolated using the Plant Chloroplast Purification Kit according to the manufacturer’s instructions (Baiaolaibo, Beijing, China). Then, the plastid DNA was extracted using this kit. The Ultra II DNA Library Prep Kit (NEB, USA) was used for library construction for Illumina sequencing. The plastid DNA was fragmented into 350 bp and sequenced using Illumina HiSeq platform. The sequencing run produced ca. 783 Mb raw data with reads length of 150 bp. Poor quality sequences and sequencing adapters were removed using Trim Galore! v0.3.7 (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/), leaving 730 Mb clean data. De novo assemblies were run using SOAPdenovo v2.04 and GapCloser v1.12 (Luo et al. 2012) with the trimmed sequences. The final plastid assembly contained 156,285 reads with the mean coverage depth of 178. This resulted in one scaffold of 126,099 bp.

Genome Annotation and Comparative Analysis

Protein-coding genes and open reading frames (ORFs) were annotated using Dual Organellar Genome Annotator (DOGMA) (Wyman et al. 2004), NCBI ORF Finder and BLAST similarity searches of the non-redundant databases at NCBI (Altschul et al. 1997). Ribosomal RNA genes were delimited by direct comparison to sequenced brown algal orthologues using MEGA7. Transfer RNA genes were searched for by reconstructing their cloverleaf structures using the tRNAscan-SE 1.21 software with default parameters (Schattner et al. 2005). The physical map of the circular plastid genome was generated with Organellar GenomeDRAW (OGDraw) (Lohse et al. 2013). The genome sequence has been deposited in GenBank with the accession number KY433579. Base composition was determined by the MEGA7.0 software (Kumar et al. 2016). Tandem repeats (TRs) were found with Tandem Repeats Finder using default settings (Benson 1999). Small inverted repeats (SIRs) were identified with Inverted Repeats Finder using the default settings and the additional constraint that repeats had to be > 75% similar (http://tandem.bu.edu/cgi-bin/irdb/irdb.exe). Multiple sequence alignment of the ten brown algal plastid genome sequences (Table 1) was performed using the Mauve Genome Alignment v2.3.1 (Darling et al. 2004) with progressive Mauve algorithm (Darling et al. 2010).

Phylogenetic Analysis

Phylogenetic relationships within the brown algae were analyzed based on ptDNA protein-coding gene (PCG) datasets, which were composed of amino acid (aa) sequences of 18 photosystem II PCGs (psb28, psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbN, psbT, psbV, psbX, psbY, and ycf12) that were the most conserved group among PCGs in brown algal ptDNAs (Liu and Pang 2016). The aa sequences of 18 photosystem II PCGs were subjected to concatenated alignments using ClustalX 1.83 with the default settings (Thompson et al. 1997). Vaucheria litorea (Xanthophyceae; Rumpho et al. 2008) was selected as an out-group taxon for analysis of the aa dataset. The evolutionary history was inferred by using the ML method based on the JTT matrix-based model (Jones et al. 1992) and the NJ method (Saitou and Nei 1987) based on the Poisson correction model (Zuckerkandl and Pauling 1965) with 1000 bootstrap replicates, respectively, using MEGA 7.0 software (Kumar et al. 2016). Bayesian Inference (BI) analyses of the aa dataset were performed based on the best scoring alternative model of MtREV + G + I using MrBayes v.3.2 (Huelsenbeck and Ronquist 2001). One million generations were run for tree reconstructions and posterior probabilities using the Markov chain Monte Carlo (MCMC) method. Every 1000th generation was saved and the first 100 generations were discarded as burn-in. Posterior probability values for the majority-rule consensus trees constructed were calculated.

Results and Discussion

Genome Size and Inverted Repeats

The complete plastid genome of D. divaricata was 126,099 bp in size (Fig. 1), which was smaller than that of Laminariales (129.9–130.6 kb) and Ectocarpales (138.8–140.0 kb), but larger than that of Fucales (124.1–125.0 kb). The size of D. divaricata ptDNA was close to D. dichotoma plastid DNA, which was predicted to be 123 kb by using electron microscopy and gel electrophoresis (Kuhsel and Kowallik 1985). Like most plastid genomes, the D. divaricata ptDNA mapped as a canonical quadripartite structure with two large inverted repeats of 6026 bp dividing single circular genome into regions of a small single copy (SSC 41,399 bp) and a large single copy (LSC 72,648 bp) (Table 1). The nucleotide composition of brown algal ptDNAs was conserved and displayed low G + C content. The G + C content of D. divaricata ptDNAs was 31.19% and slightly higher than that of other brown algae ranging from 28.94% in Fucus vesiculosus to 31.05% in Saccharina japonica.

Fig. 1
figure 1

The plastid genome map of Dictyopteris divaricata. Annotated genes are colored according to the functional categories. Genes on the inside are transcribed in the clockwise direction, whereas genes on the outside are transcribed in the counterclockwise direction. The ring of bar graphs on the inner circle shows the GC content in dark gray

Variation in plastid genome size was mainly due to expansion and contraction of the inverted repeats (IRs), intron number, gene transfer and loss, and size of intergenic spacer regions (Baurain et al. 2010; Tanaka et al. 2011; Brembu et al. 2014; Sabir et al. 2014). The brown algal IRs were comprised by the core rrn5-rnl-trnA-trnI-rns gene cluster and the additional genes that flanked the ribosomal gene operon. The size of IRs in ten sequenced brown algal ptDNAs ranged from 5370 bp in F. vesiculosus to 8616 bp in E. siliculosus. The structure of IRs was conserved at the order level in brown algae (Fig. 2). The D. divaricata IRs was 6026 bp in size and contained the rpl21-rrn5-rnl-trnA-trnI-rns gene cluster. The D. divaricata IR was larger than that of Fucales and Laminariales due to the presence of rpl21 in the IR of D. divaricata but was smaller than that of Ectocarpales. Some genes (e.g., psbA, rpl32, trnL, and trnE) and orfs (e.g., orf53 and orf258) were only present in the IRs of Ectocarpales ptDNAs, which contributed to longer IRs of 8616 bp in E. siliculosus and 8084 bp in P. lacustris.

Fig. 2
figure 2figure 2

Comparison of inverted repeat (IRs) boundaries in the ten brown algal ptDNAs. a IRa. b IRb. The “p” in the bracket represents the partial of the related genes of Fucus vesiculosus (Fv), Sargassum horneri (Sh), Sargassum thunbergii (St), Coccophora langsdorfii (Cl), Saccharina japonica (Sj), Costaria costata (Cc), Undaria pinnatifida (Up), Ectocarpus siliculosus (Es), Pleurocladia lacustris (Pl), and Dictyopteris divaricata (Dd) ptDNAs. Annotated genes are colored according to the functional categories

Gene Content

The D. divaricata ptDNA contained 174 genes, including six ribosomal RNA genes (rRNA), 28 transfer RNA genes (tRNA), 138 protein-coding genes (two rpl21 genes in IRs), and two conserved open reading frames (orfs). Only one intron was present in the D. divaricata trnL2 gene encoding tRNA-Leu. This intron was also present in the homologous genes of the sequenced Fucales and Laminariales ptDNAs but was absent in two members of the Ectocarpales (Le Corguillé et al. 2009). All protein-coding genes encoded by D. divaricata ptDNA started with the ATG codon with the exception of psbF with GTG and ycf66 with TTG. A total of 113 protein-coding genes were terminated by a TAA stop codon, 20 with TAG, and seven with TGA.

The ten brown algal plastid genomes shared a core set of 133 protein-coding genes (duplicated genes were only counted once), while the other protein-coding genes only occurred in certain lineages (Table 2). Two protein-coding genes, rbcR and rpl32, were found in Laminariales, Ectocarpales (E. siliculosus), and Fucales (LEF) but were absent in D. divaricata. The rpl32 gene was also lost in the P. lacustris ptDNA. The absence of this gene might be due to gene transfer to the nucleus or gene loss. Two genes, syfB and ycf17, were absent in Fucales but present in D. divaricata as well as Laminariales and Ectocarpales. Another two genes, petL and ycf54, were only absent in Laminariales. Besides two conserved orfs shared by all brown algal ptDNAs, some specific orfs with unknown function were identified only in the Ectocarpales ptDNAs (Table 2).

Table 2 Evolutionary patterns of gain and loss of genes in the ten plastid genomes of brown algae

Intergenic Spacer and Overlapping Regions

The intergenic spacer region in D. divaricata ptDNA was a total of 16,528 bp in size and constituted 13.11% of the whole genome, which was slightly less than that in Fucales (13.68–14.16%), Laminariales (16.48–16.74%), and Ectocarpales (19.13–19.60%). Five conserved overlapping regions had been noted in the previously sequenced brown algal plastid genomes (Wang et al. 2013; Zhang et al. 2015a, b), while eight pairs of overlapping genes were found in the D. divaricata plastid genome, including atpD-atpF (1 bp), ycf12-ftrB (6 bp), rpl4-rpl23 (8 bp), rpl29-rps17 (4 bp), rpl1-rpl11 (4 bp), rps1-ycf40 (23 bp), ycf24(sufB)-ycf16(sufC) (4 bp), and psbD-psbC (53 bp) (Table 3). Reduced content of intergenic spacer regions and the increase in overlapping regions made the D. divaricata ptDNA the most compact plastid genome in brown algae so far.

Table 3 Evolutionary patterns of gain and loss of overlapping regions in the ten plastid genomes of brown algae

Three overlapping regions, i.e., rpl4-rpl23 (8 bp), ycf24(sufB)-ycf16(sufC) (4 bp), and psbD-psbC (53 bp), were present in all sequenced ptDNAs of brown algae with the same overlapping size, indicating the conservative characteristics of brown algal genome structure. The ycf12-ftrB (6 bp) overlapping region was present in Fucales, Laminariales and D. divaricata, but absent in Ectocarpales. The rps1-thiS (4 bp) overlapping region was only found in the Fucales. Four new overlapping regions identified in D. divaricata, i.e., atpD-atpF (1 bp), rpl29-rps17 (4 bp), rpl1-rpl11(4 bp), and rps1-ycf40 (23 bp), were not detected in plastid genomes of LEF clade, indicating that they appeared in ptDNA of D. divaricata after its divergence from LEF clade and highlighting the diversity of evolutionary trends in brown algal plastid genomes. The psbD-psbC overlapping region was the most conserved in terms of size and sequence even among the ochrophytes which harbored Phaeophyceae, Xanthophyceae, Raphidophyceae, Eusgmatophyceae, Chrysophyceae, Pelagophyceae, and Bacillariophyceae (Ševčíková et al. 2015).

Tandem and Small Inverted Repeats

Numerous tandem (TRs) and small inverted repeats (SIRs) were previously found in plastid genomes (Cattolico et al. 2008; Liu and Pang 2016). In D. divaricata ptDNA, a total of 5 tandem repeats (TRs) and 15 small inverted repeats (SIRs) were identified. Two of TRs were located in the atpA gene, one in ycf46, one in psbA-psbK spacer, and one in rpl11-trnW(cca) spacer. The average size of D. divaricata TR elements was 43.8 ± 11.6 bp, which was similar to that of other brown algae (Liu and Pang 2016). The TRs had a period of 9–24 bp with a copy number of 2–3.5 with low GC content ranging from 5.13 to 25.49%. The D. divaricata SIRs were all localized within intergenic spacer regions and some overlapped partially with adjacent genes. The stem length of SIRs ranged from 21 to 68 bp and the small loop domain from 0 to 18 bp.

Eight SIRs of D. divaricata were located at the termini of two genes transcribed on opposite coding strands, including rbcS/ccsA, orf531/petA, psaI/psbJ, ftsH/psbH, psbN/psbT, psbB/petF, petF/rpl12, and psbC/ycf41. Exploring the biological significance of SIRs and their hairpin structure is a fascinating question. The IRs of stramenopile ptDNAs are usually located adjacent to genes that are related to photosynthesis or energy production (Cattolico et al. 2008) and are likely to play an important role for regulation of transcription, translation, and other biological functions (Lillo et al. 2002). The placement conservation was considered to be likely associated with the functional constraint (Ong et al. 2010; Hovde et al. 2014).

Genome Organization

The quadripartite plastid genomes in brown algae exhibited several rearrangements among species, especially at the order level, which was similar to other observed red algal-derived ptDNAs (Ruck et al. 2014; Starkenburg et al. 2014; Yurchenko et al. 2016). By comparing the genome organization of 10 brown algal species from four orders, 22 conserved gene blocks (CGBs) were identified (Fig. 3). Two large CGBs over 30 kb were noted. The ribosomal gene block (30.5 kb) contained 48 genes from rpl9 to rns. The highly conserved ribosomal gene block was also observed in other red algal-derived plastid genomes (Stoebe and Kowallik 1999; Oudot-Le Secq et al. 2007; Tajima et al. 2016). Another CGB was 33.3 kb that contained 36 genes (rpl27-atpA). Most of the genes in this CGB were mostly transcribed on the same strand.

Fig. 3
figure 3

Synteny comparison of the ten brown algal ptDNAs using Mauve software. Rectangular blocks of the same color indicate collinear regions of sequences

Genome organization of brown algal ptDNAs was highly conserved at the order level, as was shown by the nine ptDNAs sampled from Laminariales, Ectocarpales, and Fucales. Ectocarpales had displayed a higher number of rearrangements than other lineages. The architecture of D. divaricata ptDNA was more similar to that of Laminariales than Fucales and Ectocarpales. The plastid genomes of D. divaricata and Laminariales showed a high degree of similarity in gene arrangement, indicating the key clue to understand the ancestral gene order of brown algae. Relative to the gene order of Laminariales, in D. divaricata, two blocks of six genes (trnE-psaA-psaB-rps14-petG-psbK) and seven genes (rpl19-trnM-ycf47-petM-petN-acsF-ycf42) had been translocated and inverted.

Phylogenetic Analyses

The phylogenetic analyses of 18 photosystem II protein dataset based on ML, NJ, and BI methods generated identical topologies with similar support values (Fig. 4). The limited taxon sampling of ten brown algal species formed four clades representing four orders: Laminariales, Ectocarpales, Fucales, and Dictyotales. D. divaricata (Dictyotales) diverged before a strongly supported clade comprising Laminariales, Ectocarpales, and Fucales (LEF), which was similar to prior phylogenomic comparisons of ten taxa and 35 mitochondrial protein-coding genes (Liu and Pang 2015). The Dictyotophycidae and Fucophycidae had a sister relationship, which has been previously noted (Silberfeld et al. 2014). However, so far, we still have limited plastid genome sequences and especially lack representative plastid genomes of some brown algal lineages (orders) including the basal taxa Discosporangiales and Ishigeales. Thus, a greater taxon sampling will hopefully provide a more comprehensive and general picture of the diversity of plastid genomes in brown algae.

Fig. 4
figure 4

Phylogenetic tree constructed from analyses of amino acid (aa) sequences of 18 photosystem II PCGs. The tree was rooted with Vaucheria litorea (Xanthophyceae). The numbers at internal nodes (ML/NJ/BI) indicated maximum likelihood (ML) and neighbor-joining (NJ) bootstrap values, as well as Bayesian Inference (BI) posterior probability values, respectively. Branch lengths are proportional to the amount of amino acid substitutions per site, which are indicated by the scale bar below the tree

Conclusion

Dictyopteris divaricata was the first species from the subclass Dictyotophycidae to have its plastid genome completely sequenced and annotated. The most important findings in this study were that the ptDNA of D. divaricata was the most compact plastid genome in brown algae so far and its architecture was more similar to that of Laminariales than Fucales and Ectocarpales. Detailed comparative analyses of the conservation and variation of genome characteristics in a larger scale further incited into our understanding of the evolutionary history of brown algal ptDNAs. The difference in general features, gene content, and architecture among D. divaricata and LEF ptDNAs revealed the diversity and evolutionary trends of plastid genomes in brown algae.