Introduction

The Phaeophyceae (brown algae) belong within the stramenopiles and are a group of multicellular marine eukaryotes, having chloroplasts that originate from secondary endosymbiosis, in which a red alga was engulfed by a nonphotosynthetic protist (Keeling 2010). After their own independent evolution for more than 200 million years (Silberfeld et al. 2010), the current brown algal group consists of a multitude of taxa including 19 orders, 62 families, 473 genera, and more than 2000 species with various morphological and physiological characteristics (Charrier et al. 2012; Liu and Pang 2015b). Brown algae are one of the major photosynthetic producers of organic carbon on rocky intertidal shores of the world and play an important role both as human food and for the habitats that they form (Loureiro et al. 2015).

Compared to rhodophytes, green algae, and land plants (primary endosymbionts), chloroplasts of autotrophic stramenopiles (secondary endosymbionts) display their own characteristics in morphology, pigment composition, genome organization, and gene content (Cattolico et al. 2008; Dorrell and Smith 2011). The interpretation of information implied in chloroplast DNA (cpDNA) of ecologically diverse species could improve our understanding of stramenopile plastid function, phylogeny, and evolution (Keeling 2004). To date, over 40 plastid genomes have been sequenced among stramenopiles. However, only three of them are from brown algae including Fucus vesiculosus (order Fucales), Saccharina japonica (order Laminariales), and Ectocarpus siliculosus (order Ectocarpales) (Le Corguillé et al. 2009; Wang et al. 2013). The brown algal plastid genomes are 125.0–140.0 kb in size; contain 173–185 genes including 6 ribosomal RNA (rRNA) genes, 28–31 transfer RNA (tRNA) genes, and 139–148 protein-coding genes (PCGs); and appear to be highly rearranged in genome architectures (Table 1). Genome sequence analysis of more brown algal cpDNAs is a promising approach for further understanding the evolutionary history of this eukaryotic lineage.

Table 1 General properties of chloroplast genomes in four species of brown algae

Sargassaceae, in the order Fucales, is the largest family in Phaeophyceae containing 52 genera and 495 species. Sargassum C. Agardh is the most species-rich genus in this family and contains 347 species (including varieties) (Guiry and Guiry 2015). Sargassum species usually form dense underwater forests on rocky coastlines of tropical and temperate regions serving as habitats for myriad vertebrate and invertebrate species (Mattio and Payri 2011). Sargassum horneri is an important perennial, canopy-forming seaweed native to the northwestern Pacific coast growing in the upper sublittoral zone (Hu et al. 2011). During the past several years, its large-scale biomass has been continuously observed drifting in the Yellow Sea and the East China Sea (Komatsu et al. 2008). In its native distribution areas, e.g., the coastal water of Zhejiang Province of China, its biomass has been observed to decrease steadily and even disappear in certain regions (Sun et al. 2008). Thus, this seaweed has already been chosen as one of the main algal species to be used to reconstruct seaweed beds in China. An efficient technique for producing young S. horneri seedlings has been established (Pang et al. 2009).

Previously, we reported the complete mitochondrial genome sequence of S. horneri as well as four other Sargassum species in subgenus Bactrophycus, with genome organization highly similar to that of F. vesiculosus (Liu and Pang 2015a, b; Liu et al. 2015). To gain further insight into the evolutionary biology of the alga, in this study, the complete sequence of S. horneri chloroplast genome has been sequenced and compared with three other brown algal cpDNAs in the aspects of genome features, gene content, and genome organization.

Materials and methods

Sample collection and DNA extraction

Mature plants of S. horneri (Turner) C. Agardh were initially collected from the rocky shore at Xiaohuyu, Nanji Islands, Wenzhou, Zhejiang Province, China (27° 27′ N, 121° 04′ E), in April 2007 (Liu et al. 2015). Plants were transported to the laboratory in cool boxes (5–8 °C) within 24 h after collection. Fresh algal tissue was selected and stored in the ultra-low temperature freezer (−80 °C) for DNA extraction. Algal tissue was ground to fine powder in liquid nitrogen. Total DNA was isolated using a Plant Genomic DNA Kit according to the manufacturer’s instructions (Tiangen Biotech, China). The DNA quality and quantity were assessed by electrophoresis on 1.0 % agarose gel.

PCR amplification and sequencing

The whole chloroplast genome of S. horneri was amplified using the long PCR and primer walking techniques (Cheng et al. 1994). Primer sets were designed according to three known brown algal cpDNAs (F. vesiculosus, S. japonica, and E. siliculosus) and used to amplify the entire S. horneri chloroplast genome in 15 large fragments (Supplementary Table S1). PCR reactions were carried out in 50-μL reaction mixtures containing 32 μL of sterile distilled H2O, 10 μL of 5× PrimeSTAR GXL buffer (5 mM Mg2+ plus, Takara, Japan), 4 μL of dNTP mixture (2.5 mM each), 1 μL of each primer (10 μM), 1 μL of PrimeSTAR GXL DNA polymerase (1.25 units μL−1, Takara, Japan), and 1 μL of DNA template (approximate 50 ng).

PCR amplification was performed on a T-Gradient Thermoblock Thermal Cycler (Whatman Biometra, Germany) with an initial denaturation at 94 °C for 3 min, followed by 30 cycles of denaturation at 94 °C for 20 s, annealing at 50–52 °C for 50 s, extension at 68 °C for 1 min kb−1, and a final extension at 68 °C for 10 min. Long PCR products were purified using a Qiaquick Gel Extraction Kit (Qiagen, Germany). Sequencing reactions were performed using ABI 3730 XL automated sequencers (Applied Biosystems, USA).

Genome assembly and annotation

The DNA sequences were manually edited and assembled using the BioEdit v7.1.9 software (Hall 1999). The DNA sequence of the complete chloroplast genome of S. horneri was determined by comparison with published sequences for three brown algae (Le Corguillé et al. 2009; Wang et al. 2013). PCGs and putative open reading frames (ORFs) were annotated by NCBI ORF Finder and BLAST similarity searches of the nonredundant databases at NCBI (Altschul et al. 1997). rRNA genes were identified by RNAmmer 1.2 software (Lagesen et al. 2007) and by comparing S. horneri cpDNA with rRNA genes from other brown algal cpDNAs. tRNA genes were searched for by reconstructing their cloverleaf structures using the tRNAscan-SE 1.21 software with default parameters (Schattner et al. 2005). The physical map of the circular chloroplast genome was generated using Organellar Genome DRAW (Lohse et al. 2013). The genome sequence has been deposited in GenBank with the accession number KP881334.

Genome analysis

To date, three complete chloroplast genome sequences of brown algae have been reported, but their functional genes, especially tRNA and rRNA genes, have been annotated based on different methods or software, which caused the differences in gene length and number, influencing the accuracy of their comparison. In order to solve this problem, we reanalyzed three reported brown algal cpDNA sequences with the same method, and part of the obtained results were different from those previously reported. A total of 173 genes in the S. horneri cpDNA were sorted by their function and divided into basic functional groups. Base composition and pairwise comparison were determined by the MEGA 5.2 software (Tamura et al. 2011). The identity percentages of gene sequences were evaluated using the BioEdit v7.1.9 software (Hall 1999). Small inverted repeats (SIRs) were identified with Inverted Repeats Finder using the default settings, and the additional constraint that repeats had to be >75 % similar (http://tandem.bu.edu/cgi-bin/irdb/irdb.exe). Tandem repeats (TRs) were found with Tandem Repeats Finder using default settings (Benson 1999).

Results and discussion

Genome features

The chloroplast genome of S. horneri has a size of 124,068 bp, which is smaller than three other brown algal cpDNAs sequenced to date (Le Corguillé et al. 2009; Wang et al. 2013). The S. horneri cpDNA is mapped as a canonical quadripartite structure with two 5436 bp inverted repeat regions (IRs), which divide the circular molecule into a small single-copy region (SSC 39,885 bp) and a large single-copy region (LSC 73,311 bp) (Fig. 1). The S. horneri IRs contain five gene loci, including three rRNA and two tRNA genes, which are the same as those of F. vesiculosus and S. japonica, but different from that of E. siliculosus in which the IRs are much longer (8616 bp) containing 11 gene loci.

Fig. 1
figure 1

The chloroplast genome map of Sargassum horneri. Annotated genes are colored according to the functional categories. Genes on the outside are transcribed in the clockwise direction, whereas genes on the inside are transcribed in the counterclockwise direction

The overall G + C content for S. horneri cpDNA is 30.61 % and in the range of brown algae from 28.94 % for F. vesiculosus to 31.05 % for S. japonica (Table 1). The coding sequence constitutes 86.32 % of the S. horneri cp genome. The total spacer size in S. horneri cpDNA is 16,967 bp with an average length of 101.0 bp, and smaller than that in three other brown algal cpDNAs. The decease of the spacer size makes the S. horneri cpDNA more compact than the other three brown algal cpDNAs. The spacer G + C content is only 18.05 % in S. horneri cpDNA.

Five pairs of genes are found overlapping by 4 to 53 bp in the S. horneri cpDNA; i.e., ftrB and ycf12 overlapped by 6 bp, sufC and sufB by 4 bp, rps1 and thiS by 4 bp, psbC and psbD by 53 bp, and rpl23 and rpl4 by 8 bp. The overlaps of psbC-psbD (53 bp) and rpl23-rpl4 (8 bp) are highly conserved among the four brown algal cpDNAs. These two overlapping regions are also observed in diatom plastid genomes (e.g., Oudot-Le Secq et al. 2007; Tanaka et al. 2011; Galachyants et al. 2012).

Small inverted and tandem repeats

Brown algal chloroplast genomes harbor multiple small inverted repeats (SIRs) and tandem repeats (TRs). The S. horneri cpDNA contains 16 SIRs and 9 TRs (Table 2), representing 1.02 and 0.32 % of the S. horneri genome, respectively, which is similar to those of three other brown algal cpDNAs. The total repeat number found in brown algal cpDNAs (18–25) is similar to that in diatoms (Oudot-Le Secq et al. 2007; Galachyants et al. 2012), more than that in Pelagophyceae (Ong et al. 2010), and less than that in Raphidophyceae (Cattolico et al. 2008).

Table 2 Properties of small inverted and tandem repeats in chloroplast genomes of brown algae

The SITs of S. horneri cpDNA are composed of a stem structure which ranges from 23 to 55 bp in size (average 35.1 ± 9.5 bp) and a small loop domain averaging only 2.9 ± 2.3 bp. Approximately half of SIRs in brown algal cpDNAs are located between two genes transcribed on opposite coding strands, which is similar to bacterial genomes and other reported stramenopile members (e.g., Lillo et al. 2002; Cattolico et al. 2008). Although some SIRs held by four brown algae are located in the same positions of cpDNAs, there is no sequence homology between them. TRs have a period of 18.8 ± 11.2 bp with a copy number ranging from 1.9 to 5.1 in the S. horneri cpDNA. The average size of TR element is 43.8 ± 26.9 bp.

Genes in the S. horneri cpDNA

The gene set of S. horneri cpDNA consists of 173 genes, including 6 rRNA genes located in the IRs, 28 tRNA genes sufficient for messenger RNA translation, and 139 PCGs. The PCG set includes 52 genes responsible for transcription and translation, 12 photosystem I-associated genes, 18 photosystem II-associated genes, 19 for electron transport and ATP synthesis, 6 for carbon assimilation, 5 for light harvesting and chl biosynthesis, 3 for signal transduction, 3 for protein import, 2 for Fe-S assembly, 2 chaperones-associated genes, 2 proteolysis-associated genes, and 15 conserved hypothetical genes (Table 3).

Table 3 Genes identified in the chloroplast genome of Sargassum horneri

The brown algal plastid genomes show high similarity in terms of gene content and composition. In total, 167 chloroplast genes (6 rRNAs, 28 tRNAs, and 133 PCGs) are shared by these four brown algae (Fig. 2). Two genes (petL and ycf54) found in S. horneri and F. vesiculosus cpDNAs were present in E. siliculosus, but absent in S. japonica. Four PCGs (psb28, sufB, sufC, and thiG) were only found in Fucales. Two new tRNA genes (trnM-4 and trnF-2) predicted in the S. japonica cpDNA were not detected in three other brown algal cpDNAs. The differences in gene content might be due to gene loss or function transfer to the nucleus.

Fig. 2
figure 2

Venn diagram of gene content of the plastid genomes of Sargassum horneri and Fucus vesiculosus (Sh and Fv, blue) in order Fucales, Saccharina japonica (Sj, green) in order Laminariales, and Ectocarpus siliculosus (Es, yellow) in order Ectocarpales

In S. horneri cpDNA, two introns located in trnL2 and trnW genes were identified with the size of 209 and 90 bp, respectively. The former intron in trnL2 gene was present in F. vesiculosus (219 bp) and S. japonica (234 bp), but lost in E. siliculosus; the latter in trnW was only detected in the S. horneri cpDNA which might be formed by repeat mutation. In addition, only in the S. japonica trnF2 gene (tRNA-Phe) located between ycf35 and ycf24 was the specific intron (31 bp) found.

All PCGs encoded by S. horneri cpDNA start with the ATG codon with the exception of psbF, which starts with GTG codon. Unusual start codons have been identified in three other brown algal cpDNAs, e.g., a GTG codon found at the beginning of psbF and rpl3 in F. vesiculosus; psbF and rps8 in S. japonica; and rps8, rpl3, and rbcR in E. siliculosus and a TTG codon at the beginning of Escp99 in E. siliculosus. Three stop codons are employed, and approximately 79.86 % of PCGs (111 of 139 genes) terminate with TAA stop codon, compared with 17.99 % with TAG and 2.16 % with TGA.

Gene order and identity

The gene order of S. horneri cpDNA is identical to that of F. vesiculosus, indicating that the genome organization of cpDNAs is conservative at the level of the order Fucales. The comparison by genome-scale alignment shows that chloroplast genomes of S. horneri and F. vesiculosus display the same gene synteny and have an overall nucleotide sequence identity of 82.1 %. Considering the genome organizations of S. japonica and E. siliculosus, multiple genome rearrangements occurred during the evolution of the Phaeophyceae (Le Corguillé et al. 2009).

Although gene order of brown algal cpDNAs are different at the order level (Wang et al. 2013), their gene length and identity display high similarity (Supplementary Tables S2 and S3). Gene identities in rRNA and tRNA genes of four brown algae are higher than those in PCGs (Fig. 3a). Among the PCGs, photosystem II-associated genes are most conserved based on both nucleotide and amino acid sequences (Fig. 3b). It is worth noting that the size of ilvH gene shows large variation in different brown algal cpDNAs, which is 489 bp in S. horneri and much shorter than 651 bp in F. vesiculosus, 603 bp in S. japonica, and 585 bp in E. siliculosus, due to the premature termination codon of the ilvH gene.

Fig. 3
figure 3

The comparison of average identity percentages a of tRNA, rRNA, and protein-coding gene (PCG) sequences and b of the specific functional PCG nucleotide sequences from S. horneri (Sh), F. vesiculosus (Fv), S. japonica (Sj), and E. siliculosus (Es). TT transcription and translation; PSI photosystem I; PSII photosystem II; EA electron transport and ATP synthesis; CA carbon assimilation; LC light harvesting and chl biosynthesis; ST signal transduction; PI protein import; FCP Fe-S assembly, chaperones, and proteolysis; and CHG conserved hypothetical genes.

Conclusion

Sargassum horneri is the fourth brown algal species to have its complete cpDNA sequenced. The new data obtained will provide important information for us to understand plastid evolution as well as phylogeny in brown algae, especially the Sargassaceae. However, limited plastid genomic information severely restricts more detailed investigation on the evolution of the brown algae. Additional sequencing of unsampled taxonomic groups is necessary so that patterns of genome organization in cpDNAs could be further investigated for a better understanding of phylogenomic relationships in brown algae.