Introduction

Saccharina (Laminariales, Phaeophyceae), one of the most important economic algae, can be used as food and raw biochemical industry materials due to its algin, mannitol, and iodine extraction (Jensen 1993; Porse and Rudolph 2017). It also plays a key role in maintaining offshore ecological balance (An et al. 2010). Additionally, Saccharina has been considered as a potential model organism for genetic study of seaweeds with respect to the heterogenesis, parthenogenesis, and apogamy in the life history (Waaland et al. 2004). Because of its important economic, ecological, and genetic value, Saccharina is cultured in large scale in the region of Western Pacific, especially in China (Liu et al. 2012). So far, more than 20 elite varieties have been bred in China and its mariculture industry ranks first among the world (Liu et al. 2010b). However, some problems for Saccharina breeding, including genetic confusion, inbreeding depression, and limited improved varieties have not been resolved at present. Applying molecular markers for genetic study systematically will improve the breeding work.

Recently, several molecular markers for Saccharina such as randomly amplified polymorphism of DNA (RAPD) (Billot et al. 1999a; Yotsukura et al. 2001; Wang et al. 2004; Bi et al. 2011), inter-simple sequence repeat (ISSR) (Wang et al. 2005), and amplified fragment length polymorphism (AFLP) (Li et al. 2007b; Yang et al. 2009; Liu et al. 2010a; Shan et al. 2011) have been used in germplasm identification, genetic diversity, population genetics, genetic mapping, and QTL analysis. SSRs which also known as microsatellites are widely utilized in many species including tobacco, rice, maize, wheat, poplar, pine, apple, and peach (Gupta et al. 1996; Rungis et al. 2004; Chen et al. 2005). Compared with other molecular markers, SSR markers with Mendelian inheritance and co-dominance are more simple, stable, abundant, highly reproducible, and polymorphic (Varshney et al. 2002; Chen et al. 2005). For Saccharina and Laminaria, studies associated with SSRs have made a lot of progress in recent years. There were ten polymorphic genomic SSRs (gSSRs) isolated from Laminaria digitata genomic libraries of small inserts (Billot et al. 1999b), 18 gSSR markers developed from S. japonica using FIASCO method (Shi et al. 2007), 27 polymorphic trinucleotide gSSR markers developed from S. japonica through paired-end Illumina sequencing data (Zhang et al. 2014b), and 23 gSSRs from the SSR-enriched genomic library of S. japonica (Zhang et al. 2015). In addition, nine EST-SSR markers with polymorphism were generated from S. japonica EST sequences (Liu et al. 2010b), 23 EST-SSR polymorphic markers were developed from L. digitata ESTs and S. japonica ESTs (Wang et al. 2011), 13 EST-SSR markers for wild S. japonica population using L. digitata ESTs (Liu et al. 2012). Even so, SSR works of this important seaweed is lagged far behind that of the high plants and more SSR markers are indispensable to be developed effectively in order to satisfy the need of molecular genetics.

In contrast with gSSRs developed from genome by enrichment and sequencing of genomic libraries (Edwards et al. 1996), EST-SSRs are more efficient and less expensive (Bouck and Vision 2007). Moreover, they associated with functional genes that can be more useful in construction of physics map and marker-assistant selection. However, only 984 ESTs from Saccharina and 4102 ESTs from Laminaria are available in the National Center for Biotechnology Information (www.ncbi.nlm.nih.gov) for the moment which had hampered the advancement of EST-SSR markers. Transcriptome sequencing is a useful method to produce EST datasets for expression pattern construction, molecular marker development, and novel gene identification (Li et al. 2012). Fortunately, transcriptome of S. japonica has been de novo sequenced by our team using the next generation sequencing (NGS) technologies (Liang et al. 2014) and yielded a large amount of ESTs. Here, we developed a batch of EST-SSR markers based on transcriptome data and conducted their practicability test. In addition, genetic diversity and population structure of ten Saccharina strains which represented almost all the most widely used commercial varieties in China were evaluated using the developed polymorphic EST-SSR markers in our study. This work will enhance the genetic research and promote the molecular marker-assistant selection of Saccharina.

Materials and methods

EST-SSRs identification and primer design

ESTs used in this study were generated from the transcriptome of S. japonica which has been sequenced using Illumina HiSeq 2000 by our team (Liang et al. 2014) as one part of 1KP Project (http://www.onekp.com/). Microsatellites were searched from these ESTs using microsatellite search module (MISA) of PrimerPro (http://webdocs.cs.ualberta.ca/~yifeng/primerpro/) with 2–6 bp repeats as described in Thiel et al. (2003). The parameters were set for detection of dinucleotide, trinucleotide, tetranucleotide, pentanucleotide, and hexanucleotide motifs with a minimum of ten, six, five, four, and three repeats, respectively (Liu et al. 2012). Mononucleotide repeats such as A, T, G, and C were not considered in this work.

SSR Primers were designed in the conserved flanking regions with Primer Premier software (version 5.0). The principles for primer design were as follows: product length 150–500 bp, primer size 18–24 bp, primer melting temperature of 45–65 °C, the difference in annealing temperatures between the forward and the reverse primer did not exceed 1 °C, and none secondary structure. Oligonucleotides were synthesized at Invitrogen, Shanghai, China.

Practicability test of the designed EST-SSR primers

All designed primer pairs were initially tested for amplification using four DNA samples as templates from “Ailunwan” (S. japonica × Saccharina latissima), “Zaohoucheng” (S. japonica), “Dongfang No. 2” (S. japonica × Saccharina logissima), and “Rongfu” (S. japonica × S. latissima), respectively. These four Saccharina varieties were all widely cultured in China. Total DNA was isolated from sporophytes using improved CTAB methods (Guillemaut and Drouard 1992). PCR amplifications were carried out in 20 μL reaction mixture containing 1× buffer, 250 μM dNTPs, 200 μM of each primer, 30–50 ng template DNA, 1.5 mM MgCl2, and 0.5 U Taq DNA polymerase (Zhang et al. 2015). The thermal program with gradient temperature control was as follows: 5 min at 94 °C, then 35 cycles of 50 s at 94 °C, a 45-s gradient from 45 to 65 °C and 45 s at 72 °C, finishing with 10 min at 72 °C.

Amplified products were separated by electrophoresis in 3% agarose gels and visualized by GoldView (5 μL GoldView was added into 100 mL cooled agarose gel solutions). The primer pairs which yielded clear bands were used to optimize the annealing temperature and to determine the PCR cycle number.

Genetic diversity analysis

A total of 128 individuals from ten strains including “Ailunwan,” “Fujian,” “Rongfu,” “Pingbancai,” “Sanhai,” “Gaojia,” “Zaohoucheng,” “Dongfang No. 2,” “Dongfang No. 3,” and “Pengza No. 2” which covering almost the full widely used commercial Saccharina varieties in China were utilized to evaluate the genetic diversity and population structure using the developed SSR primer pairs from this work. We performed SSR PCR amplifications using the reaction conditions described in this paper. The amplified products were separated through 6% denaturing polyacrylamide gel electrophoresis and then visualized using silver-staining method (Bassam et al. 1991). Molecular sizes of the amplifications were estimated using 20-bp DNA ladder (Takara).

SSR products which were amplified by each primer pair were scored manually as binary data using “1” (presence) and “0” (absence) based on the SSR pattern according to the previously reported method (Sun et al. 2006). The percentage of polymorphic loci (P) among strains, Nei’s genetic diversity (H), Shannon’s information index (I), effective number of alleles (Ne), genetic identity, and genetic distance were calculated from the EST-SSR data using POPGENE version 1.31 (Yeh et al. 1999). The cluster analysis was constructed on the similarity matrix by employing the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) algorithm using NTSYS-pc (version 2.1) software. Polymorphism information content (PIC) was calculated according to the following formula: \( PIC=1-\sum \limits_{i=1}^n{P}_i^2-\sum \limits_{i=1}^{n-1}\sum \limits_{j=i+1}^n2{P}_i^2{P}_j^2 \) (“n” is the number of alleles at one locus, “P i ” and “P j ” are the frequencies of the ith and jth alleles at one locus) (Botstein et al. 1980).

To further investigate the genetic relationship of Chinese Saccharina strains, the possible population structure was analyzed with the STRUCTURE (version 2.2) program (Pritchard et al. 2000) based on the admixture model. To calculate an accurate number (K) of subpopulations inferred, six independent runs were performed at K levels, ranging from K = 2 to K = 10. Both the length of burn-in period and the number of iterations were set at 100,000. The model choice criterion to detect the most probable value of K was ΔK, an ad hoc quantity related to the second-order change in the log probability of data with respect to the number of clusters inferred by STRUCTURE (Evanno et al. 2005).

Results

Characterization of searched microsatellites

A set of 98,627 ESTs from the transcriptome of S. japonica with a total length of 44,362,190 bp was obtained in this work (Table 1). Here, 9688 SSRs in 8039 ESTs were isolated using the MISA program, which accounted for 9.82% (9688 sequences out of 98,627). It appears that there is one microsatellite sequence every 4.6 kb Saccharina transcriptome sequences. From those, most ESTs contained only one SSR while there were 1237 ESTs contained two or more SSRs. The identified microsatellites from this study contained five repeat types: dinucleotide (285, 2.94%); trinucleotide (5589, 57.69%); tetranucleotide (953, 9.84%); pentanucleotide (934, 9.64%); and hexanucleotide (1927, 19.89%). Trinucleotides which represented 57.69% of the isolated microsatellites were the predominant types. The distribution of EST-SSR repeat types is shown in Table 2.

Table 1 Searching results of EST-SSRs from the transcriptome of S. japonica
Table 2 Distribution of EST-SSR repeat types found in S. japonica

We used the three-class categorization according to the report of Weber (1990) in order to conduct a more detailed SSR analysis. Five hundred seventy-five microsatellites (5.93%) were regarded as compound type, 5921 (61.12%) as perfect, and 3192 (32.95%) as imperfect.

Evaluation for EST-SSR markers

Some microsatellites were located too close to the end of the flanking region or their base composition of the flanking sequence was unsuitable to accommodate primer design (Wang et al. 2011). Therefore, not all the searched microsatellites were suitable for primer design. Finally, a total of 1120 PCR primer pairs were designed and synthesized according to the criteria of primer design in this work.

Among the 1120 PCR primer pairs, 631 accounting for 56.34% have the good amplification in four Saccharina templates at the target region. Other primers were excluded which showed no amplification or multiple bands. Subsequently, 146 of the screened 1120 (13.04%) SSR primer pairs revealed polymorphism in four templates. Of the 146 primer pairs, 83 (56.85%) pairs were dinucleotide repeats, 45 (30.82%) were trinucleotide repeats, 13 (8.91%) were tetranucleotide repeats, and 5 (3.42%) were pentanucleotide repeats (Table 3), which indicated primers of dinucleotide repeat type had the highest development efficiency. Moreover, the most abundant repeat motif was (AT/TA)n (19.18%), followed by (AG/GA)n (9.59%), and (CT/TC)n (8.91%).

Table 3 Primer pairs of polymorphic EST-SSR characterized by repeat-unit type

Genetic diversity analysis

A total of 128 individuals belonged to ten Saccharina strains were analyzed with 52 representative polymorphic EST-SSR markers which obtained in this work. One hundred thirty-nine alleles were amplified in total and the average number of alleles per locus was 2.67 (ranging from 2 to 5). The PIC index was between 0.064 (HDS593) and 0.657 (HDS964) with an average of 0.327. The detailed information of the 52 EST-SSR markers was present in Table 4.

Table 4 Characteristics of 52 representative microsatellite loci of S. japonica

And then the genetic diversity was assessed using these 52 EST-SSR primer pairs. The percentage of polymorphic loci (P) per strain ranged from 28.85% (“Zaohoucheng”) to 73.08% (“Ailunwan,” “Dongfang No. 2,” and “Dongfang No. 3”). Nei’s genetic diversity (H) ranged from 0.1207 (“Zaohoucheng”) to 0.3376 (“Dongfang No. 3”) and Shannon’s information index (I) from 0.1819 (“Zaohoucheng”) to 0.5267 (“Dongfang No. 3”) (Table 5). The results showed that genetic diversity of “Zaohoucheng” was the lowest while “Dongfang No. 3” was the highest. As shown by genetic identity and genetic distance from EST-SSR markers (Table 6), the genetic identity coefficients ranged from 0.5555 to 0.9214. Among these ten strains, “Dongfang No. 3” and “Pingbancai” had the lowest similarity while “Gaojia” and “Pengza No. 2” showed the highest similarity value.

Table 5 Genetic diversity statistics of ten Saccharina populations based on 52 EST-SSR markers
Table 6 Nei’s genetic identity (above diagonal) and genetic distance (below diagonal) among ten Saccharina populations

All the amplified alleles were used for cluster analysis of the ten Saccharina strains via the UPGMA method. The results indicated that different samples from the same strain were clustered together and ten Saccharina strains were divided into two main groups (Fig. 1). The first group contained nine Saccharina strains which were divided into two subgroups, while the second group contained only “Dongfang No. 3.” For the clade of nine strains, “Dongfang No. 2” was the basal strain, “Fujian” and “Zaohoucheng” subsequently derived. The remaining six strains fell into two clusters: “Gaojia,” “Pengza No. 2,” and “Ailunwan” clustered as one clade while “Sanhai,” “Rongfu,” and “Pingbancai” clustered in another clade.

Fig. 1
figure 1

The UPGMA dendrogram of ten Saccharina populations based on 52 representative polymorphic SSR markers developed in this study

Population structures were determined using the STRUCTURE program in order to further investigate the relationship of Chinese Saccharina strains in this work. The estimated likelihood values for a given K (from 2 to 10) in six independent runs were performed and the maximum log likelihood was attained at K = 3 (Fig. 2). At this level, all samples from ten strains could be classified into three groups with greatest probability. Clustering bar plots with K = 3 were shown in Fig. 3. The strains such as “Zaohoucheng” and “Fujian” originated from S. japonica had a similar trend and clustered into one group (red), “Dongfang No. 2” and “Dongfang No. 3” originated from hybrids of S. japonica and S. logissima had a trend of clustering into one group (green) and then the strains such as “Ailunwan,” “Pingbancai,” “Sanhai,” “Rongfu,” “Pengza No. 2,” and “Gaojia” had a trend of clustering into another group (blue).

Fig. 2
figure 2

Assessing the population structure by ΔLn P (D). Log probability data [LnP(D)] as function of K (number of groups) from the STRUCTURE run. STRUCTURE simulation demonstrated that K value showed a modest peak at K = 3, suggesting that three groups could contain all individuals with greatest probability

Fig. 3
figure 3

Estimated population structure of 100 individuals for ten Saccharina populations using the program STRUCTURE (K = 3). Each group is represented by a different color as listed: red: “Zaohoucheng,” “Fujian,”; green: “Dongfang No. 2,” “Dongfang No. 3,”; blue: “Ailunwan,” “Pingbancai,” “Sanhai,” “Rongfu,” “Pengza No. 2,” and “Gaojia”

Discussion

Due to the intrinsic advantages of EST-SSRs, exploring SSR markers from ESTs has attracted our more and more attentions. The major disadvantage of gSSR or EST-SSR markers is their strong species-specificity, and the markers need to be developed de novo for different species (Squirrell et al. 2003). Therefore, it requires the information of genome or transcriptome sequences in advance for primer designing. Traditionally, ESTs were obtained by constructing cDNA library and sequencing a number of clones. With the advent of high-throughput next generation sequencing methods such as Roche/454, ABI/, and Illumina/Hiseq2000, large-scale transcriptome sequences can be generated economically which enable researches to develop a batch of EST-SSR markers more easily. EST-SSRs have been identified using NGS technology in many species such as Cyprinus carpio (Ji et al. 2012), Alfalfa (Liu et al. 2013), Carassius auratus (Zheng et al. 2014), Pelteobagrus fulvidraco (Zhang et al. 2014a), Neolitsea sericea (Chen et al. 2015), and Verasper variegatus (Ge et al. 2017) . At the same time, gSSR markers for Saccharina using paired-end Illumina sequencing data have been reported (Zhang et al. 2014b). Here, we identified 9688 EST-SSRs and explored 146 novel EST-SSR markers based on 98,627 S. japonica ESTs produced by NGS technology. Previously, nine EST-SSR markers were generated from S. japonica EST sequences (Liu et al. 2010b) and 23 were developed from the L. digitata ESTs and S. japonica ESTs (Wang et al. 2011). Compared to these Saccharina EST-SSR reports, the obvious advantage of NGS methods used in this work is their ability to produce large numbers of ESTs from which to isolate and develop numerous gene-associated microsatellite markers at lower cost and effort (Zalapa et al. 2012).

Usually, most SSRs are present in genomic DNA while only a little is present in transcriptome (Sun et al. 2006). Some studies showed that 7–10% of EST sequences contains SSRs in land plants (Chen et al. 2005). A total of 9688 EST-SSRs accounting for 9.82% were identified in this work using the MISA program. This frequency was in coincident with plants but higher than those seaweeds which have informed EST-SSR markers such as Ulva prolifera (2.91%) (Zhang et al. 2014c) and Porphyra haitanensis (6.02%) (Xie et al. 2009). In earlier reports of S. japonica, SSRs exist in 3–6% of EST sequences (Wang et al. 2011; Liu et al. 2012). In general, the NGS method could provide comprehensive information about the genomic organization of repeat sequences in species (Zheng et al. 2014). However, other explanations should also be considered such as SSR search criteria and the size of dataset which usually leads to the varying frequency of EST-SSRs (Varshney et al. 2005; Liang et al. 2009). Although the criteria for screening for EST-SSRs vary in different species, trinucleotide repeats are the most common types, not only in many land plants but also in some seaweeds (Varshney et al. 2002; Kantety et al. 2002; Sun et al. 2006; Xie et al. 2009; Wang et al. 2011). Our results were also in accordance with this universal pattern. In addition, if four nucleotides (A, T, G, C) are present in random combinations, all the SSR motifs could be represented by 4 different dinucleotide motifs (AC, AG, AT, CG), 10 different trinucleotide motifs, 33 different tetranucleotide motifs, 102 different pentanucleotide motifs, and 350 different hexanucleotide motifs (Rota et al. 2005). In this work, all types of trinucleotide, tetranucleotide and most types of dinucleotide, pentanucleotide, and hexanucleotide occurred in S. japonica EST-SSRs (Table 2). The results proved that the distribution of different types of EST-SSRs in S. japonica was obviously extensive. Of course, if this distribution is the actual distribution of S. japonica, EST-SSRs still requires more evidence and further research.

The key step in an effective breeding or conservation program is to accurately evaluate the available genetic resources (Xie et al. 2009), and SSR analysis is a well-established tool for evaluating genetic diversity. The value of P, H, Ne, He, and I which are all parameters of genetic diversity varies with abundance (Nei 1972). On the basis of the present study, “Zaohoucheng” which in an inbred line of S. japonica (Tian and Yuan 1989) exhibited the lowest genetic diversity (P = 28.85%, H = 0.1207) while “Dongfang No. 3” (P = 73.08%, H = 0.3376) showed higher genetic diversity. “Dongfang No. 3” is the direct hybrid of the first filial generation from the female gametophytes of S. japonica and the male of S. longissima (Li et al. 2008). EST-SSR survey in our work indicated that continuous selfing and/or inbreeding had reduced the genetic diversity and might lead to inbreeding depression. Although high-genetic diversity could enhance the adaptability to environments, self-crossing and selection would stabilize the traits of the varieties (Zhao et al. 2013). Therefore, it needs to be considered comprehensively, and evaluating genetic diversity is indispensable in Saccharina breeding.

In our work, genetic identity and cluster analysis of ten Saccharina strains which are widely cultured in China were analyzed using all the amplified EST-SSR alleles. It is well known that the genetic identity and clustering order can reflect the relationships among strains. At the same time, the previous paper has reported that lower genetic parental identity resulted in more obvious heterosis to a certain extent (Melchinger et al. 1992). In other words, stronger heterosis could be achieved in breeding if parental combinations with lower genetic identity are used. Therefore, our research can help in the selection of parents for Saccharina breeding purposes. In addition, population structure analysis displayed three categories of ten strains: one was constituted of strains bred from S. japonica, one was made up of hybrids derived from S. japonica and S. longissima, while another was constituted of strains most from hybrids between S. japonica and S. latissima such as “Ailunwan,” “Rongfu,” and “Sanhai” (Zhang et al. 2011, 2016). Generally speaking, the accessions with the same background usually have similar Q values which could be within a cluster (Hou et al. 2011). We infer that Saccharina strains cultured in China at present are divided into three groups based on genetic background. This is consistent with our traditional understanding of the parental origin of these strains (Tian and Yuan 1989; Li et al. 2007a, 2008; Zhang et al. 2011, 2016). Therefore, the genetic sources of Saccharina strains were revealed using EST-SSR markers developed from our work.

All the above research supported that the EST-SSR analytical system using the polymorphic markers obtained in this study has been successfully adapted for the genetic analysis of Chinese Saccharina strains. These developed EST-SSR markers will facilitate most fields of genetic analysis and promote the program of molecular marker-assisted breeding in Saccharina.