A rapid and cost-effective approach for the development of polymorphic microsatellites in non-model species using paired-end RAD sequencing

Xue, Dong-Xiu; Li, Yu-Long; Liu, Jin-Xian

doi:10.1007/s00438-017-1337-x

A rapid and cost-effective approach for the development of polymorphic microsatellites in non-model species using paired-end RAD sequencing

Methods Paper
Published: 20 June 2017

Volume 292, pages 1165–1174, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Molecular Genetics and Genomics Aims and scope Submit manuscript

A rapid and cost-effective approach for the development of polymorphic microsatellites in non-model species using paired-end RAD sequencing

Download PDF

Dong-Xiu Xue^1,2^na1,
Yu-Long Li^1,2^na1 &
Jin-Xian Liu^1,2

2169 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

As one of the most informative and versatile DNA-based markers, microsatellites have been widely used in population and conservation genetic studies. However, the development of microsatellites has traditionally been laborious, time-consuming, and expensive. In the present study, a rapid and cost-effective “RAD-seq-Assembly-Microsatellite” approach was developed to identify abundant microsatellite markers in non-model species using the roughskin sculpin Trachidermus fasciatus as a representative. Overlapping paired-end Illumina reads generated by restriction-site-associated DNA sequencing (RAD-seq) were clustered based on the similarity of reads containing the restriction enzyme recognition site and then assembled into contigs, which were used for microsatellite discovery and primer design. A total of 121,750 RAD contigs were generated with a mean length of 522 bp, and 19,782 contigs contained microsatellite motifs. A total of 156,150 primer pairs were successfully designed based on 16,497 contigs containing priming sites. Experimental validation of 52 randomly selected microsatellite loci demonstrated that 45 (86.54%) loci were successfully amplified and polymorphic in two geographically isolated populations of T. fasciatus. Compared with traditional approaches based on DNA cloning and other approaches based on next-generation sequencing, our newly developed approach could yield thousands of microsatellite loci with much higher successful amplification rate and lower costs, especially for non-model species with shallow background of genomic information. The “RAD-seq-Assembly-Microsatellite” approach holds great promise for microsatellite development in future ecological and evolutionary studies of non-model species.

Isolation and characterization of polymorphic microsatellite loci for the three Iberian vipers, Vipera aspis, V. latastei and V. seoanei by Illumina MiSeq sequencing

Article Open access 09 February 2024

Identification and development of microsatellite (SSRs) makers of Exbucklandia (HAMAMELIDACEAE) by high-throughput sequencing

Article 15 April 2019

Microsatellite discovery in an insular amphibian (Grandisonia alternans) with comments on cross-species utility and the accuracy of locus identification from unassembled Illumina data

Article Open access 29 July 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Microsatellites, also known as simple sequence repeats (SSRs), are tandem repeats of one to six nucleotides in DNA sequences (Oliveira et al. 2006). Given their extensive distribution in genome, high polymorphism, codominant inheritance and high success amplification rate, microsatellites have been one of the most powerful and valuable molecular tools in many research areas, such as population genetics, conservation genetics, genome mapping, parentage analysis, and quantitative trait loci identification (Chang et al. 2009; Montanari et al. 2016; Xue et al. 2014). Despite advances in the achievement of single nucleotide polymorphism (SNP) data, microsatellites are still useful and more easily accessible for many studies such as those involving genetic diversity monitoring for a long period of stock management, and breeding or pedigree estimation (Hodel et al. 2016; Minegishi et al. 2015; Stabile et al. 2016; Weinman et al. 2015; Zalapa et al. 2012). A major limitation to the usage of microsatellites is that traditional methods for microsatellites development, such as an enriched library followed by cloning and Sanger sequencing, were labor-intensive, time-consuming, and expensive (Glenn and Schable 2005; Zane et al. 2002). Furthermore, microsatellites had to be developed de novo for every species under study, as cross-amplification from congeneric species is not generally feasible (Schoebel et al. 2013). Therefore, rapid and cost-effective methods for microsatellite development are urgently needed for population management and conservation of non-model species.

Next generation sequencing (NGS) can produce large amount of sequences, from which numerous genome-wide and gene-based microsatellite loci could be isolated and developed (Zalapa et al. 2012). So far, studies using NGS to develop microsatellite loci have largely rely on the Roche 454 and Illumina sequencing platforms (Hodel et al. 2016; Minegishi et al. 2015; Schoebel et al. 2013; Zalapa et al. 2012). Since read length is an important factor that affects the possibility to discover microsatellites and design primers, the 454 sequencing platform was used extensively for microsatellite development (Hodel et al. 2016; Mastretta-Yanes et al. 2015). However, 454 is less cost-effective than Illumina on a per-megabase basis (Glenn 2011; Zalapa et al. 2012). Moreover, Roche have discontinued the use of the 454 instrument since 2016. Currently, projects of microsatellite discovery largely focused their efforts on Illumina platforms. However, the short read lengths obtained with Illumina platform limited its utility for microsatellite development, because most reads did not have enough flanking sequences for primer design. To improve the efficiency of microsatellite development using Illumina reads, sequence assembly should be performed to create longer DNA sequences or contigs before microsatellite discovery.

One cost-efficient and practical strategy to develop microsatellite markers using NGS technologies is the sequencing of a reduced representation genomic library (Bonatelli et al. 2015). Restriction site-associated DNA sequencing (RAD-seq) is a useful approach to create reduced representation genomic libraries and provide sequence data adjacent to restriction enzyme recognition sites (Davey et al. 2011; Hohenlohe et al. 2013). RAD-seq incorporates a random shearing step in library preparation, which can be modified to generate overlapping paired reads. The reads of a single RAD locus generated by traditional RAD-seq technology could be firstly clustered using the similarity of the first reads with the restriction enzyme recognition site, and then the overlapping paired-end reads allowed local assembly of contigs containing both the forward and reverse reads of each pair (Hohenlohe et al. 2013), which could improve accuracy and quality of the assembled contigs, and therefore improve the success rate of microsatellite development. The roughskin sculpin Trachidermus fasciatus Heckel (Scorpaeniformes: Cottidae), is a small, benthic, carnivorous, and catadromous fish species with a native distribution in Northwestern Pacific distribution (Onikura et al. 2002; Wang 1999). In the past decades, it has experienced severe population declines in China, probably due to degradation of habitats, water pollution and dam construction (Wang and Cheng 2010). However, only a few molecular genetic resources are publicly available for roughskin sculpin (Xu et al. 2008; Zeng et al. 2012), and the use of microsatellite markers in conservation genetic studies and maker-assisted selection was limited (Li et al. 2016b).

In the present study, a “RAD-seq-Assembly-Microsatellite” approach was developed and applied in the roughskin sculpin as a representative of non-model species, for which limited genetic data were available. To improve the success rate of microsatellite development in a simple, fast, and economic way, the advantages offered by the traditional RAD-seq technology coupled with fast and efficient bioinformatic tools for reads assembly, microsatellite isolation and primer design were explored in this approach. The essence of the approach is to generate enough long contiguous sequences of high quality to overcome technical limitations introduced by short read lengths and to isolate abundant microsatellite loci. Briefly, genomic DNA of a roughskin sculpin individual was sequenced using the overlapping paired-end RAD-seq protocol, and the generated reads were sorted according to RAD loci and locally assembled to achieve longer contiguous sequences. Then microsatellite sequences in the assembled contigs were searched and primer pairs were designed. Finally, 52 microsatellite loci were randomly selected and validated in two roughskin sculpin populations based on PCR amplification and genotyping. The newly developed rapid and cost-effective approach would be of particular advantage for the isolation and characterization of sufficient microsatellite loci for ecological and evolutionary studies of non-model species.

Materials and methods

Sampling and genomic DNA extraction

A total of 48 individuals were collected from two geographic sites in China: 24 individuals from Dandong, Liaoning Province (39°46′N, 124°20′E) in May 2014, and 24 individuals from Fuyang, Zhejiang Province (30°03′N, 119°58′E) in January 2014. Muscle tissue were preserved in 95% ethanol. Genomic DNA was extracted using the standard phenol–chloroform extraction protocol, and checked using 1% agarose electrophoresis and Nanodrop 2000c spectrophotometer.

Library preparation and RAD tag sequencing

Approximately 1 μg of genomic DNA extracted from a single individual of Fuyang was digested with restriction enzyme EcoRI. The digested products were ligated to a modified Illumina P1 adapter containing individual-specific index sequences of 6 bp for sample tracking. The total genomic DNA samples were then randomly sheared to an average size of 500 bp, and fragments with insert size spanning 200–600 bp were isolated using a MinElute Gel Extraction kit (Qiagen). An “A” base overhangs were added to the 3′ ends of the blunt DNA fragments, and then a modified P2 adapter containing a 3′ dT overhang was ligated onto the ends of DNA fragments with 3′ dA overhangs. Finally, the library was enriched by high-fidelity PCR amplification, preparing RAD tags that contain both adaptors for paired-end (2 × 125 bp) sequencing on an Illumina Hi-Seq 2500 platform at Novogene in Tianjin.

RAD data assembly and assessment

Illumina raw reads were quality-filtered, and PCR duplicates were removed by “clone_filter” in STACKS (version 1.32) (Catchen et al. 2013). The first reads with restriction enzyme recognition sites were sent to STACKS to identify RAD loci. The minimum depth of stacks was set to 10, and the number of mismatches allowed between stacks was set to 3 to maintain the true alleles from paralogues. Deleveraging and removal algorithms were turned on to filter out highly repetitive loci. Finally, the second reads corresponding to each RAD locus were collected into separate files using a modified version of “sort_read_pairs.pl” in STACKS. The reads for each locus were locally assembled by CAP3, which is a DNA sequence assembly program based on overlap-layout-consensus methods (Huang and Madan 1999). The assembly was performed by a custom developed multi-threading Perl scripts CP3_Opti.pl (available at https://github.com/lyl8086/RAD_SSR) according to an optimized assembly approach. Firstly, the second reads for each RAD loci identified by the first reads were locally assembled into contigs. Secondly, the assembled contigs of the second reads were merged with the corresponding consensus sequences of the first reads for each RAD locus. Thirdly, a final assembly was performed on each RAD loci to generate the final assembled RAD reference. In general, the overlapping paired reads generated by RAD-seq are staggered over a local genome location, these reads can be locally assembled into high-quality contigs, which are up to 1 kb depending on the strategy of size selection in the library preparation. The longer contigs thus provided sufficient sequences for the downstream microsatellites discovery and primer design.

In order to check the quality of the assembled RAD reference, the paired reads used for assembling were mapped back to the reference by BWA 0.7.12 (Li and Durbin 2009). BWA “mem” (Li 2013) was used to generate SAM file, the parameters were set to default except for the minimum seed length of 32. The SAM file was processed by SAMTOOLS 1.3.1 (Li et al. 2009) to check the overall coverage, the number of mapped reads and the depth. To further improve the quality of the assembled RAD reference, only contigs with properly mapped read pairs (paired reads mapped in right direction with proper insert size given by the aligner) and a minimum mapping quality of 20 were retained. Soft or hard clipped reads, secondly aligned reads, and reads with the SAM tags of “XA” or “SA” were also removed. The generated high-quality contigs were then used for downstream microsatellites discovery and primer design.

Microsatellites searching and primer design

QDD 3.1.2 (Meglécz et al. 2010) was chosen for microsatellite discovery and primer design. The program was run in a local Galaxy (Afgan et al. 2016) platform with default parameters. Microsatellite was defined as pure or compound tandem repeats of di- to hexa-nucleotide motif with at least five uninterrupted repeats. To improve the success rate, primers were selected based on the following five criteria: (1) select one primer for each locus; (2) select pure microsatellites with repeat number great than 5; (3) select primers that were only in design category A; (4) remove primers with alignment score greater than 10; and (5) select primers that were away from the target microsatellite (>10 bp).

Microsatellite genotyping and polymorphism survey

A total of 52 primer pairs were randomly selected for laboratory verification. Initial testing for PCR amplification used two individuals from Fuyang. A M13-tail (5′-GGAAACAGCTATGACCATG-3′) was added on the 5′ end of each forward primer. PCR amplification were performed in a total volume of 10 μL containing 10 ng genomic DNA, 1× PCRmix (Dongsheng Biotech Co., China) and 0.2 μM each primer, using the following cycling conditions: (1) initial activation step for 5 min at 95 °C; (2) 35 cycles of denaturation at 95 °C for 20 s, annealing at 52 °C for 30 s and extension at 72 °C for 30 s; and (3) a final extension of 5 min at 72 °C. The PCR products were electrophoresed on a 1.5% agarose gel and only primers that produced specific products were further evaluated using an initial set of eight individuals. PCR amplification were carried out in a final volume of 10 μL containing 10 ng genomic DNA, 1× PCRmix (Dongsheng Biotech Co., China), 0.02 μM forward primer, 0.2 μM reverse primer, and 0.2 μM of M13-tail primer that was fluorescently labeled with FAM, HEX or TAMRA. The PCR amplification program was the same as mentioned above. Fluorescently labeled PCR fragments were electrophoresed on an ABI 3730xl automated sequencer (Applied Biosystems) with the GS-500 size standard. Allele calling was performed using GeneMarker (SoftGenetics, State College, USA). The final scoring was manually checked to minimize genotyping errors. Finally, polymorphism of microsatellite loci screened out by the above two steps were checked in 48 individuals from Fuyang and Dandong.

Genetic diversity indices for each loci and population including observed heterozygosity (H _O), expected heterozygosity (H _E) and polymorphism information content (PIC) were calculated using the Excel Microsatellite Toolkit (Park 2001). The number of alleles (Na), allelic richness (A _R) and inbreeding coefficient (F _IS) was calculated using FSTAT 2.9.3 (Goudet 2001). Deviations from Hardy–Weinberg equilibrium and genotypic linkage equilibrium were tested with Genepop 4.5.1 (Rousset 2008). The significance tests were estimated by the Markov chain Monte Carlo (MCMC) method (10,000 dememorization steps, 1000 batches of 10,000 iterations). Micro-checker 2.2.3 (van Oosterhout et al. 2004) was used to test for the presence of null alleles. A standard Bonferroni correction was used for all above significance levels of tests.

Results

RAD sequencing, filtering and assembly

A total of 33.5 million raw paired reads were obtained, and 25.1 million clean paired reads were retained after quality filtering and removing PCR duplications. A total of 137,409 loci identified by STACKS were exported into separate fasta files for local assembly. CAP3 assembled a total of 127,864 contigs with a mean length of 517 bp and N50 of 543 bp. About 20.8 million reads could be mapped back to the assembled contigs, and 94% of these were properly paired. After retaining contigs with properly paired reads and a minimum mapping quality of 20, and removing clipping and other possible spurious reads, a total of 121,750 contigs were retained as the final assembled RAD reference for microsatellites discovery (Online Resource 1). The final assembled RAD reference had a mean length of 522 bp and GC content of 41.59%.

Microsatellite isolation and characterization

A total of 19,782 contigs possessing microsatellite motifs were identified. For 16,497 contigs that contained priming sites for microsatellite loci, the types of the microsatellites in the target region were variable. The number of the pure microsatellites was 12,127, while the number of the compound microsatellites was 3242. Finally, a total of 156,150 primer pairs sets were successfully designed (Online Resource 2). Using one primer pair for each locus, pure microsatellites, design category A, PCR primer align score ≤10, and minimum primer target distance >10 bp, a total of 1854 primer pairs were retained (Table 1). These 1854 microsatellite motifs included 1536 di- (82.85%), 262 tri- (14.13%), 49 tetra- (2.64%), 5 penta- (0.27%) and 2 hexa- (0.11%) nucleotide repeats, of which the repeats number ranged from 5 to 41.

Table 1 Summary of the number of primer pairs designed from the 16,497 assembled RAD contigs for Trachidermus fasciatus

Full size table

Of the 52 primer pairs randomly selected, 48 primer pairs produced clear and specific amplification products of the expected size by being screened in 1.5% agarose electrophoresis in two individuals, and were subsequently used for evaluation with capillary to test genotyping in eight individuals. Finally, a set of 45 microsatellite loci were used to evaluate polymorphism in 48 individuals from the two populations, and a total of 618 alleles were detected (Table 2). The number of alleles per locus ranged from 3 to 29, and the expected (H _E) and observed (H _O) heterozygosity ranged from 0.3510 to 0.9800 and from 0.2080 to 1.0000, respectively. The polymorphism information content (PIC) ranged from 0.3070 to 0.9590. No linkage disequilibrium was detected between microsatellite loci. Significant deviation from Hardy–Weinberg equilibrium was observed in four loci (tfa 28, tfa 32, tfa 36 and tfa 57), of which two loci (tfa 32 and tfa 57) were significant in both tested populations (Table 2). Analyses using micro-checker indicated the presence of null alleles at the same four loci.

Table 2 Characterization of 45 microsatellite loci validated in populations of Trachidermus fasciatus from Fuyang and Dandong

Full size table

Discussion

The present study, with the roughskin sculpin as a representative of non-model species, developed a rapid and cost-effective microsatellite identification approach (RAD-seq-Assembly-Microsatellite) using paired-end RAD-seq. This approach can create longer sequences by assembling the overlapping paired-end RAD reads for microsatellite discovery, which overcomes the issues of short read lengths generated by the Illumina platform and improves microsatellite detection rates. The approach could efficiently generate a large set of polymorphic microsatellite markers for a wide range of applications from population genetics, behavioral ecology, to marker-based breeding programs, especially for non-model species.

Next-generation sequencing (NGS) technologies have enhanced our ability to obtain hundreds of microsatellites in a rapid and low cost manner, and dramatically accelerated the discovery of genomic information even in non-model species (Cai et al. 2013; Davey et al. 2011; Hu et al. 2016). Compared with traditional methods for microsatellite markers development (Zane et al. 2002), our approach, which based on RAD-seq and de novo local assembly, is more efficient in terms of money and time. At current market prices, the RAD library construction costs approximately $75, and a 125 bp × 2 paired-end sequencing run on the Illumina Hiseq 2500 platform to produce 1 Gb of sequences costs approximately $70. The Illumina Hiseq 4000 and X-ten platforms have much lower per-base sequencing costs with higher throughout than Hiseq 2500 platform (http://www.illumina.com). So the total costs for library construction and sequencing using our approach ($360 for 4 Gb data) are lower than that using traditional (at least $800 for 100 sequence data) approaches (Zalapa et al. 2012). In general, traditional approaches require 2–4 weeks for DNA extraction, library construction, cloning, Sanger sequencing and primer design, whereas our approach requires only 1–2 weeks for DNA extraction, RAD library construction, Illumina sequencing and primer design. Moreover, our approach can identify thousands of microsatellites and then batch design primers simultaneously, while traditional approaches can usually find repeated units in a relative small pool of sequences (Zane et al. 2002).

There have been studies that used the next generation sequencing technologies to develop microsatellite markers (Bonatelli et al. 2015; Castoe et al. 2010; Hung et al. 2016; Li et al. 2016a, b; Minegishi et al. 2015). The two major NGS platforms used for the discovery of microsatellites are Roche 454 and Illumina sequencing (Zalapa et al. 2012). The main advantage of the 454 sequencing for microsatellite discovery is that the read-length is 350–600 bp, which could allow the discovery and development of microsatellites even directly from the raw reads. Castoe et al. (2010) identified 14,612 microsatellite loci in 11.3% of the 128,773 Roche 454 shotgun reads, and 4564 of which had flanking sequences suitable for primer design. Bonatelli et al. (2015) validated 22 (30.56%) polymorphic microsatellites out of 64 loci using double digest restriction site–associated DNA sequencing (ddRAD-seq) on a Roche 454 platform. Seventy-four projects using 454 sequencing reviewed in Hodel et al. (2016) yielded 8–91 polymorphic loci, with an average of 16 polymorphic loci and 4400 potential loci derived from an average of 139,418 reads. The main advantage of using Illumina platform for microsatellite discovery is that the much higher throughout and lower costs than 454 platform (Zalapa et al. 2012). Cai et al. (2013) assembled the generated Illumina shot-gun reads into a draft genome of Anisogramma anomala, and successfully amplified 214 (90.7%) microsatellite loci with specific products. Hung et al. (2016) applied Illumina shotgun sequencing to Apodemus semotus and mapped the obtained sequences reads against the genome of Mus musulus, then successfully amplified 44 (74.57%) of 59 microsatellite loci. Hu et al. (2016) used transcriptome data from RNA-Seq using Illumina sequencing and found that 20 (31.75%) loci were successfully amplified and also were polymormic. For microsatellites development studies using Illumina sequencing reviewed in Hodel et al. (2016), the average number of polymorphic microsatellite markers reported was 15, and the average number of potential loci per study was 15,539.

However, the Illumina platform generates relative short reads (100–300 bp), so assembly is usually required to achieve longer contiguous sequences, which could provide sufficient flanking sequence for the design of primers to amplify the target microsatellite and reduce redundancy of closely linked microsatellites (Zalapa et al. 2012). Yang et al. (2016) only identified 650 microsatellite loci from 4.5 million RAD raw reads and only 285 (43.84%) primer pairs were successfully designed. However, in the present study, 22,835 microsatellites were discovered in 121,750 contigs assembled, and 156,150 primer pairs were designed for 16,497 (77.24%) loci containing microsatellites. Therefore, careful consideration should be given to the quality of assembly. RAD-seq is a family of genomic approaches that provide sequence data adjacent to restriction enzyme recognition sites (Davey et al. 2011; Hohenlohe et al. 2013). Furthermore, the overlapping paired-end reads by traditional RAD-seq technology allowed local assembly of contigs containing both the forward and reverse reads of each pair. These RAD contigs are anchored at one end by the restriction enzyme recognition site and contain several hundred base pairs of continuous genomic sequence data (Hohenlohe et al. 2013). This assembly method for RAD holds several advantages comparing to whole genome sequencing. Firstly, reads of a single RAD locus could be clustered before assembly using the similarity of the first reads with the restriction enzyme recognition site, and therefore reducing complexity and the computational costs, which is even affordable for desktop computer. Secondly, local assembly of the contigs could improve the quality of de novo assembly, and therefore improve the success rate of microsatellite development. Thirdly, the size selection in library preparation is flexible, which makes the length of contigs easily customized. In our study, PCR amplifications were successful for 48 (92.31%) of the 52 randomly selected loci, which was higher than those of the other approaches above. This indicated that primer pairs in the database and the assembled RAD contigs were of high quality and most of primer pairs would amplify their targets. Compared to other approaches using next-generation sequencing, our assembly based approach exhibited great advantages on developing thousands of microsatellites rapidly and accurately, especially for non-model species with shallow background of genomic information.

In conclusion, the present study has contributed a detailed approach to rapidly and cost-effectively develop genome-wide microsatellite markers in non-model species with high success rate. A total of 45 polymorphic loci were validated, which could serve as a proof-of-concept showing that the “RAD-seq-Assembly-Microsatellite” approach was successfully applied to a non-model species. The “RAD-seq-Assembly-Microsatellite” approach developed in the present study holds great promise for microsatellite development in future ecological and evolutionary studies of non-model species.

References

Afgan E, Baker D, van den Beek M, Blankenberg D, Bouvier D, Čech M, Chilton J, Clements D, Coraor N, Eberhard C, Grüning B, Guerler A, Hillman-Jackson J, Von Kuster G, Rasche E, Soranzo N, Turaga N, Taylor J, Nekrutenko A, Goecks J (2016) The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res 44:W3–W10
Article CAS PubMed PubMed Central Google Scholar
Bonatelli IA, Carstens BC, Moraes EM (2015) Using next generation RAD sequencing to isolate multispecies microsatellites for Pilosocereus (Cactaceae). PLoS One 10:e0142602
Article PubMed PubMed Central Google Scholar
Cai G, Leadbetter CW, Muehlbauer MF, Molnar TJ, Hillman BI (2013) Genome-wide microsatellite identification in the fungus Anisogramma anomala using Illumina sequencing and genome assembly. PLoS One 8:e82408
Article PubMed PubMed Central Google Scholar
Castoe TA, Poole AW, Gu W, Jason de Koning AP, Daza JM, Smith EN, Pollock DD (2010) Rapid identification of thousands of copperhead snake (Agkistrodon contortrix) microsatellite loci from modest amounts of 454 shotgun genome sequence. Mol Ecol Resour 10:341–347
Article CAS PubMed Google Scholar
Catchen J, Hohenlohe PA, Bassham S, Amores A, Cresko WA (2013) Stacks: an analysis tool set for population genomics. Mol Ecol 22:3124–3140
Article PubMed PubMed Central Google Scholar
Chang Y, Feng Z, Yu J, Ding J (2009) Genetic variability analysis in five populations of the sea cucumber Stichopus (Apostichopus) japonicus from China, Russia, South Korea and Japan as revealed by microsatellite markers. Mar Ecol 30:455–461
Article CAS Google Scholar
Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML (2011) Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat Rev Genet 12:499–510
Article CAS PubMed Google Scholar
Glenn TC (2011) Field guide to next-generation DNA sequencers. Mol Ecol Resour 11:759–769
Article CAS PubMed Google Scholar
Glenn TC, Schable NA (2005) Isolating microsatellite DNA loci. Methods Enzymol 395:202–222
Article CAS PubMed Google Scholar
Hodel RG, Segovia-Salcedo MC, Landis JB, Crowl AA, Sun M, Liu X, Gitzendanner MA, Douglas NA, Germain-Aubrey CC, Chen S, Soltis DE, Soltis PS (2016) The report of my death was an exaggeration: a review for researchers using microsatellites in the 21st century. Appl Plant Sci 4:1600025
Article Google Scholar
Hohenlohe PA, Day MD, Amish SJ, Miller MR, Kamps-Hughes N, Boyer MC, Muhlfeld CC, Allendorf FW, Johnson EA, Luikart G (2013) Genomic patterns of introgression in rainbow and westslope cutthroat trout illuminated by overlapping paired-end RAD sequencing. Mol Ecol 22:3002–3013
Article CAS PubMed PubMed Central Google Scholar
Hu Z, Zhang T, Gao X-X, Wang Y, Zhang Q, Zhou H-J, Zhao G-F, Wang M-L, Woeste KE, Zhao P (2016) De novo assembly and characterization of the leaf, bud, and fruit transcriptome from the vulnerable tree Juglans mandshurica for the development of 20 new microsatellite markers using Illumina sequencing. Mol Genet Genom 291:849–862
Article CAS Google Scholar
Huang X, Madan A (1999) CAP3: a DNA sequence assembly program. Genome Res 9:868–877
Article CAS PubMed PubMed Central Google Scholar
Hung C-M, Yu A-Y, Lai Y-T, Shaner P-JL (2016) Developing informative microsatellite makers for non-model species using reference mapping against a model species’ genome. Sci Rep 6:23087
Article CAS PubMed PubMed Central Google Scholar
Li H (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303:3997 (Prepr)
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25:1754–1760
Article CAS PubMed PubMed Central Google Scholar
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Article PubMed PubMed Central Google Scholar
Li Q-Y, Zhang J, Yao J-T, Wang X-L, Duan D-L (2016a) Development of Saccharina japonica genomic SSR markers using next-generation sequencing. J Appl Phycol 28:1387–1390
Article CAS Google Scholar
Li Y-L, Xue D-X, Gao T-X, Liu J-X (2016b) Genetic diversity and population structure of the roughskin sculpin (Trachidermus fasciatus Heckel) inferred from microsatellite analyses: implications for its conservation and management. Conserv Genet 17:921–930
Article Google Scholar
Mastretta-Yanes A, Arrigo N, Alvarez N, Jorgensen TH, Piñero D, Emerson BC (2015) Restriction site-associated DNA sequencing, genotyping error estimation and de novo assembly optimization for population genetic inference. Mol Ecol Resour 15:28–41
Article CAS PubMed Google Scholar
Meglécz E, Costedoat C, Dubut V, Gilles A, Malausa T, Pech N, Martin JF (2010) QDD: a user-friendly program to select microsatellite markers and design primers from large sequencing projects. Bioinformatics 26:403–404
Article PubMed Google Scholar
Minegishi Y, Ikeda M, Kijima A (2015) Novel microsatellite marker development from the unassembled genome sequence data of the marbled flounder Pseudopleuronectes yokohamae. Mar Genom 24:357–361
Article Google Scholar
Montanari S, Perchepied L, Renault D, Frijters L, Velasco R, Horner M, Gardiner SE, Chagné D, Bus VGM, Durel C-E, Malnoy M (2016) A QTL detected in an interspecific pear population confers stable fire blight resistance across different environments and genetic backgrounds. Mol Breed 36:1–16
Article CAS Google Scholar
Oliveira EJ, Pádua JG, Zucchi MI, Vencovsky R, Vieira MLC (2006) Origin, evolution and genome distribution of microsatellites. Genet Mol Biol 29:294–307
Article CAS Google Scholar
Onikura N, Takeshita N, Matsui S, Kimura S (2002) Spawning grounds and nests of Trachidermus fasciatus (Cottidae) in the Kashima and Shiota estuaries system facing Ariake Bay, Japan. Ichthyol Res 49:198–201
Article Google Scholar
Park S (2001) Trypanotolerance in West African cattle and the population genetic effects of selection. Dissertation, University of Dublin
Rousset F (2008) Genepop’007: a complete re-implementation of the genepop software for Windows and Linux. Mol Ecol Resour 8:103–106
Article PubMed Google Scholar
Schoebel CN, Brodbeck S, Buehler D, Cornejo C, Gajurel J, Hartikainen H, Keller D, Leys M, Ríčanová S, Segelbacher G, Werth S, Csencsics D (2013) Lessons learned from microsatellite development for nonmodel organisms using 454 pyrosequencing. J Evol Biol 26:600–611
Article CAS PubMed Google Scholar
Stabile J, Lipus D, Maceda L, Maltz M, Roy N, Wirgin I (2016) Microsatellite DNA analysis of spatial and temporal population structuring of Phragmites australis along the Hudson River Estuary. Biol Invasions 18:2517–2519
Article Google Scholar
van Oosterhout C, Hutchinson WF, Wills DP, Shipley P (2004) Micro-Checker: software for identifying and correcting genotyping errors in microsatellite data. Mol Ecol Notes 4:535–538
Article Google Scholar
Wang J-Q (1999) Advances in studies on the ecology and reproductive biology of Trachidermus fasciatus Heckel. Acta Hydrobiol Sin 23:729–734 (in Chinese)
Google Scholar
Wang J-Q, Cheng G (2010) The historical variance and causes of geographical distribution of a roughskin sculpin (Trachidermus fasciatus Heckel) in Chinese territory. Acta Ecol Sin 30:6845–6853 (in Chinese)
Article Google Scholar
Weinman LR, Solomon JW, Rubenstein DR (2015) A comparison of single nucleotide polymorphism and microsatellite markers for analysis of parentage and kinship in a cooperatively breeding bird. Mol Ecol Resour 15:502–511
Article CAS PubMed Google Scholar
Xu J-R, Han X-L, Li N, Yu J-F, Xu P, Bao Z-M (2008) Analysis of genetic diversity in roughskin sculpin Trachidermus fasciatus by AFLP markers. J Dalian Fish Univ 23:437–441 (in Chinese)
CAS Google Scholar
Xue D-X, Zhang T, Liu J-X (2014) Microsatellite evidence for high frequency of multiple paternity in the marine gastropod Rapana venosa. PLoS One 9:e86508
Article PubMed PubMed Central Google Scholar
Yang X-Y, Long Z-C, Gichira AW, Guo Y-H, Wang Q-F, Chen J-M (2016) Development of microsatellite markers in the tetraploid fern Ceratopteris thalictroides (Parkeriaceae) using RAD tag sequencing. Genet Mol Res. doi:10.4238/gmr.15017550
Google Scholar
Zalapa JE, Cuevas H, Zhu H, Steffan S, Senalik D, Zeldin E, McCown B, Harbut R, Simon P (2012) Using next-generation sequencing approaches to isolate simple sequence repeat (SSR) loci in the plant sciences. Am J Bot 99:193–208
Article CAS PubMed Google Scholar
Zane L, Bargelloni L, Patarnello T (2002) Strategies for microsatellite isolation: a review. Mol Ecol 11:1–16
Article CAS PubMed Google Scholar
Zeng Z, Liu Z-Z, Pan L-D, Tang W-Q, Wang Q, Geng Y-H (2012) Analysis of genetic diversity in wild populations of Trachidermus fasciatus by RAPD and the transformation of two SCAR markers. Zool Res 33:203–210
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We are thankful to the anonymous referee for critical and helpful comments on the manuscript.

Author information

Dong-Xiu Xue and Yu-Long Li contributed equally to this work.

Authors and Affiliations

CAS Key Laboratory of Marine Ecology and Environmental Sciences, Institute of Oceanology, Chinese Academy of Sciences, 7 Nanhai Road, Qingdao, 266071, Shandong, China
Dong-Xiu Xue, Yu-Long Li & Jin-Xian Liu
Laboratory for Marine Ecology and Environmental Science, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266071, Shandong, China
Dong-Xiu Xue, Yu-Long Li & Jin-Xian Liu

Authors

Dong-Xiu Xue
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Long Li
View author publications
You can also search for this author in PubMed Google Scholar
Jin-Xian Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jin-Xian Liu.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

All applicable international, national, and/or institutional guidelines for the care and use of animals were followed.

Funding

This study was funded by the National Natural Science Foundation of China (NSFC) (Grant number 31502169), AoShan Talents Program supported by Qingdao National Laboratory of Marine Science and Technology (Grant Number 2015ASTP-ES05), the NSFC-Shandong Joint Fund for Marine Ecology and Environmental Sciences (Grant Number U1606404), and AoShan Science and Technology Innovation Plan (Grant Number 2015ASKJ02-05).

Additional information

Communicated by S. Hohmann.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (ZIP 18189 kb)

Supplementary material 2 (XLSX 37261 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xue, DX., Li, YL. & Liu, JX. A rapid and cost-effective approach for the development of polymorphic microsatellites in non-model species using paired-end RAD sequencing. Mol Genet Genomics 292, 1165–1174 (2017). https://doi.org/10.1007/s00438-017-1337-x

Download citation

Received: 24 March 2017
Accepted: 14 June 2017
Published: 20 June 2017
Issue Date: October 2017
DOI: https://doi.org/10.1007/s00438-017-1337-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A rapid and cost-effective approach for the development of polymorphic microsatellites in non-model species using paired-end RAD sequencing

Abstract

Similar content being viewed by others

Isolation and characterization of polymorphic microsatellite loci for the three Iberian vipers, Vipera aspis, V. latastei and V. seoanei by Illumina MiSeq sequencing

Identification and development of microsatellite (SSRs) makers of Exbucklandia (HAMAMELIDACEAE) by high-throughput sequencing

Microsatellite discovery in an insular amphibian (Grandisonia alternans) with comments on cross-species utility and the accuracy of locus identification from unassembled Illumina data

Introduction

Materials and methods

Sampling and genomic DNA extraction

Library preparation and RAD tag sequencing

RAD data assembly and assessment

Microsatellites searching and primer design

Microsatellite genotyping and polymorphism survey

Results

RAD sequencing, filtering and assembly

Microsatellite isolation and characterization

Discussion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Funding

Additional information

Electronic supplementary material

Supplementary material 1 (ZIP 18189 kb)

Supplementary material 2 (XLSX 37261 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A rapid and cost-effective approach for the development of polymorphic microsatellites in non-model species using paired-end RAD sequencing

Abstract

Similar content being viewed by others

Isolation and characterization of polymorphic microsatellite loci for the three Iberian vipers, Vipera aspis, V. latastei and V. seoanei by Illumina MiSeq sequencing

Identification and development of microsatellite (SSRs) makers of Exbucklandia (HAMAMELIDACEAE) by high-throughput sequencing

Microsatellite discovery in an insular amphibian (Grandisonia alternans) with comments on cross-species utility and the accuracy of locus identification from unassembled Illumina data

Introduction

Materials and methods

Sampling and genomic DNA extraction

Library preparation and RAD tag sequencing

RAD data assembly and assessment

Microsatellites searching and primer design

Microsatellite genotyping and polymorphism survey

Results

RAD sequencing, filtering and assembly

Microsatellite isolation and characterization

Discussion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Funding

Additional information

Electronic supplementary material

Supplementary material 1 (ZIP 18189 kb)

Supplementary material 2 (XLSX 37261 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation