Abstract
Pareuchiloglanis sinensis (Siluriformes: Sisoridae) is an endemic and highland fish species which occurs only in some rivers of south-west China. In this study, the isolation and characterization of polymorphic microsatellite loci of this fish species by next-generation sequencing is described. A total of 9471 simple-sequence repeats (SSRs) were observed from RNA-seq data. One hundred and twenty primer pairs were chosen randomly and validated across 48 P. sinensis individuals collected from the Dadu river (a branch of the Yangtze river) of which 28 polymorphic microsatellite loci were detected. The number of alleles ranged from 2 to 14, with an average of seven alleles per locus. Twenty loci exhibited high polymorphism with the polymorphism information content (PIC) larger than 0.5. The mean observed and expected heterozygosity varied from 0.104 to 0.958 and 0.157 to 0.844, with an average of 0.583 and 0.613, respectively. The microsatellite markers characterized in the current study serve as a useful tool for the conversation genetic studies and population evaluation of P. sinensis.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Pareuchiloglanis sinensis (Siluriformes: Sisoridae) is an endemic and highland fish species which only detected in some rivers of south-west China, i.e. the Jinsha river, the Dadu river and the Bailong river (figure 1) (Chu et al. 1999). In our study, 48 individuals were collected from the Dadu river, a branch of the Yangtze river. As of March 2014, a total of 26 dams were constructed, some are under construction or planned for the river, which poses a new threat to freshwater ecosystems and fish diversity in the Dadu river (https://www.wilsoncenter.org/publication/interactive-mapping-chinas-dam-rush). To facilitate a better understanding of the genetic diversity and population structure of P. sinensis for resource conservation, we isolated and characterized 28 polymorphic microsatellites of P. sinensis owing to the fact that microsatellites are the markers of choice for a variety of population genetic studies. Compared with the traditional methods of simple-sequence repeats (SSRs) marker development, next-generation sequencing is more cost efficient (Zheng et al. 2013; Liu et al. 2017). RNA-seq data were generated by Ma et al. (2016). In this study, to understand the population genetics of P. sinensis, we used unigenes assembled from RNA-seq for developing polymorphic SSRs with a Perl script, MISA (http://pgrc.ipk-gatersleben.de/misa/misa.html).
Materials and methods
Sample collection
In this study, methods involving fish were conducted in accordance with the Laboratory Animal Management Principles of China. Forty-eight individuals of P. sinensis were collected from the Dadu river.
RNA-seq data
RNA-seq data were generated by Ma et al. (2016). FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc) was used to control the quality of reads. We trimmed the adapter sequence and sites of lower quality reads (Phred score <20) with Cutadapt (Martin 2011). These cleaned reads were assembled using Trinity (Haas et al. 2013) software with default parameters. Contigs longer than 200 bp were retained for further analysis. CD–HIT–EST program (Li and Godzik 2006) with an identity threshold of 95% was used to remove low-coverage artifacts or redundancies. The unigenes were used for further microsatellite marker detection.
EST-SSR detection and primer development
Microsatellites within the unigene assembly were detected using a Perl script MISA (http://pgrc.ipk-gatersleben.de/misa/misa.html). The SSR loci were considered to contain only two to six nucleotide motifs with a minimum of 6, 5, 5, 5 and 5 repeats, respectively. Mononucleotide repeats were excluded from the EST-SSR search as their polymorphism is often difficult to interpret (Lopez et al. 2015).
The EST-SSR primers were designed using Primer 3.0 (Untergasser et al. 2012) under following criteria: (i) primers’ length ranged from 18 to 25 bases (optimum: 20 bp); (ii) PCR product size ranged from 100 to 300 bp; (iii) melting temperature was between \(58{^{\circ }}\hbox {C}\) and \(63{^{\circ }}\hbox {C}\) (optimum: \(60{^{\circ }}\hbox {C}\)) and (4) a GC content of 40–60% (optimum: 50%).
DNA extraction, PCR conditions and amplification of SSRs
We dissected a small piece of white muscle tissue or fin from the right side of the body of each specimen. All of the tissue samples were preserved in 95% ethanol. Total genomic DNA was extracted from the muscle tissue or fin by performing a standard salt extraction.
The polymerase chain reaction (PCR) amplification was carried out in \(30\ \mu \hbox {L}\) reaction mixture with \(\sim \) \(100\ \hbox {ng}\) of template DNA, \(1\ \mu \hbox {L}\) of each primer (10 pmol), \(3\ \mu \hbox {L}\) of \(10 \times \) reaction buffer, \(1.5\ \mu \hbox {L}\) of dNTPs (2.5 mM each) and 2.0 U of Taq DNA polymerase.
The PCR conditions for SSR included an initial denaturation step at \(94{^{\circ }}\hbox {C}\) for 5 min, followed by 30 cycles of denaturation at \(94{^{\circ }}\hbox {C}\) for 30 s, annealing at \(60{^{\circ }}\hbox {C}\) for 40 s and extension at \(72{^{\circ }}\hbox {C}\) for 30 s, followed by a final extension at \(72{^{\circ }}\hbox {C}\) for 10 min and storage at \(4{^{\circ }}\hbox {C}\).
Amplification products were separated using 20% polyacrylamide gel. Some loci did not amplify in all samples although we adjusted the PCR conditions. These loci were excluded from further testing. Besides, only those loci which showed polymorphism were considered for genotyping analyses. Fluorescently labelled primers were further synthesized to ensure the accuracy of visualized lengths in polyacrylamide gel.
Genotyping
Forward primers (table 1 in electronic supplementary material at http://www.ias.ac.in/jgenet/) were labelled with the FAM or HEX dye on the \(5^{\prime }\)-end. The PCR reaction conditions were the same as described above. The amplified products were detected on an ABI 3130xl Genetic Analyzer, and scored using GeneMapper software (Applied Biosystems, Foster City, USA).
Microsatellite data analysis
Important genetic parameters of polymorphic microsatellite loci such as polymorphism information content (PIC), the number of alleles (\(N_{\mathrm{A}}\)), observed heterozygosity (\(H_{\mathrm{O}}\)), expected heterozygosity (\(H_{\mathrm{E}}\)) were calculated using POPGENE 1.32 (Quardokus 2000). Possible deviations from the Hardy–Weinberg equilibrium (HWE) were tested by Fisher’s exact test with Bonferroni correction.
Results and discussion
In this study, 47,989 unigenes generated using RNA-seq data were used to detect potential microsatellite loci. A total of 7832 sequences were identified containing 9471 SSRs. A total of 1354 sequences contained more than one SSR (table 1). There were 70 motifs obtained, of which the most frequent was AC/GT (428, 54.65%), followed by AG/CT (406, 16.43%), ATC/ATG (138, 5.02%), AGG/CTT (123, 3.99%), AAG/CTT (101, 4.35%) and GTA/CAT (88, 3.79%) (table 1 in electronic supplementary material). Detailed analysis showed that dinucleotide repeats were the most frequent (72.53%), followed by trinucleotide (22.56%) and tetranucleotide (4.56%) repeats. SSRs with nine tandem repeats 1980 (20.90%) were the most common, followed by eight tandem repeats 1333 (14.07%) (figure 2).
To test the applicability and polymorphisms of SSR markers, 120 primer pairs were chosen randomly and validated across 48 P. sinensis individuals collected from the Dadu river (dot in figure 1). Of the 120 primer pairs only 86 (71.67%) were successfully amplified. Twenty-eight of the microsatellite loci showed polymorphism (table 2). Fluorescently labelled primers were further synthesized for these loci. The result showed that the number of alleles (\(N_{\mathrm{A}}\)) for each locus ranged from 2 to 14 and the mean number of alleles per locus was 7. The observed heterozygosity (\(H_{\mathrm{O}}\)) and expected heterozygosity (\(H_{\mathrm{E}}\)) varied from 0.104 to 0.958 and from 0.157 to 0.844, with an average of 0.583 and 0.613, respectively (table 2). Twenty loci exhibited high polymorphism (\(\hbox {PIC} {>} 0.5\)). Across all samples, 14 loci among 28 showed significant departures from the HWE (table 2).
P. sinensis is an endemic species with narrow distribution, which faced threat from human disturbance and habitat destruction. Thus, it is crucial that the current resources of P. sinensis be protected. Microsatellite markers developed in our study serve as a useful tool for the conversation genetic studies and population evaluation of P. sinensis.
References
Chu X., Zheng B. and Dai D. 1999 Fauna Sinica, class Teleostei, Siluriformes (in Chinese). Scientific Press, Beijing.
Haas B. J., Papanicolaou A., Yassour M., Grabherr M., Blood P. D., Bowden J. et al. 2013 De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512.
Li W. and Godzik A. 2006 Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659.
Liu H. G., Yang Z., Tang H. Y., Gong Y. and Wan L. 2017 Microsatellite development and characterization for Saurogobio dabryi Bleeker, 1871 in a Yangtze river-connected lake, China. J. Genet. 96, e1–e4.
Lopez L., Barreiro R., Fischer M. and Koch M. A. 2015 Mining microsatellite markers from public expressed sequence tags databases for the study of threatened plants. BMC Genomics 16, 781.
Ma X., Dai W., Kang J., Yang L. and He S. 2016 Comprehensive transcriptome analysis of six catfish species from an altitude gradient reveals adaptive evolution in Tibetan fishes. G3-Genes Genomes, Genet. 6, 141–148.
Martin M. 2011 Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 17, 10–12.
Quardokus E. 2000 PopGene. Science 288, 458–458.
Untergasser A., Cutcutache I., Koressaar T., Ye J., Faircloth B. C., Remm M. et al. 2012 Primer3 – new capabilities and interfaces. Nucleic Acids Res. 40, e115.
Zheng X., Pan C., Diao Y., You Y., Yang C. and Hu Z. 2013 Development of microsatellite markers by transcriptome sequencing in two species of Amorphophallus (Araceae). BMC Genomics 14, 490.
Acknowledgements
This work was supported by the Key Fund and NSFC-Yunnan mutual funds of the National Natural Science Foundation of China (grant nos. 31130049 and U1036603).
Author information
Authors and Affiliations
Corresponding author
Additional information
Corresponding editor: Indrajit Nanda
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Chen, W., He, S. Isolation and characterization of microsatellite markers in a highland fish, Pareuchiloglanis sinensis (Siluriformes: Sisoridae) by next-generation sequencing. J Genet 97 (Suppl 1), 111–116 (2018). https://doi.org/10.1007/s12041-018-0997-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12041-018-0997-6