Introduction

The Pacific white shrimp Litopeneaus vannmei (L. vannmei) is a worldwide cultured species with the total production of 4 million tons, and the total value of the output was 24 billion US dollars in 2016 (http://www.fao.org/fishery/statistics/global-aquaculture-production/query/en). It is considered as an aquaculture species with the highest single output value. Advances in selective breeding have accelerate the development of L. vannmei aquaculture. For selective breeding, the growth trait is among the most important economic traits in shrimp aquaculture. Several breeding programs have been performed to select rapid growth broodstocks in L. vannamei since 1990s (Andriantahina et al. 2012; Argue et al. 2002; Castillo-Juarez et al. 2015; Moss et al. 2008; T et al. 2005). The heritability of the growth trait was reported to be 0.24–0.515 (Andriantahina et al. 2012; Castillo-Juarez et al. 2007; Li et al. 2015), which illustrated that the genetic factor played important role during the growing process. In order to improve the selective efficiency, marker-assisted selection (MAS) is a useful approach to accelerate the genetic progress. Besides, understanding the genetic architecture and identifying the QTLs for growth trait could facilitate the application of MAS in shrimp (Liu and Cordes 2004).

In the past decade, genetic mapping has been conducted to identify the growth-related QTLs or markers in L. vannamei. Compared with the other aquaculture species, such as Atlantic salmon (Gutierrez et al. 2012, 2015), tilapia (Liu et al. 2014) and catfish (Geng et al. 2016; Li et al. 2018), very limited information about the growth-associated markers or genes is known for the shrimp L. vannamei. Previous studies have identified some QTLs related to growth trait through linkage mapping method (Andriantahina et al. 2013; Yu et al. 2015). However, no genes related to growth trait were identified due to the low resolution of the linkage map or limited information of the whole genomic sequence of L. vannamei.

Association mapping is regarded as an efficient approach for dissecting the complex trait (Mackay 2001; Tan et al. 2018). The association mapping highly relies on the linkage disequilibrium (LD) between the markers and the genetic variations affecting the trait (Wen et al. 2014). Among the association mapping approaches, the genome-wide association (GWA) mapping is the most widely used method in plant and livestock (Hayes and Goddard 2010; Hirschhorn and Daly 2005; Huang et al. 2010). In the past few years, genome-wide association study (GWAS) analyses have also been conducted to identify growth-related QTLs in several aquaculture species (Li et al. 2018; Tsai et al. 2015; Gonzalez-Pena et al. 2016). For instance, markers significantly associated with growth were identified in Atlantic salmon (Gutierrez et al. 2015). In common carp, growth-related genes such as KISS2, IGF1, SMTLB, NPFFR1, and CPE were identified (Peng et al. 2016). In catfish, one single significant QTL located on the linkage group 5 was identified to be associated with growth traits (Li et al. 2018). Furthermore, nine genes such as actn2, Adam12, and dock1 located in the QTL regions were also identified as candidate genes, which might function in the bone development of this species (Geng et al. 2017). Up to now, there is no application of GWAS in studying the growth traits of the penaeid shrimp L. vannamei. It is partially attribute to the lack of reference genome information and the high-throughput genotyping platform in shrimp. Recent advances in genomic study have facilitated the genotyping of genome wide SNP with a low cost in non-model species (Davey et al. 2011; Yu et al. 2015). So, the use of GWAS was therefore possible to improve the mapping efficiency and identify interested genes in L. vannamei.

In this study, we presented the results of GWAS for growth trait in L. vannamei. The body weight and body length were separately analyzed to identify growth-associated genomic regions. Among the QTL regions, two candidate growth-related genes were identified and analyzed.

Materials and Methods

Samples and Phenotype Data

The samples used for GWAS analysis were derived from a full-sib family with a total of 205 progenies. It is the same as the mapping family described previously (Yu et al. 2015). Briefly, the full-sib family was created in the breeding center of Guangxi Institute of Fisheries (Nanning, Guangxi Province, China) and all the shrimp were cultured under the same condition. The growth-related phenotypes including body weight and body length were measured for each individual after 3-months cultivation. In order to validate the identified growth-related markers, another two independent populations which was derived from a commercial breeding program developed by Hainan Grand Suntop Marine breeding company (Wenchang, Hainan province, China) were used. The shrimp family was constructed in 2015 and the individuals from the same family were cultured in one tank for seeding and culturing. After the offspring reached around 3 cm in length, a total of 13 families were labeled with visible implant elastomer (VIE) tags and pooled together as the first validation population, in which each family contained 50 offspring. The same strategy was applied to construct another family pool, in which the families were different from the previous one. After almost 3 month’s cultivation, the phenotype of body weight and body length was measured for the two populations respectively, and the sex of each individual was recorded. A total of 143 individuals and 200 individuals were randomly collected and measured for the two populations separately.

DNA Extraction, SNP Genotyping, and GWAS Analysis

Genomic DNA of the mapping family and the two validation populations were extracted using TIANGEN Plant DNA extraction kit (TIANGEN, Beijing, China). The concentration of extracted DNA was determined using a NanoDrop 1000 Spectrophotometer (NanoDrop, Wilmington, DE, USA). DNA integrity was evaluated by gel electrophoresis. The whole genome SNP data used for genome-wide association analysis of the family have been described previously (Yu et al. 2015). Briefly, the individuals of the family were genotyped by SLAF-seq approach (Sun et al. 2013). As the GWAS only use the bi-allele marker, the tri-allele and four-allele markers were filtered. The markers were further filtered with minor allele frequency (> 0.05) and call rate for SNPs (> 90%). The location of each SNPs along the chromosome was obtained from the linkage map published previously (Yu et al. 2015).

In order to determine the SNPs associated with body weight and body length, GWAS was performed using the GenABEL package in the R environment (Aulchenko et al. 2007b). A genomic kinship matrix was calculated from the SNP data using the gkin function. Heritability was estimated using a linear mixed model implemented on polygenic function. Genome-wide rapid association using the mixed model and regression-Gamma (GRAMMAR-Gamma) method was used to identify the associated SNPs (Aulchenko et al. 2007a). The -log10 (P value) of each SNP across the genome was calculated to illustrate the GWAS results. The threshold P value for genome-wide significance was calculated using Bonferroni correction based on the estimated number of independent markers. Considering the marker number and the number of genotyping populations, the suggestive threshold for the association significance was arbitrarily set as 0.01.

Annotation of Significant Markers

The flanking sequence of the significant SNPs was blasted to the assembled L. vannamei genome (Zhang et al. 2019) using BLASTN. The upstream 40 kb and downstream 40 kb sequence around the top BLAST hit region were examined for candidate genes according to their locations and functions. The genes within these regions were predicted using FGENESH (Solovyev et al. 2006) and annotated by BLAST analysis against the non-redundant protein database in National Center for Biotechnology Information (NCBI) database.

Candidate Gene Association Analysis

Based on the annotation of the significant SNP markers, three genes including E3 ubiquitin-protein ligase HECW1, protein kinase c, and ras-related protein Rap-2a near the Marker33701, Marker1286, and Marker1121 were selected for candidate gene association analysis. Primers were designed to amplify the gene regions in the genomic DNAs. A pool of DNAs from the two validation populations were used as template for PCR amplification. The PCR products were sequenced using Sanger sequencing by ABI Prism 3730xl and SNPs in the gene regions of the three genes were detected. Those regions containing SNPs were further amplified and sequenced in the individual DNA sample to obtain the genotype information for each shrimp. As the body weight and body length shown a significant correlation, only the body weight were analyzed and the association test were performed using R/SNPassoc package.

Expression Analysis of the Candidate Genes

In order to analyze the function of the candidate growth-related genes, the expression patterns of these genes were analyzed. Healthy shrimps with an average weight of 11 g were randomly selected. Hemocytes were collected from the shrimp with anticoagulant and isolated by centrifugation at 800×g for 10 min under 4 °C. Different tissues including epidermis, gill, hemocyte, hepatopancreas, heart, intestine, muscle, nerve, lymphoid organ, and stomach were collected and preserved in liquid nitrogen for total RNA extraction. Total RNA was extracted from various tissues using RNAiso Plus reagent (TaKaRa, Japan). A total of 1 μg RNA for each was used for cDNA synthesis in a total volume of 25 μl using PrimeScript RT Reagent Kit (TaKaRa, Japan), the temperature was set at 37 °C for 15 min and then 85 °C for 5 s.

Quantitative real-time PCR (qPCR) was performed to analyze the expression pattern of candidate genes. The house-keeping gene 18S rRNA was employed as a reference gene. The reaction contained 5 μl SuperReal PreMix Plus (SYBR Green) (Tiangen, China), 0.3 mM primers, 1 μl cDNA template, and then adding RNase free water to 10 μl. The qPCR program was set as the following: 95 °C for 20 min; 40 cycles of 95 °C for 15 s, annealing temperature for 15 s and 72 °C for 20 s. In order to confirm whether the PCR product was specific, a melt cycle was added. The data was analyzed by 2-∆∆CT method (Livak and Schmittgen 2001).

Result

Phenotypic Data of the Growth Trait

The phenotypic data of the two validation populations shown a relative high level of variation compared with the mapping family (Table 1). The average body weight and body length of the mapping family was 20.32 ± 3.72 g and 117.13 ± 7.36 mm, respectively. The average body weight and body length of the validation population 1 was 8.94 ± 2.80 g and 90.35 ± 9.69 mm, and those for the validation population 2 were 5.56 ± 2.16 g and 76.99 ± 9.95 mm separately.

Table 1 The phenotype information of three populations for growth trait

GWAS Analysis of the Body Length in the Mapping Family

A total of 3960 SNPs were obtained after filtration and quality control. These SNPs distributed along 44 linkage groups with one SNP per 1.04 cM in average. The heritability estimated by the SNPs information for body length was 0.25. After Bonferroni correction, the genome-wide significance threshold was 1.6 × 10−5. Consequently, GWAS analysis identified a total of 52 suggestively associated markers (P < 0.01) (Fig. 1). Among these markers, the most significant marker was Marker19909 located in 52.69 cM along LG 7, and the next are Marker4108 and Marker1309 located in 38.62 cM of LG 33 (Table 2). Interestingly, among the top 10 significant markers, eight markers located in LG7, and the other two markers located in LG33. These data gave us a clue that the linkage groupLG7 and LG33 might contain QTLs for body length.

Fig. 1
figure 1

Genome-wide association analysis result for body length of L. vannamei. The red line indicated the suggestive threshold for significant markers associated with body length. The blue line indicated the genome-wide threshold for significant markers

Table 2 The markers associated with body length of L. vannamei at P < 0.01

GWAS Analysis on Body Weight in the Mapping Family

Similar to the body length trait, only suggestively associated markers were identified. A total of 51 suggestively associated markers were screened (P < 0.01) (Fig. 2). The most significant markers were Marker4108 and Marker1309 located in 38.62 cM of LG33 (Table 3). The heritability estimated by SNPs information for body weight was 0.42. Similar to the body length, the LG33 contained eight markers among the top 20 significant markers, which illustrate the importance of these regions for growth trait. The LG38, LG27, and LG8 were the other three linkage groups containing significant markers.

Fig. 2
figure 2

Genome-wide association analysis for the body weight of L. vannamei. The red line indicated the suggestive threshold for significant markers associated with body weight. The blue line indicated the genome-wide threshold for significant markers

Table 3 The markers associated with body weight of L. vannamei at P < 0.01

Annotation on the Associated Genomic Regions

Based on the genome-wide association results, we analyzed the genomic regions associated with body length and body weight. The top 10 significant markers were selected for each trait, separately. As there were two common markers for the two traits, a total of 18 growth-related markers were annotated. Among them, 11 markers were next to genes and seven were annotated as known genes (Table 4). For the body length, the genomic regions along the top 10 markers contained four genes, of which one gene was annotated as Beta-2 adrenergic receptor and the function of the three other genes were unknown. For the body weight, the associated genomic regions contained eight genes, in which six genes were annotated by non-redundant protein database. Some growth trait-related genes such as protein kinase C delta (PKC-delta) ras-related protein Rap-2a (Rap-2a) and E3 ubiquitin-protein ligase HECW2 happen to be located in these regions.

Table 4 Annotation of top associated markers with body length or body weight in L. vannamei

Association Analyses of Candidate Genes

Based on the P value and the annotated gene function of the identified markers, we selected three genes including PKC-delta, Rap-2a, and E3 ubiquitin-protein ligase HECW2 for further validation and candidate genes association analyses. From the sequencing of the genomic DNA of the three genes, a total of five, six, and three SNPs were identified in the amplified regions of PKC-delta, Rap-2a, and E3 ubiquitin-protein ligase HECW2, respectively. After genotyping these SNPs in the two validation populations, one SNP located in the 3′UTR region of PKC-delta was significantly associated with the body weight under both the dominant and codominant model (P < 0.05). One SNP in the intron of Rap-2a was also validated to be significantly associated with body weight (Table 5). Interestingly, only two genotypes were observed for the marker in Rap-2a, one type of homozygote was lost in both populations.

Table 5 Association analysis result of SNPs in the genes PKC and Rap-2a

Gene Expression Pattern of Candidate Genes

The expression pattern of the PKC-delta and Rap-2a in different tissues was analyzed by qRT-PCR using the primers listed in Table 6. The PKC-delta was highly expressed in muscle, heart, and stomach. The expression was low in the other tissues (Fig. 3a). The Rap-2a was highly expressed in lymphoid organ, besides, the expression level in the digestive organs including hepatopancreas, intestine, and stomach was also high (Fig. 3b).

Table 6 The primers used for expression analysis of the candidate genes by real-time PCR
Fig. 3
figure 3

Relative expression pattern of protein kinase C delta (a) and ras-related protein Rap-2a (b). Epi, epidermis; Gi, gill; Hc, hemocyte; Hp, hepatopancreas; Ht, heart; In, intestine; Ms, muscle; Ner, nerve; Oka, lymphoid organ; St, stomach

Discussion

Growth traits including body weight and body length are the most concerned economic traits in the selective breeding of L. vannamei (Benzie 2009; Caballero-Zamora et al. 2015). In this study, we identified several genomic regions associated with body weight and body length by genome-wide association analysis. Based on the assembled genome sequence, candidate growth-related genes were identified.

Genome-wide association study has become the major approach for mapping genes associated with the interested traits. It has been widely used to identify the causative genes of disease in humans and explore the SNPs related to economical traits in crops and livestock (Bouwman et al. 2018; Hirschhorn and Daly 2005; Huang and Han 2014; Zhang et al. 2012). In aquaculture species, several GWAS have been performed in catfish, tilapia, common carp, Atlantic salmon, rainbow trout, and Pacific oyster (Correa et al. 2015; Geng et al. 2016; Gutierrez et al. 2015; Li et al. 2017, 2018; Sodeland et al. 2013; Song et al. 2018; Wang et al. 2017; Zheng et al. 2016; Zhou et al. 2017). Genomic regions or candidate genes associated with disease resistance, fillet quality, sexual maturation, shell color, and growth rate were identified. In our study, by combing GWAS and candidate gene association approach, several genomic regions along LG7, LG27, LG33, and LG38 associated with the growth traits were identified in L. vannamei. Two genes including PKC-delta and Rap-2a were identified as candidate growth-related genes.

Usually the Bonferroni correction is generally applied to avoid false positive (type I error) rates. However, it is known that the Bonferroni-corrected P value is too strict in a GWAS analysis (Hong and Park 2012; Spencer et al. 2009). In our study, the genome-wide significance threshold was around 1.6 × 10−5 after Bonferroni correction. Considering the small sample size and large number of SNPs, the corrected significant threshold is too strict to identify candidate markers. Similar results were reported in other aquaculture species (Gutierrez et al. 2015). The growth trait is a polygenic trait controlled by many genes with small effect (Gutierrez et al. 2015; Meng et al. 2017). The body weight and body length might be controlled by many genes with moderate to low effects in shrimp. So, the p value of the most detected markers was not significant for the adjusted threshold. However, the top associated markers were clustered in genomic regions along the LG7, LG33, and LG38, which illustrated effective genes controlling growth might be located in these regions.

In order to identify traits-related markers, the candidate markers should be validated in other populations (Yue 2014). With the help of genome reference, the annotations of the significant markers were obtained. By combing the GWAS result and the annotation information, it can give evidence for identifying the candidate genes. In this study, three candidate genes were selected for association analyses and two genes were confirmed. The PKC-delta is a metabolism-related gene and it has been proved to play an important role in apoptosis (Zhao et al. 2012) and it also presented a regulation function during the molting process (Chen et al. 2017). The growth of shrimp is discontinuous and accompanied by frequent molting, so the PKC-delta may influence the growth by regulating the molting (Shyamal et al. 2018). Rap-2a is a member of the Ras-related protein family. It is part of several signaling cascades and may regulate cytoskeletal rearrangements, cell migration, and cell spreading (Taira et al. 2004). Interestingly, the Ras gene was identified to be associated with head size in catfish, and it located in the QTL hub of the catfish genome which played an important role in growth regulation (Geng et al. 2016). The expression pattern of PKC-delta shown that it was high expressed in the muscle tissue, as the muscle was the major portion of shrimp body, it may play crucial rule in growth. The Rap-2a shown highly expression in the intestine tissue. The intestine was the major tissue for energy absorption and was related to development and growing. The expression analysis results gave us more evidence for the two genes as growth-related genes.

The marker-assisted selection (MAS) has been widely used in crop and animal breeding (Goddard and Hayes 2009). The advantage of MAS was to improve the selection accuracy and speed up the genetic gain through early selection (Yue 2014). The QTLs and SNPs associated with growth trait identified in this study will be useful for the further MAS application. Furthermore, the gene-assisted selection (GAS) was more accurate than MAS (Dekkers 2004). In this study, the two identified growth-related genes will also accelerate the GAS study in L. vannamei. However, as the restriction of mapping population and the marker density, the causative SNPs or mutations influencing the growth trait have not been detected. So, further fine mapping and function analysis of these genes are necessary.

Conclusion

In this study, SNPs and genomic regions associated with growth trait were identified by genome wide-association mapping approach. Based on the assembled genome reference, two growth-related genes including PKC and Rap-2a were identified. It illustrated the power of mapping QTLs or genes using GWA mapping in L. vannamei. The associated markers or genes will be useful to marker-assisted selection and the identified genes will expand our knowledge about the mechanisms of growth regulation in L. vannamei.