Introduction

Sex determination and development of aquatic organisms is a complicated biological process. Sex chromosomes were the first sex determination and development associated systems identified by biologists, and diversified sex chromosome determination mechanisms exist for aquatic organisms, including the XY and ZW systems (Cioffi et al. 2012; Mathers et al. 2015; Salvadori et al. 2018). The same-sex chromosome system and differentiation sex determination and development mechanisms also exist among aquatic organisms. For example, Macrobrachium rosenbergii and Triportheus have a ZW sex chromosome system, but the underlying sex determination mechanisms are distinct (Aflalo et al. 2006; Diniz et al. 2008), particularly the sex determination mechanism of M. rosenbergii with single gland cells (Levy et al. 2016, 2019) and single genes determines the direction of sex development (Tomer et al. 2012). M. rosenbergii, a species characterized by different sex types of different sizes (Kuris et al. 1987), has drawn considerable attention in sex determination and development research. Previously, it was discovered that the expression of the insulin-like androgenic gland (IAG) gene in the androgenic gland is a key factor in the sex determination and the sex development control mechanism of M. rosenbergii (Ventura et al. 2009, 2012; Aflalo et al. 2012); however, it is not yet clear how sex determination and development are regulated by different TFs and function genes, and both participate in gonadal sex reversal remain unknown.

In most organisms, TFs and DEGs from gonadal perform key functions for decoding DNA sequences and controlling cell differentiation, developmental patterning (Lee and Young 2013), and specific pathways, such as gonadal differentiation and gonadal sex determination (Morohashiet et al. 2004). The best-understood TF pathway involved in sex determination and development is the demethylase KDM6B pathway, which regulates temperature-dependent sex determination in turtles (Ge et al., 2018). KDM6B is a histone demethylase that can directly promote the expression of Dmrt1, a TF involved in male sex determination in turtles. Dmrt1 contains a DNA-binding motif (Matson and Zarkower 2012) and is the master gene in the genetic sex determination of certain non-mammalian species, such as chickens and frogs (Smith et al. 2009; Yoshimoto et al. 2008). The crustaceans may adopt the Drosophila sex determination and development pathway that is ruled by the genetic pathway Sxl-Tra/Tra-2-Dsx/Fru, and the pathway also was reported in some shrimp species as Penaeus monodon and Penaeus chinensis (Leelatanawit et al. 2008; Li et al. 2012). M. rosenbergii, a non-mammalian species, has been shown to express similar TFs and function genes in embryonic tissues and androgenic glands (Amterat Abu Abayed et al. 2019); however, the upstream regulatory factor IAG is also associated with these TFs. Currently, these two genes are known to be involved in the sex determination of M. rosenbergii; however, the regulatory mechanism underlying their activity is unclear.

The decision mechanisms of model organisms and amphibians have been studied systematically. Because of the large number of chromosomes in shrimps, it is difficult to distinguish between autosomes and sex chromosomes, and sex identification by karyotype analysis is impossible (Lee et al. 2004). In particular, the role of TFs and DEGs in sex determination and development in M. rosenbergii is poorly understood. Although it is known that the testis, ovaries, and androgenic glands are important gonadal tissues of M. rosenbergii, they have important role in producing sperm and eggs and regulating sex development; whether the DEGs and differential TFs in three gonadal have a role in their development and maturity remains unclear. Therefore, further research on this topic would broaden our understanding of the sex development system, improve and expand the existing sex development regulation network, and provide new ideas for the study of sex development regulation in M. rosenbergii.

Materials and methods

Sample collection and RNA extraction

Six adult M. rosenbergii shrimps, including three males and three females, which were separated based on the presence of genital papillae on the fifth walking legs and appendices masculina on the second pleopods, as previously reported (Ventura et al. 2009), were raised in two automated circulating water system tanks with water temperatures maintained at 25–28 °C and dissolved oxygen content above 5 mg/L in the Laboratory of Aquatic Biology of Zhejiang Academy of Agricultural Sciences, Hangzhou, China. All shrimps were aged more than 8 months and matured sexually with clear sexual gonad tissues, determined by dissection. Each shrimp was placed on ice for 1 min to induce anesthesia. After anesthetization, samples of ovaries (Ov) all in pre-ovulated stage from three female, testes (Tt) and androgenic glands (Ag) from three males were collected separately and dissected. Each fresh sample was washed in phosphate-buffered saline (PBS; Gibco, Fisher Scientific, Waltham, MA, USA) and immediately frozen in liquid nitrogen.

Total RNA was extracted from each sample using an RNeasy Plus Micro Kit (QIAGEN LLC., Germany). RNA concentration and purity were measured using a spectrophotometer to determine the OD260/OD280 value for each sample (NanoVue; GE Healthcare, Piscataway, NJ, USA).

Mixed sample sequencing and unique transcript identification

Three male and three female RNA samples were mixed into one sample, respectively; each contains 5 µg of total RNA mixed, with one male and one female RNA sample for Iso-Seq library construction. The Iso-Seq library was prepared according to the Isoform Sequencing protocol (Iso-Seq) using the Clontech SMARTer PCR cDNA Synthesis Kit and the BluePippin Size Selection System protocol, as described by Pacific Biosciences (PN 100–092-800–03). Sequencing was performed using the PacBio Sequel platform (Pacific Biosciences Co., CA, USA). The raw read data were generated using PacBio Sequel and processed using the SMRTlink software (version 7.0; Pacific Biosciences Co.). The classified reads were processed as full-length and non-full-length reads. Full-length non-chimeric (FLNC) read sequences were corrected using the Illumina RNA-Seq data using the software LoRDEC, and redundancies were removed using CD-HIT to obtain the final transcripts. The obtained transcripts were subjected to clustering analysis to form a reference library for all the unique genes.

Sample sequencing and de novo assembly

Three gonadal tissues, 9 samples, 3 replicates for each gonadal tissue were conducted sequencing. A total of 3 μg of RNA per sample was used as the initial material for RNA sample preparation. Total RNA was used to generate sequencing libraries with varied index labels using a NEBNextUltra™ Directional RNA Library Prep Kit for Illumina (New England BioLabs, Ipswich, MA, USA), according to the manufacturer’s instructions. The clustering of index-coded samples was performed on a cBot cluster generation system using TruSeq PE Cluster Kit v3-cBot-HS (Illumina, San Diego, CA, USA), according to the manufacturer’s instructions. After cluster generation, the libraries were sequenced on an Illumina Hiseq 4000 platform (Illumina), and 150 bp paired-end reads were generated. All data generated in this study were deposited in the Gene Expression Omnibus (GEO) with the accession code GSE171823. Additional nucleotide errors in the consensus reads were corrected using Illumina RNA-Seq data with LoRDEC software. The expression levels of all unigenes were analyzed using sequencing data, by calculating the fragments per kilobase million (FPKM) value.

Annotation of unigenes and differential expression level analysis

The function of each unique gene was annotated using NR, NT, Pfam, KOG/COG, Swiss-Prot, KO, and Gene Ontology (GO) and analyzed using BLAST, Diamond BLASTX, and Hmmscan, with the e-value set to 1e−10. Differential expression analysis of each gene in the two groups was performed using the DESeq R package (1.10.1). The resulting p values were adjusted using the Benjamini–Hochberg approach to control the false discovery rate. Genes with an adjusted p value < 0.05, identified using DESeq, were categorized as differentially expressed genes (DEGs).

Quantitative reverse transcription PCR (qRT-PCR) was performed using a PowerSYBRGreen RT-PCR Reagent Kit (Applied Biosystems, Carlsbad, CA, USA) in triplicate for each sample and reference gene. Actin was used as a reference gene. The expression of all chosen mRNAs was quantified using the comparative CT ratio between each gene and the β-Actin gene (primer data are shown in Table 1).

Table 1 Primers used in the present study

GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses of DEGs

GO enrichment analysis of DEGs was performed using the GOseq R package, in which gene length bias was corrected. GO terms were considered significantly enriched by DEGs, with corrected p values < 0.05.

KEGG (http://www.kegg.jp/) enrichment of DEGs was performed using the hypergeometric test on the KOBAS software (v.2.0) (Chen et al. 2011) platform to test the statistical enrichment of DEGs in KEGG pathways, in which p values were adjusted by multiple comparisons as q values. KEGG terms with q < 0.05 were considered to be significantly enriched.

TFs identification

Based on the Animal TFDB 2.0 TF database (Zhang et al. 2015), all TFs were predicted and identified using the Hmmscan software (v.3.1b2) (Eddy 2011) to analyze the E-value of each DEG, with a highly significant E-value threshold of 10 × 10−5. Tissue-specific TFs were scanned from differentially expressed TFs using PKFM values, with only one being highly expressed in the tissue.

Results

Summary of RNA-seq raw data

Two libraries (for a mixed sample of three males and a mixed sample of three females) were constructed and sequenced from six adult M. rosenbergii. Based on the mixed sample sequencing, 44.42 G base of subreads with length over 50 bp were generated in total, with 20.79 G and 23.63 G for males and females, respectively (Table 2). In total, 37,751,802 high-quality subreads with an average length of 1,177 bp were obtained by sequencing. After clustering to eliminate redundancy for FLNC read sequences of transcripts, 13,299 and 35,247 transcripts were obtained for males and females, respectively. A total of 24,007 genes were identified by removing redundant and similar sequences through sequence alignment and clustering.

Table 2 Statistical analysis of sequencing for M. rosenbergii

Analysis of annotation and expression by sequencing

Nine samples from three males and three females were established and sequenced. In total, 21,071 unigenes were clustered with the third-generation sequencing results and were annotated using seven databases. A total of 11,602 unigenes were annotated in the Nr protein database with an e-value < e−5. The least annotated unigenes were found in the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov/RefSeq) database of non-redundant nucleotide and protein sequences, and only 3,127 unigenes were involved.

The FPKM values of all the unigenes were calculated for subsequent analysis. The results revealed that the unigenes were expressed at different levels in the three tissues (Fig. 1). The expression levels of unigenes did not differ between the testes and the other tissues, but significant differences were observed in the unigene expression between the ovary and androgenic glands (p < 0.05, t-test). These results demonstrated that the average expression level of unigenes in androgenic glands, testes, and ovaries showed an increasing trend in the order they were mentioned, especially when the expression in androgenic glands was compared to that in ovarian tissue.

Fig. 1
figure 1

The level of differential expression of unigenes in each sample of M. rosenbergii. Tt, Ag, and Ov represent the testicles, androgenic glands, and ovaries, respectively

Analysis of DEGs

In total, 9,199 unique DEGs were detected and identified by pairwise comparison of the three M. rosenbergii tissues. Based on the DEG analysis results, the greatest level of differential unigene expression was observed between the androgenic glands and ovaries, with 5,391 unigenes downregulated and 2,113 upregulated in the androgenic gland compared to those in the ovary (Fig. 2). The lowest number of DEGs was found in the comparison between androgenic glands and testes: 2,246 unigenes were downregulated and 802 were upregulated in the androgenic glands compared to those in the testes.

Fig. 2
figure 2

The level of differential expression of unigenes in different tissues of M. rosenbergii. A, B, and C present DEG numbers of the results of pairwise comparison in three tissues. D The DEG profiles in three comparisons. E The cluster result of all the DEGs in the present study

Cluster analysis resolved the unigenes into three categories. A branch showing the unigenes at the base of the heatmap displayed a decreasing trend from the ovary to the androgenic gland and testis, and the top branch displayed an opposing trend at the top of the heatmap (Fig. 2).

Validation of unigene expression using qPCR

We selected four unigenes (Piwi, IAG, Argonaute2 (Ago3), and ankyrin) for qRT-PCR validation of the FPKM value by RNA sequencing. According to the CT value of each gene in each sample analyzed using qPCR, the qPCR results were highly consistent with the FPKM values (Fig. 3), and we have conducted the correlation analysis and results show that there is a statistical probability value less than 0.001, confirming that the FPKM values accurately represented the expression levels in the testes, androgenic glands, and ovaries of M. rosenbergii.

Fig. 3
figure 3

Verification of the expression pattern of three tissue unigenes in M. rosenbergii by qRT-PCR. The y-axis shows log10 determined by FPKM value and qPCR relative expression level

Enrichment of DEGs

To further investigate the pattern of divergent expression of sex maintenance- and sex determination-related DEGs and their corresponding functions, GO analyses of DEGs were conducted in the three comparison groups. We found that 1,347 DEGs in Ag vs Tt were significantly enriched in GO terms (Fig. 4). Among these DEGs, the most enriched GO term was observed in response to the stimulus of biological processes, which included 57 upregulated unigenes and 155 downregulated unigenes. The second-most enriched GO term was found in DNA binding of molecular functions, including 22 upregulated unigenes and 160 downregulated unigenes. In total, 2,719 DEGs from Ag vs Ov were significantly enriched in the GO term, and the most enriched GO term was the extracellular region. Additionally, 1,590 DEGs from Tt vs Ov were also significantly enriched in the GO term, and the most enriched GO term was the extracellular region. However, interestingly, there were no DEGs from the testes or ovaries that were enriched in biological processes.

Fig. 4
figure 4

DEGs GO enrichment analysis results. A Statistics for DEGs from androgenic gland vs testicle; B statistics for DEGs from androgenic gland vs ovary; C statistics for the DEGs from testicle vs ovary

In the KEGG enrichment analysis, 1,281 DEGs, 3,634 DEGs, and 1,641 DEGs from Ag vs Tt, Ag vs Ov, and Tt vs Ov, respectively, were significantly enriched in KEGG terms. The top 20 KEGG terms in each group of DEGs are shown in Fig. 5. The KEGG terms cell cycle, DNA replication, oocyte meiosis, and basal TFs were significantly enriched in Ag tissue as well as in the other two tissue types (Fig. 5). Most DEGs from Tt vs Ov were significantly enriched in terms related to biosynthesis, such as insect hormone biosynthesis.

Fig. 5
figure 5

DEGs, KEGG enrichment analysis results. A Statistics for the DEGs from Ag vs Tt; B statistics for the DEGs from Ag vs Ov; C statistics for the DEGs from Tt vs Ov

TF identification and analysis

In total, 284 TFs were identified in the three tissue transcriptomes. After differential expression analysis, 272 differentially expressed TFs were detected in all three groups (Fig. 6). A total of 135 differentially expressed TFs were identified in the comparison of Ag vs Tt, 207 differentially expressed TFs in the comparison of Ag vs Ov, and 89 differentially expressed TFs in the comparison of Tt vs Ov. Based on the TF database classification, the highest number of DEGs was 110 belonging to the zf-C2H2 family, and the second-highest was 57 belonging to the ZBTB family.

Fig. 6
figure 6

Differentially expressed TFs analysis and KEGG enrichment analysis results. A Cluster result of differentially expressed TFs analysis; B statistics for the differentially expressed TFs from Ag vs Tt using KEGG enrichment analysis; C statistics for the differentially expressed TFs from Ag vs Ov using KEGG enrichment analysis; D statistics for the differentially expressed TFs from Tt vs Ov using KEGG enrichment analysis

To further elucidate the functions of the differentially expressed TFs, KEGG enrichment analysis was conducted (Fig. 6). The results of enrichment analysis indicated that the MAPK signaling, longevity regulating, and thyroid hormone synthesis pathways were significantly enriched in the DEGs of the three groups. The TF MAK in the MAPK signaling pathway was upregulated specifically in the M. rosenbergii ovary and was not detected in other tissues. Similarly, the TF TRβ1 was upregulated specifically in the testes of M. rosenbergii. In addition, the expression of the sperm cell-specific TF Piwi was also significantly upregulated in the testes.

Discussion

In this study, we conducted transcriptomic and TF identification in adult gonads of male and female M. rosenbergii to explore the mechanism of sex determination and gonad maintenance. Previous studies on M. rosenbergii have focused on sex-related gene expression or gonad-specific genes in male and female pathways (Jiang et al. 2019; Yang et al. 2020). TFs play critical roles in tissue differentiation and development (Al-Kurdi 2017); however, there is limited information available on their function in sex determination and gonad maintenance in M. rosenbergii.

In previous reports, an increasing number of aquatic animal transcriptomes have been established using mixed sample sequencing (Suwansa-Ard et al. 2015; Jung et al. 2016). While studying M. rosenbergii, transcriptome sequencing technology has also been applied to the screening of candidate genes involved in the development, differentiation, and gonadal axis regulation of the gonads of M. rosenbergii (Pasookhush et al. 2019). There have also been some new reports on the discovery of candidate genes for gonadal development and differentiation and associated genetic markers in M. rosenbergii (Jin et al. 2013; Patnaik et al. 2016); however, there is limited information about the role of TFs with respect to gonadal development and maintenance in M. rosenbergii. In this study, we identified 272 differentially expressed TFs through transcriptome sequencing analysis of three main gonadal tissues (testes, androgenic glands, and ovaries) in M. rosenbergii. Among them, 48 testis-specific upregulated TFs, with expression levels more than three times higher than those in the other two tissues, were identified, whereas 67 TFs were upregulated specifically in the ovaries, and only one TF was specifically upregulated in the androgenic glands. This suggests that the process of ovarian development and differentiation is more complicated than simply by passing the formation of the testes and androgenic glands, particularly in the communication between the intercellular and intracellular mature follicles, which is related to the requirement for TF participation (Bernabé et al. 2020). Only one androgenic gland-specific TF, with a DNA-binding function, was upregulated compared to that in the testes and ovaries. This indicates that the androgenic gland-specific genes upregulated in M. rosenbergii, such as IAG, may have transcriptional patterns similar to those for which transcription and translation are activated by regulatory-specific DNA binding in the early embryonic stages in zebrafish (Joseph et al. 2017).

TFs play an important role in the process of gonadal development and maintenance, which has been confirmed in a previous study on gonadal organs in humans and other animals (Bothun and Woods 2019; Tsoi et al. 2019; Rai et al. 2020). In this study, we found that the TFs, MAX and TRβ1, specifically expressed in the testis and ovary of M. rosenbergii and associated with the DEGs TAB1 and ATPase, may maintain the testicular and ovarian cell cycle through the MAPK signaling pathway and thyroid hormone signaling pathway to control the maturation of primary spermatocytes and primary oocytes, respectively (Fig. 7).

Fig. 7
figure 7

The key DEFs and DEGs play a role in sex gonad maintenance and sex determination in M. rosenbergii. A The pathway of key DEFs and DEGs play a role in sex gonad maintenance (the number corresponds to upregulated fold change in response to sex gonad); B the pathway of key DEFs and DEGs play a role in sex determination (the number in red color corresponds to upregulated fold change in Tt vs Ov and green refers to downregulated fold change in Tt vs Ov)

We observed that MAX, a TF that is upregulated in the testis, directly regulates DNA replication, thereby controlling the proliferation of primary spermatocytes (Hurlin and Huang 2006). TAB1, the primary DEG upregulated in the testes, can indirectly inhibit the expression of TCF through the Wnt signaling pathway to control the proliferation of primary spermatocytes and maintain the function of sperm development (Shibuya et al. 1996). The MAPK signaling pathway is an important biological pathway for cell maintenance (Bonni 1999) and plays an important role in the maintenance of gonads and the process of apoptosis (Casarini and Crépieux 2019). In a previous study, MAX was shown to play an important role in the sex organs of male rats (Izawa 1993) and was also shown to be an important gene in gonadal cell transcription (Hurlin and Huang 2006), which is a key process for the maintenance of gonadal cells and the prevention of tumor formation. Although the TAB1 gene has been reported to maintain peripheral T cells and sex pheromone response (Ohkusu-Tsukada et al. 2004; Folli et al. 2008), it has not been reported to participate in the testicular cell cycle, and the results of this study support that TAB1 is a key gene that controls the division and replication of primary spermatocytes in M. rosenbergii.

The thyroid hormone signaling pathway plays an important role in female reproduction by altering the estrous cycle and anti-oxidative status of the ovary (Wei et al., 2018). TRβ1 and ATPase, belonging to Na + /K + -ATPase, are key functional genes involved in the thyroid hormone signaling pathway (Bhargava et al. 2007). We found that both are specifically upregulated in the ovaries of M. rosenbergii, and TRβ1 indirectly regulates glycolysis and maintenance of egg cell pluripotency through PI3K and Wnt/β-catenin. In addition, TRβ1 plays a direct role in regulating DNA replication to control the cell cycle. Some studies have reported that during the development and maintenance of primary oocytes, granulosa cells supply oocytes through glycolysis (Fontana et al. 2020), indicating that the upregulation of TRβ1expression is involved in the maturation of primary oocytes in M. rosenbergii. Wnt/β-catenin is an important factor that regulates and maintains the pluripotency of embryonic cells (Fan et al. 2020) and acts as a biologically active molecule, similar to TFs and steroid hormones produced by the ovary in the germ line (Edson, et al. al. 2009). TFs can also act as key biologically active molecules that activate the TF, Esrrb, to control the morphogenesis of embryonic internal organs and maintain pluripotency (Fan et al. 2020). In addition, the upregulation of ATPase expression promoted by TRβ1 can help meet the energy requirements of folliculogenesis in the ovary of M. rosenbergii (Dumesic et al. 2015).

Owing to the high demand for single-sex feeding of M. rosenbergii, sex determination and development have always been a research hotspot in this field. In this study, it was found that the two members of the Argonaute family, Ago3 and Piwi, have different expression patterns in the ovary and testis. Ago3 was significantly upregulated in the ovaries, whereas Piwi was significantly upregulated in the testes. Ago3 and Piwi are key factors in the sex development of Bombyx mori. They can interact with specific genes expressed on the ZW chromosome and regulate the development of sex determination in the silkworm through ping-pong partners and dosage compensation (Katsuma et al., 2018). The expression patterns of both mature female and male glands of M. rosenbergii are consistent with those of Bombyx mori (shown in Fig. 7), indicating that sex development in M. rosenbergii may also be achieved through the ping-pong mechanism. However, Fem, which is involved in the sex development mechanism, was not detected in the DEGs, but existing a tendency to upregulate expression in ovary in our study, which is consistent with the time-specific expression of Fem observed in Litopenaeus vannamei, in which Fem is only expressed before oocyte maturation (Galindo-Torres et al. 2019). Collectively, it was found that Fem expression in female M. rosenbergii can be coordinated with Piwi gene expression to inhibit Masculinizer (Masc) expression, which in turn causes the normal splicing of the sex-determining gene double sex (Dsx) (Suzuki et al. 2008) and leads to the development of female individuals. In male individuals, Fem expression is absent and Masc can cause alternative splicing of the Dsx gene.

Conclusions

Collectively, our results show that 9,199 DEGs and 272 differentially expressed TFs were identified by pairwise comparison; the TFs, MAX and TRβ1, are involved in the MAPK and thyroid hormone signaling pathways and play an important role in the maintenance of the testis and ovary of M. rosenbergii, respectively. Sex development in M. rosenbergii might have a ping-pong partner, similar to that in Bombyx mori. The key TFs Piwi and Ago3 in this mechanism were significantly differentially expressed in the testes and ovaries in M. rosenbergii.