Introduction

The sex-determination mechanism of plants is complex and regulated by multiple factors such as sex-determination genes, sex chromosomes, phytohormones, environmental factors, and epigenetic regulation. Most theorizing about sex determination in plants has focused on dioecious species, but monoecious species also matter, the sex of monecious or hermaphroditic species is determined at the level of modules, tissues, or cells. The study of flower sex differentiation in monoecious plants is the link between monoecy and dioecy to explore plant evolution (Pannell 2017). Schisandra chinensis which belongs to Schisandraceae, is a monoecious plant with male and female unisexual flowers (Cao et al. 2015). The fruit of S. chinensis is the traditional Chinese herb named “Wuweizi” or “Beiwuweizi” used as an astringent tonic to astringe the lungs and the kidneys, replenish energy, promote the production of body fluids, tonify the kidney, and induce sedation (Yang et al. 2022). The fruit of S. chinensis is also used as fruit or health food of high medical and economic value. The fruit yield of S. chinensis was affected by the ratio of female and male flowers, so the study on the development of male and female flowers and sex-related genes is of great significance for the subsequent regulation of this ratio. In addition, there are both monoecious plants and dioecious plants in Schisandraceae, so it is of great significance to study the flower differentiation of S. chinensis for the evolution of Schisandraceae.

With the rapid development of next-generation sequencing technology, various sequencing methods are available for the study of plant sex determination. Whole-genome sequencing (WGS) of non-model organisms is now widely accessible and has allowed a range of questions, including plant sex determination, in the field of molecular ecology to be investigated with greater power (Vekemans et al. 2021). Zhou et al. applied multiomics analysis including WGS, RNA-seq, and assay for Transposase Accessible Chromatin with high-throughput sequencing (ATAC-seq) to investigate the sex determination of areca palm (Areca catechu) and revealed a region on pseudochromosome 15 harbored sex-related genes, including CYP703, LOG, GPAT, AMS, and BiP (Zhou et al. 2022). However, the method based on WGS is costly and not suitable for every species, especially those with large genomes, such as plants from the Schisandraceae, which have a genome size of about 8G (Zhang et al. 2013). In contrast, RNA-Seq can instead be utilized as a powerful and cost-effective approach for the study of sex determination and has widely been used in the identification of DEGs associated with sex differentiation in plants (Harkess and Leebens-Mack 2016). RNA-seq can provide a molecular basis for revealing differences between male and female flowers at the transcriptional levels and facilitate genetic research on sex determination in non-model plants, such as the monoecious plants Jatropha curcas, Vernicia fordii, and Castanea henryi (Zhao et al. 2020; Alsubaie et al. 2023). In Jatropha curcas, DEGs such as KNAT, MYC2, SRS5, SVP, TFL1, and TS2 may participate in the sex determination (Chen et al. 2016). In Vernicia fordii, 310 and 298 DEGs showed high expression levels in male and female flowers (Mao et al. 2017). In Castanea henryi, WRKY47, ERF021, MYB4, AGL11/15, DEF, and SEP1 were critical regulators of sex determination (Wu et al. 2023). Therefore, we suppose that RNA-seq is a feasible tool to explore the sex determination mechanism.

Liu et al. used RNA-seq to identify the sex-determining genes of S. chinensis using flower buds about to bloom, and the results showed that phytohormones including auxin and jasmonate (JA), sucrose may contribute to the development of S. chinensis flowers. AG may be critical for the sex determination of S. chinensis (Liu et al. 2022). The members of MADS-box and YABBY gene families have been reported to be strongly associated with sex differentiation and determination of plants. MADS-box genes such as AP3, PI, and STK are candidate genes in the sex determination of Areca catechu, Populus tremula, and Zanthoxylum armatum (Zhou et al. 2022; Leite Montalvão et al. 2022; Hui et al. 2022). The YABBY family genes such as CRABS CLAW (CRC) and INNER NO OUTER (INO) are important for female organ development. CRC controls the carpel determinacy in cucurbits (Zhang et al. 2022a). INO could regulate the development of integument and be essential for female fertility in pomegranate (Chen et al. 2017). MADS-box and YABBY transcription factors play essential roles in the sex determination of plants as the sex switch (Zhang et al. 2022a, 2022b). Besides the MADS-box and YABBY genes, pathways including flavonoid biosynthesis, DNA replication, and sugar metabolism have been revealed to participate in male or female organ development (Feng et al. 2020; Pawełkowicz et al. 2019; Wang et al. 2019b). To further identify the possible genes that participate in the flower development and sex determination of S. chinensis, we performed RNA-seq and the DEGs analysis between female and male flowers at three developmental stages. Extensive research has shown that MADS-box and YABBY transcription factors play essential roles in the sex determination of plants (Arora et al. 2007; Romanova et al. 2021). Based on transcriptome data, two gene family analyses were conducted to identify possible MADS-box and YABBY genes. The results of this study can lay a foundation for further exploring the sexual differentiation mechanism of S. chinensis and provide valuable information for the study of evolutionary patterns of flowers of Schisandraceae and plant sexual differentiation.

Materials and methods

Plant materials and sample collection

Three monoecious S. chinensis were cultivated in the Beijing Medical Botanical Garden (116.27°E, 40.03°N). Male and female S. chinensis flowers were collected on April 14, 18, and 24, 2022. Three developmental stages of S. chinensis flowers expressed different morphological patterns (Fig. 1). At the first stage (April 14), the male and female flowers of S. chinensis cannot be distinguished by direct observation. The immature stamens and pistils were observed in male and female flower buds under the anatomical microscope and scanning electron microscope (Fig. 1g ~ n). At the second stage, S. chinensis flowers were fully developed, but not fully flowering (Fig. 1b, e), while at the third stage were fully open (Fig. 1c, f). Samples were grouped into F1 vs. M1, F2 vs. M2, and F3 vs. M3, which represented DEGs between female and male flower buds at the first, second, and third developmental stages, respectively. Each group contained three biological replicates and each biological replicate contained at least five flower buds. The samples for RNA-seq were immediately frozen in liquid nitrogen after the collection and stored at a − 80 °C freezer. The samples for observation were stored in FAA fixative solution (70% ethanol: glacial acetic acid: 38% formaldehyde = 18:1:1).

Fig. 1
figure 1

Male and female flowers of S. chinensis at the different developmental stages. a–c Male buds at stages 1–3. d–f Female buds at stages 1–3. g–h The male buds at stage 1 under the microscope. i The male bud at stage 1 under the scanning electron microscope. j The tissue section of a male bud of stage 1. k–l The female buds at stage 1 under the microscope. m The female bud at stage 1 under the scanning electron microscope. n The Tissue section of a female bud of stage 1. Scale. g, h, j, k, l, n 50 mm, i, m 200 μm

RNA extraction, library construction, and Illumina sequencing

The total RNA of samples was extracted using TRIzol reagent (Invitrogen, Carlsbad, CA, USA). The integrity and quality of the RNA were determined by the Bionanlyzer 2100 system (Agilent Technologies, Santa Clara, CA, USA). One microgram of total RNA was used for the transcriptome library construction. We used magnetic beads containing Oligo (dT) by A-T base pairing with polyA to separate mRNA. Then, we used a fragmentation buffer to fragmentize mRNA. cDNA was synthesized by using the SuperScript double-stranded cDNA Synthesis Kit (Invitrogen, Carlsbad, CA, USA) and random hexamer primers (Illumina, San Diego, CA, USA). The synthesized cDNA was then end-repaired, phosphorylated, and the “A” base added according to Illumina’s library construction protocol. Libraries were size-selected for cDNA target fragments of 300 bp on 2% low range ultra agarose followed by PCR amplified using Phusion DNA polymerase (NEB) for 15 PCR cycles. After quality inspection, the paired-end RNA-seq sequencing library was sequenced with the Illumina NovaSeq 6000 platform (Illumina, San Diego, CA, USA). Shanghai Majorbio Bio-pharm Biotechnology Co., Ltd. (Shanghai, China) was assigned to perform high-throughput sequencing.

Quality control, de novo transcriptome assembly, function annotation

Fastp v0.19.5 with default parameters was used for quality control (Chen et al. 2018). The de novo assembly of transcripts was finished by Trinity v2.8.5 (Grabherr et al. 2011). CD-HIT v4.5.7 was used for the optimization of assembled transcripts (Fu et al. 2012). The quality of transcripts was assessed by BUSCO v3.0.2 and TransRate v1.0.3 (Simão et al. 2015; Smith-Unna et al. 2016). All assembled transcripts were compared with the public databases including NCBI protein non-redundant (NR), Swiss-Prot, Pfam, Clusters of Orthologous Groups of proteins (COG), Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) using blast + v.2.9.0 and diamond v0.9.24, with a threshold e value < 1e−5 (Camacho et al. 2009; Buchfink et al. 2015).

Differential expression analysis

Gene expression levels were calculated and normalized using the method of transcript per kilobase per million mapped reads (TPM) by using RSEM v1.3.1 (Li and Dewey 2011). DESeq2 R package v1.24.0 was used to identify differentially expressed genes between male and female flowers in three different stages (Love et al. 2014). We compared three groups, F1 vs. M1, F2 vs. M2, and F3 vs. M3. Genes with | Log2 (fold change) |> 1 and a false discovery rate (FDR) of < 0.05 in comparison were considered as significant DEGs. DEGs were then subjected to enrichment analysis of GO functions and KEGG pathways.

Identification and expression pattern analysis of MADS-box and YABBY gene families members

Blast (https://blast.ncbi.nlm.nih.gov/Blast.cgi) and HMMER v3.0 (Potter et al. 2018) were used to identify possible members of MADS-box and YABBY gene families based on RNA-seq data. MADS-box sequences of Arabidopsis were downloaded from The Arabidopsis Information Resource 10 database (TAIR10) (www.arabidopsis.org/) (Berardini et al. 2015). Arabidopsis and rice YABBY sequences were downloaded from the Plant Transcription Factor database (http://planttfdb.gao-lab.org/). MADS-box HMMER models, SRF-TF (PF00319) and K-box (PF01486), and YABBY HMMER model (PF04690) were obtained from Pfam protein families database (http://pfam.xfam.org/) (Mistry et al. 2021). The filter threshold for both blast and HMMER was set to 1e−5. The intersection of blast and HMMER results were considered as candidate S. chinensis MADS-box and YABBY genes. All predicted MADS-box and YABBY sequences were verified using SMART (http://smart.embl-heidelberg.de) and CDD (https://www.ncbi.nlm.nih.gov/cdd) (Letunic et al. 2021; Lu et al. 2020).

For the MADS-box gene family, besides Arabidopsis thaliana and Oryza sativa MADS-box genes, we also download MADS-box sequences homologous to Schisandra chinensis to construct the phylogenetic tree, including Asarum europaeum, Amborella trichopoda, Schisandra sphenanthera, Kadsura japonica, Illicium anisatum, Illicium floridanum, Illicium henryi, Illicium parviflorum, and Magnolia stellata. . MADS-box genes in grapevine (VvMADS) have been well studied (Díaz-Riquelme et al. 2009). Therefore, we also used VvMADS genes to perform the phylogenetic analysis. For the YABBY gene family, reported YABBY sequences of Arabidopsis thaliana and Oryza sativa were used to build the phylogenetic tree. All protein sequences were aligned using MUSCLE, and trimAI was used to trim the results of alignment (Edgar 2004; Capella-Gutiérrez et al. 2009). IQtree was used to construct a phylogenetic tree with 5000 bootstrap replications (Nguyen et al. 2015). The software ModelFinder helped search for the optimal tree models for phylogenetic analysis (Kalyaanamoorthy et al. 2017). The optimal tree models of MADS-box and YABBY genes were set to LG + G4 and VT + G4, respectively. LG and VT were two amino acid replacement matrixes (Le and Gascuel 2008; Vingron 2000). G was the Gamma model, one rare heterogeneity across-sites model proposed by Yang (1994). LG + G4 indicated that the LG matrix and discrete Gamma model with four rare categories. VT + G4 indicated that the LG matrix and discrete Gamma model with four rare categories. The phylogenetic trees of MADS-box genes and YABBY genes were constructed on the toolkit TBtools and annotated by the iTOL (Chen et al. 2020; Letunic and Bork 2021). The source of all reference sequences is listed in Table S1.

Quantitative real-time PCR analysis

qPCR primers for the validation genes were designed by the Prime Primer 5 software (Singh et al. 1998). We selected the TUBA gene as the reference gene according to the previous study (Liu et al. 2022). Twenty genes were selected to validate their expression patterns according to their functions. AFO, YABBY2, INO, YABBY5-1, and YABBY5-2 belonged to the YABBY gene family, AeAP3-2, STK belonged to the MADS-box gene family, and AIL-1, ANT belonged to AP2 gene family. TT2, TT3, TT4, TT6-1, TT6-2, TT12, and MYB1 participated in the flavonoid biosynthesis. GATA18 was involved in the development of female flowers and GAMYB, CEP1, and MST4 participated in pollen development. The detailed information on forward and reverse primers is listed in Table S2. All qPCR experiments were completed using three biological replicates with three technical replicates each. We used 1μg sample RNA as the template for reverse transcription experiments. The reverse transcription experiments were performed using FastKing RT Kit with gDNase (Tiangen, Beijing, China) according to the instructions. Then, we completed the real-time qPCR experiments with 1 μg cDNA, Talent qPCR Premix Kit (SYBR Green) (Tiangen, Beijing, China) and analyzed on a BIORAD CFX96 Real-Time System. The procedure was initiated at 95 °C for 3 min, followed by 40 cycles of 95 °C for 5 s, 55 °C for 10 s, and 72 °C for 15 s. The relative expression levels were calculated using the 2−ΔΔCt method (Livak and Schmittgen 2001).

Protein–protein interaction prediction

To elucidate the candidate gene co-expression network related to the flower development and sex differentiation of S. chinensis, a protein–protein interaction (PPI) prediction was performed. The string database (https://cn.string-db.org/) was used for the PPI prediction (Szklarczyk et al. 2023). The coding sequences of 466 DEGs were translated to protein sequences and then uploaded to this database. Arabidopsis thaliana was selected as the reference species. The confidence value was set as 0.4 (medium). The prediction result was exported to cytoscape v3.9.1 to draw the PPI network (Shannon et al. 2003).

Results

Results of transcriptome assembly and annotation

Eighteen cDNA libraries from male and female flowers of S. chinensis at three developmental stages were sequenced. After filtering the adaptors and low-quality sequences, 125.51 Gb clean data were obtained from 18 samples. Each sample contained more than 6.03 Gb clean data. The Q20 ratio of each sample was above 97.21%, and the Q30 ratio was above 92.46%. The GC content of 18 samples ranged from 46.37 to 47.16% (Table S3). A total of 149,495 transcripts and 96,694 unigenes were obtained from the de-novo assembly (Table S4). The N50 of transcripts and unigenes were 1,473 bp and 1,352 bp. BUSCO scores of transcripts and unigenes were 93.0% and 86.0%. TransRate scores of transcripts and unigenes were 0.29% and 0.23%. The results above indicated accurate splicing.

Clustering and heatmap analysis, combined with PCA analysis showed that female and male samples could be well separated (Fig. 2a, b). Samples at the first stage were separated from those at the second and third stages, indicating that samples at the second and third stages showed similar expression patterns. All expressed unigenes were compared with the major six databases (GO, KEGG, eggNOG, NR, Swiss-Prot, and Pfam (Fig. 2c). Most annotated unigenes were obtained from the NCBI NR database (37782 unigenes, 39.07%), and the minimum annotated unigenes were obtained from the KEGG database (12536 unigenes, 12.96%). Based on the species distribution analysis of the NR database, the top hit species was Tetracentron sinense (10.24%), belonging to the Trochodendrales.

Fig. 2
figure 2

Results of transcriptome assembly and annotation. a Clustering and heatmap analysis. The sample distances are represented by variations from dark blue (low similarity) to dark red ( similarity). b PCA for all 18 samples based on the expression values of transcriptome-wide expression profiles. c The profile of annotation results based on the major six databases (GO, KEGG, eggNOG, NR, Swiss-Prot, and Pfam)

DEGs analysis, GO, and KEGG enrichment analysis

To identify candidate genes that may engage in the flower development and sex differentiation of S. chinensis, we compared the transcriptome profiles between F1 and M1, F2 and M2, and F3 and M3. The total numbers of DEGs in F1 vs. M1, F2 vs. M2, and F3 vs. M3 were 936, 7179, and 6890, respectively (Fig. 3a, b). A total of 466 DEGs were commonly expressed in S. chinensis flowers at three developmental stages.

Fig. 3
figure 3

Results of DEGs, GO, and KEGG enrichment analyses. a DEGs in S. chinensis flowers at three developmental stages. b Venn diagram showing overlaps between the sets of sex-biased genes (|log2(foldchange) |> 1, p.adj < 0.05). c–e GO enrichment analysis of DEGs (1) F1 vs. M1 (2) F2 vs. M2 (3) F3 vs. M3. f–h KEGG enrichment analysis of DEGs (1) F1 vs. M1 (2) F2 vs. M2 (3) F3 vs. M3

GO enrichment analysis revealed 118, 198, and 190 significantly enriched GO terms (FDR < 0.05) at stages 1 to 3, respectively (Fig. 3c, d, e). At the first stage, cell fate commitment (GO:0045165) was the most significantly enriched term. At the second stage, nucleosome assembly (GO:0006334) was the most significantly enriched term. At the third stage, mitotic cell cycle phase transition (GO:0044772) was the most significantly enriched term.

KEGG enrichment analysis revealed that 4, 6, and 7 KEGG pathways were significantly enriched (FDR < 0.05) at stages 1 to 3, respectively (Fig. 3f, g, h). At the first stage, flavonoid biosynthesis (ko00941), fatty acid elongation (ko00062), DNA replication (ko03030), and phenylpropanoid biosynthesis (ko00940) were significantly enriched. At the second stage, ribosome (ko03010), DNA replication (ko03030), starch and sucrose metabolism (ko00500), lysine biosynthesis (ko00300), glycerolipid metabolism (ko00561), and flavonoid biosynthesis (ko00941) were significantly enriched. At the third stage, ribosome (ko03010), lysine biosynthesis (ko00300), glycerolipid metabolism (ko00561), phenylpropanoid biosynthesis (ko00940), DNA replication (ko03030), base excision repair (ko03410), and isoquinoline alkaloid biosynthesis (ko00950) were significantly enriched.

Possible MADS-box gene family members in S. chinensis flowers

Through the above method, we finally obtained 51 MADS-box genes from S. chinensis and divided these genes into several clades according to previous studies (Fig. 4a, Table 1, S5) (Duan et al. 2022; Schilling et al. 2020; Ye et al. 2022). Most common MADS clades can be found in S. chinensis, such as AP1 (class A), AP3/PI (class B), AG (class C), STK (class D), and SEP (class E), which act as flowers organ identity genes (Ali et al. 2019). However, clades like FLC, FUL, ANR, and SVP were not identified in this study. On the contrary, the TM8 gene and OsMADS32 gene, which were not found in Arabidopsis, could be found in S. chinensis (Wang et al. 2020a).

Fig. 4
figure 4

Results of two gene families analyses. MADS-box gene family analysis: a Maximum likelihood tree of MIKC MADS-box genes. 42 MIKC genes were divided into 13 clades. b Heatmap of MIKC genes in S. chinensis female and male flowers at three developmental stages. YABBY gene family analysis: c Maximum likelihood tree of YABBY genes identified in S. chinensis flowers. d Heatmap of YABBY genes in S. chinensis female and male flowers at three developmental stages

Table 1 Information of 51 MADS-box genes in S. chinensis

Then, we analyzed differentially expressed MADS-box genes in male and female flowers (Fig. 4b). All B-class genes (AP3/PI) had higher expression in male flowers, similar to the results of previous studies (Ren et al. 2021; Wang et al. 2019c). One AG (class C) gene, two STK (class D) genes, and two Bsister genes were upregulated in female flowers. Most SEP (class E) genes showed an extremely low expression level, while we found one SEP gene and one AGL6 gene showed a higher expression level in the female flowers. MIKC* is another important clade of the MADS-box gene family (Liu et al. 2013). Our results suggested that MIKC* genes were mainly upregulated in male flowers.

Possible YABBY gene family members in S. chinensis flowers

We successfully identified 10 YABBY genes from the transcriptome data (Table S6). Based on the classification of YABBY genes in Arabidopsis and rice, we found that all YABBY clades can be found in S. chinensis (Fig. 4c, Table 2). However, YABBY3 genes could not be found in S. chinensis.

Table 2 Information of 10 YABBY genes in S. chinensis

Expression analysis showed that most YABBY genes were upregulated in female flowers (Fig. 4d). DL and FIL showed high expression at the early flowering stage. With the development of flowering, the expression level reduced gradually. YABBY2 genes had high expression in male and female flowers. INO gene showed a female-specific expression pattern. Like the expression of DL and FIL genes, two YABBY5 genes had high expression at the first stage, while one YABBY5 gene showed high expression at all stages. Five YABBY genes were regulated in female flowers at all stages.

Sex-biased genes of S. chinensis

The identification of sex-biased genes, which exhibit significantly higher expression in flowers of one sex than in the other sex, can help explore the sex determination mechanism (Harkess et al. 2015; Li et al. 2020). Four hundred and sixty-six DEGs were identified between female and male flowers at three developmental stages, which included 390 female-biased genes, 66 male-biased genes, and 10 no-biased genes (Fig. 5a). In these genes, 52 transcription factors (TFs) were identified, including 42 female-biased TFs, eight male-biased TFs, and two no-biased TFs. The gene family including more than three members were MYB (7), bHLH (7), NAC (6), YABBY (5), LBD (4), GRF (4), AP2/ERF (4), and MADS (3). Notably, most members of the NAC gene family showed a male-biased expression pattern (Fig. 5b). The top five enriched GO terms were centromere complex assembly (GO:0034508), meiotic chromosome separation (GO:0051307), chromosome, centromeric region (GO:0000775), protein-DNA complex (GO:0097522), and nucleosome (GO:0000786) (Fig. 5c). These genes were significantly enriched in two pathways, DNA replication (ko03030), and flavonoid biosynthesis (ko00941) (Fig. 5d).

Fig. 5
figure 5

Sex-biased genes of S. chinensis. a Different expression types of 466 DEGs, b TFs in 466 DEGs, c GO enrichment analysis of 466 DEGs, d KEG enrichment analysis of 466 DEGs

Validation of DEGs using qRT-PCR analysis

To validate the gene expression revealed by the RNA-seq analysis, 20 DEGs were selected for validation experiments. These genes, including ScAFO, ScYABBY2, ScINO, ScYABBY5-1, ScYABBY5-2, ScAeAP3-2, ScSTK, ScAIL-1, ScANT, ScGATA18, ScTT2, ScTT3, ScTT4, ScTT6-1, ScTT6-2, ScTT12, ScMYB1, ScGAMYB, ScCEP1, and ScMST4. These genes were reported to participate in the development of male and female flowers. The results showed that the expression patterns of these genes in male and female flowers were similar between RNA-seq and qPCR methods, and the correlation ratio (R2) was 0.807 (Fig. S1), suggesting that the RNA-seq analysis results were reliable. The expression of ScGAMYB, ScCEP1, and ScMST4 were downregulated in female flowers, while the expression of ScAFO, ScYABBY2, ScINO, ScYABBY5-1, ScYABBY5-2, ScAeAP3-2, ScSTK, ScAIL-1, ScANT, ScTT2, ScTT3, ScTT4, ScTT6-1, ScTT6-2, ScTT12, ScMYB1, and ScGATA18 were upregulated in female flowers (Fig. 6). ScYABBY2, ScINO, ScSTK, ScAeAP3-2, ScANT, ScMYB1, and ScGATA18 showed female-specific expression pattern, while ScCEP1 and ScMST4 were only expressed in male flowers.

Fig. 6
figure 6

qRT-PCR validation of DEGs. Male-biased genes included GAMYB, CEP1, and MST4. Female-biased genes included AFO, YABBY2, INO, YABBY5-1, YABBY5-2, AeAP3-2, STK, AIL-1, ANT, GATA18, TT2, TT3, TT4, TT6-1, TT6-2, TT12, MYB1. The y-axis indicated the expression level (2−△△.Ct). The error bar indicated the standard error. *: p < 0.05; **: p < 0.01; **** p < 0.001

Protein–protein interaction prediction

A total of 76 DEGs with medium confidence (0.400) were filtered to construct the PPI network. These genes could be divided into four groups according to their functions and clusters, including genes related to the biosynthesis of flavonoids (TT2, TT4, TT5, TT12, etc.), genes related to stomata development (FAMA, STOMAGEN, TMM, SPCH, etc.), genes related to flower development (SUP, NTT, INO, BEL1, ANT, LFY, HEC3, etc.), genes related to DNA replication and cell cycle (MCM3, MCM5, HMGB6, etc.) (Fig. 7). Genes with darker colors showed higher degree scores, suggesting that these genes were more likely to participate in the sex differentiation of S. chinensis. YABBY genes including INO and AFO, MADS-box genes STK and TT16, flower development-related genes LFY, ANT, BEL1, SUP, and DNA replication-related genes including MCM3, MCM5, PCNA2, PRL, HTA6, and HMGB6 may play a key role in the flower development and sex differentiation of S. chinensis.

Fig. 7
figure 7

Protein–protein interaction prediction. Seventy-six DEGs with medium confidence (0.400) were shown. The label with deep color represented genes with high degree scores, and the edge with thick lines suggested that these two genes had high combined scores

Discussion

MADS-box genes regulate the flower development of S. chinensis

MADS-box genes are strongly related to the formation of flowers. The C-class gene AG was reported to be a candidate gene involved in the sex differentiation of S. chinensis (Liu et al. 2022). To further explore MADS-box genes potentially involved in the flower development and sex differentiation of S. chinensis, we performed a gene family analysis based on the RNA-seq data.

In this study, we found that the expression pattern of B, C, and D genes in S. chinensis was similar in Arabidopsis. B-class genes including AP3 and PI control the identity of petals and stamens (Whipple et al. 2004). Five B-class genes in S. chinensis (ScMADS6, 8, 9, 10, 43), were upregulated in male flowers, indicating that B-class genes may regulate male organ development. The C-class gene AG could regulate the identity of stamens and carpels (Yamaguchi et al. 2006). In S. chinensis, the AG gene (ScMADS30) had expression in both male and female flowers. Consistent with Liu’s findings, the expression level in female flowers was higher in male flowers (Liu et al. 2022). Therefore, we speculated that the AG gene ScMADS30 could be a candidate gene in the sex differentiation of S. chinensis.

D-class genes including STK and SHP1/2 are responsible for the ovule identity (Rodríguez-Cazorla et al. 2018). Two STK genes ScMADS16 and ScMADS17 had high expression in female flowers. Besides D-class genes, Bsister MADS-box genes also act as the key regulator in female organ development (Mizzotti et al. 2012). We identified two Bsister genes ScMADS38 and ScMADS39 in S. chinensis. They were homologs to AeAP3-2, the Bsister genes identified in Asarum europaeum (Kramer and Irish 2000). It was reported that AeAP3-2 could be involved in carpel and ovule development. We found that ScMADS38 and ScMADS39 were only expressed in female flowers. This result indicated that ScMADS38 and ScMADS39 were essential for female flower development and may participate in the sex differentiation of S. chinensis. AG, STK, and AeAP3-2 mutually regulated the development of female organs (Pinyopich et al. 2003). AG and STK genes are responsible for the specification of ovule identity (Pelayo et al. 2021). STK and AeAP3-2 genes are essential for female gametophyte development (Mizzotti et al. 2012). ScMADS16, ScMADS17, and ScMADS38 were significantly upregulated in female flowers. Overall, we inferred that C, D-class genes, and Bsister MADS-box genes may be crucial for the female flower development of S. chinensis.

E-class genes including SEP1, SEP2, SEP3, SEP4, and AGL6 were involved in the development of sepals, petals, stamens, carpels, and ovules (Morel et al. 2019). The expression analysis showed that most E-class genes identified had a very low expression in flowers, and none of these genes were significantly upregulated. Four E-class genes, ScMADS20 (AGL6), ScMADS37 (SEP1), ScMADS45 (SEP2), and ScMADS46 (SEP3) had high expression in flowers during the flowering process. In Arabidopsis thaliana, SEP genes can control the specification of stamen identity with B-class genes (AP3/PI) and C-class gene (AG). ScMADS37 showed no obvious biased expression in flowers. ScMADS45 and ScMADS46 were downregulated in female flowers. Considering the B-class genes were upregulated in male flowers and C-class were upregulated in female flowers, we speculated that SEP2 and SEP3 may regulate the formation of stamens with B-class genes. Besides SEP genes, we also identified an AGL6 gene, ScMADS20, which exhibited higher expression in female flowers. The AGL6 gene had higher expression in carpels than in stamens and may participate in the ovule development of petunia (Petunia hybrida). We suggested that the AGL6 gene may participate in pistil development (Rijpkema et al. 2009).

Besides ABCDE-class genes, we also identified five genes belonging to MIKC*-type (ScMADS12, 13, 18, 19, 35). MIKC* genes are the other important clades of MIKC genes. In Arabidopsis thaliana, MIKC* genes include AGL30, AGL65, AGL66, AGL103, and AGL104. MIKC* genes are enriched in mature pollens. The mutation of MIKC* genes could lead to reduced male fertility. In rice, MIKC* genes were specifically expressed in late pollen development (Adamczyk et al. 2007). Four MIKC* genes identified in S. chinensis except ScMADS12 had high expression in the male flowers and almost had no expression in the early flower development, suggesting that MIKC* genes may influence the development and maturation of pollens. ScMADS12 had higher expression than other MIKC* genes in flowers and was significantly upregulated in female flowers at the late flowering stage. This confusing result hinted that the role of ScMADS12 needed further investigation.

Based on our results, we tend to agree that the B, C, and D-class genes may play similar roles in the flower development of S. chinensis. B-class genes participated in the formation and development of stamens and C-class and D-class genes are responsible for the development of pistils. Besides, the Bsister genes may play essential roles in the regulation of ovules. E-class genes including SEP-like genes and AGL6-like genes may regulate the flower development together with B, C, and D-class genes. MIKC* genes are most likely to participate in the development of pollen. Combined with the expression pattern, we speculated that ScMADS16, ScMADS17, and ScMADS38 may participate in the sex differentiation of S. chinensis.

YABBY genes are essential for the female flower development in S. chinensis

The YABBY family is plant-specific and characterized by a C2-C2 zinc finger domain at the N-terminus and a YABBY domain at the C-terminus (Bowman 2000). YABBY family genes can regulate the development of lateral organs and abaxial-adaxial polarity (Siegfried et al. 1999). The YABBY gene family could be divided into five clades including CRC/DL, YABBY1/YABBY3, YABBY2, INO, and YABBY5 (Buttar et al. 2020). Accumulated evidence indicates that YABBY genes may be associated with plant sex determination. YABBY1(FIL/AFO), YABBY2, and YABBY5 may play an important role in the sex determination of spinach (Li et al. 2020). INO may be related to the sex differentiation and determination of female organs in Tapiscia sinensis (Xin et al. 2019). Considering the vital role YABBY genes play in the sex determination of flowers; we also performed the gene family analysis to identify possible YABBY genes in S. chinensis based on the transcriptome data.

In our study, we identified ten YABBY genes. Compared to YABBY genes in Arabidopsis, The CRC gene and the YABBY3 gene could not be found. However, we identified a DL gene, which is a homolog to CRC in rice. By the analysis of expression pattern, we found that most YABBY genes in S. chinensis had higher expression levels in female flowers, and five YABBY genes like ScFIL1-2, ScYABBY2, ScINO, ScYABBY5-1, and ScYABBY5-2 were significantly upregulated in female flowers during three developmental stages. DL is essential for the specification of carpels in rice (Yamaguchi et al. 2004). The mutation of DL could lead to the formation of ectopic stamen (Nagasawa et al. 2003). ScDL had higher expression in female flowers than in male flowers, indicating that ScDL may be associated with carpel development in S. chinensis.

The YABBY1 gene, also called AFO or FIL, controls the determination of floral meristem identity and development of the floral meristem (Sawa et al. 1999). The homolog to YABBY1 in cucumber had a high expression level at all ovary developmental stages (Yin et al. 2022). YABBY1 may be a candidate gene for the gynoecium development in Silene latifolia (Bačovský et al. 2022). Three YABBY1 genes, ScFIL-1, ScFIL-2, and ScFIL-3, were identified in this study. ScFIL-1 and ScFIL-2 only had high expression levels in female flowers at the early flowering stage, implying that these genes may be involved in the early female flower development. ScFIL-3 had a high expression level in male flowers at the late flowering stage. In general, we suggested that YABBY1 genes showed female bias. The YABBY2 gene is expressed in the abaxial tissue of the carpel in Arabidopsis and Amborella trichopoda (Siegfried et al. 1999; Yamada et al. 2004). Two YABBY2 genes, ScYABBY2-1, and ScYABBY2-2, had higher expression in male and female flowers, respectively, indicating that YABBY2 genes could be required for flower development of S. chinensis.

INO demonstrated a female-limited expression pattern in S. chinensis flowers. In Arabidopsis thaliana, INO is crucial for the formation of the outer integument of ovules (Skinner et al. 2023). In pomegranates (Punica granatum), the expression level of INO is higher in bisexual flowers than in male flowers, suggesting that INO may potentially regulate the differentiation and development of ovules (Chen et al. 2017). In Tapiscia sinensis, INO acts as an essential regulatory factor in pistil development (Xin et al. 2019). The INO gene is crucial to female flower development, and we infer that ScINO may regulate the gynoecium development of S. chinensis.

YABBY5 is involved in the formation of spikelet in rice (Dubos et al. 2010). In spinach (Spinacia oleracea), YABBY5 acted as a pivotal gene in the development of female flowers (Li et al. 2020). Three ScYABBY5 genes showed higher expression in female flowers than in male flowers and had no expression in male flowers at the last two stages. It implied that ScYABBY5 genes were essential to the development of male and female flowers at the early stage and more associated with the female flower development in S. chinensis. Combined the result of RNA-seq analysis and RT-qPCR, we speculated that ScFIL-2, ScYABBY2-2, ScINO, ScYABBY5-1, and ScYABBY5-2 could potentially serve as key regulators in the gynoecium development and candidate genes in the sex differentiation of S. chinensis.

Multiple transcription factors contribute to the flower development and may be related to the sex differentiation of S. chinensis

In addition to MADS-box and YABBY transcription factors, we also identified genes belonging to other gene families that may be related to the sex differentiation of S. chinensis. In 466 DEGs significantly expressed during the flowering stages, transcription factors accounted for ~ 11%. In these transcription factors, the MYB gene family had the most members. As the gene family, which is large, functionally diverse, and represented in all eukaryotes, the role MYB transcription factors play in the flower development of S. chinensis cannot be ignored (Dubos et al. 2010). MYB transcription factors seemed to exhibit high expression in male and female flowers. GAMYB and DIV had higher expression in male flowers, while RAD had higher expression in female flowers. GAMYB, a part of the GA signal pathway, contributed to the anther development. GAMYB regulates early anther development and is essential for the formation of exine and Ubisch bodies in anthers (Alonso-Peral et al. 2010; Aya et al. 2009). GAMYB can interact with the DNA-specific motifs of GA-upregulated genes of anthers and regulate their expression. The DIV and RAD genes are responsible for the dorsoventral asymmetry of flowers in Antirrhinum majus (Galego and Almeida 2002; Sengupta and Hileman 2022). In snapdragons, RAD and DIV are expressed in ovaries and developing fruit. However, DIV was downregulated in female flowers, indicating that DIV might influence the development of stamens in S. chinensis. In hexaploid persimmon, ectopic overexpression of RAD could result in gynoecium hypergrowth. In Plukenetia volubilis, two RAD-like genes RAD1 and RAD2 had higher expression levels in female flowers than in male flowers (Fu et al. 2018; Masuda et al. 2022). Similarly, one RAD-like gene, ScRAD1, appears to express exclusively in female flowers. These results suggest that ScRAD1 could be involved in the gynoecium development and might serve as a candidate gene in the sex differentiation of S. chinensis. Basic helix-loop-helix (bHLH) proteins, another large family of transcription factors, also play a role in the flower development of plants. Liu et al. found that bHLH transcription factors including SPT, HEC, DYT1, and AMS could be involved in the sex differentiation and determination of S. chinensis (Liu et al. 2022). This study identified a HEC-like gene, ScHEC3, upregulated in female flowers. In Arabidopsis, the HEC3 gene contributes to the gynoecium development. In Plukenetia volubilis, HEC-like genes HEC1 and HEC2 have higher expression in female flowers (Gremski et al. 2007; Peréz-Mesa et al. 2020). ScHEC3 was only expressed in female flowers, suggesting that ScHEC3 may be crucial to gynoecium development and possibly be associated with the sex differentiation of S. chinensis.

Besides HEC3, we also identified many transcription factors related to gynoecium development. The SUP and NTT genes, both from the C2H2 zinc finger gene family, are essential for the development of female flowers. In Arabidopsis thaliana, SUP controls the determinacy of the floral meristem and is required for normal ovule development (Sakai et al. 2000, 1995). SUP could regulate the expression of B-class genes including AP3 and PI, and a mutation of SUP might lead to the development of extra stamens (Sakai et al. 2000). Accumulated research showed that SUP may be involved in plant sex differentiation (Bačovský et al. 2022; Fu et al. 2018; Khadka et al. 2019). NTT, specifically expressed in the transmitting tract, is required for normal differentiation of the ovary transmitting tract cells and pollen tube growth (Crawford et al. 2007). ANT, an AP2 family gene, is essential to the gynoecium development (Klucher et al. 1996). ANT could regulate the autoinduction and expression of INO (Baker et al. 1997). GRF4 and GRF6, members of the growth-regulation factors, act as regulators involved in plant hormone signaling and metabolism (Zhang et al. 2022c). In rice, GRF genes were initially identified as gibberellin-induced (Lu et al. 2021). GRF genes are involved in gynoecium development in persimmon (Yang et al. 2019). WOX1 and WOX9, from lateral organ boundaries domain protein, are crucial for ovule development. WOX1 may regulate the outgrowth of ovules and WOX9 is required for the correct patterning and curvature of the ovule in Arabidopsis. WOX9 could regulate the expression of INO and is involved in the transcriptional activation of a subset of cytokinin response factors (Niu et al. 2018; Petrella et al. 2022). CUC3, a NAC family gene, participates in ovule development in Arabidopsis (Gonçalves et al. 2015). GATA18 is essential to the gynoecium development of Silene latifolia and can modulate cytokinin homeostasis in organ boundaries by regulating CKX3 expression (Bačovský et al. 2022). These transcription factors showed female-specific expression patterns, indicating that they might play a key role in the gynoecium development of S. chinensis.

Auxin and cytokinin participate in the female flower development of S. chinensis

Accumulated evidence indicates that many genes involved in phytohormones biosynthesis and signal pathway act as sex determination factors in many plants. Even though the plant hormones pathway (ko04075) was not the significantly enriched pathway, there were still some certain genes related to the phytohormone identified in 466 DEGs, including eight auxin-related genes and two cytokinin-related genes.

Auxin, essential for the initiation and development of flowers (Yamaguchi et al. 2013), was observed to play a vital role in the sex determination of many plants, such as Carica papaya, Jatropha curcas, and Spinacia oleracea (Li et al. 2020; Xu et al. 2016; Zhou et al. 2019). Auxin-related genes, including auxin efflux carrier component 1,6 (PIN1,6), auxin-induced protein 6B (AX6B), auxin transporter-like protein 2 (LAX2), YUCCA2, auxin-responsive protein SAUR32, auxin-responsive protein IAA7 (IAA7) VAN3-binding protein (VAB), had higher expression in female flowers. Liu’s study pointed out that auxin may contribute to the development of male flowers, while our study suggested auxin also participates in the female flower development of S. chinensis (Liu et al. 2022).

Cytokinin is another common phytohormone. Exogenous application of cytokinin and its homologs can promote the formation of female flowers in many species, such as Castanea henryi, Sapium sebiferum, and Plukenetia volubilis (Luo et al. 2020; Wu et al. 2022; Ni et al. 2018). Cytokinin-degrading cytokinin oxidase/dehydrogenase (CKX) is essential for the female cone development of Dacrydium pectinatum (Wang et al. 2022). CKX3 can regulate the activity of the reproductive meristems of Arabidopsis thaliana (Bartrina et al. 2011). CYP735A2 is a cytochrome P450 monooxygenase (P450s) that catalyzes the biosynthesis of trans-Zeatin. CKX3 and CYP735A2 were upregulated in female flowers. Combined with function and expression pattern, we inferred that auxin and cytokinin may contribute to the female flower development in S. chinensis.

MCM proteins may participate in the ovule development of S. chinensis

DNA unwinding is the process in which the double-stranded DNA (dsDNA) is untwisted into two single-stranded DNAs (ssDNA) and catalyzed by DNA helicase. In eukaryotes, the DNA helicase consists of six minichromosome maintenance proteins (MCM), MCM2, MCM3, MCM4, MCM5, MCM6, and MCM7, which are highly conserved and form a hexamer (Zhang et al. 2020). Accumulated evidence indicated that MCM proteins may participate in the development and sex determination of flowers. Heterozygous MCM7 mutants caused ovule abortion in Arabidopsis thaliana (Herridge et al. 2014). It was reported that MCM protein genes MCM2 and MCM6 may play a significant role in the sex determination in Cucumis sativus (Wang et al. 2019b). MCM2 and MCM6 also participated in the female flower development of Phellodendron amurense (He et al. 2023). Four DNA helicase genes, MCM3, MCM5, MCM6, and MCM7, showed higher expression in female flowers and had only expression in male flowers at the early flowering stage. We suggest that MCM protein genes might engage in the development of ovules in S. chinensis.

Flavonoid biosynthesis pathway is essential for the male and female gametophytes of S. chinensis

KEGG analyses of 466 DEGs showed that eight DEGs were significantly enriched in the flavonoid biosynthesis pathway (ko00941), suggesting that the flavonoid metabolism might be a key difference between female and male flowers in S. chinensis. Besides these enriched genes, we also identified three genes related to flavonoid biosynthesis, MYB1, MYB123 (TT2), and DTX41 (TT12). Among these genes, DFRA (TT3), CHS (TT4), FL3H (TT6), and TT12 had high expression in both male and female flowers (TPM > 10). Accumulated evidence showed that flavonoids play a part in pollen development and are crucial for male fertility. In Arabidopsis, the DFRA-like gene DRL1 is essential for male fertility. The mutation of DRL1 could lead to impaired pollen formation (Tang et al. 2009). TT4, a key enzyme in the flavonoid biosynthesis pathway, plays a key role in the male fertility of rice (Wang et al. 2020b). The expression of TT4 was strongly inhibited in cytoplasmic male sterility (CMS) Raphanus sativus (Yang et al. 2008).

In addition to their potential roles in male fertility, flavonoids are also related to the development of female flowers. In Prunus mume, genes involved in flavonoid biosynthesis exhibited lower expression in abnormal pistils (Shi et al. 2012). The content of flavonoids was higher in Broussonetia papyrifera female flowers (Jiao et al. 2022). In our study, genes related to flavonoid biosynthesis were significantly upregulated in female flowers, suggesting that flavonoids may be more associated with the development of female flowers (Xia et al. 2021). Flavonoids are essential to the development of seeds and the control of seed size. Seed formation, a key process in plant reproduction, starts with megagametophyte development in the ovule (Ehlers et al. 2016). Endothelium differentiates from the inner layer of integument in the ovule. The formation of endothelium is one of the basic features of ovule development in tomatoes (Chaban et al. 2019). Proanthocyanins (PAs), a type of flavonoid in Arabidopsis, are accumulated in the endothelium (Xu et al. 2014). PAs are essential to ovule and seed development and important for female fertility. The TT12 gene acts as a flavonoid/H+-antiporter that controls the vacuolar sequestration of flavonoids in the seed coat endothelium (Debeaujon et al. 2001). The TT16 gene, encoding ARABIDOPSIS BSISTER (ABS) MADS-box domain protein. TT16, together with STK, is required for the endothelium formation. TT16 is expressed mainly in ovules and is necessary for proanthocyanin accumulation in the endothelium of the seed coat (Nesi et al. 2002). MYB1 acts as a transcription activator involved in the regulation of anthocyanin biosynthesis, activating genes such as TT3 (Liu et al. 2017). TT2 might activate the promoters of genes encoding the dedicated enzymes for anthocyanin biosynthesis (Wang et al. 2019a). Considering the function and expression pattern of genes related to flavonoid biosynthesis, we speculate that flavonoids might be related to the development of female gametophytes in S. chinensis. However, due to the limitations of our study, which specific type of flavonoids plays a crucial role in the S. chinensis female gametophytes needs further exploration.

Protein interaction network reveals potential genes involved in the sex differentiation of S. chinensis

In this study, 76 DEGs with medium confidence were selected for the construction of the PPI network. These genes are associated with flower development, DNA replication, flavonoid biosynthesis, and stomata development and may potentially participate in the sex differentiation of S. chinensis. Genes related to flower development including LFY, PIN1, and CKX3 are crucial for the initiation of flowers and the early development of gynoecium (Zúñiga-Mayo et al. 2019). Two MADS-box genes STK and TT16 are essential for the ovule and endothelium formation in Arabidopsis thaliana (Ehlers et al. 2016). Three YABBY genes AFO, INO, and YABBY5 are essential to flower development. INO may interact with BEL1, SUP, HEC3, LFY, ANT, STIP, and CUC3 to participate in the development of female organs (Villanueva et al. 1999). Flavonoids are essential to the normal male and female gametophyte. The flavonoid biosynthesis-related genes including DFR, F3H, MYB1, TT2, TT4, TT5, TT12, and TT16 might participate in the normal gametophyte development of S. chinensis. Free nuclear mitosis of megagametophytes is vital for normal ovule development. The abnormal expression of genes related to DNA replication such as MCM2~7 could lead to the arrest of the cell cycle and ovule abortion (Zhang et al. 2020). DNA replication–related genes including MCM3, MCM5, PCNA2, PRL, HTA6, and HMGB6 may play a potential role in the ovule development of S. chinensis. The stomata are essential for the photosynthesis process in Arabidopsis. Besides leaves, floral organs are also photosynthetically active (Brazel et al. 2023). The fruit stomata are important to seed development (Lugassi et al. 2020). Nevertheless, there is limited research on the relationship between stomata and flower sex differentiation, indicating that the role stomata may play in the sex differentiation of S. chinensis needs further investigation.

Conclusion

Our study successfully identified 466 DEGs that may be involved in the flower development and sex differentiation of S. chinensis. The KEGG enrichment analysis showed that flavonoid biosynthesis and DNA replication pathways were strongly related to the development of S. chinensis female flowers. MADS-box and YABBY gene family analyses provided insight into potential genes that influence flower development. The development of the stamen could be regulated by B, C, and E-class genes while C, D, and E-class genes along with Bsister genes could be crucial to the development of the gynoecium. The majority of YABBY genes exhibited increased expression in female flowers, suggesting that YABBY genes may be vital to the development of female flowers. In addition, auxin and cytokinin were thought to be required for the information of ovule primordia. Combined with quantitative real-time PCR validation experiment results, AFO, INO, YABBY2, YABBY5-1, YABBY5-2, AeAP3-2, STK, AIL-1, ANT, GATA18, TT2, TT3, TT4, TT6-1, TT6-2, TT12, MYB1, GAMYB, CEP1, and MST4 were preliminarily considered to be the key genes for the flower development and candidates for sex differentiation of S. chinensis. In summary, we have proposed a potential model for sex differentiation in S. chinensis (Fig. 8), which could contribute to the comprehension of the sex differentiation mechanism.

Fig. 8
figure 8

A proposed model for the elucidation of sex differentation in S. chinensis. Genes with blue colors represented male-specific genes. Genes with green colors represented female-specific genes. Genes with orange colors had high expression in both male and female flowers and were upregulated in female flowers. Genes with grey colors represented upregulated partially in male and female flowers