Introduction

Transcription factors (TF) play vital roles in numerous biological activities by binding to specific DNA elements of their targeted genes. Nuclear factor Y (NF-Y), as a transcription factor, is composed of three subunits: NF-YA (HAP2 or CBF-B), NF-YB (HAP3 or CBF-A), and NF-YC (HAP5 or CBF-C) (Bucher and Trifonov 1988). NF-Y is widely found in fungi, animals, and plants. It specifically binds to the cis-element CCAAT-box in eukaryotic promoters (Nardini et al. 2013). Subunits NF-YB and NF-YC form dimers in the cytoplasm and then bind to the NF-YA protein and form a trimer in the nucleus (Tom et al. 2013).

In many mammals and yeasts, a single gene encodes each of the three NF-Y subunits (Li et al. 1992). For example, in mice and humans, only one NF-Y gene encodes each NF-Y subunit. However, in plants, each NF-Y subunit gene is represented by multiple orthologs (Zhao et al. 2016). For example, in Arabidopsis thaliana, the NF-YA, NF-YB, and NF-YC subunit families have 10, 13, and 13 members, respectively (Siefers et al. 2009). In rice (Oryza sativa), the NF-YA subfamily is encoded by 10 NF-YA genes, NF-YB by 11 NF-YB genes, and NF-YC by 7 NF-YC genes (Thirumurugan et al. 2008). Likewise, in soybeans (Glycine max), the number of identified members of the three subfamilies has reached 21, 32, and 15, respectively (Quach et al. 2015). In addition, the NF-Y genes of citrus (Pereira et al. 2018), walnut (Quan et al. 2018), tomato (Li et al. 2016), foxtail millet (Feng et al. 2015), sorghum (Malviya et al. 2016), and maize (Zhang et al. 2016) have previously been studied, and their numbers of members have increased in plants. This expansion of NF-Y gene number might contribute to the formation the well-established TF network and regulate plant growth and development (Calvenzani et al. 2012; Petroni et al. 2012).

Recent studies have revealed that NF-Y genes play a key role in plant response to various abiotic and biotic stresses. Overexpression of ATNF-YA5 and ATNF-YB1 in plants generally increases their growth under drought stress (Li et al. 2008). However, nfya5 knockout plants have the opposite response to overexpression of AtNF-YA5, becoming more sensitive to drought stress. In castor bean, cold stress significantly up-regulated the expression level of RcNF-YC6 (Wang et al. 2018). In addition, in soybean, the expression level of GmNF-YA3 was induced by various stress treatments (Ni et al. 2013). These studies have indicated that the NF-Y gene family participates in various regulatory processes in response to abiotic stress in plants, but with functional differences among members.

Many NF-Y genes have been reported to be involved in growth and developmental processes, such as embryogenesis (Fornari et al. 2013), seed germination (Liu et al. 2016), photoperiod or age-dependent flowering (Kumimoto et al. 2010; Wei et al. 2017), and fruit maturation (Li et al. 2016). NF-Y genes are also involved in plant physiological processes, including photomorphogenesis (Myers et al. 2016), regulation of nodulation during nitrogen fixation (Quach et al. 2015), and stress responses of the endoplasmic reticulum (ER) (Liu and Howell 2010).

Numerous studies have reported that NF-Y genes are involved in the regulation of flowering timing and flower differentiation, implying that NF-Y genes play a particularly important role in flower development (Su et al. 2018; Wei et al. 2017). Poplar is an economically important tree that is highly valued for bioenergy and timber production purposes (Ji et al. 2013). Extensive research has focused on traditional breeding. Due to rapid growth and ease of transformation, P. trichocarpa is considered an ideal tree species for basic and applied research (Yan et al. 2012). Taking advantage of functional genomics studies of the plant NF-Y gene and current genomic data of P. trichocarpa, we attempted to analyze the NF-Y gene family of P. trichocarpa. In addition, the urban environmental pollution issues (such as causing traffic accident, fire risk, as well as spreading bacteria and virus) caused by female catkins of poplar during spring have attracted much attention by researchers. Meanwhile, the pollen in male floral buds has allergenic properties and poses potential health hazards to allergic people (An et al. 2011). Therefore, clarifying the flower development in poplar is of great importance. In this study, we identified 46 NF-Y genes from the P. trichocarpa genome and performed a relatively complete bioinformatics analysis, including conserved regions, phylogenetic relationships, gene-structure analysis, and chromosome localization. Moreover, we analyzed NF-Y genes in terms of their expression levels among five different tissues and seven different flower developmental stages using transcriptome data and validated these expression levels by qRT-PCR, which is particularly important for identifying candidate genes involved in regulation of the growth and flowering of P. tomentosa. Taken together, our results contribute to a more complete understanding of the function of the NF-Y genes in poplar.

Materials and methods

Identification of PtNF-Y gene family members

Using the classification criteria of NF-Y TFs in A. thaliana (Petroni et al. 2012), we obtained 36 AtNF-Y amino acid (AA) sequences from The Arabidopsis Information Resource (Swarbreck et al. 2008). These sequences included 10 NF-YA, 13 NF-YB, and 13 NF-YC subunits. To anchor the P. trichocarpa NF-Y family members, AA sequences of the Arabidopsis NF-Y members were used as queries to search for NF-Y family members of P. trichocarpa in Phytozomev12.1 (Goodstein et al. 2012). We selected all sequences with e value < 10−10 for subsequent analysis. In addition, SMART tool (http://smart.embl-heidelberg.de/) (Zhang et al. 2019) was utilized to identify NF-Y protein domains. We obtained the NF-Y candidate members of P. trichocarpa based on this comprehensive screening for use in subsequent analysis.

Chromosomal distribution and structure analysis of PtNF-Y

The PtNF-Y genes were mapped to the chromosomes using the MapInspect software (Zhang et al. 2019). The gene structures of PtNF-Y members were analyzed using the Gene-Structure Display Serve (http://gsds.cbi.pku.edu.cn/) (Hu et al. 2015).

NF-Y protein alignment and phylogenetic prediction

We obtained the genomic sequences, coding sequences (CDS) and protein sequences of PtNF-Y members using Phytozome v12.1. Multiple sequence alignments analysis of NF-Y proteins was performed using Clustal X2.1 (Larkin et al. 2007). Additionally, based on the alignments obtained, genetic distance matrices were obtained, and an unrooted phylogenetic tree was generated in MEGA7 using the neighbor-joining (NJ) method based on the PtNF-Y protein sequences of P. trichocarpa and Arabidopsis. Their evolutionary relationships were estimated with 1000 bootstrap replications (Kumar et al. 2016).

Plant material

Adult P. tomentosa trees were grown under natural conditions in the Beijing Forestry University greenhouse (Beijing, China). Roots, stems, and leave tissue were obtained from 6-month-old cultured plantlets. Dormant and germinating leaf buds were collected during the flower transition period in 2017. Male and female floral buds of P. tomentosa were collected from June 2017 to February 2018 at Beijing Forestry University. These flower buds covered seven stages of flower development, from flower induction to flower initiation and organ differentiation. In addition, all collected samples were immediately frozen in liquid nitrogen and then stored at − 80 °C until use.

Transcriptome sequencing and de novo assembly

RNA-seq libraries were generated using the Illumina kit according to the manufacturer’s protocol (Illumina, San Diego, CA, USA). Equal amount of RNA for each sample were pooled, and sequenced separately on the Illumina HiSeqTM 2000 platform using paired-end technology. The assembly method used was described previously (Chen et al. 2018). The expression levels of PtoNF-Ys were calculated as fragments per kilobase per million (FPKM) (Cole et al. 2010) reads values to determine differences in expression in various organs and tissues among NF-Y members, and heat maps were constructed using TBool software. Hierarchical cluster analysis was conducted to visualize the expression pattern of NF-Y genes in different P. tomentosa organs and tissues.

RNA isolation and qRT-PCR analysis

Total RNAs were isolated from each tissue at various developmental stages of P. tomentosa. We used Trizol Total RNA Extraction Kit (Promega, Madison, WI, USA) and RQ1 DNase to remove genomic DNA (Promega) and incubated the extracts at 65 °C for 10 min to inactivate the DNase. The RNA was detected by 1.0% agar gel electrophoresis and a NanoDrop 2000 spectrophotometer (IMPLEN, CA, USA). The total RNA was reverse-transcribed into first strand cDNA using the Reverse Transcription System (Promega). The resulting cDNA was diluted 1:10 with ddH2O and used as the template for qRT-PCR amplification.

For qRT-PCR, we first designed primers by reference to the CDS sequence of P. trichocarpa NF-Y to amplify this sequence. The amplified DNA fragments were then cloned into the pMD-18T vector (TaKaRa, Otsu, Japan) for sequencing compared with the P. trichocarpa CDS (Fig. S1). Then, qRT-PCR primers were designed based on the CDS sequence of P. tomentosa using primer 5.0 software. Finally, primers qRT-PCR was identified by PCR (Fig. S1). The PtoACTIN (GenBank accession: AY261523.1) (An et al. 2011; Zhang et al. 2008; Zheng et al. 2009) was selected as an internal control gene for normalization according to the 2−ΔΔCt method. All primers are listed in Table S1. Real-time quantitative PCR was performed using SYBR ® Premix Ex Taq ™ (TaKaRa) on the ABI PRISM 7500 Fast Real-time PCR System (Applied Biosystems, Foster City, CA, USA), The PCR program was as follows: 95 °C for 30 s, 40 cycles of 95 °C for 5 s and 60 °C for 20 s, and then, a final elongation step of 72 °C for 7 min. The plates were read at 0.2 °C intervals for 1 s from 70 to 95 °C to generate melting curves and thus verify the specificity of the amplified product. The expression values of these genes were analyzed using the 2 −ΔΔCt method (Livak and Schmittgen 2001). All analyses were performed with three technical and three biological replicates.

Results

Isolation of the NF-Y family in P. trichocarpa

To obtain information on the NF-Y genes of P. trichocarpa, Arabidopsis NF-Y protein sequences were used to search for NF-Y using the most recent P. trichocarpa genome assembly in Phytozome v12.1 and the ExPASy server (Table 1). A total of 46 NF-Y genes were identified in the P. trichocarpa genome: 11 PtNF-YA, 21 PtNF-YB, and 14 PtNF-YC. The 46 predicted P. trichocarpa NF-Y genes were named PtNF-YA1 to 11, PtNF-YB1 to 21 and PtNF-YC1 to 14, respectively. The bioinformatics data of all gene members were analyzed, including genomic DNA sequence length, transcript length, CDS length, number of AAs, theoretical molecular weight (MM), and theoretical isoelectric point (pI). The identified PtNF-Y genes encodes peptides ranging from 143 to 348 AA, with pI values ranging from 4.85 to 9.46, and molecular weights ranging from 12.81 to 41.42 kDa, as estimated using the ExPASy server.

Table 1 Nuclear factor Y (NF-Y) identified in P. trichocarpa

Genomic structure and locations of PtNF-Y gene family members

Gene-structure analysis can provide insight into the evolution of families of genes. Therefore, we determined the numbers and positions of exons and introns of the PtNF-Y genes using genomic DNA sequences in the Phytozome v12.1 database (Fig. 1). All PtNF-YA genes were separated by introns. PtNF-YA genes had four or five exons, except PtNF-YA6, which had three introns. Among the PtNF-YB genes, 11 had no introns and the others had 1–5 introns. Of the PtNF-YC genes, 10 of 14 genes were intronless, whereas PtNF-YC13 had six introns. In addition, we found that most homologs in the PtNF-Y subfamily, such as NF-YA6/A7, NF-YA4/A5, NF-YB7/B12, NF-YB6/B11, and NF-YC1/C2, had different gene structures. These results indicate that the intron/exon distribution in PtNF-Y genes is highly variable.

Fig. 1
figure 1

Gene structure of the PtNF-Y gene family. Yellow boxes and black lines represent exons and introns, respectively. Blue box indicates the 5 and 3 non-coding regions. The length represents the size of exon and intron

We used the current P. trichocarpa genome database in Phytozome v12.1 to analyze the positions of PtNF-Y genes on P. trichocarpa chromosomes. The 46 PtNF-Y genes were found to be distributed among 16 of the 19 chromosome scaffolds of P. trichocarpa (Table 1, Fig. 2).

Fig. 2
figure 2

Positions of NF-Y gene family members on P. trichocarpa chromosomes. Scaffold numbers are indicated at the top of each scaffold. Chromosome size is indicated by the vertical scale. Apart from chromosome 2 and 4, other gene members are distributed on each chromosome

Multiple alignments analyses of the PtNF-Y genes

For each PtNF-Y subunit, multiple alignments performed using ClustalX 2.0 software revealed conserved DNA binding domains in PtNF-YA, PtNF-YB, and PtNF-YC proteins (Fig. 3). The results illustrate that PtNF-Y proteins contain evolutionarily conserved domains and more variable N- or C-terminal transcriptional regulation regions. The conserved domain of the PtNF-YA protein core contains 53 AAs, including two highly conserved domains: NF-YB/C interaction domain α1 and DNA contact domain α2. The relatively conserved linker of 21 AAs separates the alpha1 and alpha2 subdomains (Fig. 3a). The conserved core region of PtNF-YB was 86 AAs in length, which is similar to the lengths reported in other species. In addition, it also contains a central domain similar to the histidine folding domain (HFM) in the core histone H2B and plays a key role in DNA binding and protein–protein interactions (Fig. 3b). PtNF-YC subunits were also found to consist of a core histone-like sequence with a central domain about 79 AAs in length (Fig. 3c).

Fig. 3
figure 3

Multiple alignments of P. trichocarpa NF-Y family members. Multiple alignment of (a) PtNF-YA proteins, (b) PtNF-YB proteins, and (c) PtNF-YC proteins. Amino acids critical for distinguishing between LEC1 and non-LEC1 were indicated by red box

Phylogenetic relationships of PtNF-Y gene family

To explore the evolutionary relationships among different PtNF-Y family members, an unrooted phylogenetic tree of 46 PtNF-Y proteins was constructed (Fig. 4a). The results showed that all PtNF-Y members formed three major clusters: I (PtNF-YA), II (PtNF-YB), and III (PtNF-YC). Closely related members showed the most similar physical and chemical properties, such as MM and pI (Table 1). In addition, the result also indicated that the PtNF-YB proteins are more closely related to PtNF-YC proteins than they are to PtNF-YA proteins, implying that PtNF-YB and PtNF-YC genes may share a more recent ancestor than PtNF-YA.

Fig. 4
figure 4

Phylogenetic analysis of P. trichocarpa NF-Y proteins. (a) Combined phylogenetic tree for 11 PtNF-YA, 21 PtNF-YB, and 14 PtNF-YC proteins constructed by the neighbor-joining method in MEGA 7.0 software with 1000 bootstrap reiterations. (b) The genes encoding different subunits formed three separate branches. Different colors indicate different subfamilies. Blue, red, and green represent NF-YA, NF-YB, and NF-YC, respectively. Proteins prefixed by the term ‘At’ and ‘Pt’ indicate NF-Y from Arabidopsis thaliana and Populus trichocarpa, respectively. Among, AtNF-YB6 and AtNF-YB9, which are marked by red triangles, represent AtL1L and AtLEC1, respectively. Numbers on branches show bootstrap values > 50%

To investigate and elucidate the phylogenetic relationships among PtNF-Y proteins and assist with functional predictions, we constructed a comprehensive phylogenetic tree using the full-length protein sequences of all the NF-Y proteins of Arabidopsis and P. trichocarpa (Fig. 4b). The phylogenetic tree shows close relationships among the proteins, all of which, except AtNF-YC10, were grouped into three subfamilies (A, B, and C). The phylogenetic tree indicated close relationships among the PtNF-Ys within each of the three subfamilies. Furthermore, in Fig. 4b, we found that PtNF-YB3, PtNF-YB5 and AtNF-YB9 (AtLEC1), AtNF-YB6 (AtL1L) were clustered in a separate subgroup. However, protein multiple alignment analysis revealed that PtNF-YB3 and PtNF-YB5 were more diverse compared to the others, due to an amino acid change from lysine (K) to aspartic acid (D) (Fig. 3, Fig S2). Therefore, based on this phylogenetic tree, we further predicted that the structures and functions of some PtNF-Y members may be similar to those of Arabidopsis.

Expression of the PtoNF-Y gene family different tissues determined by transcriptome and qRT-PCR analyses

To identify the potential functions of PtoNF-Ys in the development of P. tomentosa, we compared the relative expression levels of the PtoNF-Y genes in root, stem, leaf, leaf bud, and male and female flower buds at seven different developmental stages. The results indicated that 34 of the 46 PtoNF-Ys were expressed in all tissues, while the other 12 PtoNF-Y gene members were not detected (Fig. 5). The transcription levels of the 34 expressed PtoNF-Y genes varied among tissues. We observed four main clusters of expression among the vegetative organs (see Fig. 5, clusters I–IV). Six of the genes are clustered together (clusters I: A3, A8, B4, B14, B21, C6) have relatively high expression except in the germinating leaf bud.

Fig. 5
figure 5

Heat map of PtoNF-Y predicted expression in different tissues and organs of P. tomentosa. The abbreviations R, S, L, DLB, and GLB represent roots, stems, leaves, dormant leaf buds and germinated leaf buds. Expression potential from high to low is represented by square colors from red to green and the black is medium. FPKM value was used to create the heat map. The scale represents the relative expression level intensity of FPKM values. In this study, de novo assembly was done using pooled samples

Among the reproductive organs, we analyzed the expression patterns of male and female floral buds at seven different developmental stages by RNA-seq and qRT-PCR (Fig. 6, these genes are highly expressed in fig S3). The results showed that about 90% of the genes appear to have similar RNA-seq and qRT-PCR patterns. In addition, the result analysis found that some genes show similar expression patterns across the male and female floral bud stages. Such as, in male flower buds and female flower buds, PtoNF-YB4 and PtoNF-YB21 peak in early and late stages, respectively. However, PtoNF-YB15, PtoNF-YC2, and PtoNF-YC7 peak in the middle stages. Moreover, the patterns also showed that the same gene can exhibit contrasting peak times between male and female floral buds. Such as, in the later stages, the expression level of PtoNF-YC5 is completely opposite in male flower buds and female flower buds. The peak of PtoNF-YC14 appears in the late stage of male flower buds and the early stage of female flower buds. The above phenomenon indicated that some genes evolved different functional roles in the two floral bud types.

Fig. 6
figure 6

Expression profiles of the PtoNF-Y genes in male floral buds and female floral buds determined by qRT-PCR. (a) qRT-PCR validation of NF-Y gene expression in the male floral of P. tomentosa. (b) qRT-PCR validation of the expression levels of PtoNF-Y genes in female floral buds. The blue and orange lines represent values obtained by qRT-PCR and RNA-seq, respectively. MFB1–MFB7 represents the seven stages (from June 2017 to February 2018) of male floral bud and female floral buds development. The data represent mean ± SD. The error bars indicate the standard deviation

Discussion

In this study, based on the currently available P. trichocarpa genome sequences and the updated classification criteria of NF-Y TFs in A. thaliana (Petroni et al. 2012), 46 PtNF-Y genes were identified in P. trichocarpa (11 PtNF-YA, 21 PtNF-YB, and 14 PtNF-YC) (Table 1). Previous studies have reported 33 NF-Y members in Arabidopsis (Siefers et al. 2009), 28 in rice (Thirumurugan et al. 2008), and 33 in walnuts (Quan et al. 2018). The high number of NF-Y members in P. trichocarpa may be due to the expansion of this gene family in poplars. In contrast, soybean and tomato contain 66 and 59 NF-Y members, respectively. Compared with the numbers of NF-Y genes in these species, fewer PtNF-Y genes were identified in our study. These differences in gene numbers may be related to differences in classification criteria. The NF-Y family member classification criteria for maize and tomato include several NC2 and Dpb3/4 members (Li et al. 2016; Zhang et al. 2016).

Gene-structure alterations might lead to changes in gene or protein function. In this study, compared with the exon–intron structures of PtNF-YB and PtNF-YC, most PtNF-YA members showed a more variable and complicated exon–intron organization pattern. This result is also consistent with the NF-Y gene structures previously reported in Arabidopsis, Brassica napus, and Solanum lycopersicum. Moreover, exon–intron structure analysis provides new insight into the evolutionary relationships among genes or organisms. We analyzed the exon–intron structure of 46 PtNF-Y genes (Fig. 1), of which 26 lacked introns. However, introns are considered to be essential components of structural eukaryotic genes. Previous studies have reported that introns have many functions, such as alternative splicing and regulation of the evolution rate of genes (Lee et al. 2003; Roy and Gilbert 2006). Their deletion or alteration might lead to structural diversity and complexity, in turn affecting the evolution of the gene family.

In general, the conserved domain is located at the C-terminus of mammalian proteins; the phenomenon is more apparent in plant NF-YAs. Previous studies have shown that three histidine (H) and three arginine (R) residues are essential for DNA binding (Xing et al. 1993). The results of the current study show that these six sites were also highly conserved among all 11 PtNF-YA proteins (Fig. 3a). This finding is consistent with the previously observed residues in Arabidopsis, Oryza sativa, and Brachypodium (Cao et al. 2011; Siefers et al. 2009; Thirumurugan et al. 2008). Previous studies reported that the aspartate at D55 site was considered a critical protein interaction site of the AtNF-YB subfamily, and the Arabidopsis NF-YB subunit based on the 55th Asp in its domain divided into two classes: LEC1 and non-LEC1 (Lee et al. 2003). LEC1 has been identified to play a key role in Arabidopsis embryogenesis and the development of castor seeds (Wang et al. 2018). However, in our study, PtNF-YB3 and PtNF-YB5 changed from lysine (K) to aspartic acid (D) at this binding site. (Figure 3, Fig S2). In addition, phylogenetic relationship analysis also showed that PtNF-YB3, PtNF-YB5 and AtNF-YB9 (AtLEC1), AtNF-YB6 (AtL1L) were clustered in a separated subgroup (Fig. 4b). Therefore, we speculate that these two members are possibly orthologs of AtLEC1 and AtL1L, and propose that PtNF-YB3 and PtNF-YB5 may have functions similar to those of AtLEC1 and AtL1L.

Previous studies have reported that NF-Y genes play important roles in regulating plant growth and development (Li et al. 2016; Potkar et al. 2013; Wang et al. 2018; Wei et al. 2017). In this study, the expression levels of PtoNF-Y genes (except for 12 NF-Y genes that had no expression data) were observed in five tissues and organs, indicating that PtoNF-Y is involved in regulating the growth and development of P. tomentosa (Figs. 5, 6, S3). It has been reported that AtNF-YB2 and 3 and AtNF-YC3, 4, and 9 are critical regulators of flowering time (Kumimoto et al. 2010; Wenkel et al. 2006). Here, we detected expression of their orthologs, PtoNF-YB4, PtoNF-YB14, and PtoNF-YB5-7, during the early stages of floral bud development, but the specific regulatory mechanism of this expression remains unclear. Therefore, determining how these genes regulate floral bud development and affect the growth and development of P. tomentosa are interesting questions for future research.

Comprehensive bioinformatics analysis of the NF-Y family in P. trichocarpa identified 46 genes as putative PtNF-Y genes, and the expression of PtoNF-Y genes in various P. tomentosa tissues and organs was evaluated using transcriptome data. The results suggest that the expression and transcription levels of 34 of 46 members in various tissues and organs vary among family members. The results of this study provide a basis for further studies on the functions of the PtNF-Y gene family. However, the specific functions of these genes in P. tomentosa must be further verified experimentally, and the regulatory mechanisms must also be determined in future studies.