Introduction

Plants routinely experience diverse stress conditions, such as abiotic or biotic stresses, due to their sessile nature. They are therefore required to adapt for survival by developing a range of defense mechanisms [1, 2]. Heat shock response is an important conserved defense mechanism for adapting to environmental stresses. Heat shock proteins (HSPs) play essential roles in defense from environment stresses by folding/unfolding and degrading proteins [3, 4]. Several researches showed that HSFs, the key regulators of HSPs, could increase heat, high salinity and oxidative stress tolerance in plants [5, 6]. Although HSFs vary in their size and sequences, their promoter recognition pattern and basic structures are conserved in plants [7]. Based on their conserved region, plant HSFs can be divided into three classes namely, HSFA, B, and C [8]. Group A HSFs are major members with function as transcription activators, whereas most HSFs that belong to Group B and C do not have a self-activation function [9, 10].

Analyses of different HSF mutants have presented extraordinary functional differences that could not be restored by other HSF types. The precise function of different HSFs in plants is still unclear. Several studies showed that HSFs partake in signal transduction pathways and regulate genes expression responsive for a variety of abiotic/biotic stresses [6, 11, 12]. Evidences showed that HSFA1a was the primary regulator to resist heat stress. HSFA1a could trigger a heat stress response and form a protein complex with HSFA2 and HSFB1 to adjust physiological metabolism during heat stress [8]. Arabidopsis showed an improved heat, salt/osmotic, and oxidative stress tolerance via overexpression of HSFA2 [13, 14]. HSFA3 was shown to be functionally similar to HSFA1a and HSFA2 and since its expression was up-regulated by drought and heat stress, it was regarded as a part of drought stress signaling [15]. The expression of HSFA6a and HSFA6b was also significant up-regulated by salt and cold stress [16]. HSF also showed resistance to heavy metal stress. For example, rice and yeast strains presented better tolerance for cadmium stress by overexpressing TaHSFA4a [17]. HSFA9 regulated the expression of HSP during seed development and showed a unique function to verify the functional diversification of HSF [18, 19]. While HSFAs seemingly displayed the majority of HSF functions, most members belonging to HSFBs did not have an activator function and were usually regarded as repressors of gene expression [9, 10, 20]. A large proportion of group HSFC genes were found in monocots compared to eudicots. However, the exact role of HSFCs in monocots is still unclear [8].

HSFs are pervasive in eukaryotes and function as transcription factors. Plant HSFs comprise of a larger family of proteins compared with vertebrate HSFs and those found in Drosophila [21]. With the development of sequencing technology, several HSF families have been identified in different plants and these are composed of 25 members in Zea mays, more than 56 members in Triticum aestivum, 25 members in Oryza sativa, 30 members in Populus trichocarpa, 19 members in Ricinus communis, and 21 members in Arabidopsis thaliana [8, 22,23,24,25,26,27]. Cassava is a major source of dietary carbohydrate, industrial starch, and bioethanol due to its high starch production [28, 29]. After harvest, its tuberous roots undergo rapid PPD, restricting its use as a raw material in food industry [30, 31]. Physiological and biochemical analysis showed that the production of ROS is the first event during development of PPD. Lower ROS accumulation results in delayed PPD development by manipulating ROS-scavenging-related genes or exogenous application of chemicals [11, 30,31,32]. Cassava presents excellent drought resistance during the growth process [33]. HSFs could increase the tolerance to drought and salinity stress in plants. However, the mechanism underlying resistance to abiotic stress in cassava is unclear.

In this study, HSFs present in cassava were identified and analyzed with regards to their phylogenetic relationships, gene structure, and protein motifs. The expression profiles of the various identified HSFs in different tissues were analyzed. HSFs responded to simulated drought and ABA. The change in HSF expression during the process of PPD was also investigated. Our results may prove meaningful for the analysis of HSFs function in cassava. Our findings also expanded knowledge regarding simulated drought tolerance and the PPD process, and offered novel implications for extending the shelf life and improving the quality of cassava tuberous roots.

Materials and methods

Plant materials and treatments

Arg7 (Manihot esculenta cv. Arg7) is an elite cassava cultivar in Argentina adapted to moderate drought stress. Arg7 were cultured in growth chamber conditions (35 °C/20 °C day/night, 16/8-h light/dark cycle under 70% relative humidity, 200 μmol m−2 s−1 photosynthetic photon flux density). After cultivation for 90 days, cassava seedlings were subjected to 100 μM ABA and PEG6000 (20%) treatment, respectively. Each sample was pooled from five plants with three replicates. Subsequently, two replicates of these samples were chosen for transcriptomic analysis. For analyzing PPD, the tuberous roots of 10-month-old cassava (Manihot esculenta cv. sc124) were cut into 5-mm thick slices and placed on a wet filter paper in Petri dishes. The samples were incubated in the dark for different time periods (0, 6, 12, and 48 h) at 28 °C. RNA was extracted from the tuberous root slices at different time periods (0, 6, 12, and 48 h) and used for transcriptomic analysis (three replicates of each sample).

Identification and phylogenetic analyses

The HSF protein sequences in Arabidopsis and rice were obtained from TAIR (http://www.arabidopsis.org/) and RGAP (http://rice.plantbiology.msu.edu/) databases, respectively. The whole genome sequence of cassava was acquired from a publicly available database (https://phytozome.jgi.doe.gov/pz/portal.html). In order to identify HSFs in the cassava protein sequence library, hidden Markov models (HMM) (http://www.hmmer.org/) were constructed using known HSF sequences [34]. PtHSFs, RcHSFs, AtHSFs, and OsHSFs were used to confirm the recognized cassava HSFs by BLAST analysis. The conserved domains of cassava HSFs were validated by the PFAM (http://pfam.sanger.ac.uk/) and conserved domains database (http://www.ncbi.nlm.nih.gov/cdd/). An evolutionary tree was constructed using cassava, poplar, castor bean, Arabidopsis, and rice HSFs using MEGA 5.0 and Clustal X2.0 softwares [35].

Protein properties and sequence analyses

The molecular weight and isoelectric points of MeHSFs were predicted by proteomics server (ExPASy) (http://expasy.org/). The MEME (http://meme.nbcr.net/meme/cgi-bin/meme.cgi) and InterProScan (http://www.ebi.ac.uk/Tools/pfa/iprscan/) databases were employed to identify the conserved motifs of MeHSFs. MeHSF gene structures and chromosomal location were analyzed through gene structure display server (GSDS) and Phytozome cassava database, respectively [36].

Transcriptomic analysis

RNA extraction, library preparation, and sequencing were performed by Majorbio BioTech Co., Ltd. (Shanghai, China). The sequencing platform was Illumina GAII (Illumina, San Diego, CA, USA). For reliability, the FASTX-toolkit (http://hannonlab.cshl.edu/fastx_toolkit/) and FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) were used to remove adapter sequences and low-quality sequences, respectively. Subsequently, the cassava genome version 4.1 (https://phytozome.jgi.doe.gov/pz/portal.html) was used as the reference for aligning the clean reads through Tophat 2.0 software (http://tophat.cbcb.umd.edu/) [37]. Based on the alignment files, the transcriptomic data were assembled using Cufflinks [38]. Genes were scored as not expressed if the corresponding RNAseq reads could not align to the genome. Expression levels were calculated and normalized as fragments per kilobase per million mapped reads. FPKM values were calculated to create heat map with MeV 4.9 software (CCCB, Boston, MA, USA). DEGseq was used to identify differentially expressed genes (Log2 based fold changes > 1; Log2 based fold changes < − 1) in response to treatment. The expression profiles of cassava housekeeping genes and phylogeny of raw RNA-Seq data were added in the supplement (Table S1–S2, Fig. S1). The expression datum of MeHSFs in various tissues were downloaded from database (shiny.danforthcenter.org/cassava_atlas/) [39]. The generated sequence data were deposited in NCBI’s SRA database under the accession of SRP182603.

Results

Identification and evolutionary analysis of HSFs in Manihot esculenta

The AtHSF and OsHSF sequences were used as queries for HMM and BLAST searches and 32 HSF proteins were identified from the cassava genome, which were designated as MeHSF1–MeHSF32. Their lengths ranged from 217 to 576 amino acid residues and pIs and relative molecular masses varied between 4.63 to 9.06 and 24.8 to 64.2 kDa, respectively (Table S3).

The evolutionary relationship of MeHSFs with AtHSFs, OsHSFs, PtHSF, and RcHSF was investigated by phylogenetic analysis (Fig. 1). The MeHSF family could be classified into 10 groups, Group A1 included MeHSF1, -30; Group A2 included MeHSF9, -23; Group A3 included MeHSF25, -29; Group A4 included MeHSF5, -6, -7 and -28; Group A5 included MeHSF3; Group A6 included MeHSF12, -26; Group A7 included MeHSF14, -15; Group A8 included MeHSF8; Group A9 included MeHSF2, -4; Group B included MeHSF11, -13, -16, -18 -19, -20, -21, -22, -24,-27, -31 and -32; Group C included MeHSF10, -17. MeHSFs presented a closer relationship with RcHSFs and PtHSFs compared to OsHSFs and AtHSFs via orthologous genes.

Fig. 1
figure 1

Phylogenetic analyses of HSFs from cassava, Arabidopsis thaliana, rice, poplar and castor bean

Conserved motif and gene structure analyses of MeHSFs in Manihot esculenta

Overall 10 conserved MeHSF motifs were identified by searching in the MEME database. The conserved motifs were annotated using the InterPro database, which presented essential features of the HSF family (Fig. 2). Figure 2 indicated that all MeHSFs contained motifs 3 and 4. In group A1–A9, all MeHSFs contained at least six motifs, except in group A3, whereas only four and at most six motifs were present in group B and C, respectively. All groups presented a similar or identical motif composition, except group A3. These results suggested that all MeHSFs contain essential features of the HSF family and similar motif characters in different groups, further supporting the results of evolutionary relationships.

Fig. 2
figure 2

The motif analyses of HSFs in cassava on the basis of their evolutionary relationship

The exon–intron organization of the MeHSFs were analyzed using the GSDS database (Fig. 3). Interestingly, all MeHSFs contained two exons, except MeHSF9, MeHSF25, and MeHSF29, and those in the same group generally exhibited similar exon–intron organizations.

Fig. 3
figure 3

The exon–intron organization analyses of cassava HSFs according to the phylogenetic relationship

Chromosomal distribution of MeHSFs in M. esculenta

In order to analyze the distribution of MeHSFs, the chromosomal location of the identified MeHSF1-32 were analyzed. MeHSFs were mapped to be present on chr1, 2, 3, 5, 8, 9, 12, 13, 14, 15, 16, 17, and 18 (Fig. 4). Group A5 and A8 contained only one member located on chr15. MeHSFs in group A2 and A3 were all located on chr3, 8, 9, and 16. Group A4 contained four members located on chr1, 2, and 17; however, MeHSF28 from this group could not be exact located on the chromosome. By using the cassava v7.1 to analyze the information of MeHSF28, it could only be located in chr16. Group B, a large subfamily contained 12 members located on chr1, 2, 5, 8, 9, 12, 13, 14, 16, 17, and 18. The locations of group C were chr3, and 16. Thus, most of MeHSFs were located on Chr1, 2, 3, 9, 15, and 16, while four chromosomes were only distributed for one gene, respectively.

Fig. 4
figure 4

Chromosome distribution analyses of HSFs in cassava

Expression profiles of MeHSFs in different cassava tissues

In order to analyze the expression profiles of MeHSFs in different cassava tissues, the expression data of 11 cassava tissue/organ types were downloaded from a database [39]. Here 11 tissues that were included were leaf, midvein, petiole, stem, lateral bud, shoot apical meristem (SAM), storage root (SR), fibrous root (FR), root apical meristem (RAM), organized embryogenic structure (OES), and friable embryogenic calli (FEC). As shown in Fig. 5, all MeHSFs showed a corresponding expression based on transcriptomic data, except for MeHSF19 (Table S4). Approximately 50% of MeHSFs presented a low transcript abundance, which coincided with its blocking status under normal conditions. Five MeHSFs (MeHSF1, 3, 8, 28, and 30), belonging to Group A, presented high expression levels in all the 11 analyzed tissues, three MeHSFs (MeHSF24, 18, and 10), belonging to Group B and C, presented a high expression in nine tissues, and the other MeHSFs were mainly highly expressed in lateral buds, OES, and FEC.

Fig. 5
figure 5

Expression data of MeHSFs in various tissues/organs. L leaf, M midvein, P petiole, S stem, LB lateral bud, SAM shoot apical meristem, FR fibrous root, SR storage root, RAM root apical meristem, OES organized embryogenic structure, FEC friable embryogenic calli

Expression profiles of MeHSFs in response to PEG and ABA treatment

A PEG treatment was performed in order to analyze the expression profiles of MeHSFs responsive to drought stress. All MeHSFs, except MeHSF12, showed corresponding expression on the basis of transcriptome data. As shown in Fig. 6, MeHSF-5, -7, -8, -9, -10, 18 and -29 showed induction under PEG treatment, MeHSFs-15, 17, 20, 22, and 23 showed suppression under PEG treatment. ABA played an important role in signal transduction pathways that respond to drought stress, and plant HSFs also regulated the expression of genes responsive to various abiotic stresses [8]. The expression profiles of MeHSFs upon ABA treatment were also studied. Similar to treatment with PEG, the expression data of MeHSF21 was not covered. MeHSF-8, -9, -10, -13, -18 and -29 showed induction after ABA treatment, MeHSF22 and MeHSF 23 showed a down-regulation. The up-regulated MeHSFs were identical after treatment with PEG or ABA, which suggested that these MeHSFs might be involved in ABA mediated osmotic response.

Fig. 6
figure 6

Expression profiles of MeHSFs in Arg7 after treated with ABA and PEG, Log2-based FPKM fold change was used to create the heat map

Expression profiles of MeHSFs during PPD

PPD is one of the major factors that restricts the use of cassava as a raw material in the food industry. HSFs are strongly associated with oxidative stress, and physiological and biochemical analyses show that production of ROS is the first step in PPD development. The expression profiles of MeHSFs during PPD in cassava were analyzed in the cultivar sc124. All MeHSFs showed a corresponding expression data based on transcriptome data, except MeHSF12 and MeHSF20. Most of the MeHSFs showed induction during PPD and only MeHSF23 and MeHSF26 showed suppression, which suggested these to be a part of the MeHSFs that participate in this process (Fig. 7).

Fig. 7
figure 7

Expression profiles of MeHSFs in storage roots of sc124 at 6 h, 12 h, and 48 h compared with 0 h after harvest. Log2-based FPKM fold change was used to create the heat map

Discussion

Cassava is one of the important crops in tropical and sub-tropical regions, and it provides staple food for over 750 million people around the world [33]. HSFs have many functions involved in different circumstances, such as oxidative stress, high temperatures, and drought stress [5, 6]. Thus, it is necessary to systematically analyze the potential roles of MeHSFs in cassava. In this research, 32 HSFs were identified from the cassava genome, which were classified into three groups (A, B, C) according to their evolutionary relationship. MeHSFA contained nine subclasses (A1–A9). The results of this classification coincided with the classification in rice and Arabidopsis [25, 40]. The cassava genome (742 Mb) is larger than that of Arabidopsis (125 Mb), castor bean (320 Mb), poplar (410 Mb), and rice (430 Mb) [26, 27, 41,42,43]. The number of HSFs in cassava (32 MeHSFs) was roughly similar with that in rice (25 OsHSFs), castor bean (19 RcHSFs), poplar (30 PtHSFs) and Arabidopsis (21 AtHSFs). MeHSFs showed a closer relationship with RcHSFs and PtHSFs. In all species, the HSFs are classified within different subclasses and these might be correlated with the different growth conditions [6]. Although their sequence and sizes are considerably diverse, the fundamental structure of HSFs is conserved in eukaryotes. Our results showed that almost all MeHSFs contained conserved motifs and different groups had similar motif characters. Exon/intron organization analysis showed that all MeHSFs contained only one intron, except MeHSF9, MeHSF25, and MeHSF29, and presented a prominently conservation in all family members. This phenomenon also existed in other species [6]. The extensive MeHSF distribution on chromosomes in cassava showed similarity to those from poplar, rice, and Arabidopsis [25, 40, 44]. Taken together, all results suggested that the classification of MeHSFs was reliable and the HSFs family of proteins was well conserved among different species.

Tissue-specific expression may be related with the function of genes. In Arabidopsis and rice, HSFs are differentially expressed in a tissue-specific manner [40]. In order to explore the possible roles of MeHSFs in different tissues, their expression profiles in 11 tissues were studied [2]. Most MeHSFs presented a low transcript abundance, which coincided with its blocking status under normal conditions. Five MeHSFs (-1, -3, -8, -28 and -30) presented high expression in all the analyzed 11 tissues, which might be involved in wide heat shock functions in various tissues [8]. The tissues that were analyzed here could be divided into three types; the first class consists of tissues that are subjected to air such as SAM, lateral bud, stem, leaf, petiole, and midvein. The second class consists of tissues that grow underground and include the SR, FR, and RAM, whereas the last class consisted of the embryogenic tissues including OES and FEC. The expression profiles of MeHSFs in different tissue classes were similar suggesting that these MeHSFs might exhibit a similar function. More MeHSFs showed a high expression in the lateral bud, OES, and FEC, which might be related to less differentiation of these tissues. None of the MeHSFs were expressed only in one specific tissue, which was similar to that observed in rice. However, AtHSF9 was specifically expressed in seeds [18]. Collectively, the expression profiles revealed that different MeHSFs presented a differential expression pattern in different tissues. Several MeHSFs presented a constitutively high expression in all cassava tissues, indicating their crucial function in cassava development.

Large experimental data has demonstrated that HSFs can increase the tolerance to abiotic stresses in plants including Arabidopsis and rice [8, 45]. Results from this research showed that few MeHSFs including MeHSF-5, -7, -8, 9, 10, -18, and -29, were up-regulated after PEG treatment and the same MeHSFs, such as MeHSF-8, -9, -10, -18, and -29, were up-regulated after ABA treatment. Thus, MeHSFs might resist drought stress associated with the ABA signaling pathway. AtHsfA9 was considered to be related with the ABA signal network [46]. MeHSF8 is a homolog of AtHsfA9, which present high expression after ABA treatment. Thus, this suggests that MeHSF8 might improve the tolerance to drought stress through the ABA signaling pathway. MeHSF9 and MeHSF8 are homologs of AtHSFA2 and AtHSFA8, respectively [25]. AtHSFA2 and AtHSFA8 improved the tolerance to salt/osmotic stress in Arabidopsis. MeHSF9 and MeHSF27 are homologs of OsHSF17 and OsHSF29, respectively. OsHSF17 and OsHSF29 also improved the tolerance to salt/osmotic stress in rice [40]. The expression analysis of these MeHSFs were consistent with Arabidopsis and rice under similar abiotic stress treatments, providing clues for the function of MeHSFs under abiotic stress. HSFs are associated with oxidative stress, and lower ROS accumulation leads to a delayed PPD process [30,31,32, 47, 48]. In this research, we observed that MeHSFs were induced during PPD in the analyzed cultivar. OsHsfC2a and OsHsfA5 seem to be the major players related to ROS sensing and accumulation [48, 49], which are homologs of MeHSF10/17 and MeHSF3, respectively, and that show high expression during PPD. These MeHSFs may be regarded as candidate genes for genetic improvement of cassava toward resistance to PPD.

Conclusion

In this research, 32 MeHSFs were identified from cassava and their classification, protein motifs, and gene structures were analyzed in detail. All identified MeHSFs were distributed on 13 different chromosomes. Tissue expression analysis showed that none of the MeHSFs were expressed only in one specific tissue. Transcriptomic analysis suggested that the MeHSFs were involved in response to simulated drought and ABA treatments. MeHSFs were also related with PPD and may operate mainly through ROS-regulated gene networks. In conclusion, our results offer critical basic knowledge for future gene function analysis of MeHSFs in cassava.

Supplementary materials

Table S1 The expression profiles of cassava housekeeping genes (PPD), Table S2 The expression profiles of cassava housekeeping genes (ABA, PEG), Table S3 The list of MeHSF members identified, Table S4 Expression data of MeHSF genes in various tissues/organs.