Introduction

A large number of the world’s population depend on rice for future food requirement. Unlike animals, plants are exposed to diverse environmental stresses. Especially abiotic stresses (AbS’s) are predominant, which negatively affects their growth and development of plants thereby reduction in productivity. Drought, cold, salinity, flood, submergence are the major stressors responsible for substantial yield loss. At the molecular and physiological level, expression and signalling of several genes are affected by these stresses, which known as abiotic stress responsive (AbSR) genes (Hirayama and Shinozaki 2010; Sharoni et al. 2011; Hadiarto and Tran 2011; Lata et al. 2015). Among many AbSR genes, transcription factors (TFs) are the crucial targets to understand the molecular cross-talks of AbS responses and to conducting overexpression studies in plants for AbS tolerance, they can act as key regulators overdriving the expression of several target genes (Nakashima et al. 2009; Urano et al. 2010; Yang et al. 2010; Golldack et al. 2011; Mickelbart et al. 2015). Plant genome contains around 7% TFs (Udvardi et al. 2007), which are categorized into 58 TF families (Jin et al. 2014). Among these, NAC, ZF-HD, AP2-EREBP, WRKY, bHLH are the largest TF families, excluding ZF-HD TFs (http://grassius.org/grasstfdb.php) (Singh et al. 2002; Gong et al. 2004, Xiong et al. 2005).

Various investigation and available literatures reveal the regulatory role of NAC, ZF-HD, AP2-EREBP, WRKY and bHLH TF families in signal transduction and inflection of several physiological and molecular processes including somatic embryogenesis (Toledo-Ortiz et al. 2003; Albrecht et al. 2005; Sonnenfeld et al. 2005; Mittler et al. 2006; Xu et al. 2008), plant development (Shigyo and Ito 2004), pollen development and function (Guan et al. 2014), internode elongation (Hattori et al. 2009), biosynthesis of secondary metabolites (Ma et al. 2009; Suttipanta et al. 2011), seed development (Johnson et al. 2002; Luo et al. 2005), seed dormancy (Rushton et al. 2010; Ding et al. 2014), biomass (Yu et al. 2013), flowering period and plant height (Cai et al. 2014), leaf senescence (Miao et al. 2004; Guo and Gan 2006), anther and ovule development (Zhao et al. 2008) and hormone signaling (Rashotte and Goertzen 2010; Feller et al. 2011; Hu et al. 2013). Significantly NAC, ZF-HD, AP2-EREBP, WRKY, bHLH TF families are activated in response to diverse biotic stress (Shinozaki et al. 2003; Cao et al. 2006; Singh et al. 2010; Xia et al. 2010; Muthamilarasan and Prasad 2013) including fungal invasion (Zheng et al. 2006; Marchive et al. 2007), virus attack (Huh et al. 2012), bacterial infection (Du and Chen 2000; Deslandes et al. 2002; Kim et al. 2008), disease resistance (Gutterson and Reuber 2004; Oh et al. 2005; Nakashima et al. 2007) and as well as AbS (Shinozaki et al. 2003; Cao et al. 2006; Singh et al. 2010; Xia et al. 2010; Tang et al. 2013) like drought and heat (Rizhsky et al. 2002; Kiribuchi et al. 2004, 2005; Sakuma et al. 2006; Wu et al. 2009; Zheng et al. 2009; Qiu and Yu 2009; Ren et al. 2010), salinity (Mukhopadhyay et al. 2004; Gutha and Reddy 2008; Qiu and Yu 2009; Zheng et al. 2009), cold (Pnueli et al. 2002; Wang et al. 2003; Mukhopadhyay et al. 2004; Kume et al. 2005; Qiu and Yu 2009), desiccation, submergence, heavy metals and wounding (Mukhopadhyay et al. 2004; Kiribuchi et al. 2004, 2005), osmotic stress (Gutha and Reddy 2008), low temperatures (Zheng et al. 2009; Sharoni et al. 2011) and phosphate starvation (Yi et al. 2005).

Therefore, the significant role of NAC, ZF-HD, AP2-EREBP, WRKY, bHLH TF families in several biological, molecular and physiological process were studied in assorted crop plants (Ledent and Vervoort 2001; Toledo-Ortiz et al. 2003; Ulker and Somssich 2004; Zheng et al. 2009; Rushton et al. 2010; Sharoni et al. 2011; Figueiredo et al. 2012; Chen et al. 2016). However, only minimum reports are available in the C3 model plant, O. sativa. For that reason, it is important to expedite the functional genomic approaches in the panicoideae family, especially C3 photosynthesis and AbS tolerance.

Based on the above addressed issues, the present study aimed at, in silico approaches to identify the potential AbSR encoding genes from rice NAC, ZF-HD, AP2-EREBP, WRKY, bHLH TF families. This is the first wide range investigation on these TFs O. sativa. The current study provides better insights about the functional aspects of these AbSR TFs, and unravels key genes for further validation toward describing their functional role in AbS dynamism.

Materials and methods

Database search for the identification of transcription factor family in O. sativa (L.)

Rice transcription factors (TF) like NAC, ZF-HD, AP2-EREBP, WRKY, bHLH and their protein sequences retrieved from the GRASSIUS Grass Regulatory Information Server (http://grassius.org/grasstfdb.html; Yilmaz et al. 2009).

Mining and meta-analysis of rice transcription factors

TF family members and their RAP-DB ID/Gene locus ID were collected. It was exposed to Rice Oligonucleotide Array Database (ROAD) meta-analysis search tool (Cao et al. 2012) for analyzing the tissues specific expression profile in different tissues in rice. The AbSR TF genes (Gene ID’s) were used to retrieve the corresponding genomic transcript, coding sequences with their chromosomal localization and protein sequence from RiceSRTFDB (Priya and Jain 2013).

Gene structure prediction

The genomic sequences and coding sequences of potential AbSR TFs like 17 NAC, 3 ZF-HD, 13 AP2-EREBP, 11 WRKY, 8 bHLH proteins were analyzed by GSDS web server v2.0 (Hu et al. 2015) to identify the position of exons and introns.

Physicochemical properties of identified proteins and phylogenetic analysis

The physicochemical properties including amino acid length, molecular weight, isoelectric point (pI), the number of positive/negatively charged residues, the instability index, and aliphatic index were predicted using the protparam tool of ExPASy (http://web.expasy.org/protparam/; Gasteiger et al. 2005). The 17 NAC, 3 ZF-HD, 13 AP2-EREBP, 11 WRKY, 8 bHLH protein sequences were imported into Phylogeny.fr (http://www.phylogeny.fr/; Dereeper et al. 2008) to construct a phylogenetic tree by maximum likelihood method.

Analysis of subcellular localization in TF families

The subcellular localization of TF family proteins of O. sativa was predicted using CELLO2GO (http://cello.life.nctu.edu.tw/cello2go/; Yu et al. 2014), Wolf Psort 2 (http://wolfpsort.org/; Horton et al. 2007), Bacello (http://gpcr.biocomp.unibo.it/bacello/; Pierleoni et al. 2006), ESLPred2 (http://www.imtech.res.in/raghava/eslpred2/; Garg and Raghava 2008), SubLoc (http://www.bioinfo.tsinghua.edu.cn/SubLoc/; Hua and Sun 2001).

Gene ontology analysis

Identified potential TF family members (Table 1) were loaded into CELLO2GO (http://cello.life.nctu.edu.tw/cello2go/; Yu et al. 2014) to find the Gene Ontology (GO) annotation against eukaryote. TF genes were characterized as per biological process, molecular function and cellular components according to CELLO2GO GO annotation.

Table 1 Potential abiotic stress responsive TF genes and their details

Results

Identification of potent transcription factors (TF) in rice

A total of 144 NAC, 15 ZF-HD, 164 AP2-EREBP, 103 WRKY and 135 bHLH TFs of rice with their gene ID’s were analyzed computationally for tissues specific (TS) gene expression studies. Seventeen NAC, 3 ZF-HD, 13 AP2-EREBP, 11 WRKY, 8 bHLH were potential abiotic stress responsive (AbSR) TF genes (Table 1), which have been involved in TS gene expression in all the 22 different tissues (Fig. 1a–e). This higher expression heatmap profiling was also evidenced among 5 different TF families, thereby reasonably delineating their function in tissue-specific manner.

Fig. 1
figure 1

Differential expression patterns of rice TF family genes. Heat map representing expression profiling of a NAC, b ZF-HD, c AP2-EREBP, d WRKY, e bHLH potent AbSR TF genes with respect to specific tissues. Yellow color indicates up regulation; Blue color indicates down regulation; Black color indicates unchanged expression level of AbSR TF genes. The colored scale bar at right side denotes relative expression value, where 5.0 and 13.0 represent down regulation and up regulation respectively (color figure online)

Structure of AbSR TF genes

Positions of exons and introns within the 17 NAC, 3 ZF-HD, 13 AP2-EREBP, 11 WRKY and 8 bHLH genes were predicted. Gene structure determination showed the numbers and arrangements of exons and introns (Fig. 2a–e and Table 1). The majority of AbSR TF genes (11; 21.153%) were found to contain a single intron, while 7 genes (13.46%) have two introns. Six AbSR TF genes (11.538%) have three introns; whereas 8 genes (15.38%) have four introns. Three AbSR TF genes (5.769%) were found to contain six introns; while 4 genes (7.692%) contained five and 8 introns, respectively. One AbSR TF gene (1.92%) has 16 introns. Among these, 12 AbSR TF genes (23.07%) were found intronless (Fig. 2a–e and Table 1).

Fig. 2
figure 2

Gene organization of major TF family genes. a NAC, b ZF-HD, c AP2-EREBP, d WRKY, and e bHLH TF key players. The blue lines indicate the UTR regions, orange box indicate the exons and black lines shows the introns (color figure online)

Gene ontology annotation

Gene characteristic features of TF families were analyzed by protein sequence using CELLO2GO and showed the putative involvement of these proteins in different molecular function, and biological process (Table 2). A large number of proteins were predicted to be involved in stress response, embryo development, anatomical structure development, metabolic and biosynthetic process, and signal transduction (Fig. 3a–e). The molecular functions of these proteins paralleled to transcriptional regulator activity, protein binding and DNA binding activity (Fig. 4a–e).

Table 2 Gene ontology of key AbSR TF genes
Fig. 3
figure 3

Classification of AbSR TF genes based on their biological process. GO biological process based categorization for a NAC, b ZF-HD, c AP2-EREBP, d WRKY, and e bHLH family genes to AbS

Fig. 4
figure 4

Classification of TF genes according to their molecular function. GO molecular function based categorization for a NAC, b ZF-HD, c AP2-EREBP, d WRKY, and e bHLH family genes to AbS

Protein properties of potentially expressed TFs

All five different TF families and their well-expressed protein properties were analyzed. Smallest and biggest amino acids with respect to molecular weight, pI ranging, stability index, aliphatic index and grand average of hydropathicity (GRAVY) of potential TF genes were determined (Table 3).

Table 3 Protein properties of key TF genes

Phylogenetic analysis of TFs

To study the phylogenetic organization of the 17 NAC, 3 ZF-HD, 13 AP2-EREBP, 11 WRKY and 8 bHLH family, the imputed protein sequences were used to generate an unrooted phylogenetic tree (Fig. 5). The unrooted tree divided the potentially expressed TF family genes into 5 major groups (groups I–V) based on the conserved NAC, ZF-HD, AP2-EREBP, WRKY, bHLH domains and homology of TF family gene sequences. Twenty proteins belongs to group—I (9 NAC; 1 AP2 –EREBP; 5 WRKY; 5 bHLH), 8 to group—II (2 NAC; 1 WRKY; 2 AP2-EREBP; 3 ZF-HD), 5 to group—III (1 NAC; 4 AP2-EREBP), 4 to group—IV (4 NAC); 15 to group—V (1 NAC; 6 AP2-EREBP; 5 WRKY; 3 bHLH) (Fig. 5).

Fig. 5
figure 5

Unrooted maximum likelihood tree constructed with NAC, ZF-HD, AP2-EREBP, WRKY, and bHLH proteins of rice. Groups are differentiated with different colors (color figure online)

Subcellular localization of rice TFs

The 4 programes were classed into two based on their resolution as high resolution and low-resolution predictions. The prediction principles and the competences of the four programs described in literature (Horton et al. 2007; Garg and Raghava 2008; Pierleoni et al. 2006; Hua and Sun 2001). The prediction results for the key TF family genes among the five different TF families and summarized in Table 4. It revealed the localization of these key players and their products in nucleus.

Table 4 Subcellular localization of potential TF genes

Discussion

NAC, ZF-HD, AP2-EREBP, WRKY, and bHLH type of TF’s have been reported to play vital roles in regulating various plant processes and physiological responses like plant development, normal growth, response to environmental stimuli and involved in plant defenses and so on (Ledent and Vervoort 2001; Toledo-Ortiz et al. 2003; Ulker and Somssich 2004; Zheng et al. 2009; Rushton et al. 2010; Sharoni et al. 2011; Figueiredo et al. 2012; Chen et al. 2016). This class of TFs are one of the well-studied proteins whose mode of action, cross regulation in signaling, auto regulation and evolution have been reported (Singh et al. 2002; Shimono et al. 2007; Puranik et al. 2012; Bakshi and Oelmüller 2014; Chen et al. 2016). These TFs play a crucial role in conferring tolerance to various AbS’s that includes cold, salinity, drought (Hu et al. 2006; Nakashima et al. 2007; Hu et al. 2008; Zheng et al. 2009), low temperature (Zheng et al. 2009) in NAC; cold, drought, salinity, metal, submergence (Kreps et al. 2002; Mukhopadhyay et al. 2004; Vij and Tyagi 2006; Kilian et al. 2007; Figueiredo et al. 2012; Giri et al. 2013) in ZF-HD; salinity, drought (Sharoni et al. 2010; Hsieh et al. 2013), low temperature (Sharoni et al. 2010), submergence, flooding (Sharoni et al. 2010), osmotic stress (Hsieh et al. 2013) in AP2—EREBP; salinity, drought, heat (Li et al. 2009, 2011), cold (Zou et al. 2010), H2O2 (Song et al. 2009),ozone oxidative stress, UV radiation (Jiang and Deyholos 2009), sugar starvation (Song et al. 2010), phosphate deprivation (Chen et al. 2009) in WRKY; cold, drought (Shinozaki et al. 2003) in bHLH TF families.

Expression of NAC, ZF-HD, AP2-EREBP, WRKY, bHLH TFs in response to diverse AbS decodes their putative involvement in the regulation of signaling mechanisms associated with transcriptional reprogramming during physiological stress. In silico identification of NAC, ZF-HD, AP2-EREBP, WRKY, bHLH TFs has been analyzed in various crop plants and their expression profiling in response to many AbS’s have been well studied. To the best of our knowledge, so far no such study has been reported in rice which is a panicoid C3 model crop with potential tolerance to diverse AbS.

The current study, 144 NAC, 15 ZF-HD, 164 AP2-EREBP, 103 WRKY, 135 bHLH from O. sativa TF family members were identified (Yilmaz et al. 2009). According to the heat map data, well expressed 17 NAC, 3 ZF-HD, 13 AP2-EREBP, 11 WRKY, 8 bHLH potential abiotic stress responsive (AbSR) TF genes were retrieved and subjected to ROAD. These AbSR TFs genes were chosen for expression profiling under AbS. Publically available microarray hybridization of ROAD expression values showed tissue-specific expression patterns in the 27 tissues of all the identified (17 NAC, 3 ZF-HD, 13 AP2-EREBP, 11 WRKY, 8 bHLH) potential AbSR TF genes. High level expression of these TFs expressed genes from multiple tissues and their expression during the individual AbS’s, delineates their multiple roles in diverse biological process and molecular crosstalks. This data could be further exploited for selecting key genes showing distinct expression pattern for explaining their functional roles. It highlighted the genome-wide analysis of genes expressed in different tissues. We outlined information on co-regulation among TF genes under abiotic stress conditions (Cao et al. 2012; Muthuramalingam et al. 2017). Identified each AbSR TF genes involved in the molecular crosstalks of the plant stress response was predicted. Further, this heatmap data could pave the way for conducting over-expression studies of key genes in different plant tissues in order to develop the AbSR protein content in rice.

Phylogenetic analysis of the potentially expressed TFs genes in rice showed the genes are present in the same subclades/subgroups. In addition, a joined evolutionary tree was constructed from expressed rice 17 NAC, 3 ZF-HD, 13 AP2-EREBP, 11 WRKY, 8 bHLH proteins involved in diverse aspects of plant growth, development and involved in multiple abiotic stress tolerance mechanisms. The potent TF genes with the same functions revealed a tendency to cluster into one subgroup, which provided an vital resource to explore the functions of the TF genes. It implied that these potent TF genes may be involved in the responses to unique and combined AbS, and this hypothesis was also supported using computational tissues specific expression and gene ontology annotation.

These AbSR TFs encoding proteins have huge variations in length of amino acid, isoelectric point, molecular weight, aliphatic index, instability index and GRAVY values of these proteins. Additionally, subcellular localization of these proteins at independent organelles can be attributed to the presence of putative novel variants, which are prerequisite for further validation.

Emerging advancement of ultra-high- throughput omics tools and approaches, including molecular physiology and computational strategies, the pivotal role of NAC, ZF-HD, AP2-EREBP, WRKY, bHLH TFs in gene regulation and signal transduction mechanisms has been studied well in all major crops and tree species. Though, no such reports on these TFs have been conducted in O. sativa, considered as model systems for scrutinizing C3 photosynthesis and AbS tolerance mechanisms. Considering the significance of this crop and NAC, ZF-HD, AP2-EREBP, WRKY, bHLH TFs, the current study used a wide range of computational approaches to categorizing and characterize these TF gene family members. The identified TF genes were used for gene ontology annotation, gene structure prediction, physicochemical properties and subcellular localization prediction analysis. In toting, in silico expression profiling of 17 NAC, 3 ZF-HD, 13 AP2-EREBP, 11 WRKY, 8 bHLH genes were used to understand the different tissues specific differential expression profiling. As a result proving an important indication of their controlling and regulatory functions in AbS conditions.