Introduction

Rice (Oryza sativa L.) is one of the most important staple food crops in Southeast Asia. Globally, India is the first in area of rice cultivation and second in rice production after China. India produces 21.5% of rice globally (Viraktamath et al. 2011). Enhancing grain nutrient quality is one of the most important objectives of many rice breeding programmes. Rice seeds consist of over 90% starch and protein by dry weight (Juliano 1971). However, several advances have been made in understanding the molecular and genetic basis of grain quality in rice. The molecular basis and response of signalling pathways to environmental factors, which can have strong and adverse effects on grain quality, are not completely conclusive. Seed-storage protein genes and the genes underlying starch biosynthesis have been cloned, and their functions in grain quality have been described. The regulation of these genes affecting the levels of transcription and post-transcription during development and in response to environmental conditions are not yet understood (Krishnan et al. 2009).

Scientists have been attempting to develop and improve grain quality through classical breeding as well as biotechnology. However, caution must be taken, as downregulation or upregulation of the expression of a single gene might result in unpredictable and complex biochemical and physiological changes. It has been reported that the modification of a single starch synthesis gene led to changes in all the traits related to cooking quality (Tian et al. 2009), and a reduction of one or several storage proteins could be compensated by increasing other proteins (Kawakatsu et al. 2010).

NAC is a domain name derived from first letters of three different genes NAM (no apical meristem), ATAF (Arabidopsis transcription activation factor) and CUC (cup-shaped cotyledon). Several NAM, ATAF1/2 and CUC2 (NAC) transcription factors (TFs) are involved in different cellular processes in plant species, for example, hormone signal pathways and development (Greve et al. 2003; Peng et al. 2009). NAC proteins may function as homodimers and heterodimers in plants (Xie et al. 2000; Hegedus et al. 2003; Ernst et al. 2004). Even OsNAC5 is reported to form homodimers and heterodimers with different OsNACs (Jeong et al. 2009; Takasaki et al. 2010). Genes induced by pathogen attack and wounds were found in the ATAF subfamily (Ooka et al. 2003).

To date, various NAC TFs from rice such as AtNAC072 (RD29), AtNAC019, AtNAC055 (Fujita et al. 2004; Tran et al. 2004), ANAC102 (Christianson et al. 2009) from Arabidopsis, BnNAC from Brassica napus (Hegedus et al. 2003), and SNAC1 (Hu et al. 2006), SNAC2/OsNAC6 (Nakashima et al. 2007; Hu et al. 2008), OsNAC5 (Sperotto et al. 2009; Zheng et al. 2009; Takasaki et al. 2010; Song et al. 2011) and OsNAC10 (Jeong et al. 2010) from rice have been shown to be involved in responses to various environmental stresses.

Recently, the stress-responsive gene OsNAC5 was reported in rice (Takasaki et al. 2010; Song et al. 2011). OsNAC5-overexpressing transgenic plants had increased tolerance to drought, high salinity and low temperature. Studies, however, were limited to tolerance and increasing yield (Jeong et al. 2013). A few reports determined that its function could be related to Fe, Zn and amino acid remobilization from green tissues to seed (Lim et al. 2007; Sperotto et al. 2009, 2010). To investigate the role of OsNAC5 in regulating seed-storage protein content, an expression analysis of rice NAC TFs was conducted to identify its role in improving grain protein content (GPC) in aerobic environments. More importantly, expression of this gene could significantly enhance GPC in rice.

Methods and materials

Estimation of total protein and amylose content (%)

A near-infrared reflectance spectroscopy (NIR system, FOSS, Denmark) system was used for estimation of crude protein content and amylose content in recombinant inbred lines (RILs). The NIR system was standardized using biochemical data from genotypes of \(\hbox {F}_{3}\) and \(\hbox {F}_{4}\) generations through a mathematical equation suitable for this mapping population. To measure protein content, rice was placed in a dry room with a constant humidity of 12% for a week to balance the moisture content. Samples were scanned by using a Fourier transform (FT)-NIR (Bruker MATRIX-1, Germany). Protein content was predicted from spectrum models developed with 3000 samples at the University of Agricultural Sciences (UAS), Bengaluru.

Gene expression studies for GPC

For in silico analysis, putative uncharacterized genes and known genes for seed storage protein accumulation were downloaded from the BGI database and searched for possible orthologues against the Ensembl database. To identify the most suitable candidate genes for expression studies, about 1246 putative genes with uncharacterized function expressed in the root, leaf and panicle during the grain filling stage of panicle development were downloaded and searched for their possible orthologues. Among grain filling genes investigated, a total of 49 genes qualified for possible orthologues involved in the seed storage mechanism either directly or indirectly. From these, the gene LOC_Os011g0184900 was selected for expression analysis. Further, selection was based on their protein sequence with ID percent similarity of more than 90% for query and subject.

Table 1 GPC in selected five RILs of cross between BPT5204 and HPR14.
Table 2 Different stages of grain filling and time of sample collection.
Table 3 List of gene-specific primers used for gene expression studies.
Fig. 1
figure 1

Relative quantification in real-time quantitative PCR for NAC-like TF between parents and five selected RILs spanning 4–14% GPC in rice.

mRNA was purified from leaves and panicles of five RILs and parents (table 1) at three different stages as described by Jain et al. (2007) under aerobic conditions with GPC spanning from 4 to 14% of a population as shown in table 2.

Total RNA was extracted from leaves and panicles with the RNeasy Plant Mini kit (Qiagen) and mRNA is reverse transcribed into cDNA with SuperScript III (Invitrogen). Gene-specific primers (table 3) were designed for each candidate gene based on the cDNA sequence available in the Ensemble database. The real-time polymerase chain reaction (RT-PCR) products of 50–170 bp long were designed to span an intron, providing an internal control for the detection of contaminating genomic DNA. Quantitative RT-PCR was performed on three biological replicates with a RealPlex (Eppendorf) using SYBR Green PCR Master Mix (Qiagen) as shown in figure 1. Melt curves were examined for problems associated with genomic DNA contamination (figure 2) such as primer-dimers and multiple products. Suitable targets were then used in a template dilution series to optimize reaction efficiency. Cycle threshold values for each RIL were normalized to ubiquitin (reference gene).

In silico analysis of candidate gene

Protein and gene sequences of rice were used as queries to retrieve 36 homologous sequences of different plant genera using BLASTN and BLASTP (McGinnis and Madden 2004) of NCBI (http://blast.ncbi.nlm.nih.gov/Blast.cgi). For the search, we used an E-value threshold of 0.001, query coverage \({\ge }70\%\) and sequence identity \(\ge 65\%\). Redundant sequences were removed using CD-HIT (Li and Godzik 2006).

Fig. 2
figure 2

Single melt curve for NAC-like TF.

The nonredundant homologous sequences were then aligned using the Kimura 2 parameter in MUSCLE, a multiple alignment web server (Edgar 2004). A phylogenetic tree was constructed using the maximum likelihood (ML) method with PHYLIP (Felsenstein 1989) with 100-bootstrap replicates. Further, the protein sequence of a NAC-like TF was used for multiple sequence alignment and tree building with other NAC family proteins. The resulting tree was visualized and edited using a cladogram program.

Standardization of RT-PCR condition

Primer design

All TF families have a conserved region, which is common for all the members within the family. Nonconserved regions specific to each member of the family were scanned for primer design. The primer pair specific to the TFs was designed using Primer3Plus software (http://www.bioinformatics.nl/cgibin/primer3plus/primer3plus.cgi), and the primers were synthesized at Sigma-Aldrich, USA. The following predicted parameters such as melting temperature (\(T_{\mathrm{m}}\)) of \(60\pm 2^{\circ }\hbox {C}\), primer lengths of 20–24 nucleotides, guanine–cytosine (GC) content of 45–55% and PCR amplicon length of 100–200 bp were used for designing the primer pairs of the TFs.

Primer concentration

In real-time quantitative PCR, for any gene expression, the primer concentration is one of a key factors. Primer concentrations 150, 200, 250, 300, 350 and 400 nmol were used to optimize the amplification. Primers at a concentration of 200 nmol gave a single melting curve, low Ct value, high fluorescence value and no primer dimers. Hence, this concentration was used for all experimental studies.

Annealing temperature

It is essential to determine optimal annealing temperature (\(T_{\mathrm{a}}\)) for each primer pair before its actual use. The selection of \(T_{\mathrm{a}}\) is based on the length and positions of the primers. The common view is that the \(T_{\mathrm{a}}\) should be \(5^{\circ }\hbox {C}\) below the melting temperature (\(T_{\mathrm{m}}\)) of primers. Both OsNAC and Ubiquitin were adjusted for an optimal annealing temperature of 59–60\(^{\circ }\hbox {C}\) and 61–63\(^{\circ }\hbox {C}\), respectively.

Reaction mixture

The RT-PCR master mix was freshly prepared to avoid handling errors. The reaction mixture of \(10\,\mu \hbox {L}\) contained 1.0 ng cDNA, 200 nmol of each gene-specific primer and \(5\,\mu \hbox {L}\) of \(2\times \hbox {SYBR}\) green reagents (Qiagen, USA). Individual components of the reaction mixture were standardized for a \(10\,\mu \hbox {L}\) reaction volume.

Setting baseline and threshold levels

The point of measurement (baseline and threshold levels) should be accurately determined to reflect the quantity of a particular target within the reaction. Eppendorf RealPlex detection software was used to set the default baseline from 3 to 15 cycles. The highly abundant gene ubiquitin and a few other TF genes started to amplify very early in the 20th PCR cycle. Therefore, the baseline was not changed, with default set between 3 and 15 cycles. Once the baseline has been set correctly, the software automatically sets the threshold at 10 standard deviations above the mean baseline fluorescence. It was ensured that the threshold line was placed in the exponential phase to increase the precision and quality of the experimental data, and the same baseline and threshold default setting was used for all PCR reactions.

RT-PCR condition and analysis of data

An Eppendorf RealPlex instrument (Eppendorf, India) was used for all RT-PCR amplifications. The wells defined as ‘unknown’ were used to calculate the relative quantifications. pyQPCR software (https://www.projet-plume.org/en/relier/pyqpcr) was used with an improved \(\Delta { Ct}\) method that allows reliable quantifications and error to be obtained. The confidence level is modifiable and can be calculated either using Gaussian or t-test. The programme plots results as histograms that are easy to understand. Two of the assumptions for a t-test are that both groups of \(\Delta { Ct}\) will have Gaussian distributions, and they will have equal variances (Hollander and Wolfe 1973; Yuan and Neal Stewart 2005). To generate a baseline-subtracted plot of the logarithmic increase in fluorescence signal (\(\Delta { Rn}\)) versus cycle number, baseline data were collected between cycles 3 and 15. All amplifications were analysed with the threshold automatically set by the instrument. To compare data from different PCR runs or cDNA samples, Ct values for all the TF genes were normalized to the Ct value of the ubiquitin gene, which was the most stable and had the lowest gene expression stability. Different methods are available for estimating PCR efficiency; the classical method uses Ct values obtained from a series of template dilutions.

Results

Alignment and phylogenetic analysis

The initial approach of the BLAST method for gene prediction was to identify homologues to known genes, since single nucleotide comparisons only provide an indication whether genes may or may not be involved in the seed storage protein mechanism. Therefore, phylogenetic analysis was performed to predict associated homology. In the nucleotide BLASTN study, a NAC-like TF was found to have cDNA sequence similar to the mRNA of the NAC5 gene.

Multiple sequence alignment of the NAC protein sequence of O. sativa with different NAC proteins of other crop plants suggested that residues towards the N-terminal region appeared to be more conserved than the C-terminal region. A total of 10 clusters were generated using the neighbour-joining (NJ) method.

In eight cases, different NAC genes from other species formed a single cluster indicating that they are possible orthologues of each other. The NAC-like TF from O. sativa indica used in this study was found to be a close relative of OsNAC5 in the above phylogenetic studies and an orthologue to Glycine max NAC8, NAC2 and Triticum aestivum NAC6, whereas it was paralogous to OsNAC6, as they formed a single cluster (figure 3).

Expression analysis of OsNAC-like TF at flowering initiation and different stages of grain filling in BPT5204 and HPR14

Expression in leaves: Expression of OsNAC-like TF at flowering initiation in comparison with the baseline was 0.88-fold in BPT5204 and 1.14-fold in HPR14, whereas expression was 2.16-, 2.28- and 2.16-fold at the \(\hbox {S}_{1}, \hbox {S}_{2}\) and \(\hbox {S}_{3}\) stages of grain filling in BPT5204, respectively. HPR14 showed a high transcript abundance of 3.95-, 4.9- and 4.29-fold at the \(\hbox {S}_{1}, \hbox {S}_{2}\) and \(\hbox {S}_{3}\) stages of grain filling, respectively, summarized in table 4.

Expression in panicle: Low transcript abundance was observed in the panicle of BPT5204 at all stages. The OsNAC-like TF showed 0.75-, 0.3-, 0.34- and 0.32-fold expression at the \(\hbox {S}_{0}, \hbox {S}_{1}, \hbox {S}_{2}\) and \(\hbox {S}_{3}\) stages during grain filling in BPT5204, respectively, whereas moderate transcript abundance was observed in the panicle of HPR14, with 1.0-, 2.30-, 2.84- and 0.88-fold expression at the \(\hbox {S}_{0}, \hbox {S}_{1}, \hbox {S}_{2}\) and \(\hbox {S}_{3}\) stages, respectively, summarized in table 4.

Table 4 Transcript abundance of OsNAC5-like TF at different stages of grain filling in parents.
Fig. 3
figure 3

Phylogenetic relationship among rice NAC genes, OsNAC5 and NAC-like TF. GLYMA, G. max; At, Arabidopsis; MLOC, Hordeum vulgare; GRMZM, Zea maize; Ta, Triticum aestivum; MTR, Medicago truncatula; SB, Sorghum bicolor; BNA, B. napus.

Fig. 4
figure 4

Determination of relative expression of NAC-like TF at \(\hbox {S}_{0}, \hbox {S}_{1}, \hbox {S}_{2}\) and \(\hbox {S}_{3}\) stage in (a) leaf and (b) panicle tissues of BPT 5204 and HPR14 genotype using RT-PCR.

Relatively higher transcript stage of OsNAC-like TF

OsNAC-like TF expressed relatively higher transcript at the \(\hbox {S}_{2}\) stage; HPR14 accumulated a 3.9-fold increase in leaves and a 1.84-fold increase in the panicle, whereas BPT5204 had a 1.28-fold increase in leaves and a 0.34-fold increase in the panicle (figure 4).

Expression analysis of OsNAC-like TF at S2 stage during grain filling in five selected RILs

Expression in leaves: The expression level of the NAC-like TF was found to be linear in leaves of RILs with increasing GPC. Here, RIL1 (4–5% GPC), RIL2 (6–8% GPC), RIL3 (8–10% GPC), RIL4 (10–12% GPC) and RIL5 (13–14% GPC) showed NAC-like TF expression of 1.9-, 1.9-, 2.5-, 3.04- and 4.51-fold in leaves, respectively (table 5).

Expression in panicle: A nearly linear relationship was observed in the expression level of NAC-like TF in panicles of RILs with increasing GPC (figure 5). Here, RIL1 (4–5% GPC), RIL2 (6–8% GPC), RIL3 (8–10% GPC), RIL4 (10–12% GPC) and RIL5 (13–14% GPC) showed 0.47-, 0.62-, 1.2-, 1.5- and 3.2-fold expression in the panicle, respectively, as summarized in table 5.

Discussion

Regulation of gene expression is vital for a variety of essential processes in plants, including growth, development, differentiation, metabolic regulation and adaptation to biotic and abiotic stresses. Initiation of transcription is the first step in the expression of any downstream gene; it plays a central role in the regulation of the expression of downstream genes. Transcription appears to be controlled by numerous TFs that mediate the effects of intracellular and extracellular signals (Verma and Agarwal 2010). Therefore, the functional analysis of TF genes is essential for understanding their role in grain filling.

Alignment and phylogenetic analysis

The first systematic analysis of Arabidopsis and rice NAC proteins classified them into 18 subgroups (Ooka et al. 2003). However, another phylogenetic analysis of rice NAC proteins suggested that the NAC family can be divided into five groups, and each subfamily was largely diversified (Fang et al. 2008). In a report concerning eight subfamilies (Shen et al. 2009), the main reason for the discrepancies in the reported phylogenetic trees may lie in the fact that all the previous NAC protein classifications were based on the conserved N-terminal NAC domains, either from sub-domain A to D or from A to E, which did not take the highly divergent C-terminal sequences into consideration (Ooka et al. 2003; Pinheiro et al. 2009; Shen et al. 2009).

To gain a better understanding of the phylogeny of the NAC gene family, we performed phylogenetic analysis with the inclusion of the highly diverse C-terminal sequence. Moreover, different algorithms utilized in the phylogenetic analyses may lead to the inconsistent interpretations. In the previous analyses, different algorithms, including NJ (Ooka et al. 2003; Fang et al. 2008), ML (Shen et al. 2009) and Bayesian method (Pinheiro et al. 2009) were implemented, which may make results less comparable.

In this study, NAC-like TF showed 90% sequence similarity with cDNA of the OsNac5 gene present in the NCBI database. Further, multiple sequence alignment of derived protein sequences of NAC-like TF with other NAC protein sequences of different crop plants revealed that sequences are more conservative at the N-terminal region forming a more conserved sub-domain of the NAC family, whereas the C-terminal region was found to be highly diverse among different NAC genes. In most cases, different NAC genes from different species formed a single cluster, indicating that they are possible orthologues of each other. Of the 10 clusters observed using the NJ method during phylogenetic analysis, one cluster, NAC-like TF was found to be a close relative of OsNAC5 and an orthologue to G. max NAC8, NAC2, T. aestivum NAC6, whereas it was a paralogue to OsNAC6.

Fig. 5
figure 5

Determination of relative expression of NAC-like TF transcript at \(\hbox {S}_{2}\) stage in (a) leaf and (b) panicle tissues of BPT 5204, HPR14 and low to high GPC RILs using RT-PCR. Y-axis, expression in fold. x-axis, genotypes. Ctrl, control (ubiquitin gene); TG, target gene.

Table 5 Transcript abundance of OsNAC5-like TF at \(\hbox {S}_{2}\) stage in parents and five selected RILs.

Based on the available information on various TFs and their annotated functions in the model system O. sativa, important genes were selected for their expression analysis on the basis of their direct and indirect roles in regulating grain filling. TF genes showing detectable expression levels belonged to OsNAC families. The NAC genes constitute one of the largest families of plant-specific TFs and are present in a wide range of land plants. Genes in the NAC family have been shown to regulate a wide range of developmental processes including seed development, embryo development, shoot apical meristems, fibre development, leaf senescence and cell division (Souer et al. 1996; Aida et al. 1997; Sablowski and Meyerowitz 1998; Xie et al. 2000; Uauy et al. 2006).

In this study, we found a NAC-like TF encoding a protein for the rice gene to be upregulated at the \(\hbox {S}_{2}\) stage in the leaves and panicles of the parent HPR14 and also in the five RILs. A significant increase in the expression of transcript has been reported. A wheat NAC gene, NAM-B1, was reported to be involved in nutrient remobilization from leaves to developing grains (Uauy et al. 2006; Zhang et al. 2008). In rice it was also reported that OsNAC10, a closest plant protein to NAM-B1 (NAC TF), which promotes accelerated senescence and increases nutrient remobilization from leaves to developing grain was located near QTL qCP7. GPC is highly influenced by the amount of N remobilization from leaves to grain and dry matter accumulation in the seeds. The protein to starch ratio in the grain is one of the results of events occurring at both the sink (developing grains) and at the source (leaves).

Several studies suggest that source regulation plays a significant role in grain protein accumulation (Barneix and Guitman 1993; Martre et al. 2003). However, efficient remobilization of N takes place during short duration grain filling but may result in reduced total kernel weight under favourable conditions (Kade et al. 2005). OsNAC5 was demonstrated to be a senescence-associated gene that is upregulated during grain maturation in rice flag leaves (Sperotto et al. 2009). Lim et al. (2007) reported that OsNAC5 is regulated by ABA, a hormone with a known central role in senescence processes. A comparison of diverse cultivars showed a positive correlation of OsNAC5 expression in flag leaves before and during anthesis with final Fe, Zn and protein concentrations in mature grains (Sperotto et al. 2009, 2010).

In our results, there was a consistent increase in the relative abundance of transcript in leaves with increasing GPC across RILs. The result is consistent with reports by Lim et al. (2007) and Sperotto et al. (2009, 2010). This suggests that the putative OsNAC5 TF acts as a candidate gene for nutrient reservoir activity to enhance GPC during grain filling in rice.