Introduction

Three-fourth of our biosphere’s temperature is below 5 °C which poses as major deterrent in sustenance of life in such harsh conditions (Casanueva et al. 2010; Rodrigues and Tiedje 2008). But many of the life forms have adapted well in such environments during the course of evolution. Microorganisms, the most ancient colonizers of this planet, are well known inhabitants of the extreme ecological niches due to their enormous metabolic diversity. Most of the known psychrophilic microorganisms fit into various species of archaea, bacteria, yeast, fungi and algae (Thieringer et al. 1998; D’Amico et al. 2006; Piette et al. 2011). To survive at lower temperatures, the expression of a considerable number of genes coding for a number of proteins involved in cold acclimatization and survival are up or down regulated by psychrophiles. Cold shock proteins (CSPs) constitute a highly conserved family of structurally related DNA-binding proteins which are released on temperature drift (Moon et al. 2009). CSPs are distributed in prokaryotic and eukaryotic kingdom to cope up with cold shock due and abrupt downshift in temperature. In prokaryotes, CSPs are found in a diverse group of bacteria like Bacillus sp., Streptococcus sp., Thermotoga sp., Listeria sp., Arthrobacter sp., Escherichia coli and Pseudomonas sp. (Goldstein et al. 1990; Graumann et al. 1997; Fang et al. 2012; Lee et al. 2013; Hoffmann et al. 2013; Lee et al. 2014; Bisht et al. 2014). Among prokaryotes, CSPs were first identified in E. coli and further discoveries showed that it codes for eight more similar proteins (CspB–CspI). In E. coli, CspA is the major CSP which accumulate up to 10% of total proteins upon exposure to lower temperature (Goldstein et al. 1990).Upon temperature downshift, cell growth is arrested transiently when synthesis of most proteins shuts excluding cold inducible proteins (Polissi et al. 2003). After this transient acclimatization the cells get adapted to lower temperature and recommence the growth but with a lower rate. The bulk protein synthesis restarts in the cell with decline in the expression of the cold-inducible proteins (Phadtare 2004).

In this study, we cloned and characterized cspA gene from Pseudomonas koreensis P2-a cold adaptive bacteria isolated from Himalayan region of Arunachal Pradesh, India. The real time expression of cspA gene under different temperature was also estimated using quantitative-real time PCR. Consequently, we implemented a composite approach of structure prediction for modeling of CspA protein with further studies to find its role during survival in adverse environmental conditions and its adaptation towards low temperature.

Materials and methods

Bacterial culture

The bacterial strain, Pseudomonas koreensis P2 (NAIMCC-B-01747) was previously isolated by our group from soils collected from Sela Lake, Arunachal Pradesh, India. It was found to grow at temperatures ranging from 4 to 35 °C with optimum growth at 15 °C. Pure cultures were maintained in glycerol at − 20 °C and used for further studies.

Amplification of cspA gene

cspA genes of eight different Pseudomonas spp. available in the NCBI databases were downloaded and aligned (Supplementary figure 1) using ClustalW (Thompson et al. 2002). Primers cspA F (ATGTCTAATCGCCAAACC) and cspA R (TTACTCTGGGCGAACTTG) were then designed from the alignment for amplification of full length gene of CspA. Genomic DNA was extracted from P. koreensis P2 using standard protocol (Ausubel et al. 2003). PCR amplification of cspA was performed as described by Rai et al. (2015). Primer pair, cspA F and cspA R was used for amplification of cspA gene from chromosomal DNA of P. koreensis P2. The resultant amplicon was purified using PCR purification kit procured from Nucelo-pore (ThermoFisher Scientific, USA).

Cloning and sequencing

The purified amplicon thus obtained was successfully cloned in pJET1.2 vector, following manufacturer’s protocol with minor modifications and transformed into chemi-competent E. coli DH5α cells. The cloning procedure employed a positive selection in which clones grown on Luria–Bertani supplemented with Ampicillin (LB Amp+) plates were selected and the cloned gene was amplified using gene specific primers to ensure positive clones. Clones were then sequenced by an automatic ABI-3130 XL Sequencer (Applied Biosystems) using pJET1.2 forward and reverse primer. The curated sequence of cspA, after the removal of vector sequence by VecScreen (NCBI), was translated into its amino acids sequence using EXPASY server (Artimo 2012).

Analysis of expression of cspA gene at different temperatures

In order to measure the expression levels of cspA gene under different temperature regimes, the exponentially grown P. koreensis P2 was inoculated (@5%w/v) in nutrient broth in a shaking incubator at 5 °C, 15 °C and 30 °C for 12 h at 150 rpm. The low temperature conditions viz., 5 °C and 15 °C served as treatments which were named as T1 and T2 respectively; whereas culture incubated at 30 °C served as control. All the experimental runs were performed in triplicates. After 12 h of incubation, cultures under each temperature treatment were pelleted down for total RNA isolation. The isolation of total RNA was performed using GeneJETTM RNA purification kit (Fermentas). The quality of the total RNA was determined by gel electrophoresis using 1.2% formaldehyde agarose gel and visualizing the gels in BioradTM Chemidoc XRS gel documentation system and its quantity was determined using a Nanodrop. Prior to qPCR, the cDNA synthesis from the total RNA isolated from all the temperature treatments was performed using iScript™ cDNA synthesis kit (Biorad). The cDNA synthesized from all the treatments (T1 and T2) and control served as template for setting up reaction mix for qRT-PCR. qRT PCR was performed in triplicate in a reaction volume of 20 µl consisting of 10 µl 2 × SYBER green Master mix, 0.5 µl QN ROX Reference Dye, 1 µl each of primer F and primer R respectively (10 pmol), 1 µl cDNA and 6.5 µl RNase free water. The cspA specific primers described by Ivancic et al. (2013) were used for quantification of transcripts. The 16S rRNA gene served as an endogenous control. Following thermal cycling conditions were set:initial denaturation at 95 °C for 2 min, followed by 40 cycles of amplification by 3 step cycling, Denaturation at 95 °C for 5 s, annealing at 55 °C for 30 s, extension at 65 °C for 1 min. The gene quantification was performed as per the MIQE guidelines (Bustin et al. 2009).

Prediction of primary and secondary structure

The physico-chemical characteristics were studied by computing Extinction Coefficient, theoretical isoelectric point (pI), molecular weight, total number of positive and negative residues (Gill et al. 1989), Aliphatic Index (Atsushi 1980), Grand Average Hydropathy (GRAVY) (Kyte and Russell 1982) and Instability Index (Guruprasad et al. 1990) using Expasy’s ProtParam server (Gasteiger 2005). Amino acid compositions of CspA from cold adaptive P. syringae NCPPB 3739 and mesophilic P. aeruginosa DSM 22644 and P. stutzeri DSM 50227 were compared. For this amino acid sequences were retrieved from UniProt database. Principal Component Analysis (PCA) was carried out to identify important amino acids. Based on PCA results, a biplot was drawn to group different strains and to also identify the important amino acids for each group. Further, hierarchical clustering techniques were also performed and dendrogram was drawn to double check the grouping of strains. All the statistical analyses were carried out using R version 3.4.4 (2018-03-15). Secondary structure of CspA was also predicted using PDBSum Server (Laskowski 2001). PDBSum server provides 3D protein structure information regarding the motifs, domains, helices, beta sheets and strands, angles, etc. Subcellular localization of protein was determined using CELLO V.2.5 (Yu et al. 2006) and PSORTb (Nancy et al. 2010).

Phylogenetic analysis of CspA protein

A total of 35 amino acid sequence of CspA proteins of different bacteria were retrieved from NCBI. Phylogenetic analysis of all the retrieved protein sequences along with CspA protein of Pseudomonas koreensis P2 was performed based on the results of multiple alignments using ClustalW (Thompson et al. 2002). A phylogenetic tree was constructed by maximum likelihood method using MEGA 6.06 (Tamura 2013) with the bootstrap test replicated 1000 times. The evolutionary distances were computed using JTT matrix (Jones et al. 1992). All the positions containing missing data and gaps were removed.

Sequence analysis and molecular modeling

Similarity search of CspA sequence were performed by BLASTp (Altschul et al. 1990) at NCBI (http://www.ncbi.nlm.nih.gov) with PDB (Berman et al. 2000) as a reference database to identify the suitable templates for modeling of CspA. The search revealed that no single template with lower e-value and acceptable identity was able to satisfy 100% query coverage. Hence, the combination of multiple templates was opted to enhance the query coverage. We used I-TASSER server (http://zhang.bioin-formatics.ku.edu/I-TASSER) to develop high quality 3D model (Roy et al. 2010). Further, SWISSMODEL (Schwede 2003) and Phyre 2 (Kelley et al. 2015) were also used to check the structure reliability.

Assessment of predicted model

The stereo-chemical quality of predicted model was checked by analyzing the overall structure and residue-by-residue geometry of proteins. The quality assessment of the refined energy minimized CspA model was performed by PROCHECK (Laskowski et al. 2001) and QMEAN Z-score estimation using QMEAN server (Benkert et al. 2009).VERIFY 3-D (Eisenberg et al. 1997) and ERRAT server (Colovos and Yeates 1997) were used for structural validation.

Structural analysis of CspA model

Binding pockets of CspA proteins were predicted using CastP server (Binkowski et al. 2003).Detection of RNA and DNA binding specificities of CspA protein was performed using BindUP server (Paz et al. 2016). BindUP predicts nucleic acid binding proteins (NABPs) on the basis of electrostatic patches on protein surfaces using NAbind algorithm (Stawiski et al. 2003; Shazman and Mandel-Gutfreund 2008).

Results and discussion

Cloning and sequencing of cspA gene

PCR amplification resulted in approximately 200 bp amplicon (Fig. 1a, b). Cloning, sequencing and vector screening revealed that cspA of P. koreensis P2 is 213 bp long. When the sequence was searched against NCBI database, its identity as cspA gene was confirmed. The complete sequence of cspA gene has been submitted in DNA Data Bank of Japan (DDBJ) with accession no. LC214053.

Fig. 1
figure 1

acspA gene amplified using designed primer; M: Marker; Lane 1: cspA amplicon, b Cloned cspA gene amplified by pJET sequencing primer; M: Marker; Lane 1: cloned cspA amplicon

Real time quantification of cspA gene expression at different temperature variables

The role of cspA gene for cold adaptation was better elucidated by the qRT-PCR based gene quantification studies. The size of the cDNA of cspA gene was found to be about 150 bp, this cDNA was further utilized for its copy number estimation using qRT-PCR. As compared to the control (30 °C), the relative quantity of cspA did not change significantly at 5 °C while at 15 °C, it increased to 2.57 ± 0.23 (Fig. 2). The results indicated two probable events, first, probably the cspA gene has minimal contribution in the cold adaptation at very low temperatures like 5 °C and have more profound role at moderately low temperatures; second, at very low temperatures, more cold inducible genes are responsible for the of P. koreensis P2. It is pertinent to mention that the optimum growth temperature for the bacterial strain is also 15 °C which indicates that probably at 5 °C, the machineries required for the transcription of cspA are inadequately formed leading to lower transcription. Similar induction of cspA mRNA synthesis has already been demonstrated in various cold adaptive isolates of P. fluorescens, E. coli, Arthrobacter protophormiae, Caulobacter spp. etc. (Ray 1994; Panicker et al. 2002; Mazzon et al. 2012; Bisht et al. 2014). It has been reported that, cspA mRNA levels are inversely related with temperature (Ivancic et al. 2013; Song et al. 2012). Goldenberg et al. (1996) suggested that the cspA mRNA (t1/2 = 10 s) is rapidly degraded at 37 °C while at low temperatures it becomes much more stable (t1/2 = 20 min).

Fig. 2
figure 2

Graphical representation of relative gene expression at different temperatures

Sequence analysis and secondary structure

Molecular weight and pI of the CspA protein was calculated 7.69 and 6.55 kDa respectively. The molecular formula of CspA protein was found as C342H524N94O105S2. CspA from P. koreensis P2 comprised of 70 amino acids residues. Bi-Plot analyses and UPGMA clustering based on Eucledian distance revealed that in terms of amino acid composition both the cold adaptive CspA (P. koreensis and P. syringae) clustered together while P. aeruginosa and P. stutzeri were placed in different cluster (Fig. 3). This indicates that the amino acid composition of CspA in cold adaptive and mesophilic species of Pseudomonas is different. It was observed that Lys (L), Pro (P) and Thr (T) were the predominant amino acids explaining the variations while Ser (S), Ile (I), Arg (R), His (H) and Tyr (Y) were important in CspA of P. aeruginosa and P. stutzeri.

Fig. 3
figure 3

Bi-plot analyses and clustering of CspA from Pseudomonas spp. based on their amino acid composition

Metpally and Reddy (2009) reported that neutral and small amino acid groups were greatly favored in psychrophiles. It was observed that tiny/small amino acids viz. Ala (A), Gly (G), Ser (S), Asn (N), Asp (D), Pro (P) and Thr (T) constituted 42.85%, 41.50% in P. koreensis and P. syringae while the same was 40.3% and 39% in P. aeruginosa and P. stutzeri. Although the difference in amount of tiny/small amino acids present in cold adaptive and mesophilic species was not much, it was interesting to note that the out of the three amino acids explaining the variation in CspA of P. koreensis and P. syringae two (P and T) were tiny/small amino acids. While out of the five amino acids explaining the variation in CspA of mesophilic Pseudomonas, only one (S) was tiny/small. CspA of P. koreensis P2 was found to be localized in cytoplasm. Secondary structure predicted by PDBSUM server showed 1 sheet, 3 β-hairpins,1 Ψ loop, 4 β-beta bulges,5 strands, 2 helices and 3 β-turns (Supplementary figure 2). Multiple sequence alignment of CspA proteins (Supplementary figure 3) showed presence of two motifs viz. RNP1and RNP2 while 16 of 70 residues were found to be 100% conserved. Similar structure and composition of RNA binding domains (RNP1 and RNP2) were reported in E. coli and Bacillus subtilis (Schindelin et al. 1993, 1994). Schröder et al. (1995) showed through mutational analysis that RNP1 and RNP2 were essential for ssDNA-binding activity. RNA molecules typically form stable secondary structures under low temperature and may cause premature transcription termination. Hence, the RNA chaperones are essential to transcriptional process under low temperature. The presence of these motifs in CspA of P. koreensis P2 indicated the conservation of the nucleic acid binding capacity and, possibly, of the functional role as RNA chaperone observed in other bacteria.

Phylogeny of CspA protein

CspA is reported in a large number of bacteria (Phadtare et al. 1999; Kortmann and Narberhaus 2012). Phylogenetic analysis of CspA protein showed close evolutionary relationship with CspA of P. fluorescens WH6 (100%), P. agarici (98.4%) and P. rhizosphaerae (98.4%) forming a monophyletic clade (Fig. 4). P. fluorescens belonged to P. fluorescens subgroup which was phylogenetically distant from P. koreensis subgroup (Gomila et al. 2015). Other close neighbours like P. rhizospharae, P. japonica, P. cremoricolorata belonged to the P. putida subgroup which was also distant from P. koreensis group while P. agarici belonged to unclassified group of Pseudomonas (Anzai et al. 2000). Phylogenetic analysis revelaed that the cold shock proteins (CspA) were highly diverse in Pseudomonads as the similarity ranged from 35 to 100%. The evolutionary relationship among different CspA proteins indicated probable gene transfer among different subgroups of Pseudomonas. The AT content of the cspA gene was 51.24% which was higher than the genome average (39.55%, Unpublished). Jensen et al. (2004) reported that high AT genomic regions in Pseudomonas were more prone to gene transfer. Moreover is it is well accepted that the laterally transferred genes are generally AT rich (Moszer et al. 1999; Lawrence and Ochman 1997; Médigue et al. 1991). It was quite possible that interspecies transfer of genes among the pseudomonads might have resulted in the cspA of P. koreensis which was also supported by the phylogenetic analyses as the most closest cspA (of P. fluorescens WH6) was from different sub-group of Pseudomonas. Such gene transfer might have enabled cold adaptive P. koreensis P2 to survive under cold climatic conditions of eastern Himalayas. Earlier, Bisht et al. (2014) reported possible gene transfer between Pseudomonads and Bacillus cereus under cold Himalayan environment.

Fig. 4
figure 4

Phylogenetic relationships of CspA protein with related proteins from bacteria. A consensus tree following 1000 bootstrap replications are shown

Prediction and evaluation of 3D structure of cold shock protein

As mentioned earlier that the 3D structure of CspA from Pseudomonas koreensis P2 was not available in PDB and in any other structural database, the structure was modeled by comparative modeling and a composite approach using threading and ab initio modeling for 3D structure prediction.

For comparative modeling Modweb, Swiss Model and Phyre2 were used. Best models obtained from these methods were evaluated. Five models were predicted and retrieved for CspA with threading and ab initio modeling at I-TASSER. Model1 predicted by I-TASSER was found the best among the predicted model. The best model of CspA had an acceptable C-score (correlation scoring) of 1.11 as it lied within the range of reliable models C-score, i.e., − 0.5 to + 2.0. C-score is a “confidence score” for estimating the quality of a computed model. The C-score in I-TASSER can range from − 5 to 2. 3D structure of CspA of P. koreensis P2 with the highest C score is depicted in Fig. 5. The refined 3D model was deposited in Protein Model database (PMDB) with PMDB id: PM0081019.

Fig. 5
figure 5

Three dimensional Structure of CspA protein and its RNA-binding sites RNP1 (blue colour) and RNP2 (ocean green colour) (color figure online)

The results of stereo-chemical estimation of backbone Ψ and Φ dihedral angles of the CspA revealed that 87.5, 10.7, and 1.8% of residues were falling within the most favored regions, additionally allowed regions and disallowed regions respectively (Supplementary figure 4). Q-mean-Z-score of the model was found to be − 0.90 indicating reliable structure and fell within the range of scores characteristically found for similar size proteins (Supplementary figure 5).Verify-3D checks the compatibility of the model with its own amino acid sequence. The Verify 3-D predicted the model which lied within the range 0.19–0.71 (Supplementary figure 6). About 97% of the residues had an average 3D–1D score ≥ 0.2, confirming good quality model. The ERRAT results indicated that the CspA model’s overall quality factor was 95.161%, supporting the robust nature of the developed model (Supplementary figure 7). The qualitative evaluation of the model revealed that the generated model was reliable and of good quality.

Structural analysis of CspA model

Protein 3D structure and its surface topography are able to provide vital information for understanding the protein function. Structural details of surface regions of protein enables detailed studies of the relationship of protein structure and function. In CspA from P. koreensis P2, seven binding pockets were identified with volume ranging from 14.7 to 41.9 (Supplementary figure 8). During analysis of NABPs in CspA model, three largest positive patches were detected viz Patch 1MET1 SER2 LYS4 MET5 GLN38 SER52 PHE53 THR54 GLU56 GLY65 ASN66 THR68 LEU70 Patch 2:PHE31 HIS33 PHE34 GLU56 GLY58 LYS60 PRO62 ALA63 ALA64 Patch 3: HIS33 SER35 GLY41 LYS43 (Supplementary figure 9-10), which predicted CspA as nucleic acid (NA) binding protein. The results suggested the role of CspA protein of P. koreensis P2 as chaperone.

In the present study, 213 bp long cspA gene was identified in cold adaptive Pseudomonas koreensis which was isolated from the Eastern Himalayas. Expression analyses of the gene indicated its role in survival during moderately low temperature. In silico analysis revealed that tiny/small and neutral amino acids characteristic to psychrophilic proteomes were predominant in CspA protein. This protein contained two highly conserved motifs viz. RNP1 and RNP2 involved in nucleic acid binding. Phylogenetic analyses revealed that the CspA protein of P. koreensis P2 was more close to CspA of distant subgroups of Pseudomonas like P. fluorescens and P. putida subgroup indicating a possible intra-specific gene transfer. The cold shock proteins are of significant biotechnological importance as they can be useful for engineering crop plants for tolerating abiotic stresses, particularly cold and drought.