Introduction

The fungus Candida albicans is a commensal in the microbiota, present on the surfaces of skin and gastrointestinal tract of humans. However, it is also an opportunistic pathogen which causes a range of superficial infections including vaginal and oral thrush in healthy individuals (Perlroth et al. 2007). In immunocompromised humans, the increased virulence of C. albicans causes severe bloodstream infections that disseminate to internal organs (Perlroth et al. 2007; Greig et al. 2015). Females are more prone to candidiasis as shown by the statistics that around 70% of global female population will encounter at least one episode of vaginal candidiasis in their lifetime (Fidel Jr et al. 1999). In the USA, 35% of mortality is accountable to systemic candidiasis acquired through bloodstream infections (Edmond et al. 1999). A key aspect of C. albicans pathogenicity is its dynamic morphology, which could switch between yeast, pseudohyphal and hyphal forms (Sudbery et al. 2004). As a commensal, C. albicans exists in yeast form (single, oval budding cells) on mucosal surfaces and it is tolerated by the host immune system (Moyes et al. 2010; Gow et al. 2012). However, as a pathogen, the morphology transforms to pseudohyphal and hyphal forms (elongated cells attached end-to-end) (Sudbery et al. 2004). Pseudohyphal cells are highly branched, ellipsoidal in form, and have constrictions at cell junctions while hyphal cells are less branched, have parallel sides, and lack constrictions at the septa (Carlisle et al. 2009). These forms preferentially invade the epithelial cells either by active penetration or host-mediated endocytosis (Dalle et al. 2010) and are accountable for tissue invasion and damage (Greig et al. 2015). The reversible transition between all the three forms is reported to be induced by several factors including serum (Taschdjian et al. 1960), body temperature (Ernst 2000), neutral pH (Buffo et al. 1985), 5% carbon dioxide (the partial pressure of CO2 in the bloodstream) (Mardon et al. 1969), N-acetyl-glucosamine (GlcNAc) (Simonneti et al. 1974), and certain hormones (Carlisle et al. 2009).

Molecular investigations have shown that regulation of expression of hypha-specific genes (HSGs) play important roles in hyphal morphogenesis (Lane et al. 2001a, b; Cao et al. 2006). Recently, the importance of phosphorylation and dephosphorylation events has also been highlighted in post-translational regulation of hyphal formation, polarized growth, cell separation suppression, and cell cycle (Sudbery 2011). Lenardon et al. (2010) showed that the phosphorylation of chitin synthase 3 (Chs3) at Ser139 by kinase Pkc1 is required for optimal localization and function of Chs3 during hyphal morphogenesis. Similarly, Bishop et al. (2010) demonstrated that phosphorylation of a vesicle-associated protein, Sec2 (at Ser584), by cyclin-dependent kinase (Cdc28-Ccn1/Hgc1) is required for hyphal development and polarized growth. Previously, Cdc28/Hgc1 has been shown to phosphorylate a GTPase-activating protein, Rga2, thus inhibiting its localization in hyphal tips, which eventually leads to localized Cdc42 activation for hyphal extension (Zheng et al. 2007). The transcription factor Efg1 is required for the induction of HSGs (Lo et al. 1997), and phosphorylation of Efg1 (at Thr179) by Cdc28-Hgc1 results in downregulation of target genes (Wang 2016). Similarly, phosphorylation mediated by cyclin-dependent kinases (CDKs) in regulation of hyphal development is continually underlined (Wang 2016), and in addition to CDKs, other kinases such as mitogen-activated protein kinases (MAPKs) are also suggested to be involved in signal transduction pathways leading to expression of HSGs (Leberer et al. 2001; Moyes et al. 2010). These reports suggest the existence of phosphoregulatory network linked to virulence characteristics of C. albicans, which is not fully understood till date.

Phosphorylation is a ubiquitous and reversible alteration of proteins, which are important for regulation of cellular processes (Seet et al. 2006). Protein kinases phosphorylate their peptide substrates by recognizing the motifs that contain a few key residues surrounding the target amino acid, which could be serine, threonine, or tyrosine (Beltrao et al. 2009). Phosphoproteomics refers to large-scale analysis of protein phosphorylation using high-throughput approaches (Johnson and Hunter 2004), and profiling of phosphoproteins advances the knowledge on phosphorylation-based intricate signaling network functioning at post-translational level for regulating the cellular activities (Humphrey et al. 2015). Next-generation mass spectrometry (MS)-based proteomics enables global analysis of phosphoproteins in various cell types, tissues, and organelles (Grimsrud et al. 2012; Giansanti et al. 2015). Among fungi, phosphoproteomes of Schizosaccharomyces pombe (Wilson-Grady et al. 2008), Saccharomyces cerevisiae (Amoutzias et al. 2012), Fusarium graminearum (Rampitsch et al. 2012), Cryptococcus neoformans (Selvan et al. 2014), Aspergillus nidulans (Ramsubramaniam et al. 2014), Alternaria brassicicola (Davanture et al. 2014), Botrytis cinerea (Davanture et al. 2014), and Neurospora crassa (Xiong et al. 2014) were characterized till date. In case of C. albicans, Willger et al. (2015) have analyzed the phosphoproteome under hypha-inducing conditions of GlcNAc; however, no comprehensive analysis has been performed under other important hypha-inducing conditions, namely, serum and elevated temperature. In view of this, a comparative phosphoproteomic study of C. albicans has been performed, with an aim to analyze the phosphoprotein composition of the hypha and its changes under different hypha-inducing conditions including elevated temperature (40 °C), serum (fetal calf serum), and GlcNAc. Further, to unravel the complexity of phosphoproteome, dynamics of protein phosphorylation was studied.

Among several signaling pathways that have been reported to regulate hyphal morphogenesis in C. albicans, CDK and MAPK cascades were demonstrated to play key roles in phospho-activation of downstream transcription factors, which leads to the expression of HSGs (Leberer et al. 2001; Berman and Sudbery 2002; Kumamoto and Vinces 2005; Moyes et al. 2010). However, the involvement of other protein kinases (PKs) in the phosphoregulatory network linked to virulence characteristics of C. albicans has not been reported. To facilitate the identification and characterization of PKs involved in phospho-activation of genes responsible for infection-related functions, a first-hand comprehensive functional classification and global analysis of PKs have also been performed. The entire PK repertoire of C. albicans has been identified using computational approaches, and using interaction network-based functional annotation, the major kinases interacting with key target proteins were identified. Altogether, the present study will serve as a solid base to facilitate further functional studies in the aspects of protein kinase-target protein interaction in effectuating phosphorylation of target proteins, and delineating the downstream signaling networks linked to virulence characteristics of C. albicans.

Methods

Cell culture and growth conditions

C. albicans wild-type strain SC5314 (ATCC® MYA-2876™) was used in the present study. A single colony of the strain was grown overnight in 50 ml of yeast extract-peptone-dextrose (YPD) medium at 30 °C, from which 1% culture was added to fresh YPD medium. The culture was incubated at 30 °C to reach an OD of 0.4–0.6 (OD600). Cells were pelleted down by centrifugation for 10 min at 4000 rpm and washed in sterile double-distilled water. To induce the hyphal form, these cells were added to specific media (for elevated temperature conditions: 2% glucose in 0.67% Yeast Nitrogen Base without amino acids, grown at 40 °C for 3 h; for serum: 10% FCS in YPD medium, grown at 37 °C for 2 h; for GlcNAc: 2% GlcNAc in 0.67% YNB without amino acids and ammonium sulfate, grown at 37 °C for 2.5–3 h). For the untreated control, cells were grown in YPD medium until it reached OD600 0.4–0.6. For temperature control, cells were grown in 0.67% YNB (without amino acids and ammonium sulfate) supplemented with 2% glucose and incubated at 30 °C for 3 h. Cells were harvested at the hyphal growth stage and cell density in different treated samples was examined approx. 5.5–9.5 × 108 cells/ml. Cells were maintained with constant shaking of 200 rpm under appropriate temperature and duration as mentioned before, and harvested by centrifuging for 10 min at 4000 rpm. The pellets were washed twice with sterile double-distilled water, and again pelleted down by centrifuging at 4000 rpm for 10 min. The pellets were stored at − 80 °C for until further use.

Wild-type cells treated with each different filament inducing conditions were harvested as a biological sample and then used for protein extraction. Three such biological samples for each condition were pooled for the single set of protein extraction followed by lysate preparation, enrichment of phosphoproteins, and LC-MS analysis, respectively. Above procedure was repeated twice for MS/MS identification. We further choose a list of proteins according to best score and highest peptides match from both the result files and do the differential analysis between control and treatment.

Preparation of lysate, extraction, precipitation, and digestion of total protein

The frozen cell pellets were ground to a fine powder using mortar and pestle in liquid nitrogen. The ground cells were homogenized with ice-cold extraction buffer [50 mM Tris HCl (pH 7.5), 1 mM ethylene diamine tetra-acetic acid (EDTA) (pH 8.0), 100 mM sodium chloride, 1 mM dithiothreitol, 0.5 mM phenylmethane sulfonyl fluoride (PMSF), PhosSTOP (1 minitablet per 10 ml of extraction buffer; Roche, Basel, Switzerland)], clarified by centrifugation at 13,000 rpm for 15 min, and supernatant was collected. Proteins from three biological samples were pooled together, and protein concentration in supernatant was determined using a Quick Start™ Bradford Protein Assay kit (Bio Rad, Hercules, CA, USA) with bovine serum albumin as a standard. One milligram of protein was precipitated with ice-cold acetone (1:4 volume) overnight, washed with ice-cold acetone and pellet was collected by centrifugation at 13,000 rpm for 10 min at 4 °C. The pellets were gently dried, dissolved in 40 mM ammonium bicarbonate (pH 8.0), reduced with dithiothreitol (10 mM), alkylated with iodoacetamide (25 mM), and finally trypsin digested (Kanshin et al. 2013). After overnight digestion with trypsin at 37 °C, the digest was acidified with trifluoroacetic acid (TFA), and peptides were desalted and lyophilized. The proteins extracted from each biological replicate were digested, and the resulting petides injected in nanoLC for further offline MS-MS analysis.

Enrichment of phosphopeptides and mass spectrometric analysis

Phosphopeptide enrichment of different samples were performed separately using Pierce™ TiO2 Phosphopeptide Enrichment and Clean-up Kit (Thermo Fisher Scientific Inc., Waltham, MA, USA), and phosphopeptides were purified and desalted using Graphite column (Villén and Gygi 2008), eluates were lyophilized, dissolved in 0.01% trifluoroacetic acid (TFA), and analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS). LC-MS/MS analysis was performed in Analyst® TF 1.6 and MultiQuant™ Software (ABSciex, Framingham MA, USA), with an Information-Dependent Acquisition (IDA) mode using Eksigent nano-LC (ultra 2D) coupled with Triple ToF 5600 (ABSciex, Framingham MA, USA). The details of trap column and reverse phase column for peptide loading and elution of fractions were ChromXP nanoLC Trap column (350 μm ID × 0.5 mm, ChromXP C18 3 μm 120 Å) and reversed phase ChromXP nanoLC column (75 μm ID × 15 cm, ChromXP C18 3 μm 120 Å). A 10 μL sample was loaded on a trap column with 10 μL/min flow rate after premixed in loading buffer mobile phase A (95% water, 5 and 0.1% formic acid, washed for 40 min to remove excess salt). The peptides were then resolved on an analytical column with same mobile phase A buffer and elution buffer mobile phase B (95% ACN, 5% water, and 0.1% formic acid) (flow rate of 250 nL/min). Initially, the gradient started at 5% buffer B, held for 1 min, then increased linearly up to 30% buffer B at 80 min and maximum of 90% buffer B in another 20 min. Before re-equilibration at 5% B for 15 min, the gradient was held at 90% buffer B for 5 min. Positive ion mode in m/z 350–1200 Da with a 0.25 s TOF MS accumulation time were selected for MS/MS product ion scan. Subsequent MS and MS/MS scan were done applying IDA advanced “rolling collision energy.”

Database search, data filtering, and phosphorylation site localization

Raw data was searched using ProteinPilot™ Software v.4.5 (Applied Biosystems/MDS Sciex, Foster City, CA) in high-resolution mode against a target decoy (reversed) version of C. albicans proteome sequence database (http://candidagenome.org/) with tryptic peptides with up to two miscleavages, carbamidomethyl cysteine as fixed modification, and phosphorylation at serine, threonine, and tyrosine as internal modifications. C. albicans proteins were identified using paragon algorithm against the Candida Genome Database. A 1% false discovery rate (FDR) in global protein level was applied for protein identification. Further, prediction of phosphorylation sites was performed with NetPhos 3.1 (http://www.cbs.dtu.dk/services/NetPhos/).

Proteomics data availability

The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (Deutsch et al. 2017) via the PRIDE (Vizcaíno et al. 2016) partner repository with the dataset identifier PXD008616.

Identification and classification of C. albicans protein kinases

Protein sequence data of C. albicans SC5314 was retrieved from Candida Genome Database (http://candidagenome.org/) and searched for protein kinases using the Hidden Markov models (HMMs) of the “typical” protein kinase clan [Pkinase (PF00069) and Pkinase_Tyr (PF07714)] by HMMER v3.0 with an E-value cutoff of < 1.0 (http://hmmer.org/). The proteins were aligned with PFAM kinase domain models to confirm the presence of kinase domains, and the results were verified using ScanProsite (http://prosite.expasy.org/) and SMART database analyses (http://smart.embl-heidelberg.de/). Classification of identified PKs was performed using InterProScan v5 (Jones et al. 2014) and further confirmed by phylogenetic approach. The amino acid sequences of PKs were imported into MEGA v7.0.21 (Kumar et al. 2016), and multiple sequence alignment was performed using ClustalW with default parameters. The alignment file was then used to generate the phylogenetic tree by neighbor-joining method with 1000 bootstrap iterations.

Functional annotation and construction of interaction network

GO annotation of identified proteins as well as protein kinases was performed using Blast2GO v4.0.7 (Gotz et al. 2008). Precisely, the protein sequences were imported into Blast2GO interface and BLASTP was performed against yeast database with the taxonomy filter, ascomycetes (taxa: 4890, Ascomycota). Further, mapping and annotation were performed, and InterPro results were merged with annotation data and GO-Slim was executed with the subset “Candida albicans.” The annotations were plotted as a bar diagram using BGI WEGO tool (http://wego.genomics.org.cn). Further, GO enrichment was performed based on Benjamini and Hochberg false discovery correction value (Q-value) at 0.05 for the proteins using BiNGO, and the networks for biological process, molecular function, and cellular component were visualized in Cytoscape v2.6 (Shannon et al. 2003). Further, physicochemical characteristics of identified proteins including molecular weight, isoelectric point (pI), and instability index were predicted using ProtParam tool of ExPASy (http://web.expasy.org/protparam/). Protein interaction network of identified phosphoproteins as well as protein kinases was generated using STRING database (http://string-db.org/) with a confidence score of at least 0.4 (Liñeiro et al. 2016) and the network was imported into Cytoscape v2.6 (Shannon et al. 2003).

Construction of physical and comparative maps

The chromosomal positions of corresponding genes were retrieved from the gff file of C. albicans SC5314 genome data (http://candidagenome.org/) using in-house perl script and physical map was constructed with MapChart v2.3 (Voorrips 2002). Further, the gene sequences were BLASTN searched against the genomes of Candida dubliniensis CD36 and Candida glabrata CBS138 available at Candida Genome Database using default parameters. Gene pairs showing more than 80% homology were considered as orthologs, and reciprocal BLASTN was performed to confirm the potential orthologs, and the comparative map was visualized using Circos v0.69-4 (Krzywinski et al. 2009).

Results

To understand the protein phosphorylation events during yeast to hyphal transition in C. albicans (wild-type strain SC5314), we selected three factors namely, higher temperature (40 °C), serum (fetal calf serum), and GlcNAc as inducers. As different inducible factors probably follow different pathways to induce the hyphal state, therefore, experiments to induce hyphal growth with different inducers could not be performed under the same growth stage/time. The morphology of the cells was microscopically observed for hypha formation, which showed that all the three treatment conditions have induced the formation of hyphae at different time points (Fig. 1). Total protein was extracted from C. albicans hyphae grown under above three conditions and their respective controls, trypsin digested, and phosphopeptides were enriched using titanium dioxide (TiO2) microcolumn. For each sample, proteins from three biological samples were pooled together for a phosphopeptide enrichment process. Trypsin digested protein samples were processed for phosphopeptide enrichment through a TiO2 microcolumn. Further, desalted and purified eluted phosphopeptides were dried and then resuspended in 1% formic acid-ACN buffer for LC-MS loading. Two technical replicates of each sample were used for the phosphopeptide enrichment and LC-MS. Further analysis was performed with the proteins of best identity scores considering all the biological and technical replicates. The Protein ANalysis THrough Evolutionary Relationships (PANTHER) was employed for the protein identification through ProteinPilot™ Software v.4.5 (Applied Biosystems/MDS Sciex, Foster City, CA) which detects the cleaved peptides against every single protein. We have got the peptide range of minimum 0 to maximum 110 for single protein with 99%–100% fit confidence threshold value. Further analysis was performed with proteins having at least two peptides detected. After the data analysis, we identified a total of 2394, 766, and 1217 phosphorylated peptides corresponding to 341, 186, and 246 proteins phosphorylated during elevated temperature-, serum-, and GlcNAc-treated conditions, respectively (< 1% false discovery rate). Similarly, 313 and 331 phosphoproteins were identified from 2010 and 2096 phosphopeptides in temperature control and untreated control samples, respectively. Overall comparative phosphoproteome analysis revealed that 56 phosphoproteins were commonly present in all the five conditions; however, 30, 27, 9, 16, and 28 phosphoproteins were unique to elevated temperature, GlcNAc, serum, temperature control, and overall control, respectively (Fig. 2a).

Fig. 1
figure 1

Microscopic examination of hyphal morphology in Candida albicans wild-type strain (SC5314) grown in YPD medium alone (control), at elevated temperature (40 °C) as well as treated with serum (fetal calf serum) and N-acetyl-glucosamine (GlcNAc). The induction of hypha formation was observed at different time points for all the three conditions (3 h after temperature elevation, 2 h and 2.5 h after treatment with serum and GlcNAc, respectively). Represented photos are at the harvesting time of the cells at the hyphal stage

Fig. 2
figure 2

a Venn diagram showing the comparative analysis of phosphoproteins identified by LC-MS/MS under different hypha-inducing as well as control conditions. Frequency distribution of phosphoproteins and phosphorylation sites under different hypha-inducing conditions namely b temperature, c serum, and d GlcNAc. Venn diagram demonstrating comparative statistics of phosphoproteins, distribution of unique phosphorylation sites sorted by the modified amino acid namely serine (pS), threonine (pT), and tyrosine (pY), and frequency distribution of phosphoproteins according to the number of identified phosphorylation sites are shown for each condition

Hyphal phosphoproteome during temperature induction

A total of 245 and 214 phosphoproteins were identified during elevated temperature (40 °C) and control sample (30 °C), respectively. Among these, 60 and 29 were uniquely phosphorylated at 40 °C and 30 °C, respectively, whereas 185 proteins were common to both the conditions (Fig. 2b; Supplementary Table S1). The phosphoproteins unique to higher temperature condition were classified according to the number of phosphorylation sites (P-sites), which showed a higher percentage of the proteins (35%) containing one P-site, followed by phosphoproteins with two P-sites (23%). However, a significant percent of proteins (18%) were found to possess more than four P-sites (Fig. 2b). Further analysis of phosphorylation sites in these 60 proteins revealed the presence of 175 unique P-sites, of which 139 (79.4%) were found in serine, 32 (18.3%) in threonine, and 4 (2.3%) in tyrosine residues (Fig. 2b). In addition, the physicochemical properties of these proteins were examined, which showed that the length and molecular weight of identified phosphoproteins ranged from 87 to 2127 amino acids, and 10.2 to 235.4 kDa, respectively, with Q5AAQ5, a putative glutamate synthase as the largest identified phosphoprotein, whereas A0A0A6PGY8, a predicted protein of mitochondrial intermembrane space is the smallest. Similarly, the isoelectric point of these phosphoproteins also varied significantly from 11.37 (Q5A6A1) to 3.9 (Q9HFQ5) (Supplementary Table S1).

Functional annotation of 60 phosphoproteins unique to elevated temperature treatment showed their involvement in diverse biological processes and molecular functions (Fig. 3). Among the biological processes, these proteins were found to be predominantly involved in several metabolic functions than cellular processes. The participation of these proteins in the metabolism of carboxylic acid, cellular ketones, organic acids, and other small molecules was predicted (Supplementary Fig. S1). The cellular processes in which the proteins participate were transmembrane transport and energy coupled proton transport against electrochemical gradient. In case of molecular function, the phosphoproteins were evidenced to possess binding as well as catalytic activities. Diverse catalytic events were identified including ligase, isomerase, oxidoreductase, and transferase activities, and in case of binding, these proteins were predicted to bind to both nucleic acid and proteins. Cellular component analysis predicted that these proteins are predominantly localized in cytoplasm, and mostly reside as protein complexes such as pyruvate dehydrogenase complex and fatty acid synthase complex (Supplementary Fig. S1).

Fig. 3
figure 3

Gene ontology (GO) annotation and the interaction of phosphoproteins uniquely present in different hypha-inducing conditions namely, temperature, serum, and GlcNAc. Y-axis refers to the percentage of phosphoproteins identified within each condition relative to the total number of identified phosphoproteins

Phosphoproteome of serum-induced hyphae

In case of serum-induced hypha samples, 106 and 212 phosphoproteins were identified in serum-treated and control conditions, respectively. A total of 20 phosphoproteins were found unique to the serum-induced hyphal samples, and 86 were common in both serum-treated as well as control conditions (Fig. 2c). Classification of these 20 phosphoproteins based on the number of phosphorylation sites showed that 45% contained one P-site, and 35% possessed more than four P-sites. Around 10% of the phosphoproteins contained two P-sites, whereas 5% each accounted for phosphoproteins with three and four P-sites. Distribution of unique phosphorylation sites sorted by the modified amino acids among the 20 phosphoproteins revealed that predominant phosphorylation occurs in serine residues (74%), followed by threonine (23.4%) and tyrosine residues (2.6%) (Fig. 2c). Among these proteins, A0A0A3BRH2 encoding for a putative bifunctional carbamoylphosphate synthetase-aspartate transcarbamylase was the largest protein (2216 amino acids). Similarly, the smallest protein was A0A0A4CY57 (105 amino acids), which encodes for a macrophage/pseudohyphal-induced ribosomal protein (Supplementary Table S1).

Functional annotation of 20 phosphoproteins unique to serum-induced hypha showed the involvement of these proteins in diverse metabolic as well as cellular processes (Fig. 3). In case of molecular function, these proteins were predicted to be involved in molecular transducer activity, binding, and catalytic activity. The catalytic activity of these proteins was highlighted by the terms kinase, carbamoyltransferase, dehydrogenase, oxidase, and hydrolase activities (Supplementary Fig. S1). Cellular component analysis showed the localization of these proteins in several macromolecular complexes, especially in tricarboxylic acid cycle enzyme complex. Otherwise, the proteins are localized in intracellular regions of the cell (Supplementary Fig. S1).

Hyphal phosphoproteome induced by GlcNAc

In GlcNAc-induced hypha and control, a total of 154 and 212 phosphoproteins were identified, respectively. Comparative analysis of both the datasets revealed the presence of 53 phosphoproteins unique to GlcNAc-induced hypha (Fig. 2d). These phosphoproteins were then categorized based on the number of phosphorylation sites, which showed that 32.7% contained more than four P-sites, while 25% had one P-site. Phosphoproteins with two, three, and four P-sites corresponded to 21.2, 11.5, and 9.6%, respectively (Fig. 2d). Percentage of amino acid phosphorylation in these proteins showed that predominant phosphorylation was evidenced in serine residues (69.8%), followed by threonine (29.8%) and tyrosine residues (0.4%). Further, length and molecular weight of these phosphoproteins ranged from 1097 to 57 amino acids, and 125.4 to 6.6 kDa, respectively (Supplementary Table S1). The largest and smallest identified proteins were A0A0A6I1U4 (cytosolic leucyl tRNA synthetase) and C4YMQ1 (cytosolic small ribosomal subunit).

Gene ontology annotation of 53 phosphoproteins identified in the present study revealed their roles in several metabolic processes involving macromolecules, proteins, amino acids and other bioactive compounds (Fig. 3; Supplementary Fig. S1). Interestingly, a subset of these proteins was predicted to be involved in the regulation of protein modification processes, which includes regulation of amino acid phosphorylation and dephosphorylation. A predominant of these proteins constitutes the ribosomal subunits, and in addition, these proteins possess the molecular functions of kinase regulator activity, phosphatase regulator and inhibitor activities, and transcription regulator activity. Cellular component analysis showed the presence of these proteins in ribonucleoprotein complex, cAMP-dependent protein kinase complex, and pyruvate dehydrogenase complex. Some proteins were also shown to be localized in intracellular non-membrane-bounded organelles within the cell (Supplementary Fig. S1).

To identify the proteins co-phosphorylated during all three hypha-inducing conditions, comparative analysis was performed among 60, 20, and 53 phosphoproteins identified during higher temperature, serum, and GlcNAc treatments, respectively (Supplementary Fig. S2). The results showed that no protein was commonly phosphorylated between all the three conditions; however, nine and four proteins were common between GlcNAc and serum, and GlcNAc and temperature, respectively. On the other hand, there was no commonly phosphorylated protein between serum- and temperature-treated condition (Supplementary Fig. S2).

Network-based functional annotation of interacting phosphoproteins and protein kinases

Kinases facilitate phosphorylation in key proteins, thereby regulating cellular and molecular processes at post-translational level. In view of this, an attempt was made to study the total repertoire of PKs present in C. albicans. In silico mining of PKs using Hidden Markov model-based approach identified the presence of 102 proteins belonging to PK superfamily (Supplementary Table S2). Classification of these proteins based on domain architecture revealed that 78 (76.5%) belong to serine/threonine protein kinase, 9 (8.8%) were AGC-kinases, 7 (6.9%) were phosphatidylinositol 3- and 4-kinases, 4 (3.9%) were MAP kinases, 2 were choline/ethanolamine kinase, and 1 each belong to tyrosine and histidine kinase. In accordance with this, phylogenetic analysis classified the PKs into six groups, where AGC-kinases constituted class I, serine/threonine protein kinases formed class II, MAPKs fall in class III, tyrosine and histidine kinases formed class IV, choline/ethanolamine kinases were in class V, and phosphatidylinositol 3- and 4-kinases constituted class VI (Fig. 4). Among these proteins, C6_01320W_PI3K05 was the largest protein (3821 amino acids), whereas C5_02440C_TK01 was the smallest (296 amino acids).

Fig. 4
figure 4

Classification and phylogenetic relationships of Candida albicans protein kinase subfamilies. The phylogenetic tree was constructed using MEGA7 software, by NJ method with 1000 bootstrap iterations. The different subfamilies are highlighted with different colors. The gene IDs are suffixed with the notations AGC, ST, MAPK, HK, TK, CK, and PI3K denoting AGC-kinases, serine/threonine protein kinases, mitogen-activated protein kinases, histidine kinase, tyrosine kinase, choline/ethanolamine kinases, and phosphatidylinositol 3- and 4-kinases, respectively

Phosphoproteins identified in response to treatment with higher temperature, serum, and GlcNAc were combined and redundant proteins were removed. The non-redundant set of phosphoproteins along with PK data was used to construct a functional annotation-based interaction network (Fig. 5). Interaction of the phosphoproteins and protein kinases was identified by searching the STRING database, and an interaction network was constructed using Cytoscape. The interaction network revealed eight important functional categories with a total of 86 phosphoproteins and 19 kinases of C. albicans. Interestingly, the kinases were shown to interact with 32 phosphoproteins predominantly during filamentous growth (Fig. 5). The data suggests that among the kinases present in the interaction network, serine/threonine (SR) protein kinases constitute the majority (85%), and the other three were AGC-kinases (15%) (Fig. 5). This underlines the crucial role of kinases, particularly, the serine/threonine protein kinases in phosphorylating the functional proteins during hyphal formation in C. albicans. Further, 32 phosphoproteins were predicted to have a role in nucleic acid binding and seven in amino acid metabolism. Four proteins were found to function in regulating translation, and three each in ribonucleoprotein complex assembly, enzyme regulator, and transporter activities. In addition, one protein was predicted to function in ribosome biogenesis (Fig. 5).

Fig. 5
figure 5

Network-based functional annotation of interacting phosphoproteins and protein kinases. Interaction of the phosphoproteins and protein kinases was identified by searching the STRING database and interaction network was constructed using Cytoscape. The interaction network revealed eight important functional categories with a total of 86 phosphoproteins and 19 protein kinases of C. albicans. The kinases (represented in the pink text on sky blue background boxes) were shown to interact with 32 phosphoproteins (represented in black text on sky blue background boxes) predominantly during filamentous growth

Physical and comparative mapping of identified phosphoproteins and protein kinases

Genes encoding the phosphoproteins and PKs identified in the present study were physically mapped on to the haploid set of chromosomes (A) of C. albicans (Fig. 6a). The physical map showed an uneven distribution of genes on all the chromosomes of C. albicans, with a maximum of 56 genes (21.5%) on chromosome 1 and 54 genes (20.8%) on chromosome 2. Among all, chromosome 6 encompassed a minimum of 17 genes (6.5%), and the average distribution of all 264 genes on C. albicans genome accounted to 6.8 genes per Mb. Comparative mapping of these genes between the genomes of C. albicans, C. dubliniensis and C. glabrata showed that a maximum of orthologous gene-pairs occurs between C. albicans and C. dubliniensis (217 genes; 82%), whereas only 36 genes (13.6%) showed homology with C. glabrata (Fig. 6b; Supplementary Tables S3 and S4). Interestingly, the synteny map between C. albicans and C. dubliniensis showed that the genes from the chromosomes of C. albicans demonstrated orthologous relationship with the genes from the same chromosome of C. dubliniensis (Supplementary Table S3). A maximum of 50 genes from C. albicans chromosome 2 (23%) showed synteny with C. dubliniensis chromosome 2, whereas a minimum of 14 genes (6.5%) on C. albicans chromosome 7 showed orthology with C. dubliniensis chromosome 7. Among C. albicans and C. glabrata synteny, predominantly genes from chromosome 1 of C. albicans (12 genes) showed homology with C. glabrata chromosomes (Supplementary Table S4). In the case of C. albicans-C. dubliniensis orthology, genes distributed over all the chromosomes of C. albicans showed a corresponding orthologous gene-pair with C. dubliniensis; however, to the C. albicans-C. glabrata synteny, the genes from chromosome 6 did not show any orthologous relationship with the C. glabrata genome. Similarly, only one gene from chromosome 5 and two genes from chromosome 7 showed orthology with C. glabrata. Interestingly, only two protein kinases namely C1_10220C_AGC03 and C3_04470W_AGC08 showed a syntenic relationship with C. glabrata genome, whereas 67 kinases showed an orthologous relation with C. dubliniensis. The presumptively higher degree of gene-based orthology between C. albicans-C. dubliniensis than C. albicans-C. glabrata could be attributed to the recent divergence of C. albicans and C. dubliniensis from a common ancestor (Diezmann et al. 2004).

Fig. 6
figure 6

Physical and comparative maps of identified phosphoproteins and protein kinases. a View of distribution of genes encoding for phosphoproteins and protein kinases identified in the present study on the haploid set of chromosomes (A genome) of Candida albicans. Vertical bars represent the chromosomes and the values on the left of each bar denotes the physical position (Mb) of each gene (IDs given on the right). Different font color of the IDs corresponds to different datasets, where black denotes protein kinases, and red, green, and pink denote the phosphoproteins identified in treatment with temperature, serum and GlcNAc, respectively. b Comparative mapping of identified phosphoproteins and protein kinases between the genomes of C. albicans (Ca), C. dubliniensis (Cd), and C. glabrata (Cg). The blocks represent chromosomes (haploid set) of respective genomes and the lines indicate the position of orthologous gene-pairs

Discussion

The human fungal pathogen C. albicans switches from yeast to hyphal growth when exposed to hypha-inducing conditions such as elevated temperature, and treatment with serum or GlcNAc and this morphological switching is crucial for the penetration of host tissues and to evade host phagocytic destruction (Gow et al. 2012). Although several genetic determinants of filamentous growth of C. albicans have been identified and characterized (Leberer et al. 2001; Berman and Sudbery 2002; Kumamoto and Vinces 2005; Carlisle et al. 2009; Moyes et al. 2010), understanding the phosphoproteome dynamics during hyphal morphogenesis is still lacking. The role of a gene is ultimately determined by its protein product, and the function of the protein is in-turn regulated by several post-translational modifications. One among them is protein phosphorylation, which has been shown to play roles in yeast-hypha growth switch and determining the pathogenicity (Sinha et al. 2007; Caballero-Lima and Sudbery 2014). However, global analysis of phosphoproteome in C. albicans hypha induced by external stimuli has not been well reported. Previously, the phosphoproteome in cells grown with GlcNAc was studied, which identified several proteins that were phosphorylated during hyphal induction (Willger et al. 2015). Considering the dynamics of phosphorylation/dephosphorylation events during hyphal morphogenesis which may not be uniform under different hypha-inducing conditions, the present study was performed to explore the phosphoproteome under three hyphae-inducing conditions namely higher temperature, serum, and GlcNAc. Precisely, the cells were exposed to the said conditions, followed by protein extraction, phosphoprotein enrichment, and LC-MS/MS analysis.

A total of 245, 106, and 154 phosphoproteins were identified in elevated temperature-, serum-, and GlcNAc-treated samples, respectively. In case of control samples for temperature and serum/GlcNAc, 214 and 212 phosphoproteins were identified, respectively. Comparative analysis of phosphoproteome data revealed several unique as well as common phosphoproteins. A total of 30, 27, and 9 phosphoproteins were unique to temperature, GlcNAc, and serum treatments, respectively, whereas 16 and 28 phosphoproteins were unique to the control samples for temperature and serum/GlcNAc treatments, respectively. However, 56 phosphoproteins were common to all the treated and control samples, suggesting that phosphorylation of these proteins could be playing essential cellular function in the survival of C. albicans. Since the present study was focused on examining the phosphoproteins which are unique to the hypha-inducing conditions in comparison to respective control, the corresponding datasets were compared using computational methods.

In case of temperature treatment, the number of phosphoproteins increased from control (214) to treated sample (245), showing 185 which were common between both the conditions whereas 60 and 29 were unique to treated and control samples, respectively. This increase in the number of phosphoproteins during elevated temperature indicates the differential phosphorylation of additional genes compared to control conditions. It also suggests that 29 proteins identified as phosphorylated in temperature control might be dephosphorylated at higher temperature. However, an opposite trend was observed in case of serum and GlcNAc treatments, since the number of proteins specifically phosphorylated during these treatments (20 and 53, respectively) were lesser when compared to control. The phosphoproteome of serum-induced hypha possessed 20 proteins unique to serum-treated sample, 86 common in both serum-treated as well as control conditions, and 126 proteins unique to control sample. In case of the hyphal phosphoproteome induced by GlcNAc, 53 and 111 proteins were unique to GlcNAc treatment and control, respectively, with 101 proteins common to both the conditions. This relatively lower number of phosphoproteins in treated samples than control conditions suggests the occurrence of dephosphorylation events during hyphal morphogenesis induced with serum and GlcNAc. Further, comparative analysis of these three datasets showed that no protein was commonly phosphorylated between all the three conditions, which suggests the presence of phosphoregulatory machinery distinct to hypha-induction by different treatments.

Among the proteins that showed hypha-specific phosphorylation in the present study, few are well known for their roles in hyphal morphogenesis in C. albicans. These include a heat shock protein Ssb1, a pyruvate kinase Cdc19, a eukaryotic translation initiation factor eIF4G (TIF4631) and 40S ribosomal protein S4 (Rps4A). Ssb1, an enzyme in GMP biosynthesis, was found to be upregulated after serum treatment (Pitarch et al. 2001). The overexpression analysis also demonstrated the development of hyperfilamentation in colony morphology in C. albicans when cultured in hyphal-inducing serum media (Pitarch et al. 2001). This study along with our observation reveals that Ssb1 is phosphorylated when cells are treated with serum, suggesting its role in hyphal development. Further studies on Ssb1 protein phosphorylation will confirm its role in hyphal development, as essentiality of Ssb1 gene limits its knockout (Skrzypek et al. 2017). Another protein of glycolysis pathway protein kinase Cdc19 also showed the filamentation-associated phenotypes, as its null mutant has defective filamentous growth (Uhl et al. 2003). The Cdc19 phosphorylation has been well studied and linked with its activity under nutrient environment (Xu et al. 2012). In our study, the presence of phosphorylated form of Cdc19 under hyphal-inducing condition suggests that this covalent modification of Cdc19 may have role in hyphal morphogenesis. Similarly, it has been shown that eIF4G protein level increases during hyphal transition (Nobile et al. 2012) and overexpression causes hyperfilamentation (Lee et al. 2005). Furthermore, Rps4A mutant was found to be defective in filamentous growth and was sensitive to osmotic stress (Lu et al. 2015). These reports as well as our observations propound that phosphorylation of eIF4G and Rps4A under hyphal-inducing condition may be associated with hyphal morphogenesis.

Fifty-three phosphoproteins unique to the GlcNAc treatment was compared with the phosphoproteins identified by Willger et al. (2015), which showed that 24 were common among both the datasets; however, the present study has identified 29 unique to the phosphoproteins which have not been previously reported. Analysis of phosphorylation sites showed that proteins with one P-site were predominant during elevated temperature- and serum-treated samples; however, proteins with more than four P-sites were maximum in GlcNAc-treated sample. In case of distribution of unique P-sites by modified amino acids, predominant P-sites were present in serine, followed by threonine and tyrosine. The extent of phosphorylation in tyrosine residues is altogether low in all the conditions and comparative dataset, which could be plausibly explained by the presence of only one tyrosine kinase in C. albicans (Zhao et al. 2014). A similar trend of higher percentage of P-sites in serine followed by threonine and least in tyrosine was reported in several organisms including C. neoformans (Selvan et al. 2014), Tetrahymena thermophila (Tian et al. 2014), B. cinerea (Liñeiro et al. 2016), humans (Song et al. 2012), and C. albicans (Willger et al. 2015). In addition to serine, threonine, and tyrosine, phosphorylation can occur on aspartate, lysine, arginine, and histidine residues; however, detection of these sites is comparatively difficult owing to the labile property of these residues under acidic conditions (Hohenester et al. 2010; Willger et al. 2015). Altogether, the difference in the phosphoproteins as well as their P-site and frequency distribution exemplifies the dynamicity of condition-specific phosphorylation events, which may influence the pathogenicity of C. albicans. In this context, the present findings could serve as a preliminary data for further functional analyses towards gaining an in-depth understanding of the physiological and biochemical characteristics of hyphal morphogenesis and fungal pathogenicity.

Kinases facilitate phosphorylation in key proteins, thereby regulating post-translational cellular and molecular processes. In the context of C. albicans hyphal growth, CDK and MAPK cascades are known to be involved in phospho-activation of downstream transcription factors, which leads to the expression of HSGs (Leberer et al. 2001; Berman and Sudbery 2002; Kumamoto and Vinces 2005; Moyes et al. 2010). However, it could be anticipated that several other kinases may also play role in regulating hyphal formation by tuning the protein machinery. In view of this, an attempt has been made to identify the total repertoire of protein kinases present in C. albicans, and to investigate their interaction with the identified phosphoproteins. In silico analysis of publicly available C. albicans proteome data showed the presence of 103 proteins belonging to protein kinase superfamily. Phylogenetic analysis categorized the protein kinases into six classes (I to VI), and differentiation of these proteins based on domain architecture conformed well with the phylogenetic classification. Interaction network-based functional annotation of these protein kinases and identified phosphoproteins revealed interesting observations. Firstly, the network demonstrated the interaction of kinases with phosphoproteins predominantly during filamentous growth. Secondly, several phosphoproteins were found to possess nucleic acid binding ability, and the members of these groups may be involved in regulation of gene expression during hypha morphogenesis. Few proteins in the interaction network were annotated as ribosome biogenesis, ribonucleoprotein complex assembly, and translation control, which might regulate the translation of transcripts to synthesis functional proteins. Overall, the interaction network will enable the selection of candidate protein kinase-phosphoprotein interacting partners for further functional characterization to delineate the molecular phenomena of hyphal morphogenesis.

The identified phosphoproteins and protein kinases were mapped onto the haploid set of C. albicans genome and physical map was constructed, which showed an uneven distribution of the proteins in all the chromosomes. Further, the physical map also revealed the clustering of protein kinases; for example, serine/threonine kinases form a peculiar cluster on chromosome 2A. This can be due to tandem duplication which could have occurred in the genome during the course of divergence; however, further analysis is required to confirm the occurrence of duplication events in C. albicans genome leading to divergence of protein kinase gene family. Comparative mapping of identified phosphoproteins and protein kinases between the genomes of C. albicans, C. dubliniensis, and C. glabrata showed a higher percentage of gene-based orthology between C. albicans-C. dubliniensis (82%) than C. albicans-C. glabrata (13.6%). The decrease in syntenic relationship in the latter could be attributed to the fact that both C. albicans and C. dubliniensis were diverged from Candida tropicalis, 20 million years ago (Diezmann et al. 2004). Further, the ratios of non-synonymous (Ka) versus synonymous (Ks) substitution rate (Ka/Ks) were computed for the orthologous gene-pairs, which showed that these genes underwent positive purifying selection pressure during the course of evolution (Ka/Ks < 1) (data not shown). Similar to C. albicans, pathogenicity of C. dubliniensis has also been highlighted recently, and this species shares similar diagnostic characteristics with C. albicans including formation of hypha and morphological switching (Gutiérrez et al. 2002). Therefore, the orthologous genes identified in C. dubliniensis could serve as potential targets for further study in this species.

Taking together all these results, our data provides the insights into phosphoregulatory network linked to the virulence characteristics of C. albicans. We analyzed the dynamic changes in the phosphoproteome after hyphal induction with elevated temperature, serum, and GlcNAc treatments and identified a number of phosphoproteins unique to higher temperature-, serum-, and GlcNAc-treated conditions. This comprehensive phosphoproteomics study of C. albicans morphogenesis will serve as a solid base to facilitate further functional studies in the aspects of protein kinase-target protein interaction in effectuating phosphorylation of target proteins, and delineating the downstream signaling networks linked to virulence characteristics of C. albicans.