Introduction

Industrial development needed to sustain a continually expanding human population has been accompanied by the release of a variety of xenobiotic compounds into the environment (Ogawa et al. 2003; Silbergeld and Patrick 2005; Symons and Bruce 2006; Cohen 2007). Due to their potential toxicity to wildlife and humans, several pollutants have now been totally banned worldwide from production and use (Donlon et al. 1995; Bleeker et al. 2002; Bojes and Pope 2007). Also, increased public awareness and concern has prompted the researchers to address ways to detoxify/remove these organic compounds from the natural environment. Remedial strategies require reliable methods to identify and monitor contamination, as well as effective procedures to attenuate or eliminate the pollutant. Conventional physico-chemical remediation strategies such as landfilling, recycling, pyrolysis, excavation, incineration etc., employed to remediate contaminated sites are inefficient, costly and may also lead to the formation of toxic intermediates (Dua et al. 2002). Therefore, during the past two decades, a more safe, eco-friendly and economic alternative, bioremediation has become a viable and promising means of restoring the contaminated sites (Shannon and Unterman 1993; Spain 1994; Maymo-Gatell and Schraa 1997; Dua et al. 2002).

Almost all living beings are gifted with some minimal basal level of detoxification abilities such as mineralization, transformation and/or immobilization of pollutants. However, microorganisms particularly bacteria have been well studied and used for carrying out the detoxification activities (Watanabe and Baker 2000; Rieger et al. 2002; Zhong and Zhou 2002). Bacteria harbor enormous metabolic diversity allowing them to utilize the complex chemicals as energy sources (Rieger et al. 2002; Diaz 2004; Nojiri and Tsuda 2005). Further, their ability to undergo rapid genetic evolution also enhances their chance to acquire new metabolic potential for degradation of the recently introduced xenobiotic chemicals. The history of biodegradation could broadly be classified into three main eras, i.e. pre-genomic, genomic and post-genomic era. Initial studies largely focused on the isolation and characterization of bacterial strains or a bacterial consortium with metabolizing/transforming properties using genomic tools. Initially, the genome sequencing facility was restricted to the culturable isolates. However, the limitations of culture-dependent methods have resulted in increased use of culture independent methods to determine the ecological fate of microorganisms under the natural environments (Ellis et al. 2003; Ono et al. 2007). Molecular studies mainly involved 16S rRNA (ribosomal ribonucleic acid) based phylogenetic characterization of the microbial communities/pure cultures associated with the biodegradation process followed by elucidation of the degradation pathway (Alexander 1984; Haggblom 1990; Stolz and Knackmuss 1993; Spain 1994; Timmis and Pieper 1999; Arai et al. 2000; Rieger et al. 2002; Solyanikova and Golovleva 2004; Symons and Bruce 2006). However, the knowledge of phylogeny was not sufficient to predict the underlying physiological/degradation properties of the both related/un-related organisms. Further, a few studies also focused on understanding the finer details of biodegradation process such as transcriptional regulation, kinetic behavior and structure–function relation of enzyme involved in the processes etc. (Diaz and Prieto 2000; Tropel and van der Meer 2004; de Melo Plese et al. 2005; Svedruzić et al. 2005). With the advent of bioremediation technologies the importance of impact of the bioremediation process on the indigenous microbial community structure and monitoring the survival and activity of the augmented microorganism under natural conditions has been realized (Wenderoth et al. 2003; Katsivela et al. 2005; Paul et al. 2006). Ideal remediation technology should not have any adverse effect on the total indigenous microbial community structure of the site under intervention (Iwamoto and Nasu 2001; Mills et al. 2003) and genome based approaches have made a great contribution towards completing these objectives. Majority of these methods are based on the sequencing/fingerprinting analysis of phylogenetically relevant 16S rRNA gene amplified from the total community DNA (Hur and Chun 2004; McBurney et al. 2006). However, well characterized partial functional gene using primers specific to the augmented strain(s) or amplification of highly conserved region of a functional gene(s) could also be used to monitor the survival and activity of the degrading exogenous/indigenous microorganism(s). These techniques mainly include DNA Hybridization, Quantitative Polymerase Chain Reaction (PCR), Real Time PCR and ‘Fluorescent in situ hybridization’ (FISH). Among these FISH with targeted oligonucleotides probes has come up as an invaluable molecular tool for assessment of environmental survival of degradative strain during the bioremediation process because of its ability to monitor the target microorganism within environmental sample without the need to culture or isolation of DNA (Ficker et al. 1999; Amann et al. 2001; Aulenta et al. 2004; Caracciolo et al. 2005). Although several 16S rRNA/functional gene amplification dependent environmental studies have been carried out to determine the bacterial diversity of environmental niches but all of them suffer from the drawback of being expensive and laborious (González et al. 2000). Therefore, they are being very valuable for comparison of complex communities that undergo dynamics spatial and temporal changes. This limitation associated with 16S rRNA gene library sequencing has resulted in continued attempts for development of high-throughput fingerprinting methods for quick and reliable determination of the community structure (Breen et al. 1995; Busse et al. 1996; Yang et al. 2001; Collins et al. 2006).

Some of the most common genome based fingerprinting methods implemented for characterization of microbial community structure are: (i) serial analysis of ribosomal sequence tags (SARST); (ii) oligonucleotide fingerprinting of rRNA gene (OFRG); (iii) rep PCR-genomic fingerprinting; (iv) amplified rDNA restriction analysis (ARDRA); (v) terminal restriction fragment length polymorphism (T-RFLP); (vi) denaturing/thermal gradient gel electrophoresis (D/TGGE); (vii) single strand conformation polymorphism (SSCP); and (viii) automated ribosomal intergenic spacer analysis (ARISA) (Marsh 1999; Płaza et al. 2001; Kitts 2001; Anderson and Cairney 2004; Li et al. 2006). Genome based technologies have also been exploited for genome-dependent expression data have enabled scientists or bioremediation practitioner to better understand how microbes respond to their environment and to develop specific, tailored methods to enhance in situ contaminant transformation that are rooted in quantitative data on the metabolism (and growth) of subsurface microorganisms rather than relying on assumptions of a relationship between growth and contaminant transformation activity (Rodriguez-Valera 2004; Morris 2006; Deutschbauer et al. 2006). Application of various genome based techniques used for diversity studies has already been reviewed in detail (Plaza et al. 2001; Collins et al. 2006; Li et al. 2006).

The environmental importance of genomics was predicted on the paradigm that it may be possible to tailor the metabolic efficiency of a microorganism for a particular environmental application, thereby, maximizing its benefits. Genome sequencing has made the information about a large number of catabolic genes as well as regulatory genetic elements readily available (Heidelberg et al. 2002; Rabus 2005) and opened new opportunities for developing of organisms with improved metabolic capabilities. Although, these studies have provided insights that are of great significance for the development of biodegradation processes, our knowledge of the microbial degradation pathway is still far from complete. The percentage of biodegradation studies that have proved their efficacy on the bench scale under ideal laboratory conditions and could successfully be extrapolated to the actual field conditions is still quite low. There are a few reports of application of the biodegradation technologies to the contaminated sites suggesting scope for further improvement/understanding of the processes. This could mainly be due to challenges of substrate, environmental variability, biases of culturing methods, limited biodegradative potential and viability of naturally occurring microorganisms (Seshadri et al. 2005; Thompson et al. 2005; Viñas et al. 2005).

For bioremediation to be successful under the actual field conditions, additional issues regulating the degradation process need to be addressed. Ideally, bioremediation strategies should be designed based on the knowledge of microorganisms present in the polluted environments, their metabolic abilities, and response to changes in the environmental conditions. As mentioned above, high-throughput methods for DNA sequencing (genotype), it has now become possible to analyze expression of genes under various environmental conditions (transcriptomics), identify the proteins that are expressed (proteomics) and characterize low molecular weight secondary metabolite expression (metabolomics) (Fig. 1). Biodegradation network models have been created that provide the basis for both predicting the abilities of existing or not-yet synthesized chemicals to undergo biodegradation and quantifying the evolutionary rate for their elimination in the future (Pazos et al. 2003; Diaz 2004). Although this information is far from complete, the application of genome-based ‘omic’ techniques to environmental samples should help to develop models that would predict the microbial activity under various biodegradation conditions. The present article, therefore, addresses one of the functional approaches, i.e. proteomics and its applicability in biodegradation. An interpretation where proteomics has been successfully implemented is also being discussed.

Fig. 1
figure 1

The hierarchy of “Omics approaches”

Proteomics

The proteome is the entire set of proteins encoded by the genome and proteomics is the discipline which studies the global set of proteins, their expression, function and structure. Knowledge of proteins is thus crucial in understanding the mechanism of any biological process. Although advances in genome sequencing have allowed the identification of a number of open reading frames (ORFs), but this information is far from complete. On an average, about 40% of the gene sequences detected in the genomic databases code for proteins of hypothetical or unknown function. Further, the number of genes present in a genome is less than the array of proteins found in the cell (Anderson 2006). Besides compositional complexity and concentration range, protein dynamics, i.e. the protein expression changes over time, add to the complexity of a proteome (Weston and Hood 2004). Bacterial genomes code for about 600–6,000 genes but only a part of the genome, usually 50–80% are expressed under specific life circumstances depending on the environmental stimuli that reach the cell (Razin et al. 1998; Anderson 2006). The low complexity makes bacteria a reasonable model system to address crucial and elementary issues of life processes by using proteomics approaches (Hecker et al. 2008). However, proteins act as aggregates in cellular machineries, they are targeted to their final destinations inside or outside the cell and they can be reversibly or even irreversibly modified, damaged, repaired and in hopeless cases even degraded (Hecker et al. 2008; Loh and Caoa 2008). This indicates that there is much work to be done to further increase our understanding of microbial metabolism, its regulation and capabilities.

Proteomics addresses three categories of biological interest: protein expression, protein structure and protein function. The proteomic technologies generally involve two three steps, i.e. separation/fractionation, quantitation and identification of the protein in a particular biological sample(s) (Table 1, Fig. 2). Sample fractionation can be performed according to the sample size/location whereas the separation can be performed either at the protein level or peptide level. The later involves the use of affinity separation methods such as 1 or 2 dimensional polyacrylamide gel electrophoresis (1DE or 2DE) or 1 or 2 dimensional chromatographic separation.

Table 1 Technologies used in proteomics
Fig. 2
figure 2

Steps involved in proteome analysis of bacterial isolates and total microbial community of soil perturbed with chemicals

The most popular method for protein separation and comparison is two-dimensional gel electrophoresis (2DE). Despite many advantages of 2DE over the other separation techniques viz., relatively inexpensive equipment and low sample (in μg) requirements, application of total cellular lysate directly on the gel, fractionation with high resolution and sensitivity with minimal loss of hydrophobic protein species etc., it still remains an entity of debate concerning its value in proteomics research. Detractors argue it is cumbersome, time-consuming, and lacking in automation, whereas to others it remains an efficient and well established way to separate complex protein mixtures. Also, the undisputed need for replicate studies (at least three replicate studies per sample) for the assessment of reproducibility of protein pattern and relative quantity adds to the time necessary for the assessment and interpretation of data (Kalia and Gupta 2005). Multidimensional liquid chromatography such as two-dimensional liquid-phase fractionation (PF2D) was recently used as a new method for the comparative analysis of intact proteins or protein complexes (Yan et al. 2003). This method is believed to be more reproducible than 2DE since nonporous reverse-phase HPLC provides the reproducible results for separating and isolating protein mixture in well-established mobile liquid phase (Wall et al. 1999), and it can be used to elucidate various biodegradation pathways.

The principal approaches toward global protein identification have been categorized into two main categories, i.e. (i) the “gel based approach” (2DE–MS/MS) which involves use of 2DE alone or 2DE in combination with mass spectrometry (MS) and (Yonekura-Sakakibara et al. 2008) and (ii) “gel free approach” which involves the use of multidimensional protein identification technology (MudPIT) (Wolters et al. 2001). MS has occupied a central position in the methodologies developed for the proteome analysis. It directly gives information on the mass and different structural modifications of a particular peptide such as glycosylation, phosphorylation and other post-transcriptional modifications (Kalia and Gupta 2005). It also provides the required throughput, the certainty of identification and the applicability thus serving as a method of choice to connect genome and proteome (Aebersold and Mann 2003). The sensitivity of protein identification through MS has increased by several orders of magnitude and the length of time for the process has decreased from many hours to a few minutes. Currently, two types of mass ionizers, i.e. Matrix Associated Laser Desorption Ionization (MALDI) and Electrospray Ionization (ESI) and four different types of mass analyzers, i.e. ion trap (IT), time of flight (TOF), quadrupole (q) and Fourier-transform ion-cyclotron (FT-MS) are available (Aebersold and Mann 2003; Kalia and Gupta 2005). Technological advancements have led to the development of more accurate and user-friendly MS techniques such as tandem MS (MS/MS), triple quadrupol MS|ESIqTOF, MALDIqTOF, ESI-qIT-MS, surface-enhanced laser desorption and ionization-time of flight (SELDI–TOF), LC–MS|MS (liquid chromatography–mass spectrometry) (Steen and Mann 2001; Aebersold and Mann 2003; Hopfgartner et al. 2004). The principles and instrumentation platforms of mass spectrometry applied to proteomics have already been reviewed in detail and are beyond the scope of this review (Mann et al. 2001; Cristoni and Bernardi 2003; Nesvizhskii and Aebersold 2004; Yates 2004). The selection of the appropriate protein identification method depends on whether or not the genome of a target bacterium has been sequenced. If the genome of a target bacterium is not available for protein identification the analysis tools are limited to de novo sequencing with tandem mass spectrometry (MS/MS) analysis or Edman sequencing. However, both methods are expensive and time-consuming in comparison to other mass spectrometry technologies (Kim et al. 2007).

In more recent times, gel free approaches such as LC-based proteomics method (shotgun proteomics) has been developed as an alternative to gel-based proteomics. The LC-based proteomics method separates the trypsinized peptide mixtures of an entire proteome using ion-exchange and reverse-phase columns, which are directly subjected to MS/MS analysis (Washburn et al. 2001). Shotgun proteomics is useful for the global analysis of biodegrading bacteria, however, is now routinely used to accurately identify 500–1,000 proteins which include both abundant proteins, such as metabolic enzymes, and low-copy proteins such as regulatory proteins. Therefore, LC-based proteomics has a wider dynamic range than 2DE. LC-based proteomics is a particularly useful technology for the identification of membrane proteins (Aebersold and Mann 2003). A disadvantage of LC-based proteomics is that it is difficult to quantitatively analyze tryptic peptides due to the ionization suppression effect (Kim et al. 2007). LC–MS based technology allows rapid identification of even site-specific post-translational modifications (PTMs) in proteins of interest and systematic quantification of the proteome on the semi-quantitative level. Common examples include the use of MudPIT (Wolters et al. 2001), involving early digestion of the protein mixture and multi-dimensional (e.g. Strong Cation Exchange–Reverse Phase chromatography; SCX–RP) chromatographic separation of the peptides. However, the chemical tagging of proteins or peptides using Isotope Coded affinity Tag (ICAT) via the LC–MS proteomics system facilitates a more accurate quantitative analysis (Gygi et al. 1999). The MudPIT can detect and identify low abundance proteins such as transcription factors, protein kinases, and integral membrane proteins not often observed in gel-based platforms (Gygi et al. 1999; McDonald and Yates 2003; Washburn et al. 2003). Recently Isobaric Tagging for Relative and Absolute Quantitation (iTRAQ™) reagent technique was used to label tyrosine residues which resulted in the successful quantitation of 292 heat shock proteins in Bacillus subtilis (Wolf et al. 2006). The MudPIT has made improvements in proteome research that can directly interface with mass spectrometry systems and reduce the analysis time (Aebersold and Mann 2003). Further, the combination of MudPIT with ICAT reduces the complexity of the peptide sample that must be analyzed and allows for better representation of low-abundance proteins and quantitative comparisons (Somiari et al. 2005). In addition, the use of highly sensitive fluorescent stains with a dynamic range has enabled precise quantification of proteins; this technology is being effectively employed in Differential Gel Electrophoresis (DIGE) (Wu et al. 2004; Marouga et al. 2005). This method involves tagging the proteins of two sample populations with different fluorescent dyes so that one gel is needed to separate the proteins and quantify differences between two sample populations. Among the two, gel based strategy offers the advantage of visualization of proteins and, to some extent, their modifications and therefore preserves the protein context. By contrast, early digestion strategies generate peptides upstream in the analysis workflow, because peptides are more easily amenable to separation and analysis and behave more uniformly than proteins. While chemical tagging methods such as ICAT, cICAT, and iTRAQ are in vitro labeling techniques required for the subsequent enrichment, purification, and MS and/or MS/MS analysis of proteins, the novel metabolic labeling method-stable isotope labeling by essential amino acid culture (SILAC), is an in vivo labeling technique (Ong et al. 2002). Thus, SILAC can be used to bypass the unnecessary chemical reactions required in the chemical tagging methods, resulting in improved reproducibility and higher confidence. However, the application of SILAC to bacteria requires a specific amino acid auxotrophic system for efficient isotopic amino acid incorporation (Kim et al. 2007).

In microorganisms many proteins take part in a multicomponent protein aggregation under different physiological responses that affects the entire complex protein interaction network. The detection of these aggregated proteins is usually based upon affinity tag/pull down/MS–MS approaches at a proteome level (Lee et al. 2004; Liu et al. 2005; Trakselis et al. 2005). Studying these interactive proteins and super-molecular complexes is one prime task of second-generation proteomics, i.e. functional proteomics which involves microarray-based assays. Protein microarray technology is a complementary method for the study of protein in vitro. To date, among the various super-molecular complexes the analysis of protein–protein, enzyme–substrate, protein–DNA and protein–oligosaccharide interactions have been carried out (LaBaer and Ramachandran 2005). For example, high-density protein microarray-based approach has been used to study protein interactions using a yeast proteome chip containing 5,800 different recombinant yeast proteins (Zhu et al. 2001). This provides enough information about the network of proteins, but lacks in evidence for sites of interaction that can further lead towards the PTMs. In biodegradation data obtained from the protein, protein-domain and peptide microarrays would certainly contribute to understanding the complex protein–protein interactions during the chemotactic movement of bacteria towards the environmental pollutants under oxygen proliferation and oxygen starvation conditions.

Proteomics in biodegradation

Past few years have witnessed a substantial increase in the use of proteomics approaches for environmental biotechnological studies; still the application of proteomics to the field of biodegradation is in its infancy (Cao et al. 2009; Schneider and Reidel 2010). Despite a growing knowledge of the range of microbial diversity, most of the microorganisms seen in natural environments are uncultivated, and their functional roles and interactions are unknown. In addition, the abilities of some microorganisms to tolerate radiation and toxic chemicals, to use many different electron donors and acceptors, or to survive at extremes of environmental conditions are all of interest. Proteomics has three main applications in biodegradation: (i) study of bacterial response to the various environmental contaminants for both microbial isolates; (ii) identification of key proteins/enzymes involved in the metabolism of toxic compounds; and (iii) community structure analysis especially during in situ bioremediation which are being discussed in detail in the following sections.

Environmental stress responses

Exposure of microorganisms extreme temperatures, chemicals, metals and other stresses have been shown to elicit different physiological responses such as changes in the membrane properties, decrease in ATP synthesis, denaturation of vital proteins etc. (Suutari and Laakso 1994; Sikkema et al. 1995; Weber and Karbe 1995). Change in membrane properties is mainly caused due to changes in the fatty acid composition of membrane lipids in response to membrane-active substances or changing environmental conditions (Cronan and Roughan 1987; Okuyama et al. 1990; Heipieper et al. 1992; Magnuson et al. 1993; Weber et al. 1993). Proteomics approaches have been used to gain insights into the responses of microorganisms to temperature, metals and other stresses (Rosen et al. 2001; Mergeay et al. 2003; Marrero et al. 2004). This is chiefly because of its capability to obtain system-wide information for non-model organisms (e.g. given the cost of procuring DNA microarrays for these species) and to obtain protein identifications without a genomic sequence. These studies have shown up-regulation of known stress-response proteins along with proteins involved in other detoxification or adaptation such as transporter proteins, lipid biosynthesis pathways and osmoprotectants. Findings by Santos et al. (2004) showed that proteins up-regulated in phenol-grown P. putida KT2440 included those involved in (i) oxidative stress response; (ii) general stress response; (iii) energetic metabolism; (iv) fatty acid biosynthesis; (v) inhibition of cell division; (vi) cell envelope biosynthesis; (vii) transcription regulation; and (viii) transport of small molecules. Cti, a protein responsible for cistrans isomerization has been well characterized in different Pseudomonas species such as P. oleovorans Gpo12, P. putida DOT-T1E and P. putida S12 (Junker and Ramos 1999; Pedrotta and Witholt 1999). It has been found that the isomerase possesses an N-terminal hydrophobic signal sequence, which is cleaved after targeting the enzyme to the periplasmic space. General function of Cti is mainly dependent on the induction/activation of other stress response mechanisms (Heipieper et al. 1996; Neumann et al. 2003). Heat-shock response has been studied extensively in several Gram-positive bacteria (e.g. B. subtilis) and Gram-negative bacteria (e.g. E. coli and A. tumefaciens) especially, the role of chaperones under stress conditions. During growth of Burkholderia xenovorans LB400 on chlorobiphenyls various molecular chaperones were expressed which were also induced during heat shock, strongly suggesting that exposure to chlorobiphenyls constitutes stress conditions for strain LB400. Further, growth of strain LB400 on biphenyl, oxidative stress was evidenced by the induction of alkyl hydroperoxide reductase AhpC, which was also induced during exposure to H2O2. 4-Chlorobiphenyl and biphenyl also induced catechol 1,2-dioxygenase, as well as polypeptides involved in energy production, amino acid metabolism and transport (Agullo et al. 2007). Martinez et al. (2007) have further reported molecular response of strain LB400 to 4-chlorobenzoic acid as revealed by 2DE analysis. Enzymes BenD and CatA of benzoate and catechol catabolic pathways were induced. The induction of molecular chaperones DnaK and HtpG by 4-chlorobenzoic acid indicated that the exposure to this compound constitutes a stressful condition for this bacterium. Further, growth on 4-chlorobenzoic acid also induced several other Krebs cycle enzymes suggesting that the high-energy demand for aromatic compound extrusion or cellular energy requirement to combat its uncoupling effect (Martinez et al. 2007). 2,4-Dinitrophenol and benzoate have been reported to induce stress proteins in E. coli as well (Gage and Neidhardt 1993). DnaK and GroEL are induced by 2,4-dichlorophenoxyacetic acid (2,4-D) in Burkholderia sp. YK-2 (Cho et al. 2000) and by 4-chlorobiphenyl and biphenyl in B. xenovorans LB400 (Agullo et al. 2007). Lacerda et al. (2007) investigated the response of a natural community in a continuous-flow wastewater treatment bioreactor to an inhibitory level of cadmium by 2-D PAGE combined with MALDI–TOF/TOF–MS and de novo sequencing. The authors observed a significant shift in the community proteome after cadmium shock, as indicated by the differential expression of more than 100 proteins including ATPases, oxidoreductases and transport proteins. Several DNA repair proteins were observed to be rapidly up-regulated in response to cadmium shock (Hartwig 1994; Zhao and Poh 2008). Over-expression of these proteins indicates towards the important role played by these proteins in response to cadmium shock. In aerobic organisms the presence of toxic organic compounds leads to the generation of oxidative stress due to the formation of reactive oxygen species ‘ROS’ (Engelmann and Hecker 1996; Tamburro et al. 2004). In order to withstand such oxidative stress bacteria induce/overexpress catalase activity that causes the degradation of hydrogen peroxide and, thereby, may form an important part of the antioxidant defense mechanism. Catalase is overexpressed in the presence of n-butanol in Enterobacter sp. VKGH12 and under ethanol stress in Bacillus subtilis (Engelmann and Hecker 1996). Another major group of enzymes for the adaptation to high ROS concentrations seems to be glutathione-S-transferases (Favaloro et al. 2000).

Other stress related adaptive responses include changes in cell morphology and cell surface hydrophobicity wherein different morphological changes have been observed in different bacteria depending on the stress factor. For example, a considerable difference in the adaptive reactions between cells of strain VKGH12 grown in toxic concentrations of n-butanol and those that are exposed to the same concentrations as pure toxin in presence of another growth substrate. Cells gradually adapted to the growth substrate n-butanol with a decrease in size, which also increased the surface and volume ratio of their cells at increasing concentrations of the alcohol (Veeranagouda et al. 2006). The cells probably increase their relative surface and their membrane area to allow a better uptake and transformation of the substrate n-butanol. By contrast cells grown on glucose and exposed to toxic compounds such as n-butanol or aromatics increase in size and thus decreasing surface-to-volume ratio. This relative reduction of the cell surface was earlier discussed as an adaptive response against the membrane-active action of these toxic compounds during study with Pseudomonas putida DOT-T1E (Neumann et al. 2005).

In addition to cell morphology the cell surface hydrophobicity may also get changed. There are reports that the alkane and PAHs utilizing bacteria show an increase in cell surface hydrophobicity during growth on these hydrophobic compounds (Al-Tahhan et al. 2000). For example, P. aeruginosa grown on n-hexadecane represented an overall reduction in expressed lipopolysaccharides that results in increased cell surface hydrophobicity (Al-Tahhan et al. 2000). In Rhodococcus sp. strain Q15 during growth at low temperature (5°C) on hexadecane or diesel fuel, an increase in cell surface hydrophobicity occurs as revealed by electron microscopic examination when compared to glucose-acetate-grown cells (Whyte et al. 1999). Membrane proteins, thus, are of great interest for the biodegradation of aromatic pollutants such as PAHs, NACs and organophosphates, where many alterations in the organism affect cell-surface proteins and receptors (Blackstock and Weir 1999; Wang and Yuan 2005). All these studies clearly demonstrate the vital role played by the proteomics in understanding the behavior of the microorganisms under stress conditions.

Identification of catabolic protein(s)

One of the essential goals of biodegradation is to elucidate the degradation pathway(s) of toxic chemicals in different bacteria and understand the underlying molecular mechanism. Also, the metabolic capabilities of microorganisms including dehalogenation, methanogenesis, denitrification and sulphate reduction, etc. are important for applications in environmental biotechnology. A broad range of protein products control these key degradation reactions. To date, a little is known about the molecular regulation of these pathways mainly because of the non-availability of the genome sequences of most of these organisms. However, recent advances in the field of environmental proteomics have allowed us to identify proteins even from the un-sequenced organisms with the use of advanced bioinformatics tools. Proteomics thus allows the examination of global changes in the composition or abundance of these products as well as identification of key proteins involved in response of microorganisms to a pollutant (Hughes et al. 2004). Traditionally, the focus of biodegradation under the rubric of microbial ecology was on comparative qualitative assessments and functional characterization of a few phylogenetically conserved molecules. However, recent methodological developments have enhanced strategies for conducting whole cell protein assessments based on the rationale that a complete proteome map will facilitate the discovery of unique polypeptides which are important contributors to the degradation pathways. 2DE has been used extensively for this purpose which has enabled us to monitor and record global changes in the composition and abundance of proteins as well as to identify key proteins involved in response of an organism in a given physiological state. Several reports have shown sets of proteins being up- and down regulated in response to the presence of specific chemicals (Kim et al. 2004; Kuhner et al. 2005). Kim et al. (2004) identified PAHs induced proteins in Mycobacterium vanbaalenii PYR-1 by using 2DE. Similarly, Pessione et al. (2003) used this method to characterize membrane proteins in Acinetobacter radioresistens S13 that are specifically induced when grown on aromatic substrate. 2DE has also been used to gain insight into the global mechanism underlying phenol toxicity and tolerance in Pseudomonas putida KT2440 (Santos et al. 2004). Studies on identification of proteins synthesized during biodegradation of toxic aromatic pollutants have been listed in Table 2.

Table 2 Identification of proteins synthesized during biodegradation of toxic aromatic pollutants using proteomic approaches

Based on DIGE analysis a complete differentiation pattern in the total protein profile was observed when denitrifying bacterium strain EbN1 was grown in toluene and ethylbenzene under substrate-dependent regulation during anaerobic degradation studies (Kuhner et al. 2005). Further, MS based techniques have revolutionized the environmental proteomics by making it possible to extensively examine global changes in the composition or abundance of proteins, as well as, identify key proteins involved in the response of microorganisms in a given physiological state (Donnes and Hoglund 2004). A number of reports have described sets of proteins that were up or down regulated in response to the presence of specific pollutants (Kim et al. 2002; Krivobok et al. 2003). For example, a variety of differentially expressed signature proteins were analyzed using SELDI–TOF–MS in blue mussels (Mytilus edulis) exposed to PAHs and heavy metals (Knigge et al. 2004). The global regulatory role of σ54 during gentisate induction in Pseudomonas alcaligenes NCIMB 9867 was carried out using MALDI–TOF (Zhao et al. 2005). Moreover, 2DE, DIGE, MALDI–MS/MS, Nano-LC and IT MS/MS were used to compare the proteome profiles of phthalate-grown cells of Rhodoccocus sp. strain TFB with those cultured in the presence of tetralin- or naphthalene. The results suggested that different metabolic pathways are involved in the degradation of mono- and polyaromatic compounds. Using mass spectrometry a novel type of highly negatively charged lipooligosaccharide from Pseudomonas stutzeri OX1 possessing two 4, 6-o-(1-carboxy)-ethylidene residues in the outer core region was identified (Leone et al. 2004). MS is also a potentially attractive means of monitoring the survival and efficacy of bioaugmentation agents. It has therefore recently been applied for bioremediation purposes. For example peptide mass fingerprinting (PMF) was used to identify and characterize Sphingomonas wittichii strain RW1 by targeting dioxin dioxygenase, the characteristic phenotypic biomarker. Also, the proteome from RW1 cells grown on various media in the presence and absence of dibenzofuran, was analyzed using MALDI–TOF (Halden et al. 2005).

Community structure analysis

Monitoring the ecological consequences of any technological intervention that is directly or indirectly related to the environment (such as an in situ bioremediation process) is of utmost significance and it probably constitutes the most important aspect of the assessment of ecological sustainability of the process. The scope for studying such ecological consequences encompasses several non-related phenomenon, however, for biodegradation technology development, a detailed analysis of the impact of the biodegradation process on the indigenous microbial community structure is most important. The ideal remediation technology should not have any adverse effect on the total indigenous microbial community structure of the site under intervention. Also, microbial communities play key roles in the biogeochemical cycles. Our knowledge of the structure and activities in these communities is limited because analyses of microbial physiology and genetics have been largely confined to studies of organisms from the few lineages for which cultivation conditions have been determined (Delong and Pace 2001). Recent advances in the metaproteomics approach has enabled us to carry out community analysis to evaluate in situ microbial activity in a particular environment and thereby explain cellular mechanisms of un-sequenced mixed cultures invariable environmental conditions. Several proteomics studies pertaining to microbial community structure analysis of different niches have been carried out wherein several key proteins have been identified (Maron et al. 2007; Wilmes and Bond 2009; VerBerkmoes et al. 2009). Application of proteomics to environmental problems is still a burgeoning research area, but has yielded insights into phosphorous removal in waste water treatment plants (Wilmes and Bond 2006; Wilmes et al. 2008a, b), protein expression in mixed chloroethene dechlorinating cultures (Morris et al. 2007), uranium bioremediation (Wilkins et al. 2009) and biofilms associated with acid mine drainage systems (Jeans et al. 2008; Goltsman et al. 2009). Using genomic and mass spectrometry-based proteomic methods, Ram et al. (2005) evaluated gene expression, identified key activities, and examined partitioning of metabolic functions in a natural acid mine drainage microbial biofilm community. They detected 2,033 proteins from the five most abundant species in the biofilm, including 48% of the predicted proteins from the dominant biofilm organism, Leptospirillum group II. In further analyses of the same biofilms Lo et al. (2007) were able to differentiate between peptides of discrete AMD populations and found strong evidence for interpopulation recombination-an approach strongly dependent on a database containing strain-specific genome information. This method is referred to as “strain-resolving proteogenomics.” The study was expanded by Denef et al. (2009) whose extensive semi-quantitative analysis of 27 distinct AMD biofilm protein profiles revealed that specific environmental conditions select for particular recombinant types thus leading to a fine-scale tuning of microbial populations. More recently, Goltsman et al. (2009) employed both metagenomics and semi-quantitative community proteomics to analyze a Richmond mine biofilm and identified 64.6 and 44.9% of the predicted proteins of Leptospirillum groups II and III; this study nicely demonstrates the potential of a simultaneous genome and proteome approach. Wilmes and Bond (2004) studied the molecular mechanisms of enhanced biological phosphorus removal (EBPR) by a comparative metaproteome analysis of two laboratory wastewater sludge microbial communities with and without EBPR performance by 2-D PAGE combined to MALDI-TOF-MS. Major differences in protein expression profiles between the two reactors were detected; later, more than 2,300 proteins were identified by 2-D LC-MS/MS analyses of activated sludge (Schulze et al. 2005; Wilmes et al. 2008a, b), aided by reference metagenomic data from studies of EBPR sludge (Martín et al. 2006). The obtained data indicated that the uncultured polyphosphate-accumulating bacterium “Candidatus Accumulibacter phosphatis” is dominating the microbial community of the EBPR reactor and further enabled an extensive analysis of metabolic pathways, e.g. denitrification, fatty acid cycling, and glyoxylate bypass, all central to EBPR. Another study by Chuang et al. (2010) showed the presence and activity of vinyl chloride (VC)-oxidizing bacteria (ethenotrophs) in ethene enriched groundwater microcosms from a VC contaminated site using shotgun mass spectrometry based proteomic methods. Similarly, Schulze et al. (2005) used mass spectrometry (MS)-based proteomics [high mass-accuracy tandem MS (MS–MS)] to analyze the protein complement of water that contained high levels of dissolved organic matter from four different environments (a peat bog lake, soil from an unmanaged deciduous forest, soil from a managed evergreen spruce forest and acidic soil from beneath a spruce). Their aim was to determine the phylogenetic groups from which individual proteins originated and to elucidate the potential catalytic function of these proteins in the sampled ecosystems. The overall soil protein composition was investigated and the majority of proteins were of bacterial origin. In forest leachate, the number of proteins that originated from plants, fungi and vertebrates was approximately twice the number in the sampled lake water. Environmental proteomics can also detect expressions of proteins in a microbial community and provides critical insights into the important cellular activities with temporal and spatial environmental resolutions (Uchiyama et al. 1999; Cho et al. 2000; Sharma et al. 2006; Zhao et al. 2007). Such information not only yield useful biomarker proteins for environmental bioremediation, but also help to build up a proteomic databank to enable better understanding of the catalytic roles of microorganisms. Therefore it can be concluded that proteomic investigations of microbial communities in their native environments provide the most realistic information about their function but also pose the greatest experimental and bioinformatic challenges. These findings provided evidence of exchange of genes during adaptation to specific ecological niches. More recently, Benndorf et al. (2007) published a metaproteome analysis of protein extracts from contaminated soil and groundwater employing either 1-D or 2-D PAGE combined with LC and MS/MS. Proteome analyses of soils mainly suffer from numerous inorganic and organic contaminants which hamper protein separation and identification; thus, only 59 proteins could be identified although the authors presented a multi-step purification protocol combining NaOH treatment and phenol extraction. A similar approach was employed to investigate the metaproteome of an anaerobic benzene degrading community inhabiting aquifer sediments (Benndorf et al. 2009). Despite these significant advances the proteomic analysis of microorganisms in soil has been hampered by the lack of effective methods for extracting proteins directly from soil in a manner that is compatible with proteomic techniques.

Potential applications during bioremediation

As also indicated in earlier sections, biodegradation has found application in bioremediation. Studies have shown the successful use of microorganisms to remove or detoxify toxic chemicals such as alkyl benzene, polychlorinated biphenyls, nitroaromatic compounds, etc. either by exploiting the metabolic potential of endogenous microorganisms or by bioaugmentation (Labana et al. 2005; Vieth et al. 2005; Paul et al. 2006). Furthermore, post-genomic era has promoted genome level research on bacterial biodegradation, thereby, enabling environmental microbiologists to have a better understanding of the bioremediation process. The development in the study of biodegrading bacteria has also given thrust to development of suitable bioremediation strategies.

The practicability of bioremediation of sites contaminated with the aforementioned compounds is essentially based on pollutant characterization and microbial characterization. In particular, microbial characterization includes parameters for monitoring the specific catabolic diversity, population sizes and catabolic activities in a contaminated site. In situ bioremediation is generally carried out by microbial consortium and is influenced by environmental factors such as pH, temperature, water content, geological characteristics, nutrient availability, external electron availability, and the bioavailability of pollutants (Mergaert et al. 1992; Ferrari 1996). However, the understanding of the process of pollutant degradation by bacterial consortia requires broader and more complicated analyses of the environmental factors. Therefore, the extensive and intensive characterization of the catabolic genes, proteins and metabolites in microorganisms associated with pollutant degradation is absolutely necessary in order to devise the most effective strategy for bioremediation. Co-metabolism or gene expression often affects the rate of bioremediation. Because degradability depends on gene expression, the presence of biodegradation genes does not guarantee biodegradability in a given environment. The catabolic pathways for most pollutants are very complicated and differ among bacterial strains and under different environmental conditions (Winkler et al. 1995; Giacomazzi and Cochet 2004). A substrate can occasionally be transformed into several intermediates by the action of different enzymes expressed under specific conditions. Thus the complexity of catabolic enzymes makes it very difficult to study the biodegradation pathways. This suggests that the use of proteome analysis in the study of biodegradation pathways is faster and more effective than the conventional methods typically used to study these pathways. Therefore, before they can be used in bioremediation, most bacteria whose genome sequences have been completely or partially determined must be further studied using proteome analysis in order to determine whether the catabolic genes are expressed in the bacteria.

Challenges in environmental proteomics

Proteomics is now successfully established as a valuable tool to characterize the functional molecules of various signaling pathways in biomedical research, but only a few laboratories are currently applying it to environmental concerns. There are no doubts that this technology is expensive and requires highly specialized facilities and needs skilled staff to perform the analyses. Clearly, proteomics technology needs to be developed for environmental cleanup at a more reasonable cost. Nevertheless, progress is still being made to implement proteomics studies in environmental biotechnology laboratories. Even though its applications are mainly beyond the scope of current science, this technology is already offering tangible benefits by helping us to understand the principles of protein expression and identifying new target candidate proteins that can be exploited by other technologies. For community related studies the major challenge lies in the extraction of cellular proteins from soil and other high-solids matrices especially because of the presence of high concentrations of interfering compounds with properties similar to proteins which indicates there is a need for the development of new technologies for protein extraction, purification and separation. Improved bioinformatics tools are also needed to aid in the identification of proteins from un-sequenced microorganisms and especially from un-sequenced microbial communities. However, advances in the field represent major steps towards a systems view of organisms and metaorganisms. Another challenge is to integrate proteomics with other “-omics” technologies, particularly metabolomics, where low-molecular-weight primary and secondary metabolites are key role players in biodegradation.

Technology based interpretation for future implications

Analyzing low-molecular weight metabolites is essential in any biodegradation process. Detailed information about the individual organism, its proteins and end products, and the metabolites can only be obtained via a combined transcriptomics, proteomics, and metabolomics approach. The reason being, the cellular mRNA levels do not display a range, as dynamic as that of proteins inside a cell. The integrated approach would enable to identify new functional genes involved in the catabolism of xenobiotics and thereby, developing better understanding of the cellular processes of the environmentally relevant microorganisms that have not been studied yet. Figure 3 conceptualizes the use of ‘omic’ technologies towards meaningful bioremediation.

Fig. 3
figure 3

Conceptualization of the role of post-genomic technologies using a systematic biology approaches towards meaningful bioremediation. Directly extracted DNA from contaminant environmental sites and from pure culture will end up on DNA microarrays (transcriptomics) that has certain limitations in cDNA microarray analysis including parallel gene trait mapping. Extraction of protein from environmental sites and/or from pure culture will activate proteomics, interactomics and metabolomics technologies using two-dimensional gel electrophoresis, protein microarray and mass spectrometry platforms. Post translational modifications (PTMs) can be detected by functional proteomics and metabolomics approach till the end products, metabolites. The joint approach of transcriptomics, proteomics and metabolomics would allow exploring the mineralization pathways in detail

Conclusions

A variety of environmental pollutants has become susceptible to microbial degradation due to the elucidation of new catabolic pathway(s). However, our understanding of microbial mediated biodegradation has been limited because of the complexity of microbial physiology. To overcome the challenges in the field of biodegradation, new techniques such as genetic engineering, transcriptomics, proteomics and metabolomics offer remarkable promise as tools. This would help to study and understand the mechanisms involved in regulating mineralization processes in detail. Proteomics plays an essential role in determining the physiological changes of microorganisms under specific environmental influences, and metabolomics combined with gene expression allows addressing the final gene product in a comprehensive manner. On the basis of its sensitivity functional proteomics would predict metabolism of the contaminant(s) by degrading organism(s). Application of functional metabolomics might even allow cell-free biodegradation in the future. Continued technological advancements would ultimately lead to comprehensive approaches in which gene expression, protein and metabolites could be analyzed to elucidate the functioning of the entire organism.