Keywords

1 Extremophiles and Extreme Environments

These are various extreme habitats situated across the globe since the genesis of the earth. These extreme habitats harbor a rich microbial diversity. Certain old evidence of life like microfossils, stromatolites, microfibrous sedimentary rocks, and sedimentary carbon pool suggests that the microorganisms inhabit the earth since the archaean period, the time before 2.5 billion years (Stanley 2005). Such ancient microbial life had developed robust metabolic functions similar to many present-day living extremophiles thriving into extreme environments. The extremophiles hold secret survival “kits” to shelter at either single or multiple extreme conditions. Microbiologists are exploring their cellular properties like new gene pools, robust biomolecules, and metabolic uniqueness through the culturing methods since the discovery of extremophiles. However, despite the technological advancement for the investigation of extremophiles and their habitats, we have decoded the very limited information from the extreme biosphere (Rampelotto 2013).

Extreme territories support all the three taxonomic forms of life to flourish. However, the largest membership is represented by the Archaea followed by Bacteria and Eukaryotes. The ability of these extremophilic microorganisms to proliferate under extreme conditions is of immense importance for understanding microbial physiology and evolution. Extremophiles are best characterized according to their growth profiles, using marginal data, under certain culture conditions including salt concentration, temperature profile, pH scale, and growth under hydrostatic pressure (Mesbah and Wiegel 2008). The representative examples of extremophiles thriving in different extremities are thermophiles (45–60 °C temperature), hyperthermophiles (60–120 °C temperature), psychrophiles (below 0 °C temperature), acidophiles (below 4.0 pH), alkaliphiles (over 9.0 pH), piezophiles or barophiles (>0.5 MPa pressure), halophiles (>1 M NaCl concentration), and xerophiles (<0.85 water activity) (Horikoshi et al. 2010).

Culturability of the extreme habitats is very less due to the persistence of abiotic stresses and differences between natural environments and laboratory conditions. Additionally, culturability is a very complex physiological process that depends on various phenomena (Barer and Harwood 1999). Due to these limitations, the majority of such environment has remained unexplored. However, during the recent years, development and application of molecular techniques, such as PCR, cloning, and next-generation high-throughput sequencing, have proved quite valuable in judging the distribution and diversity of extreme habitats. Hence, due to the limitation of available culturing methods for the extremophiles, it is now being studied by the uncultivable approach referred as a metagenomics to translate the potentials of various extremophilic microorganisms. Furthermore, the holistic community can be deciphered through the metagenomics approach, whereas the traditional microbiology relies upon the cultivation of few clones or colonies. So, the metagenomics application provides the profiling of the microbial diversity of any extreme environment to analyze the entire community and can be delineated broadly as an environmental genomics, ecogenomics, or community genomics (Hugenholtz et al. 1998).

2 Metagenomics and Microbial Diversity

Development and discovery of various molecular biology techniques after the 1980s extend the genomic discipline toward its associated “omics” technologies. Metagenomics emerges in the ending of last century, which eliminates the culturability for mining the microbial information and revolutionizing microbial ecology. Metagenomics is the study of the collective forms of genomes directly isolated from the environmental sample for the comprehensive analysis of microbial diversity and ecology of a specific environment. Metagenomics studies provide the mechanism for analyzing previously unknown organisms, and at the same time, one can examine the diversity of organisms present in specific environments as well as analyze the complex interactions between members of a specific environment (Handelsman 2004). Metagenomics studies are conducted by two different approaches. One is function-based analysis, which deals with the total DNA extraction from environmental sample followed by cloning into suitable host and detection of expressed phenotypes in the host cells, whereas sequence-based analysis is mainly concerned with decoding of the extracted DNA and/or RNA using various sequencing platforms followed by assessment of taxonomic diversity. Both approaches are applicable to decipher hidden microbial gene pools and profiling of the microorganisms (Fig. 5.1). Early environmental gene sequencing is dealt with cloning of 16S rRNA gene to analyze the microbial taxonomic profile. Such work revealed the vast majority of microbial biodiversity that had been explored by cultivation-based methods (Hugenholtz et al. 1998). Slowly the shotgun Sanger sequencing or massively parallel pyrosequencing is applied to get largely unbiased samples of all genes from all the members of the sampled communities (Eisen 2007). The first metagenomics studies conducted using high-throughput sequencing by massively parallel 454 pyrosequencing transformed the studies of the microbial universe (Poinar et al. 2006; Edwards et al. 2006). Nowadays the various next-generation sequencing (NGS) platforms are utilized for the metagenomics studies, and continuous improvements in the existing sequencing technologies are often done by the original developers.

Fig. 5.1
figure 1

Standard metagenomics pipeline for environmental microbiomes research

3 Environmental Metagenomics

According to the meeting report of Earth microbiome project, one quintillion (1030) microbial cells are present on the earth. Theoretical estimation of the average quantity of DNA in the microbial cell is 10 million base pairs. As yet, we have investigated hardly 1% of the total environmental DNA by global environmental DNA sequencing efforts (Gilbert et al. 2010). This statistics may also be far greater than actual analysis; so, the massive information of microbial life on the Earth is yet unknown and/or under-sampled. Hence, we are in the beginning stage in the study of the extreme environmental metagenomics.

Early environmental metagenomics projects are considered a key trigger to drive the field of extreme microbiomes. Environmental genomic studies of the Sargasso Sea (Venter et al. 2004) are a major breakthrough in the environmental metagenomics, which leads to developing the interest among the scientific communities to initiate and explore the microbial diversity of extreme habitats using metagenomics approach. The scientific literature on “extreme environmental metagenomics” available in public domains increased quickly in the last decade indicating the development of the field (Fig. 5.2). The rapid escalation of metagenomics projects in the various online databases indicated the quick growth of metagenomics field. Currently, more than 20% metagenomes submitted into public domains are derived from the various extreme biosphere including marine/ocean, abyssal plain, desert, hydrothermal vents, permafrost, glacier, salt marsh, thermal hot springs, geyser, soda lake, hypersaline lake, submarine volcano, black smoker, acid mine drainage, etc. (Table 5.1). Undoubtedly, it is due to the recent advances in high-throughput sequencing technologies with more sophisticated bioinformatics analysis pipeline making the metagenomics study very easy and rapid.

Fig. 5.2
figure 2

Literature available in public domain on extreme environmental metagenomics assessed on 8 February 2017 on Google Scholar (n = 15,700). Data presented in the graph for the last 10 years are clockwise from the year 2007 (green series 1%) to 2016 (yellow series 23%)

Table 5.1 Metagenomes of extreme environments available in public domains (Assessed on 8 February 2017)

The recent identification of new gene pools and species of extremophiles from the extreme habitats geared up the exploration of microbial species for the industrial and biomedical potentials. At the beginning of metagenomics era, the giant vector, i.e., BAC, was used to construct the metagenomics library of the environmental DNA (Rondon et al. 2000) and function, and the sequence-based analysis was performed from each clone. Modern metagenomics approach makes it possible to know the physiology of the extremophiles, their role in the habitats and adaption to environmental pressures. So, the microbiome of extreme biosphere may help to establish the microbial community network structure, which is very useful to decode the microbial functionality, interaction, and community dynamics (Cowan et al. 2015). However, the various experimental challenges from sampling to sequencing should be addressed before conducting environmental metagenomics projects.

4 Challenges in Environmental DNA Extraction

The key challenges of conducting metagenomics studies include the sampling and transporting of the adequate intact environmental sample and extraction of the high-quality nucleic acid from the environmental sample. The stresses in the extreme site are the key hurdles for the extraction and the purification of the high-quality nucleic acids; so, the sample processing is prerequisite for environmental metagenomics project (Thomas et al. 2012). Isolation of poor-quality DNA may hamper the subsequent analysis, i.e., cloning and sequencing; so, the specific methods and protocol are needed to extract the high-quality and high molecular weight (HMW) community genomic DNA from the environmental sample. Various direct environmental DNA extraction methods including freezing-thawing, bead beating, and ultrasonication along with indirect extraction methods like PEG-NaCl-based and enzyme lysis with hot detergent treatment are being used for the extraction of environmental DNA and viable for the functional and structural profiling of the microorganisms (Delmont et al. 2011; Narayan et al. 2016). Based on direct and indirect extraction, nowadays commercial kits are also developed by many manufacturers to extract the nucleic acid in good quality and quantity from soil and water samples. However, the success of the DNA extraction depends on the microbial population and the physiological status of the cells.

Heterogeneous microbial communities exist in the environment, and all the microbial cells have the substantial structural variation in the cell wall and cell membrane. So the cellular dissimilarity in the microbial species restricts the selection of single universal cell lysis method to extract the nucleic acid. However, harsh treatment can be used but such conditions cause damage to DNA and mild treatment leads to less recovery. So, the combination of chemical, physical/mechanical, and biological cell lysis is best suitable for the extraction of high-quality environmental DNA (Bag et al. 2016). The environmental stresses have also increased the difficulties in extraction procedure as the extremities may give the adaptive and protective mechanisms to the microbial cell to survive in the extreme conditions, which may make the cell very resilient to lysis and consequently inadequate nucleic acid will be taken out that missing genomic contents of rare species. Hence, before starting on the extreme environmental metagenomics project, one should thoroughly study all the geological, biological, and cellular features to extract the high-quality HMW community genomic DNA (Table 5.2).

Table 5.2 Challenges in DNA extraction from the various extreme biosphere and methodologies required for the success of extraction (Note: CW cell wall and CM cell membrane)

5 Sequencing Platforms

Metagenomics analysis using DNA sequencing technique is performed either through gene-targeted metagenomics (i.e., 16s rRNA or 18s rRNA) for taxonomic assessment or whole genome shotgun sequencing for structural and functional analysis (Ghelani et al. 2015; Dudhagara et al. 2015; Patel et al. 2015). Presently, various NGS platforms are effectively used for the metagenomics pipeline including the AB SOLiD System, 454 GS FLX, Illumina MiSeq, Roche-454, and Ion Torrent (Liu et al. 2012; Mardis 2013). All these sequencing techniques are not feasible in off-grid analysis and offer short read length, creating the difficulties in assembly process which consequently affects the downstream analysis including the taxonomic and functional profiling. Two modern sequencing platforms (1) PacBio sequencing and (2) Oxford nanopore sequencing recently emerge, which offer the advantages over the limitations of the above-discussed NGS techniques. Both provide longer reads mainly useful for the analysis of a diverse pool of microorganisms.

5.1 PacBio Sequencing

PacBio sequencing is a real-time sequencing developed by Pacific Biosciences, California, USA. The single-molecule real-time (SMRT) sequencing of single-stranded circular DNA is based on a template called SMRTbell, which is loaded into a chip referred as an SMRT cell (Travers et al. 2010). The key features of this sequencing method are the long read length up to 104 bp which makes it suitable for microbiome analysis by the full-length sequencing of target genes, i.e., 16S rRNA (~1500 bp) and 18S rRNA (~1800 bp) (Schadt et al. 2010). Longer read output is important to improve the contiguity in the assembly process. However, the higher error rate, high cost, and lower sequencing depth are major demerits of the technique (Rhoads and Au 2015). Recently, large contigs and minimizing the errors with >99% Q20 accuracy can be achieved using long read circular consensus sequencing (CCS) and place it comparatively affordable for the metagenomics analysis pipelines (Frank et al. 2016). So the aim of the metagenomics projects can be easily achieved by obtaining long contig sizes with negligible possibilities of misassemblies. PacBio sequencing is suggested for the analysis of microbial abundance and taxonomic assessment. Furthermore, the fusion assemblies using PacBio CCS and Illumina HiSeq contigs improve statistics of assembly, overall contig length and number.

5.2 Oxford Nanopore Sequencing

It is a very impressive fourth-generation sequencing method. It is based on the nanopore embedded in the membrane, which is kept at a certain voltage. When the ssDNA or ssRNA passes through the nanopore, the current level variation is detected resulting into decoding of nucleotide order (Ashkenasy et al. 2005). Oxford nanopore technologies have devised the portable MinION sequencer. This is very fast, is small in size, and produces the 200 kb long reads with high accuracy. Ultra-portability offers the in-field metagenomics analysis and hence overcomes the difficulties associated with the preservation and transportation of extreme environmental sample to the laboratory. Environmental samples from the glacier and hot springs are easily getting degraded in transit and biases acquired by taphonomic degradation during storage and subsequent extraction. The in situ microbial community analysis using portable nanopore sequencing methods improved agility to analyze environmental microbiomes (Edwards et al. 2016). However, the off-grid metagenomics analysis should be cross-validate before applying to the search the microbial life and extraterrestrial life in their habitation.

6 Future Prospects

Environmental microbiomics will bring new insights very shortly in the comprehensive determination of the microbial composition. The newest subdisciplines within the metagenomics field referred as metatranscriptomics and metaproteomics are also new hopes to offer the more resolution in structure and function of the microbial community. In the future, the single-step DNA extraction, rapid library preparation, and fast real-time in situ DNA and RNA sequencing will uplift the extreme biosphere microbiomics. The recent emergence of progressive miniaturization in the sophisticated tools and techniques will move the laboratory-dependent analysis toward in-field study to capture the more real microbial profile. Fusions of the sequencing and Raman spectroscopy, as well as mineralization of bioinformatics tools, are also the good hope in the future for search and analysis of microbial life. However, the universal standards of procedures and protocols should be established for uniformity research on environmental metagenomics, like human microbiome.