Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Bioprospecting

The term bioprospecting refers to the systematic search for novel biological products and activities with biotechnological applications in natural habitats [1]. The course of search and discovery in biotechnology starts with the selection of the most appropriate environments and sampling methods from previous information, continues with the retrieval of the biological materials and their correct storage, moves through screening for desired attributes in the form of microbial assemblages, cells, macromolecules, metabolites, or bioactive compounds using an ever-growing toolkit, and culminates with the development of a commercial product or process ([2], Fig. 11.1). This workflow represents a value-adding chain that ends with the addition of products and services that respond to society’s needs [3]. The biotechnological potential of microbial diversity can be further improved by tools such as enzyme engineering, metabolic engineering and directed evolution [4]. Due to their exceptional microbial biodiversity, marine habitats represent fertile grounds for bioprospecting. In this chapter, we will explore the state-of-the art and emerging approaches that can be used to search for biological products and activities with applications in all biotechnological fields.

Fig. 11.1
figure 1figure 1

Workflow of search and discovery in biotechnology

2 Marine Microbial Habitats and Their Biotechnologically-Relevant Microorganisms

The marine environment covers more than 70 % of the Earth’s surface and contains 97.5 % of the water of our planet. Marine habitats contain a rich variety of distinctive life forms, the majority of them represented by microorganisms. Salinity is the major environmental determinant of microbial community composition, clearly distinguishing marine habitats from terrestrial ones [5]. Moreover, marine sediments constitute the most phylogenetically diverse environments on Earth, in contrast with soil, which bears high species-level diversity but has below-average phylogenetic diversity [5]. Marine microorganisms are progressively recognized as a promising source of biotechnologically valuable products and capabilities. Over the last years, many biomolecules with unique structural features and unique molecular mode of action have been identified in marine environments [6]. However, many marine microbial habitats still remain largely unexplored, understudied, and underexploited in comparison with terrestrial ecosystems and organisms.

Through billions of years of evolution, marine microorganisms have developed unique metabolic and physiological capabilities to thrive in a variety of marine habitats. In fact, oceans include the greatest extremes of temperature, light, and pressure encountered by life [7]. In recent years, marine microorganisms living under extreme conditions have been the focus of bioprospecting efforts as novel sources of biomolecules with biotechnological potential [8, 9]. For example, hydrothermal vents comprise microorganisms with distinct metabolisms based on chemosynthesis. The high diversity and abundance of these communities are comparable to those found in shallow tropical seas, and thus they are recognized as potentially rich sources of biologically active natural products [10]. Piezophilic microorganisms inhabiting deep-sea habitats are also of interest, as they can provide enzymes for high-pressure bioreactors, among other applications [11]. Interestingly, these microorganisms can be either psychrophilic or thermophilic due to the cold temperatures of the deep ocean or to their proximity to hydrothermal vents, respectively.

Other types of marine microorganisms with potential biotechnological capabilities include those living under epiphytic, epibiotic, and symbiotic lifestyles. Competition and defence strategies characteristic of surface-associated microorganisms, such as the production of toxins, signaling molecules, and other secondary metabolites, constitute an unparalleled reservoir from a biotechnological perspective [12, 13]. Bacteria living in symbiotic associations with marine invertebrates often produce complex metabolites as a consequence of coevolution with their host [14]. Sponges and corals are examples of habitats where symbiotic microorganisms with interesting capabilities have been found [15]. In many cases, microorganisms have been found to be the producers of metabolites previously assigned to their hosts [16].

Microorganisms from intertidal zones must be able to tolerate rapid and repeated fluctuations in environmental conditions. These include temperature, light, and salinity, as well as wave action, ultraviolet radiation, and periods of drought [17]. Intertidal microbial communities preferentially grow as biofilms on natural and artificial surfaces. Within these protective microenvironments, they are subjected to intense biological and chemical interactions, leading to the production of various interesting secondary metabolites [18]. For example, in response to intense solar radiation, cyanobacteria and other microorganisms inhabiting intertidal or supratidal zones produce GlossaryTerm

UV

-absorbing/screening compounds, which present potential for the development of novel GlossaryTerm

UV

blockers for human use [19].

There are certain phylogenetic groups which per se constitute interesting targets for bioprospection. Actinomycetes (within the phylum Actinobacteria) are widely known for their capabilities of producing metabolites, which include antibiotics, antitumor and immunosuppressive agents and enzymes, among others [20]. Novel compounds with biological activities have already been isolated from marine actinomycetes [6]. Culture-independent studies have shown that the majority of actinomycetes from marine environments are not recovered by cultivation-based methods, and that marine actinomycetes are very different phylogenetically from their terrestrial counterparts [21, 22]. Deep-sea sediments and marine flora and fauna allow the access of true marine actinomycetes, away from the influence of land wash-offs [6]. Given the fact that 45 % of all microbial bioactive secondary metabolites currently derive from actinomycetes and that only 10 % of these metabolites is estimated to have been discovered so far, it becomes evident that advances in the ability to access this yet unexploited diversity will provide a new source for the discovery of secondary metabolites [23].

3 Methods for Microbial Bioprospecting in Marine Environments

Both culture-dependent and independent methods have uncovered an incredible diversity of microorganisms whose metabolisms largely have yet to be characterized [24, 25]. These methods have been further empowered by genomic-level information, which in turn is supported by sequencing technologies and bioinformatics [26, 27]. In the next sections, traditional, state-of-the-art and emerging approaches used for bioprospecting marine microorganisms with biotechnological potential will be reviewed.

3.1 Culturing Techniques

Microbial bioprospection and biodiscovery is currently severely limited by the lack of laboratory cultures [28]. Although culture-independent approaches have revolutionized environmental microbiology, the development of biotechnological applications from the genetic potential of microbial communities as well as fundamental environmental research must be anchored by the corresponding study of pure cultures. Furthermore, this novel diversity needs to be deeply characterized and adequately preserved in order to guarantee its future availability [29]. Novel cultivation methods, fortunately, continue to emerge as alternatives to overcome culture limitations [30]. These methods rely on advances in basic biological and ecological knowledge in order to best simulate the natural environment, as well as on the development of new technologies for more efficient screenings (Table 11.1). With the aid of sophisticated high-throughput cultivation techniques, the proportion of microorganisms from marine environments represented in culture has increased significantly over the last years [31].

Tab. 11.1 Limitations encountered with the culturing of marine bacteria and solutions implemented through novel culturing techniques

High-throughput dilution-to-extinction culture is one of the most powerful and sensitive approaches for the culture of marine microorganisms such as bacterioplankton. This technique led to the cultivation of the first member of the widespread but yet uncultured marine GlossaryTerm

SAR

11 clade [40]. This method consists in dilution of bacteria up to 110 cells per well in microtiter plates, using low-nutrient filtered seawater. High-throughput screening based on fluorescence microscopy clearly improved the technique over conventional methods, allowing rapid and sensitive detection of growing cells [35]. In later studies, this approach was coupled to long-term incubation at low temperatures to allow the recovery of new microbial variants [39].

The diffusion chamber  [32] is a device in which microbial cells are inoculated in an agar matrix separated from the source environment by membranes, isolating the cells but allowing nutrients and growth factors to pass through. The use of this device greatly improved the proportion of culturable bacteria from marine sediments [32]. Another version of this approach is the microbial trap, which selectively enriches for filamentous bacteria (e. g. actinomycetes) by allowing the filament colonization of the sterile agar through membranes with 0.2 μ m pores [34]. Microdroplet encapsulation in an agarose matrix, combined with growth detection by flow cytometry, led to the recovery of new clades from the marine environment [36, 37]. This approach is similar to the diffusion chamber in the sense that the agarose is porous, and nutrients and signaling molecules can diffuse into the growing colony and waste metabolites can diffuse out. Another advantage of the approach is that the microdroplets are physically separated and, because they are much larger than bacterial cells, they can be manipulated [28, 36].

Currently, second-generation high-throughput automated methods are being developed from these environmental cultivation devices. One example is the development of the isolation chip (GlossaryTerm

Ichip

), a  culture/isolation device composed of several hundreds of miniature diffusion chambers, each inoculated with a single environmental cell [50]. Another example is the micro-Petri dish, a device supported by porous material and reaching a million growth compartments [49]. An interesting method that couples high-throughput culture to rapid chemical screening was recently developed to identify symbiotic microbes producing secondary metabolites [51]. The screening is chemically based, by means of the use of ultra high-performance liquid chromatography/mass spectrometry. In this approach, a 96 multi-well plate format is utilized in rounds of successive culturing steps [51]. This strategy prevents spending significant resources on isolating, culturing, and analyzing microbes that do not possess the capability to produce the compounds of interest.

Various factors are thought to contribute to the low rate of culture recovery of environmental microbes (Table 11.1). The lack of knowledge of the environmental and nutritional requirements of yet unknown microorganisms is the most obvious. Another is the loss of biological cell-to-cell interactions in the isolation process [44]. For example, most of the strains able to grow on Petri dishes after recovery in diffusion chambers were, indeed, mixed cultures, highlighting the importance of chemical signaling for microbial growth [32]. The co-culturing with helper strains, followed by the identification of an oligopeptide signal, allowed previously uncultured strains to be successfully isolated in the laboratory [30, 48]. These authors have also reported problems in the successful adaptation to laboratory conditions or domestication of the cultured strains, as many of the strains forming microcolonies in diffusion chambers could only undergo a limited number of divisions in Petri dishes [32]. In further experiments, successive rounds of in situ cultivation in the chambers allowed for a larger recovery of isolates [30].

Advances in the understanding of basic microbiological principles can greatly help us to overcome the limitations of culturing environmental microbes. The theory of scouting of dormant cells proposes that any microbial population consists of a mixture of active and dormant cells [41, 52]. Individual cells periodically exit dormancy, although these events are not related to the onset of favorable environmental conditions, but are rather essentially random. Moreover, this happens independently of the nature of the microbial species (sporulating vs. nonsporulating, or fast versus slow growers). Indeed, the importance of the slow growers in environmental samples may be lower than previously thought, as many of the microbes regarded as slow growers in culture are, in fact, late awakening events [52]. One practical implication of this theory is that the success in discovering novel species depends on the overall amount of cultivation effort rather than the length of incubation (i. e., the same amount of effort focused either on many short-term or fewer long-term cultivation experiments) [41]. This theory could explain previous findings, such as why, for example, representatives of the phylum Verrucomicrobia that had been considered unculturable for many years, were eventually cultivated using rather conventional techniques. Dilution cultures were used to separate the very fast growers, but long incubation times were not necessary [41, 53].

3.2 Culture Independent Gene-Targeted Methods

In spite of the recent advances in microbial culturing, the majority of environmental microorganisms are still unculturable. Out of the more than 100 bacterial divisions that have been proposed to date, only 30 possess a cultivated representative [29]. Moreover, marine microbes are at the top of the list of those unculturable by conventional methods [26]. Since the landmark studies of Woese and Pace [54, 55], which stated the basis for molecular phylogeny, culture-independent methods have revolutionized our understanding of microbial communities [56] and currently stand on their own as a valid alternative for bioprospection. These methods are based on the information provided by biomolecules, mainly deoxyribonucleic acid (GlossaryTerm

DNA

), bypassing the need of cultivation by extracting these biomolecules directly from the environmental sample. Although they are not exempt from biases [57], they are still the best way to gain access to the overwhelming biodiversity of environmental microbes.

3.2.1 Culture-Independent Phylogenetic Approaches

Among culture-independent methods, the approach based on the molecular phylogeny of GlossaryTerm

rRNA

(ribosomal ribonucleic acid), particularly the small subunit (16S GlossaryTerm

rRNA

for archaea and bacteria), continues to be one of the most widely used. This gene has two properties that have positioned it as a building block for a universal molecular phylogenetic framework: its presence in all forms of life and a domain structure with variable evolutionary rates, which enables phylogenetic reconstruction at various levels. Fingerprinting techniques, polymerase chain reaction (GlossaryTerm

PCR

) clone libraries, and microscopy-based techniques like fluorescence in situ hybridization (GlossaryTerm

FISH

) have been routinely utilized over the last decades to describe and compare the structure and composition of microbial communities [58]. More recently, large-scale sequencing of 16S GlossaryTerm

rRNA

gene hypervariable regions has brought new strength to the classical phylogenetic approaches, which suffered a number of limitations associated with low coverage and cloning biases [59]. These approaches can be used as a guide for phylogenetically-driven biodiscovery. For example, analysis of community 16S GlossaryTerm

rRNA

by sequencing or fingerprinting can be used to select the most diverse sampling sites, thus maximizing novel taxa recovery in culture (Fig. 11.2, [23]). This approach is based upon the premise that taxonomic diversity is coupled to chemical diversity due to the role that secondary metabolism plays in speciation [23].

Fig. 11.2
figure 2figure 2

Culture-independent gene-targeted approaches relevant for biotechnology. These approaches can aid in the selection of suitable sites for bioprospection and in inferring possible culturing conditions. They are also relevant in environmental biotechnology

3.2.2 Functional Gene-Based Approaches

Approaches based on functional genes, which focus on the potential of the community to perform an activity of interest, can give a complementary view to the phylogenetic approach. Gene coding for key enzymes participating in different environmental processes, such as sulfate reduction [60], denitrification [61], nitrogen fixation [62], ammonia oxidation [63], hydrocarbon biodegradation [64, 65] among others, have been studied in the marine environment. Targets include not only bacterial but also archaeal populations and subgroups within these, by means of the use of primers with different specificities. Due to its highly focused nature, this approach is very powerful. However, one of its major drawbacks is the relative lack of database sequence information for functional genes, with respect to the 16S GlossaryTerm

RNA

gene. Another shortcoming is the lack of accuracy in taxonomic assignment due to lateral gene transfer [66].

Functional genes have the potential to be used as biomarkers in assays developed for the environmental biotechnology field, including wastewater treatment [67], and environmental remediation (Fig. 11.2, [68]). These molecular biological tools have been applied in the marine environment, for example, for the study of hydrocarbon degrading bacterial populations [64, 65]. In particular, quantitative polymerase chain reaction (GlossaryTerm

qPCR

) is a promising technique due to its quantitative nature, high sensitivity and the possibility of high throughput analysis [68]. However, this tool is still in its infancy for field applications in marine environments.

3.2.3 Linking Phylogeny and Function: Labeled Isotopic-Based Approaches

One of the long-standing goals of environmental microbiology is the possibility to link the phylogenetic identity of an uncultured microorganism with its function in the environment. The achievement of this goal can have far-reaching consequences for our understanding of microbial communities and ecosystems. Furthermore, it can fuel our efforts in the identification of new biotechnologically sound microbes and activities. Advances in isotope labeling have started to contribute over the last years to this goal [69]. The basic principle of this approach is that the labeling of substrates with stable or radioactive isotopes allows for the differentiation of metabolically active populations that incorporate the substrate. When this approach is coupled to some form of identification tool, the result is an experimental evidence of both the functional role and the phylogenetic identity of a previously unknown population. Stable isotope probing (GlossaryTerm

SIP

) was one of the first methods to be developed [70]. Microbial populations utilize the substrates offered labeled with a stable isotope and assimilate the heavier source into cell components, which in turn become labeled. The heavier and lighter molecules are then physically separated and analyzed. The most widely used method is GlossaryTerm

DNA

-GlossaryTerm

SIP

, in which GlossaryTerm

DNA

is separated in caesium chloride gradients and further purified and analyzed by cloning and sequencing [70]. The GlossaryTerm

RNA

-based GlossaryTerm

SIP

approach maintains the sequence-based phylogenetic resolution of GlossaryTerm

DNA

-GlossaryTerm

SIP

, but focuses directly on the GlossaryTerm

RNA

molecule itself rather than its gene, with the advantage of a high copy number and a turnover that is independent of cell replication [71]. Marine environments studied by this method include marine and estuarine sediments [72, 73, 74] and seawater samples [75]. Biotechnological applications of GlossaryTerm

SIP

have mainly addressed issues related to environmental biotechnology [76].

GlossaryTerm

SIP

depends upon the availability of stable isotopes ( 13 C, 15 N, 18 O) and of substituted substrate compounds [69]. However, they have the advantage of generating de novo information about the identity of the populations associated with a certain metabolic process. Interestingly, different incubation times can be used to follow the carbon flow in the different members of the community. The results can then be further tested experimentally, for example, by means of imaging techniques or targeted culture (Fig. 11.2).

Microscopy provides information about spatial arrangement and physical interactions of cells, which is applicable to spatially complex environments such as biofilms, consortia and symbiotic assemblages (Fig. 11.2). The development of fluorescence in-situ hybridization (GlossaryTerm

FISH

) enabled the detection and identification of single microbial cells in environmental samples by means of GlossaryTerm

rRNA

-targeted gene probes [77]. Microscope-based enumeration of cells makes this method an excellent approach for quantitative estimations, which is more accurate than conventional GlossaryTerm

PCR

. Furthermore, the technique is suitable for the use of multiple hierarchical probes in the same sample, which reduces the possibility of false positives. This powerful method has been coupled with the microautoradiography technique (GlossaryTerm

FISH

-GlossaryTerm

MAR

), which offers the possibility to directly observe the incorporation of substrates labeled with a radioactive isotope into single microbial cells [78]. As in GlossaryTerm

SIP

, the main limitation of this technique is the availability of radiolabeled substrates, with the additional concern of safety issues. In addition, some environmental samples bearing cells with low ribosome content (e. g., marine oligotrophic environments) can have detection problems with GlossaryTerm

FISH

. Horseradish peroxidase (GlossaryTerm

HRP

)-labeled oligonucleotide probes and tyramide can be used to enhance the signal intensities of hybridized cells. This approach is sometimes called catalyzed reporter deposition GlossaryTerm

FISH

(GlossaryTerm

CARD

-GlossaryTerm

FISH

, [79]), which can also be coupled to microautoradiography, further increasing its potential [80]. Raman microspectroscopy and nanometer-scale secondary-ion mass spectrometry (GlossaryTerm

nanoSIMS

 [81, 82]), are other techniques that are currently under development and may potentially be useful in the future for bioprospecting.

3.3 Omics and Meta-Omics Approaches

The large-scale study of genes (genomics), transcripts (transcriptomics), proteins (proteomics), metabolites (metabolomics), lipids (lipidomics), and interactions (interactomics) are globally defined as omics in the study of individual species and are often referred to as meta-omics approaches when microbial communities are analyzed [83]. These are rapidly evolving fields, which are highly dependent on the development and improvement of technologies and analysis tools. Besides their critical role for understanding the structure and function of microbial communities, they represent powerful approaches for the bioprospection of biological products and activities with biotechnological potential in marine environments.

3.3.1 Genomics

Since the publication of the genome of the bacterium Haemophilus influenzae Rd in 1995 [84], the number of sequenced genomes has expanded quickly [85]. By the end of 2012, the Genomes Online Database (GlossaryTerm

GOLD

, Table 11.2 ) listed more than 4000 completed genome projects, 90 % of them belonging to bacteria. Approximately 60 % of these genomes where finished, that is, all segments obtained after the assembly were ordered, all gaps were closed, and any ambiguities or discrepancies were resolved after a series of rigorous quality-control steps [86, 87]. As the finishing step increases the cost and time required to sequence a genome, often the final goal is to obtain a draft genome, represented by a number of contigs or scaffolds [87]. Although there are limitations for the use of draft sequences in some applications [88], draft assemblies are a powerful resource for bioprospecting, as the majority of the genes of an organism are usually represented in its draft genome [86].

Tab. 11.2 Websites of initiatives of genomic and metagenomic data generation, repository and/or analysis tools, useful for marine microbial bioprospecting

Not only the number of genomes sequenced so far represents a minimal proportion of the microbial diversity present in our planet [89], but also some phylogenetic groups of microorganisms (such as members of the Proteobacteria and Firmicutes) are greatly over-represented, while other groups have no representative sequences [85]. This bias has a negative effect on gene discovery and annotation in both genomic and metagenomic data, as microbial genomes provide scaffolds for the interpretation of sequence information. With the aim of systematically filling in these existing gaps, the US Department of Energy’s Joint Genome Institute created the initiative Genomic Encyclopedia of Bacteria and Archaea (Table 11.2). Similarly, the Microbial Genome Sequencing Project of the Gordon and Betty Moore Foundation’s Marine Microbiology Initiative has sequenced the genome of hundreds of ecologically relevant microorganisms isolated from diverse marine habitats (Table 11.2).

The phenotypic analysis of isolates can severely underestimate its genetic potential, as genes may not be expressed or may be expressed at very low levels under laboratory conditions. For example, it has been proposed that a unique combination of environmental factors may be required for the expression of biosynthetic genes [6]. Therefore, the mining for genes or gene clusters in microbial genomes could uncover hidden treasures that could be exploited, for example, using heterologous gene expression [90]. Moreover, the use of sequence information has assisted in the determination of the chemical structure of new compounds by a combination of bioinformatics and chemistry [91]. The potential for biodiscovery of this approach has led to the explosion of interest in genome mining as a tool for bioprospection, which has been aided by the development of new bioinformatic tools for the analysis of the growing volume of GlossaryTerm

DNA

sequence data [92]. However, as sequencing efforts are rarely followed by the biochemical characterization of the putative gene products, many genes emerging from these studies have unknown functions, and basic local alignment search tool (GlossaryTerm

BLAST

)-based protein functional assignments can easily propagate annotation errors [93]. Recently, a database of experimentally characterized proteins was created (GlossaryTerm

CharProtDB

, Table 11.2), enabling to link experimental characterizations of protein functions with computationally accessible protein sequences [94].

The mining of the ever-increasing amount of genomic data from marine microorganisms has led to the bioassay-independent discovery of gene clusters with important biotechnological applications [95]. In recently published work, Wargacki and collaborators [96] used a public database to identify a genome fragment from the Vibrio splendidus strain 12B01, containing genes for alginate degradation, transport, and metabolism. This gene cluster was used to construct a microbial platform that enables bioethanol production from macroalgae via a consolidated process [96]. In silico mining of genomes combined with molecular biology approaches has resulted in the discovery of novel gene clusters with potential use for the development of peptide-based drug candidates [95]. In addition, a search of sequenced bacterial genomes showed that marine cyanobacteria present an extraordinarily efficient strategy for generating many cyclic peptide secondary metabolites [97].

3.3.2 Other Omics Approaches

Currently, only one-third of an annotated bacterial genome corresponds to information that is well known [98]. However, in order to understand how a cell operates and, therefore, for the successful application of its genetic potential, a better knowledge of the two ignored parts is required. Molecular approaches that can be used in combination with genome sequencing to study a marine microbial isolate include, for instance, transcriptomics, proteomics, and metabolomics [99]. Concerning transcriptomics, the development first of microarray technology and later of whole transcriptome shotgun sequencing using next-generation sequencing technologies has provided valuable insight into gene function and regulation [100]. Importantly, this information also serves as a basis for genome re-annotation. In addition to its function as information carrier, GlossaryTerm

RNA

can present various regulatory functions in bacteria [101]. Although this information is still in its infancy, it could be highly valuable for the biotechnological application of pure cultures.

Technological advances in the field of mass spectrometry (GlossaryTerm

MS

) have enabled us to obtain information regarding a significant proportion of the proteome of microbial isolates [102]. Several proteomic studies of cyanobacterial strains have been published, which are microorganisms that have interesting biotechnological applications [103]. For example, the analysis of the proteome of Synechocystis sp. PCC 6803 rendered evidence of the mechanisms used by this microorganism for gaining resistance against the biofuel hexane [104]. In a second study, the complex response of this model microorganism to ethanol was analyzed using a quantitative proteomics approach [105]. Ethanol sensitivity of cyanobacteria currently restricts efforts to increase biofuel production levels in metabolic engineered strains for autotrophic ethanol production, and this study provided a list of potential gene targets for engineering ethanol tolerance.

Although highly challenging due to the extremely fast turnover times of the small molecules of the cell, metabolomics is fundamental for the understanding of metabolic reaction networks and their regulation, as well as to link the genotype of an isolate to its phenotype [106]. Interactomics, on the other hand, attempts to resolve the whole set of molecular interactions in cells and fluxomics establishes dynamic changes of molecules within a cell over time [99]. No single approach is sufficient to characterize the complexity of biological systems [99]. In fact, it is only through the integration of multiple layers or dimensions of information (provided by different omics approaches) that a proper understanding of the whole cell operation can be obtained [98]. The integration of the overwhelming amount of information obtained from multiple datasets can only be accomplished with the use of mathematical modeling and computational tools and results in a dynamic map of all cellular functions and regulatory circuits with spatiotemporal resolution [107]. The advances of the field of systems biology are empowering the engineering of industrial microorganisms, allowing the development of more robust strategies and moving the field toward a design-based engineering of biological systems [108].

3.3.3 Single-Cell Analyses

The analysis of single cells is an approach with multiple biotechnological applications and presents both unprecedented challenges and opportunities [83]. Individual cells can be physically separated from each other and/or from the environmental matrix material before further analysis, through a technique called single-cell isolation [109]. In addition, targeted cells can be individually recognized and distinguished from background populations through cell-sorting techniques, although some cell-sorting instrumentation also allows cell isolation [109]. The fundamentals, advantages, and drawbacks of the different devices used for cell sorting and cell isolation were recently reviewed in detail [107, 109] and therefore they will not be covered in this chapter.

One of the applications of single-cell analysis is the study of cell-to-cell variations within an isogenic cell population, delivering functional biological information beyond the statistical average of a microbial population [107]. For example, through the use of total transcript amplification, the heterogeneity of transcript levels among individual cells within a bacterial population can now be studied [110]. A second application of single-cell analysis is to individually study yet-to-be cultured microorganisms using omics approaches. Single-cell genomics (GlossaryTerm

SCG

) involves the isolation of single cells from an environmental sample, the purification of its GlossaryTerm

DNA

, followed by whole-genome amplification and sequencing [26]. Methodological difficulties of GlossaryTerm

SCG

include background contamination during the amplification of single-cell GlossaryTerm

DNA

, biases during amplification and sequencing, as well as difficulties in sequence assembly [111, 83]. Different strategies have been tested in order to improve GlossaryTerm

SCG

, such as reagent decontamination before amplification [112], artificially inducing polyploidy in single cells [113], improving the efficiency of genome amplification [83], as well as using more efficient assembly algorithms [111]. In spite of its limitations, it is currently possible to obtain a high percentage of de novo genome sequences from yet-to-be cultured microorganisms.

GlossaryTerm

SCG

is considered a powerful complement of both cultivation and metagenomics, as it allows us to link the metabolic potential of an uncultured microorganism with its taxonomic identity, as well as to define possible strategies for the isolation of the microorganism [114]. In addition, GlossaryTerm

SCG

allows us to study in situ interactions among organisms and is particularly suited for the analysis of symbiotic systems, for example, the biotechnologically-relevant bacterial symbionts of marine sponges [115, 116]. In a recent work, Bayer etal [117] identified novel enzymes involved in halogenation reactions in marine sponge-associated microbial consortia using a combination of omics approaches, including GlossaryTerm

SCG

. In another study, Martinez-Garcia and collaborators [118] sequenced five Verrucomicrobia cells, identifying genes encoding a wide spectrum of glycoside hydrolases, sulfatases, peptidases, carbohydrate lyases, and esterases. In addition, the analysis of partially assembled genomes of only ten cells of Prochlorococcus increased the pan-genome of this genus by 4.6 % , highlighting the potential of this approach for bioprospecting [119]. Information concerning which proteins are being expressed in a particular environmental condition, their abundance, as well as post-translational modifications could be obtained through single-cell proteomics. However, further development of GlossaryTerm

MS

and micro/nanofluidic based technologies is needed for the analysis of the proteome from single cells. Although still not widely used in biotechnological applications, the combination of single-cell omics approaches has the potential to significantly contribute to this field [107, 114].

3.3.4 Metagenomic Approaches

Metagenomics, the direct analysis of the genomes contained in a microbial community, nowadays represents a key tool for microbial marine bioprospecting, as it allows access to the genetic potential of a  microbial community. Metagenomic analyses typically start with the purification of GlossaryTerm

DNA

from an environmental sample, which is called metagenomic GlossaryTerm

DNA

 [120]. This GlossaryTerm

DNA

can be used for the construction of a metagenomic library [121] or alternatively, it can be randomly sequenced using next-generation sequencing technologies [122]. Due to the uneven distribution and high species richness of most microbial communities, metagenomic analyses are very challenging. Despite the difficulties still affecting metagenomics, in recent years this discipline has been fundamental in increasing our understanding of microbial communities and has become an important tool for mining novel biomolecules or activities with biotechnological potential [123, 124]. For instance, the construction and screening of metagenomic libraries have resulted in the identification of many novel biocatalysts, including lipases/esterases, cellulases, chitinases, GlossaryTerm

DNA

polymerases, proteases, and antibiotics [123, 125]. Marine sediments, microbial communities from marine invertebrates, and cold marine environments are among the most commonly studied habitats, due to their high biotechnological potential [126, 127, 128].

The cloning of fragments of metagenomic GlossaryTerm

DNA

using the appropriate vectors and suitable hosts allows us to store and mine the genetic potential contained in a microbial community [121]. The selection of the vector for library construction (plasmids, cosmids, fosmids, or GlossaryTerm

BAC

s – bacterial artificial chromosomes) depends mainly on the desired insert length. For example, large-insert libraries are required for recovering large gene clusters [123]. Other factors to consider are the desired cell copy number, the quality of the metagenomic GlossaryTerm

DNA

, the genes that are being targeted, the chosen host, as well as the selected screening strategy [121, 123]. In order to reach sufficient coverage of metagenomes of highly diverse microbial communities, such as those from soils or sediments, metagenomic libraries need to contain a large number of clones [129]. To increase hit rates, enrichment cultures were used prior to metagenomic GlossaryTerm

DNA

extraction, although this approach can result in an overall loss of diversity [130]. Another possible strategy is the use of stable-isotope-labeled substrates to enrich the functionally-relevant fraction of the microbial community. In this case, density centrifugation of metagenomic GlossaryTerm

DNA

is performed after labeling, before the construction of the metagenomic library [121, 131].

Two different strategies can be used for the screening of a metagenomic library: a function-based approach (detection or selection for metabolic activity) or a sequence-based approach (detection of a specific target gene). The first strategy, called functional metagenomics, does not require previous knowledge of sequence information and can, therefore, result in the identification of entirely novel classes of genes [123]. It presents the additional advantage that the identified gene or gene cluster is already being functionally expressed in the host. However, functional-based screenings can be problematic due to low-level gene expression, lack of post-translational modifications, the formation of insoluble aggregated folding intermediates, as well as detrimental effects that the products can have on the host cell [132, 133]. Currently, the most commonly used vector and host for constructing metagenomic libraries are fosmids and Escherichia coli [121]. The use of other hosts and the development of vectors able to replicate in various species are particularly useful for expression-based analyses of metagenomic libraries [129, 134, 135]. In addition, hosts can be engineered to improve gene expression [136].

Different function-driven approaches can be used for the screening of metagenomic libraries. One of them is the detection of the desired phenotype in agar-plate assays, for instance, an enzymatic activity or colony pigmentation [124]. Agar-plate based screenings have the advantage of not requiring expensive devices. However, they are usually labor intensive and they tend to have low hit rates due to the generation of weak signals [129]. In addition, this strategy depends on the availability of assays able to detect the desired metabolic function, of which there are, unfortunately, very few [131]. In order to increase the sensitivity of the assays, the enzymatic activity can be measured in cell lysates [124, 129]. This is usually performed using colony picking robots and microplate readers to shorten the processing time. Another strategy, called heterologous complementation, significantly simplifies functional screening by taking advantage of gene targets for which the desired phenotype is required for the survival of the host, such as genes that confer resistance to metals or antibiotics [129, 137]. On the other hand, the ability of some substrates to induce gene expression through closely located regulatory elements has been used to engineer vectors containing reporter genes [124]. Interestingly, Uchiyama and collaborators [138, 139] have created specific reporter assays based on fluorescent proteins in order to screen for enzyme-encoding genes in metagenomic libraries.

In contrast to functional metagenomics, molecular screenings involve the use of primers or probes that have been designed based on conserved regions of already-known genes or protein families to mine a metagenomic library  [123]. The main advantage of this approach is that the target can be identified even if it is not being expressed by the host [128, 140]. Subcloning using the appropriate host and vector can result in the functional expression of the gene or gene cluster of interest, allowing the functional characterization and biotechnological application of the product. A critical drawback of molecular screenings is that they depend on sequence database information, which is currently biased, and as a consequence usually results in the retrieval of variations of previously known genes. In order to maximize the discovery process in molecular screenings, sequences of genes identified in a metagenomic library by a functional approach or gene fragments retrieved by GlossaryTerm

PCR

from the same environment can also be used as a source of de novo, unbiased genetic information for primer design [141]. An interesting approach to mine for gene clusters involved in the biosynthesis of bioactive molecules is to perform a retrobiosynthetic analysis on the structure of these compounds in order to predict the enzymes involved in the biosynthetic pathway [128]. This information is then used to design degenerate primer sets for the retrieval of gene fragments by PCR [128].

Besides the construction of metagenomic libraries, GlossaryTerm

DNA

isolated from environmental samples can be sequenced directly using next-generation sequencing technologies, resulting in the random generation of sequence information from the genomes contained in the microbial community [142]. This approach has been extensively used for the analysis of microbial communities from marine environments, providing unprecedented insights into their genetic potential [120]. Continuous improvements in sequencing technologies is resulting in longer read lengths, larger sequence outputs, as well as lower costs, allowing a deeper analysis of the microbial communities. As individual reads are usually assembled, genome fragments containing whole operons, and even draft genomes from uncultured bacteria can be obtained [143, 144]. This information is critical in bioprospecting efforts, as it can be used as a basis for the recovery of the genome fragment from the same community, or alternatively, for its synthesis. The term synthetic metagenomics has been proposed to define the discovery approach that involves in silico identification of hypothetical target sequences followed by automated chemical GlossaryTerm

DNA

synthesis and heterologous expression [145]. This approach has recently been used to obtain de novo functional methyl halide transferases using information from the GenBank database, enzymes that are useful for biofuel production [145]. This approach allows the exploitation of existing sequence databases, currently underexplored and underexploited [146]. Synthetic metagenomics has the additional advantage of allowing codon optimization, which may significantly improve gene expression [145]. Sharma and collaborators [147] have developed a resource called GlossaryTerm

MetaBioME

with the goal of facilitating the discovery of novel commercially useful enzymes from metagenome information (Table 11.2).

3.3.5 Other Meta-Omics

Next-generation sequencing technologies can also be used to analyze the subset of genes in a microbial assemblage that is being transcribed under a particular environmental condition [148, 149]. For this approach, total GlossaryTerm

RNA

is extracted from the environmental sample, GlossaryTerm

rRNA

is removed in order to enrich for the GlossaryTerm

mRNA

fraction, and copy GlossaryTerm

DNA

(GlossaryTerm

cDNA

) synthesis is performed before sequencing [148]. When GlossaryTerm

cDNA

yield is insufficient for analysis, however, an amplification step can be included. Metatranscriptome sequencing represents a powerful tool to analyze microbial communities, albeit with considerable challenges. Not only environmental cells present low GlossaryTerm

mRNA

contents, but also their half-lives are very short, in the range of a few minutes [150]. In addition, GlossaryTerm

mRNA

constitutes a very small fraction of the total GlossaryTerm

RNA

in bacterial cells and the enrichment of the GlossaryTerm

mRNA

fraction in prokaryotes is challenging [151]. Limitation in environmental sample quality and quantity, as well as low GlossaryTerm

mRNA

integrity and purity can also affect metatranscriptomic analyses [149, 151]. Despite these methodological challenges, this approach has been increasingly applied in fundamental research on microbial communities from various marine habitats [152, 153, 154, 155, 156, 157].

Like shotgun sequencing metagenomics, sequence-based metatranscriptomics presents no significant bias towards known sequences, and is considered highly informative concerning ongoing ecologically relevant processes [149, 150]. Other advantages of metatranscriptomics over metagenomics are that only ecologically relevant information is retrieved and fewer resources are required for this analysis [148]. However, sequence reads are still too short for bioprospecting efforts. Metatranscriptomics represents a powerful approach for the discovery of metabolically relevant enzymes that are actively involved in particular biochemical pathways [148, 158]. Another application is the analysis of community-specific variants of functional genes; this information is highly relevant to the environmental biotechnology field and may lead to the discovery of genes or processes with biotechnological potential [148, 149]. For instance, in a recent study, a combination of omics approaches that included metatranscriptomic analysis was able to elucidate which hydrocarbon degradation pathways were actively expressed in the deep sea after an oil spill and to ascribe these pathways to particular taxa [154].

Another approach that can be used for the identification of novel enzymes is to directly analyze the proteins of a microbial community using metaproteomics. This approach consists in the extraction of the proteins from an environmental sample, followed by the separation of the proteins (or peptides) by two-dimensional polyacrylamide gel electrophoresis or liquid chromatography, and lastly GlossaryTerm

MS

analysis and the identification of the proteins by in silico spectral matching against sequence databases [159]. The main challenges of metaproteomics are the large complexity of protein species expressed by the members of the microbial community and the large dynamic range of protein levels [159]. However, faster and more sensitive mass spectrometers and advances in omics datasets and data handling facilitate the analysis of increasingly complex environments [160]. Most marine metaproteomic studies performed so far have focused on the analysis of planktonic microorganisms, providing clues concerning key metabolic processes such as those involved in ocean biogeochemical cycles [161, 162, 163, 164]. More recently, Kleiner and collaborators [165] used metaproteomics and metabolomics to investigate metabolic interactions in the association between a gutless marine worm and its bacterial symbionts, revealing highly efficient pathways for the uptake, recycling, and conservation of energy and carbon sources.

Molecular systems biology at the ecosystem level, also known as eco-systems biology, attempts to build models that are able to predict the behavior of a community, through the integration of omics, meta-omics, and single-cell approaches, as well as the use of mathematical models [166, 167, 168]. Although still in its infancy, this discipline has the potential to provide a comprehensive understanding of the functioning of microbial communities, and, therefore, to facilitate their management, which is a long-term goal of environmental biotechnology [168]. For example, improving the mechanistic understanding of biodegradation processes may facilitate the development of knowledge-based bioremediation strategies and the design of biosensors for the detection of pollutants [169].

4 Conclusions

Over the last years, the development of a broad array of methodologies for the analysis of environmental microorganisms has profoundly altered bioprospecting efforts, thus significantly increasing our access to the genetic potential contained in microbial communities. However, finding properties of interest in the prospected environments is only the first stage in a series of value-adding steps, which ends in the development of products or services with applications in human health, industry, renewable energy, etc. Importantly, strategies that are able to maximize the biotechnological potential of environmental microorganisms, for example, microbial engineering and synthetic biology, have been matching the evolution of bioprospecting tools.

As microorganisms from marine habitats are increasingly being recognized as particularly promising resources for bioprospecting, both academia and biotechnology industry sectors are increasing their investments in marine biotechnology research and development. This is evidenced by an increment in the number of publications in marine microbial bioprospecting, as well as the development of new products from marine biodiversity. Furthermore, marine biotechnology has been recognized in many parts of the world as having an enormous development potential, and the furthering of this discipline is considered as strategic not only for reaching key societal needs but also for economic growth [170].