Introduction

Mangroves are coastal ecosystems occurring in tropical and subtropical regions around the world (Holguin et al. 2001). Brazil is among the most important countries for the preservation of mangroves, harboring 7% of total area covered by mangrove vegetation on Earth (Ghizelini et al. 2012). This ecosystem is found in zones under tidal influence, located between the sea and the continental area (Taketani et al. 2010), commonly supported upon sediments characterized by high salinity and high contents of organic matter, even under the general nutrient deficiency that occurs due to the prevalent anoxic condition (Maie et al. 2008; Reef et al. 2010). The presence of mangroves in the coastline is essential to the maintenance of sea level and protection of the coast from erosion and tsunamis, serving as a buffer system in the intertidal zone (Duarte et al. 2013; Spalding et al. 2014). Moreover, mangroves are often situated in areas of high anthropogenic influence, being exposed to pollutants and remarkable occurrence of oil spills (Dos Santos et al. 2011).

Microbes are known to be crucial for the maintenance of ecosystem productivity, conservation and recovery of impacted areas. In mangroves, microbial cells are involved in the photosynthesis, nitrogen fixation, methanogenesis, phosphate solubilization, sulfate reduction and production of substances such as antibiotics and enzymes (Dos Santos et al. 2011; Varon-Lopez et al. 2014). The microbial web in such sediments constitutes the major route for nutrient cycling, acting through the decomposition of lignocellulosic compounds and the production of enzymes with pectinolytic, cellulolytic, amylolytic and proteolytic activities (Thatoi et al. 2013).

In oil-impacted mangroves, survival of the remaining plants might rely even more on the beneficial interactions with the residing microbiome (Gomes et al. 2010). Due to particular characteristics, such as short life cycle, high mutation rate, genetic organization and horizontal gene transfer, microorganisms are capable of rapidly adapt to environmental changes. This ability allows the development of metabolic pathways able to transform or assist the removal of pollutants (Cabral et al. 2016). Despite of that, phylogenetic and functional surveys of microbial diversity in mangroves are still unexplored, as <5% of mangrove species have been described (Thatoi et al. 2013). Moreover, the biotechnological exploration of mangroves is still to be made, making such ecosystem a reservoir for the description of novel microorganisms and functions as a result of the evolution under a particular combination of environmental drivers (Thompson et al. 2013). Microbial enzymes are of great importance in the development of industrial bioprocesses, where the need for new scaffolds, with improved and more versatile activites, and economically competitive enzymes are mandatory (Adrio and Demain 2014). The combined use of a metagenomics-based approach and a particular and yet to be explored environment such as mangroves, offer a perfect scenario for a successful bioprospecting program (Adrio and Demain 2014).

In the above-mentioned context, the specific aims of this work were: (i) to construct and validate a fosmid library based on the cloning of large DNA fragments from an oil-spilled mangrove sediment; (ii) to survey genes related to known hydrolases in the search for sequence novelty of potential biotechnological interest; (iii) to identify the putative microbial groups encoding the hydrolase genes recovered in the fosmid library and to compare the results with those obtained by direct sequencing of DNA from mangrove sediments.

Materials and methods

Statement

The Campinas State University (UNICAMP) and the Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) approved and supported the development of this study.

Sampling of mangrove sediments

The mangrove site under study is located in the Brazilian coastline, in the city of Bertioga (São Paulo State) (23º53′49″S, 46º12′28″W). This mangrove was impacted by a spill of 35 million liters of oil in the year of 1983, and the consequences of the contamination are still present. A layer up to 50 cm depth of oil in the mangrove surface is visible, native vegetation remains under recovering process, and concentrations of pollutants such as hydrocarbons and heavy metals are high (Andreote et al. 2012; Cabral et al. 2016).

During an expedition in November 2010, mangrove sediment cores were collected from the 0 to 30 cm top layer, immediately transferred to sterile 50 mL tubes and transported to the laboratory for subsequent DNA extraction. The core sampling was made at three points within the mangrove, and each point was sampled in triplicate (totalling nine samples). Aliquots of approximately 50 g from each sample were pooled together to compose one unique sample, which was mixed and further used for metagenomic DNA extraction.

Total DNA extraction for whole shotgun metagenome (WSM) sequencing

For the microbial community analysis directly form the mangrove sediments, total DNA extraction of sediment samples was performed using PowerSoil® DNA Isolation Kit (Mo Bio Laboratories, Inc., USA), following the manufacturer’s instructions. The DNA was extracted separately for each point of the mangrove. The extracted DNA from the three mangrove sediment samples was submitted to whole shotgun metagenome sequencing using one lane of Illumina HiSeq 2000 platform, by the facility for Functional Genomics for Agricultural Studies (ESALQ/University of São Paulo, Piracicaba, Brazil).

Total DNA extraction and fosmid library construction

For the metagenomic fosmid library construction, total DNA extraction was performed using the protocol developed by Tsai and Olson (1991), with modifications made by de Vasconcellos et al. (2010). The DNA obtained was further subjected to purification for removal of humic compounds using celsium chloride, as described by Maniatis et al. (1982), except for the ethidium bromide stainning step. The resulting DNA was used to construct a metagenomic library with the CopyControl™ HTP Fosmid Library Production Kit containing the pCC2FOS™ Vector (Epicentre®, USA).

Fosmid DNA extraction for sequencing

To assess the phylogenetic and functional diversity recovered within the library, the entire metagenomic library was submitted to fosmid DNA extraction for subsequent sequencing using Illumina Hi Seq 2000 and 454 GS FLX Titanium platforms. Clones were cultivated in LB medium amended with 12.5 μg/mL of chloramphenicol (overnight incubation in rotational shaker at 37 °C and 200 rpm). After cultivation, 500 μL-aliquots of each clone culture were transferred to 50 mL tubes. The tubes were centrifuged to precipitate the cells, the supernatant was discarded and the DNA extraction was conducted using QIAGEN® Large—construct Kit (Qiagen Sample & Assay Technologies, Germany), following the manufacturer’s instructions. Fosmid DNA of the whole library was further sequenced using one lane of Illumina Hi Seq 2000 platform, by the facility for Functional Genomics for Agricultural Studies (ESALQ/University of São Paulo, Piracicaba, Brazil; http://genfis40.esalq.usp.br/multi/), and one plate of 454 GS FLX Titanium platform, by Macrogen (Korea; http://www.macrogen.com). Reads obtained from 454 sequencing were used for assembly into contigs and further phylogenetic analyses.

In silico data analysis

Sequences obtained from Illumina sequencing of the metagenomic fosmid library were initially trimmed for quality [reads shorter than 50 bp or of low quality (Phred score <20) were removed], and the remaining sequences were further trimmed for pCC2FOS vector sequences and genomic DNA from the host strain E. coli EPI-300, which were excluded from downstream analysis. High quality sequences were directly submitted to the MG-RAST (Metagenomics Analysis Server, Argonne National Laboratory; http://metagenomics.anl.gov) (Glass and Meyer 2011) for annotation, under the access of MG-RAST ID’s 4561201.3 (for Fosmid library sequences—Library_Oil Mgv). Fosmid library was also sequenced using 454 GS FLX Titanium technology and resulting reads were submitted to MG-RAST annotation and deposited under the access ID 4558576.3. In addition, reads obtained from 454 sequencing were used for assembly into contigs for further phylogenetic analyses.

Raw sequences derived from direct sequencing of environmental DNA (mangrove sediment) were de-multiplexed (total DNA = 86 ± 14; mRNA = 98 ± 5) according to their tags and low-quality sequences were filtered (score limit of 0.05; maximum 1 ambiguous nucleotide allowed; minimum sequence length of 100 nt) using the CLC’s Workbench software version 6.5.1 (CLC Bio-Qiagen, Aarhus, Denmark). Resulting sequences were then uploaded to the MG-RAST metagenomic analysis server, annotated and deposited under the following ID’s: 4533991.3, 4533992.3 and 4533993.3, for Bertioga/São Paulo State mangrove sediment contaminated with oil—Oil Mgv; 4534574.3, 4534575.3 and 4534576.3, for Bertioga/São Paulo State mangrove sediment under the influence of anthropogenic action—Ant Mgv; 4534058.3, 4534060.3 and 4534815.3, for preserved mangrove located in Cananéia/São Paulo State—Prs Mgv (three sample sites from the same location for each mangrove). In addition, two public datasets available at MG-RAST were used for comparisons, named 4485218.3, for Bahia mangrove under influence of anthropogenic action—Bahia_Ant Mgv (Thompson et al. 2013), and 4485219.3, for Rio de Janeiro pristine mangrove—Rio_Prs Mgv (Thompson et al. 2013).

Sequence-driven screening of target enzyme genes

Eight known functions of biotechnological interest were specifically targeted in this study, named esterases, lipases, proteases, epoxide-hydrolases, cellulases, amylases, xylanases and chitinases. Genes and metabolic routes related to these functions were searched at MG-RAST. The functional annotation performed by MG-RAST was based on the hierarchical classification of sequences (based on BLASTX search), using the subsystems approach of SEED database (Overbeek et al. 2005) (maximum e-value cutoff of 1e−5, minimum percentage of identity cutoff of 60% and minimum alignment length cutoff of 15 base pairs). In order to search for differences among the relative abundances of hydrolases, analyses using STAMP (Statistical Analysis of Metagenomic Profiles) software were carried out comparing the datasets from São Paulo state mangroves. Significant differences were identified using Fisher’s exact test (with 0.95 confidence intervals). Storey’s (FDR) method was applied to correct for multiple comparisons, and differences in proportion with a q-value (corrected p-value) <0.05 were retained, as recommended by the software.

Taxonomic classification of sequences related to the target enzyme genes

The sequences identified as being related to the target enzyme genes using MG-RAST were uploaded to the virtual workbench in MG-RAST. The taxonomic affiliation at phylum and family levels were performed using all annotation source databases available in MG-RAST (Annotation source: M5NR; Max. e-value cutoff: 1e−5; Min. % Identify cutoff: 60%) (Glass and Meyer 2011).

Phylogenetic analysis of hydrolase sequences

Phylogenetic affiliation was performed based on larger sequences obtained from 454 GS FLX Titanium sequencing (Life Science, Roche) of the same fosmid library. Sequencing data were submitted to the online platform MG-RAST V3.5 (Glass and Meyer 2011) for functional and taxonomic classification against the KEGG database (Kyoto Encyclopedia for Genes and Genomes) and SEED-nr, respectively. The sequences classified as groups of hydrolases were downloaded and assembled into contigs with the help of CAP3 tool (Huang and Madan 1999) contained in BIOEDIT software (Hall 1999). BLASTx was used to perform a similarity search of contigs, against Reference Proteins database, and checking of Open Reading Frames, from the NCBI. The nucleotide sequences were converted into amino acids using the EMBOSS Transeq platform (Rice et al. 2000). The best contigs were aligned with orthologous sequences of higher similarity obtained from Genbank, using the CLUSTALW software (Thompson et al. 1994) Phylogenetic trees were constructed using MEGA v5.0 software (Tamura et al. 2011) and the neighbor-joining method (Saitou and Nei 1987) as phylogenetic distance algorithm with bootstrap from 1000 replicate runs.

Results

Construction and sequencing of fosmid library

The metagenomic fosmid library yielded 12,960 clones. After trimming, 141,295,265 reads were obtained, with an average lenght of 86 ± 23 bases, totaling 12,205,480 Mb. Sequences were further submitted to MG-RAST for annotation, yielding 88,497,264 annotated sequences, with average length of 87 ± 22 bases, 58% of CG content and genome size of 7,729,678 Mb (Table 1).

Table 1 Properties of metagenome datasets based on MG-RAST annotation

Sequence-driven screening of target enzyme genes

The abundances of target enzyme sequences were similar in all mangroves datasets from São Paulo state—fosmid library and direct sequencing. Sequences associated with proteases were most abundant, ranging from 48.52 to 60.82% of all hydrolase sequences. Esterase-related sequences were the second most abundant group, with percentages ranging from 22.61 to 38.02%, followed by amylases and then the remaining less abundant hydrolases (Fig. 1, Supplementary Material Tables S1, S2). Interestingly, protease sequences were more abundant in São Paulo mangroves (9.62–12.13% higher) in comparison to Bahia and Rio mangroves, whereas esterase sequences were more abundant in Bahia and Rio mangroves (11.08–15.41% higher).

Fig. 1
figure 1

Target enzymes (relative abundances) from metagenomic Library_Oil Mgv sequencing and direct sequencing of mangrove sites (Oil Mgv, Ant Mgv, Prs Mgv, Bahia_Ant Mgv, Rio_Prs Mgv) annotated with MG-RAST server v3.5 (Meyer et al. 2008)

Comparison between Ant Mgv and Oil Mgv resulted in a similar profile of relative abundances of hydrolases, with particular differences in the case of proteases, which were more abundant in Oil Mgv, and amylases, more abundant in Ant Mgv (Fig. 2a). Analysis between Prs Mgv and Ant Mgv showed higher relative abundance of amylases in Ant Mgv and of esterases in Prs Mgv (Fig. 2b), and comparison between Prs Mgv and Oil Mgv revealed higher relative abundance of xylanases and esterases in pristine mangrove, while oil-impacted mangrove had higher relative abundance of proteases (Fig. 2c). STAMP analysis comparing Oil Mgv and Library_Oil Mgv showed that direct sequencing resulted in higher relative abundance of xylanases, esterases, cellulases and proteases in comparison to the fosmid library (Fig. 2d). These analyses showed an overall higher relative abundance of amylases in Ant Mgv, of proteases in Oil Mgv, of esterases in Prs Mgv and of lipases in Library_Oil Mgv.

Fig. 2
figure 2

Statistical analysis of relative abundance of target hydrolases in São Paulo State mangroves datasets, performed using STAMP v2.1.3 software. Ant Mgv versus Oil Mgv (a); Prs Mgv versus Ant Mgv (b); Prs Mgv versus Oil Mgv (c); Oil Mgv versus Library_Oil Mgv (d)

Taxonomic analyses based on hydrolase sequences

Taxonomic analysis of the hydrolase sequences based on metagenomic datasets from fosmid libray and mangrove sites showed that the target enzymes were assigned to a broad diversity of microbial families. However, the massive majority of the hydrolase reads were distributed in only a few number of taxa, which were different for each enzyme (Fig. 3). Amylase sequences were mostly related to Chloroflexaceae in Library_Oil Mgv, Bahia_Ant Mgv and Rio_Prs Mgv (20.50, 25.11 and 25.74% respectively). For cellulases, sequences were mainly related to the families Pseudomonadaceae in Oil Mgv (25.0%) and Thermoanaerobacteraceae in Bahia_Ant Mgv and Rio_Prs Mgv datasets (26.67% for both datasets). The epoxide hydrolases were mainly related to the family Bradyrhizobiaceae in Library_Oil Mgv (32.30%) and to Mycobacteriaceae (29.41%) in Rio_Prs Mgv dataset. Taxonomic assignment of lipase sequences revealed that the most prominent family in Bahia_Ant Mgv was Pseudomonadaceae (30.82%). Chitinase sequences were mostly assigned to members of the family Alteromonadaceae in Library_Oil Mgv (62.07%), to Vibrionaceae (64.81%) in Prs Mgv and to Shewanellaceae (50%) in Rio_Prs Mgv. Finally, xylanase sequences were massively assigned to the families Bacteroidaceae in Library_Oil Mgv (24.15%) and Flavobacteriaceae in Prs Mgv (27.68%) (Fig. 3, Supplementary material Table S3).

Fig. 3
figure 3

Taxonomic affiliation at Family level related of enzymes sequences from Library_Oil Mgv and mangrove sites (Oil Mgv, Ant Mgv, Prs Mgv, Bahia_Ant Mgv, Rio_Prs Mgv) based on metagenomic datasets, performed using workbench settings in MG-RAST server v3.5 (Meyer et al. 2008). Amylase; cellulase; epoxide hidrolase; esterase; lipase; protease; chitinase; xylanase

Phylogenetic analysis of the most abundant hydrolase sequences allowed the recovery of two contigs for proteases: Pro_1, that clustered with one protease sequence from Desulfococcus multivorans (class Deltaproteobacteria), and Pro_2, that clustered with a protease sequence from Herpetosiphon geysericola (Chloroflexia class) (Fig. 4a; Supplementary material Table S4). For esterases, four contigs were obtained for arylsulfatase sequences: Ary_1, that clustered with one esterase from Streptomyces sp. (class Actinobacteria), Ary_2 and Ary_3, that clustered with esterase sequences from Mycobacterium vulneris (class Actinobacteria), and Ary_4, that clustered with arylsulfatases from Rhodopirellula baltica and Rhodopirellula europaea (class Planctomycetes) (Fig. 4b). For amylase sequences, five contigs were obtained: contigs Amy_1 and Amy_2 were more closely related to alpha-amylases from Caulobacter sp. and Brevundimonas sp. (class Alphaproteobacteria), Amy_3 clustered with alpha-amylase sequences from Hyalangium minutum and Stigmatella aurantiaca (class Deltaproteobacteria), Amy_4 clustered with alpha-amylase sequences from Thiocapsa violascens and T. marina (class Gammaproteobacteria) and Amy_5 was shown to be distantly related to known amylase sequences and recovered as a separate branch in the phylogenetic tree, probably representing a new group (Fig. 4c). Two contigs were retrieved for lipase sequences: Lip_1 and Lip_2, which clustered with lipases from Rhodopirellula maiorica (class Planctomycetia) (Fig. 4d). Only one contig assigned as cellulase-related sequence was obtained: Cell_1, that clustered with a cellulase sequence from Actinotalea fermentans (class Actinobacteria) (Fig. 4e). Two contigs were obtained for epoxide hydrolase sequences: Epo_1, that clustered with sequences from Rhodopseudomonas palustris and Bradyrhizobium sp. (class Alphaproteobacteria), and Epo_2, recovered as a separate branch in the phylogenetic tree, representing a putative new group of enzyme (Fig. 4f). Four contigs were obtained for 1,4 beta xylanase sequences: Xyl_1 and Xyl_2, that clustered with Rhodopirellula europea; Xyl_ 3, that was more closely related with xylanase sequences from Thermotoga caldifontis and Pseudothermotoga hypogea (family Thermotogaceae), and Xyl_4, which was recovered in a complete separate cluster in the phylogenetic tree (Fig. 4g).

Fig. 4
figure 4figure 4

Phylogenetic analysis based on hydrolases sequences from Library_Oil Mgv and mangrove sites (Oil Mgv, Ant Mgv, Prs Mgv, Bahia_Ant Mgv, Rio_Prs Mgv) based on metagenomic datasets, performed using workbench settings in MG-RAST server v3.5 (Meyer et al. 2008). Protease (a); esterases (b); amylase (c); lipase (d); cellulase (e); epoxide hidrolase (f); xylanase (g)

Most abundant protein families in oil-impacted mangrove datasets (Library_Oil Mgv and Oil Mgv)

The protein classes of proteases, esterases and amylases were mostly abundant in the oil-impacted mangrove datasets (Library_Oil Mgv and Oil Mgv) (Table 2). In the case of proteases, 20 protein families were found and the most abundant were: ATP-dependent protease La (EC:3.4.21.53) (14.3 and 15.17%, respectively); Leucyl aminopeptidase (EC:3.4.11.1) (3.58 and 3.32%, respectively); Oligopeptidase A (EC:3.4.24.70) (2.72 and 3.41%, respectively); Carboxyl-terminal protease (EC:3.4.21.102) (2.43 and 2.96%, respectively); and Prolyl oligopeptidase (EC:3.4.21.26) (2.45 and 1.78%, respectively). A total of 20 esterase families were found in Library_Oil Mgv and Oil Mgv, and the most abundant were Arylsulfatase (EC:3.1.6.1) (7.93 and 2.69%, respectively); Holliday junction DNA helicase RuvB (EC:3.1.22.4) (3.23 and 4.17%, respectively) and Ribonuclease R (EC:3.1.13.1) (2.25 and 2.97%, respectively). For Alpha-amylase, 17 protein families were found and the most abundant ones were Glycogen debranching enzyme/alpha-amylase (EC:3.2.1.1) (38.99 and 36.81%, respectively); Alpha-amylase (EC:3.2.1.1) (32.51 and 37.5%, respectively); Periplasmic alpha-amylase (EC:3.2.1.1) (5.44 and 2.08%, respectively) and Cytoplasmic alpha-amylase (EC:3.2.1.1) (3.76 and 4.17%, respectively). Glucoamylase enzymes were distributed in 10 families, being the predominant ones: glucan 1,4-alpha-glucosidase (EC:3.2.1.3) (79.92 and 83.68%, respectively) and glycoside hydrolase 15-related (EC:3.2.1.3) (12.13 and 6.69%, respectively).

Table 2 Most abundant protein families in oil-impacted mangrove datasets (Library_Oil Mgv and Oil Mgv)

Discussion

In this study, diverse hydrolase sequences, encompassing esterases, amylases, proteases, epoxide hydrolases, lipases, chitinases, xylanases and cellulases, assigned to a broad range of microbial taxa, were recovered in the fosmid library. However, comparison with metagenomes from other mangroves showed the same overall distribution pattern, with esterases, proteases and amylases as the most abundant hydrolases in all datasets.

Taxonomic assignment of the reads related to the target enzyme sequences showed that Bradyrhizobiaceae, Pseudomonadaceae, Flavobacteriaceae, Alteromonadaceae, Vibrionaceae and Mycobacteriaceae were the most abundant families harboring such genes, and the majority of these taxa have already been described in oil-impacted sediments. Members of Bradyrhizobiaceae were dominant in the near-surface sediment at a landfill leachate-impacted groundwater environment (Yu et al. 2010). Pseudomonadales was one of the most abundant orders in bacterial communities in sediments of urban mangrove forests and is often involved in aerobic or anaerobic degradation of hydrocarbon pollutants (Gomes et al. 2008). Sequences related to Flavobacteraceae PAH-degrading organisms were found in mangrove sediments of China (HuiJie et al. 2011) and members of the Mycobacteriaceae were dominant among bacteria able to mineralize the PAHs phenanthrene, fluoranthene and pyrene (Willumsen et al. 2001).

Phylogenetic reconstruction based on the assembled contigs suggested that most of the protein sequences recovered are related to already known sequences. Five contigs showed interesting similarities to the genus Rhodopirellula, which is a member of the bacterial phylum Planctomycetes: Ary4 (esterase), Lip1, Lip2 (lipases), Xyl1 and Xyl2 (xylanases). This phylum is abundant in marine environment and Rhodopirellula species have been detected in sediment samples in previous studies (Lage and Bondoso 2012). Planctomycetes members, as well as eucaryotes, have the particular capacity to synthesize and metabolize steroids, with the aid of esterases (Fuerst and Sagulenko 2011). These organisms are difficult to isolate and represent a set of bacteria poorly explored and with yet unknown potential (Lage and Bondoso 2012). Members of the family Planctomycetaceae, including Rhodopirellula, have been reported as potential xylanase producers, as they have in their genomes the domains responsible for the cellulose degrading activity. All results were obtained through in silico analysis, both from isolates (Naumoff et al. 2014) or metagenome (Thompson et al. 2013), but there are few data available reporting xylanase production by Plancomycetes within mangrove environments.

Protease-related contigs were assigned to Deltaproteobacteria and Chloroflexi classes. Proteolytic enzymes perform degradation and synthesis functions and are physiologically necessary for living organisms (Rao et al. 1998), being found in a large variety of taxa. However, there is no previous report on the production of ATP-dependent protease La by Desulfococcus multivorans and Herpetosiphon geysericola species, and further studies with these species may reveal proteases of unknown and untapped potential.

Proteases were the most abundant hydrolases in the oil-impacted mangrove datasets (Library_Oil Mgv and Oil Mgv) and interesting protease families were recovered. Due to the stressing conditions of the oil-impacted mangrove, ATP-dependent protease is expected to occur in high quantities, as its function is to prevent abnormal and denatured protein aggregate accumulation inside the cell (Sadeghi et al. 2014). Leucyl aminopeptidases cleave N-terminal residues from proteins and peptides, with broader specificity, and are considered as cell maintenance enzymes (Matsui et al. 2006). Oligopeptidase A cleaves the peptides generated by the activity of the three primary ATP-dependent proteases from E. coli (Jain and Chan 2007), and since the fosmid library was constructed using E.coli as host, the presence of this protease family was expected in Library_Oil Mgv dataset. Carboxyl-terminal protease cleaves bonds at the C end of polypeptides, and is involved in post-translational protein processing, maturation or degradation in prokaryotic cells (Li et al. 2012). Prolyl oligopeptidases are highly selective enzymes as their activity is restricted to the substrates of up to 30 amino acids. They are widely distributed in all domains of life and their exact physiological role, genomic distribution and evolutionary characteristics in bacteria are still unknown (Kaushik and Sowdhamini 2014). The high abundance of protease sequences and protein families found in the Library_Oil Mgv and Oil Mgv datasets is in accordance with the functions of these enzymes in microbial cells, and may reveal the high capacity of the microbial community to respond to the stressing factors imposed by oil spill contamination.

Three esterase-related contigs were affiliated to the Actinobacteria phylum, which is commonly related to esterase production, with several species capable of producing lipolytic enzymes in marine and mangrove environments (Sathya and Ushadevi 2014). Among esterases, arylsulfatases were abundant in oil-impacted mangroves, problably due to the presence of oil as these enzymes participate in the metabolism of sulphuric acid esters (Sahoo and Dhal 2009), an organosulphate highly present in petroleum (Kilbane 2006). Holliday junction DNA helicase RuvB is an enzyme involved in the formation of Holliday junction structure, which is necessary for double-strand break repair thus allowing correct replication and chromosomal segregation (Cañas et al. 2014). As this process is common to cells from all domains of life, occurrence of this protein family is expected. Concerning the high abundance of Ribonuclease R, according to Matos et al. (2014), under stress conditions, levels of Ribonuclease R increases in the cell, suggesting a role for the protein under these conditions. The high incidence of arylsulfatase and Ribonuclease R in oil-impacted mangroves indicates an adaptation of the microbiota to the prevailing environmental conditions, both for sulfur metabolism and for stress response.

Contigs related to amylase coding sequences belonged to bacterial classes commonly found in mangroves, however the affiliated species are not featured as amylase producers. Amy_5 showed 54% sequence similarity with Cesiribacter andamanensis, suggesting a potential new amylase sequence, not previously described or available in databases. High amount of organic matter within mangroves could explain the occurrence of alpha-amylases. Glycogen debranching enzyme/alpha-amylase breaks polymers of glucose (Zmasek and Godzik 2014). Alpha-amylases break down long-chain carbohydrates by acting at ramdom locations along the starch chain (Tiwari et al. 2015). Periplasmic and cytoplasmic alpha-amylases have been related to E. coli maltose utilization system, probably when other sugars readily used as energy source are not available (Schneider et al. 1992; Kim et al. 2006). The presence of these enzymes is likely crucial for nutrient cycling and recovery of natural or anthropogenic disturbances within mangroves.

The cellulase-related contig Cell_1 was affiliated to Actinotalea fermentans, a Cellulomonadaceae member already described as cellulose-degrading bacterium. Members of this group are commonly found in environments other than mangrove sediments (Rainey et al. 1995), and this finding shows not only the ability of this species to adapt to different environments, but also a cellulase potential activity in mangrove.

One epoxide hydrolase-related contig, Epo_1, was affiliated to Alphaproteobacteria (family Bradyrhizobiaceae), a group already found within this dataset as an epoxide hydrolase producer (Jiménez et al. 2015). Contig Epo_2 showed low amino acid identity (54%) with Rhodopseudomonas palustris, indicating that this enzyme might be a potential novel epoxide hydrolase.

Apart from the Planctomycetes-related sequences, two xylanase sequences were affiliated to Thermotogae and Firmicutes phyla. Contig Xyl_3 showed 53% sequence similarity to xylanase from Thermotoga caldifontes, a species not previously reported as xylanase producer. On the other hand, Thermotoga maritima and other halophilic species have shown the ability to degrade cellulose (Dalmaso et al. 2015). Xyl_4 showed only 35% amino acid identity to Marinococcus halotolerans. The low sequence similarity displayed by Xyl_3 and Xyl_4 may suggest they represent potentially new xylanases yet to be described.

Combined results gathered in this work showed that abundant hydrolases of distinct classes were found in the particular mangrove environment through metagenomics, including new sequences with potential unknown properties of biotechnological interest. Metagenomic studies over the years have resulted in novel classes of hydrolytic enzymes, highlighting the relevance of the cultivation-independent approach (López-López et al. 2014). Further biochemical studies for enzymatic characterization are needed for unveilling their novelty and confirm their application potential.

Results brought together in this work provided an in-depth insight into the microbial diversity of hydrolases in Brazilian mangrove sediments. In addition to the direct sequencing of the sediment samples, the sequencing of the metagenomic library constructed from sediments of an oil-impacted mangrove allowed a comprehensive scanning of the taxonomic diversity and functional potential recovered from the environment. Taxonomic affiliation of contigs unraveled a significant abundance of hydrolase sequences related to the genus Rhodopirellula, which belongs to the phylum Planctomycetes, suggesting a key role of such microorganisms in mangrove nutrient cycling and a promising source as a target for further bioprospecting of the Library_Oil Mgv dataset. Finally, phylogenetic analysis revealed three contigs showing no relatedness with known sequences, potentially representing new sequences not previously deposited in databases, including one alpha-amylase (Amy_5), one epoxide hydrolase (Epo_2), and one xylanase (Xyl_4). These data demonstrate the biotechnological potential of Library Oil_Mgv in the search of enzyme sequences for further heterologous expression studies aiming their industrial application in diverse processes.