Introduction

Deep subsurface petroleum reservoirs were long considered as hostile environments unfavorable to life and growth (Augustinovic et al. 2012). However, recent examinations of deep subsurface oil reservoirs have shown the ability of microbial life to colonize these extreme habitats, usually characterized by high temperature, pressure, salt, heavy metal, and organic solvent concentrations (Youssef et al. 2009). Consequently, microorganisms inhabiting such niches are considered poly-extremophiles, being adapted to harsh conditions and presumably harbor specialized biochemical capabilities of relevance for scientific and industrial applications (Wentzel et al. 2013).

The study of microbial communities in petroleum reservoirs has been mainly motivated by the increasing oil demand in the world and an understanding that microbial activities in oil reservoirs can have significant implications for oil quality and recovery. Some of the effects can be negative, such as degradation of petroleum hydrocarbons, reservoir souring, and corrosion of oil field equipment, whereas others may be beneficial, such as the application of microbial technologies for enhanced oil recovery (Roling 2003; Head et al. 2003). Therefore, in the last few decades, culture-dependent and independent approaches have been widely used to describe populations of microorganisms inhabiting oil reservoirs. A large inventory of microorganisms detected in oil reservoirs has been reviewed elsewhere in Magot et al. (2000) and Youssef et al. (2009) and includes several physiological groups, such as fermentative, sulfate-, sulfur, iron-, and manganese-reducing microorganisms, acetogens, and methanogens growing at either mesophilic or thermophilic temperatures.

Physicochemical conditions may vary greatly across petroleum reservoirs and are thought to influence microbial community composition. Temperature has been considered the most important factor controlling microbial growth in oil reservoirs, and the presence of indigenous bacteria has been suggested to be limited to temperatures between 80 and 90 °C (Magot et al. 2000). Salinity, pH, and nutrient availability can also play important roles for the types of bacteria and metabolic activities in oil field environments (Magot et al. 2000; Youssef et al. 2009) as they are known to do in all environments. Therefore, it is expected that the environmental variations may promote substantial differences in microbial communities among oil reservoirs. However, most studies on microbial diversity of oil reservoirs have explored one or, in the best scenario, three oil wells in the same region and only a few comparisons have been made of microbial communities among reservoirs (Li et al. 2012; Lewin et al. 2014). Consequently, there is a gap in our knowledge of how microbial communities can differ among reservoirs and whether such variation correlates with environmental conditions or, alternatively, with geographical distance. The present work aimed at the characterization of the microbiota in Brazilian oil reservoir samples using 16S rRNA gene libraries followed by a comparison of the relative abundances of microbial taxa with those observed in several other oil reservoirs worldwide. The hypotheses that guided this work were: (i) environmental parameters influence the microbial community structure in deep subsurface petroleum reservoirs, i.e., reservoirs with similar characteristics harbor more similar communities; (ii) reservoirs located closer together harbor more similar communities, i.e., geographic distance plays a deterministic role in shaping microbial communities in oil reservoirs; (iii) there is a core microbiome underlying ecosystem functioning in oil reservoirs.

Materials and methods

Sampling and geological settings

Samples were collected in December 2012 with logistic support of PETROBRAS R & D and PETROBRAS operational units of Recôncavo and Campos basins. Samples of oil–water fluids were obtained directly from the wellheads of two onshore production wells located in the Recôncavo Basin in the Northeast of Brazil (labeled as BA-1 and BA-2) and one offshore exploratory well, located above a thick salt layer (post-salt), at Campos Basin in the Southeast of Brazil, labeled as ATL. Depths and temperatures of the wells and other geochemical characteristics are described in Table 1. In addition, wells have not undergone water flooding. Samples were collected in sterile Schott bottles filled completely with the oil–water mixture and transported at ambient temperature to the laboratory at CPQBA/UNICAMP (Campinas, Brazil) for further processing.

Table 1 Geochemical characteristics of the studied oil samples from the Recôncavo and Campos Basins

Hydrocarbon analysis was carried out for the crude oil samples. The hydrocarbon, aromatic, resin, and asphaltene fractions were obtained by silica gel column chromatography using hexane, hexane/toluene (1:1) and toluene:methanol (1:1) as eluent, respectively. Gas chromatographic (GC) analysis of the saturated hydrocarbon fractions was carried out on the hexane fraction. GC was performed on an Agilent 7890A gas chromatograph instrument, connected to a Agilent 5975C-MSD mass detector, fitted with a HP5-MS coated capillary column (60 m length, 0.25 mm internal diameter, 0.25 μm film thickness). Conditions for analysis were as follows: split ratio 10:1, using He as carrier gas with a flow of 1 mL min−1 and data acquisition in both SCAN and selective ion monitoring (SIM) modes. The temperature program was 50 °C for 5 min then 4 °C/min to 300 °C for 10 min., with injector temperature set to 310 °C.

DNA extraction

Aliquots (25 mL) of crude oil samples were first treated with 2,2,4-trimethylpentane (iso-octane) (Vetec), according to the method described by Yoshida et al. (2005). After centrifugation, resultant pellets were suspended in 10 mL of iso-octane and used directly for DNA extraction using PowerMax® Soil and PowerSoil™ DNA isolation kits (MO BIO Laboratories, Inc). In the case of the PowerMax Soil DNA isolation kit, final volumes were concentrated in a Speed-Vac concentrator (Savant, SPD 121P Thermo Scientific). For each sample, replicate extracts (5 tubes) from the PowerSoil DNA isolation kit were pooled together with DNA eluted from PowerMax kit and used for subsequent PCR amplification.

Construction of 16S rRNA gene clone libraries

Bacterial and archaeal 16S rRNA gene fragments were PCR-amplified from the total community DNA for each oil sample. The primer pairs 10F (5′-GAGTTTGATCCTGGCTCAG-3′)—1100R (5′-AGGGTTGCGCTCGTTG-3′) and 344F (5′-ACGGGGYGCAGCAGGCGCGA-3′)—915R (5′-GTGCTCCCCCGCCAATTCCT-3′) were used for the bacterial (Lane 1991) and archaeal (Casamayor et al. 2002) domains, respectively. The final volume of the reaction mixture (25 µL) contained 5 µL of total DNA, 4 U of Taq DNA polymerase (Invitrogen), 0.2 mM dNTP mix (GE Healthcare), 0.5 µM of each primer, 1× Taq Buffer (Invitrogen), and 1.5 mM MgCl2 (Invitrogen). Bacterial PCR amplifications were performed using 2 min of initial denaturation at 95 °C, followed by 30 cycles of 1 min at 94 °C, 1 min at 55 °C and 3 min at 72 °C and a final extension step of 72 °C for 3 min. Archaeal PCR reactions were performed as follows: after 2 min of an initial denaturation step at 94 °C, nucleic acids were amplified for ten cycles of 30 s at 94 °C, 30 s at 65 °C, 2 min at 72 °C, followed by 25 cycles of 30 s at 94 °C, 30 s at 60 °C and 2 min at 72 °C, and a final extension for 2 min at 72 °C. PCR reactions were performed in duplicate using serially diluted total community DNA (1:5, 1:10, 1:20) of the three different samples. PCR products from different DNA dilutions in each sample were pooled to minimize PCR bias (Polz and Cavanaugh 1998) and purified using Illustra GFX PCR DNA and Gel Band Purification kit (GE Healthcare), according to the manufacturer’s protocol. The purified amplicons were cloned using pGEM®-T Easy Vector (Promega) and transformed into Escherichia coli JM109 competent cells (Promega) by the heat-shock method, according to the manufacturer’s instructions.

Sequencing and phylogenetic analyses

The 16S rDNA inserts were PCR-amplified from the library of clones with the M13 forward (5′-CGCCAGGGTTTTCCCAGTCACGAC-3′) and reverse (5′-TTTCACACAGGAAACAGC TATGAC-3′) plasmid-specific primers. PCR reactions were performed in a 50-µL reaction volume, containing a 1:100 dilution of an overnight clone culture, 0.5 µM of each primer, 0.2 mM dNTP mix, 2 U Taq DNA polymerase (Invitrogen), 1× Taq buffer, and 1.5 mM MgCl2. The amplification program for the M13F/M13R primer set consisted of an initial denaturation step at 94 °C for 3 min, followed by 30 cycles of 94 °C/20 s, 60 °C/20 s and 72 °C/90 s and a final extension of 3 min at 72 °C. PCR products were purified using PureLink® Pro 96 PCR Purification Kit (Invitrogen) and sequenced in the ABI3500XL (Applied Biosystem) sequencer using the primer pair 10F/1100R for bacterial and the reverse primer 915R for archaeal 16S rRNA clones.

Bacterial partial 16S rRNA sequences obtained from clones were assembled in a contig using phredPhrap/CONSED package (Ewing et al. 1998; Gordon et al. 1998). Good quality sequences were checked for putative chimeric sequences using Chimera Slayer algorithm and non-chimeric sequences were grouped into operational taxonomic units (OTUs) at a cutoff of 97% similarity using MOTHUR (Schloss et al. 2009). Rarefaction analysis and estimation of microbial richness and diversity indices using OTU-based approaches were calculated in MOTHUR. Phylogenetic assignment of the OTUs was determined using the classifier tool of Ribosomal Database Project (RDP) II (Wang et al. 2007) with a confidence threshold of 80%. RDP Library Compare was used to estimate statistical differences in bacterial and archaeal composition between the libraries (P < 0.01). 16S rRNA gene sequences representing each OTU and their closest relatives were selected and automatically aligned to SILVA reference alignment using the SINA Webaligner tool (Pruesse et al. 2012). The alignment was manually refined in Bioedit and the closest type strains in RDP were also added into the alignment. Phylogenetic trees were constructed using MEGA software v6.06 (Tamura et al. 2013) with the neighbor-joining method and Kimura’s two-parameter model (Kimura 1980), with bootstrap values calculated from 1000 replicate runs. The nucleotide sequence data reported are available in the Genebank database under the accession numbers KX348396–KX348455.

Meta-analysis of published data from oil reservoir microbial communities

In an effort to investigate microbial communities in petroleum reservoirs at a global scale, several studies involving microbial communities from oil and production water fluids worldwide were reviewed. For the purpose of this work, only studies based on 16S rRNA gene sequences from clone libraries derived from DNA isolated directly from the reservoir fluid were considered. Therefore, data on high-throughput sequences, microbial laboratory enrichments or isolates were excluded, to prevent that either the scarcity of deep next generation sequencing studies in oil reservoir samples or the selective enrichment of some strains through cultivation techniques contributed to a bias in the analysis. Consequently, this study covers the microbial composition accomplished with 16S rRNA clone libraries here and in previous studies, considering the limitations of the technique when compared to new deep sequencing approaches. Bacterial and archaeal 16S rRNA gene sequences were downloaded from GeneBank, encompassing ten different studies located in California (Orphan et al. 2000), Alaska (Pham et al. 2009), Canada (Grabowski et al. 2005; Hubert et al. 2012), Japan (Kobayashi et al. 2012a), China (Li et al. 2007), North Sea (Dahle et al. 2008; Kaster et al. 2009) and Brazil (Silva et al. 2013 and this study). Sequences of each dataset were used to perform multiple sequence alignment (CLUSTALW) and further clustering into OTUs using MOTHUR. Microbial classification of the OTUs was determined using the same parameters described above in the RDP Classifier tool (Wang et al. 2007). Data from Kobayashi et al. (2012a) were separated into two different libraries based on the origin of the clone libraries (oil or water clone libraries). Comparative analyses were undertaken using pooled data from both the Archaeal and Bacterial domains using Primer-E version 6.1.5 (Clarke and Gorley 2006). A hierarchical cluster analysis of the relative abundance of bacterial and archaeal members at the order level was performed based on data from this study and studies detailed above. A dendrogram was constructed using a complete linkage clustering algorithm using Euclidean similarities calculated on log-transformed abundance data performed in Primer-E.

Results

Hydrocarbon analysis of crude oil samples

BA-1 sample showed evidence of oil degradation and samples BA-2 and ATL corresponded to non-degraded oils. BA-1 sample had a prominent unresolved complex mixture (UCM), relatively low proportion of n-alkanes in relation to iso- and cycloalkanes and presence of C30 17α 21β (H)-hopane and C29 17α(H)-25-norhopane (Fig. 1a). In addition, saturated hydrocarbons in BA-1 were less abundant and the API gravity was substantially lower than the sample BA-2 (Table 1). Such characteristics can be related to a biodegradation level between 4 and 5 in the Peters & Moldowan biodegradation scale (Peters and Moldowan 1993). The sample BA-2 showed no evidence of UCM, high percentage of n-alkanes with a typical smooth profile in the distribution and relatively high API degree (Fig. 1b; Table 1). These geochemical characteristics are distinct features of preserved Recôncavo oils (Gaglianone and Trindade 1988). The sample ATL also showed no evidence of biodegradation as indicated by the flat baseline in the GC trace and the presence of a wide range of n-alkanes (Fig. 1c). Although Recôncavo and Campos oil samples used in this study are of lacustrine origin, each one has geochemical differences inherited from specific organic matter preserved under dissimilar conditions: fresh water lakes in Recôncavo Basin, and saline to hypersaline alkaline lakes in Campos Basin (Mello et al. 1988; Mello and Maxwell 1990; Trindade et al. 1995). Such source diversity in both basins can explain the differences between the abundance of saturated hydrocarbons and API degrees found in preserved oils from the Campos and Recôncavo basins (de Aguiar and Penteado 2005). The substantial difference between δ13C of Recôncavo and Campos oils corroborates the dissimilarity of organic matter types (source and paleoenvironmental variations) in both basins (Fig. 1; Table 1). For more details on the petroleum systems that host the studied oils see Mello et al. (1994) and Figueiredo et al. (1994).

Fig. 1
figure 1

Total ion chromatograms (TIC) of hydrocarbon fractions from oil samples BA-1 (a), BA-2 (b) collected in two wells from the same field within the Reconcavo Basin and one oil simple ATL (c) from a well in the offshore Campos Basin. Ho C30 17α, 21β(H)-hopane; 25-NH C29 17α(H)-25-norhopane; Pr pristane; Ph phytane

Microbial diversity in Brazilian oil samples

The microbial community composition of the biodegraded (BA-1) and non-biodegraded (BA-2 and ATL) oil samples was examined by the analysis of bacterial and archaeal 16S rRNA gene libraries. In total, six 16S rRNA clone libraries were constructed from the isolated DNA of the three petroleum samples: three archaeal clone libraries containing 132, 110, and 94 high quality 16S rRNA gene sequences (approx 300 bp) for the BA-1, BA-2, and ATL samples, respectively, and three bacterial clone libraries containing 116, 112, and 53 high quality sequences (approx 600 bp) for the BA-1, BA-2, and ATL samples, respectively (Table S1). Diversity analysis showed that 4–9 OTUs were identified in the archaeal libraries and 10–21 OTUs in the bacterial libraries (Table S1). Coverage values ranged between 92 and 99%, indicating that most of the prokaryotic diversity present in the samples was detected. Rarefaction curves of most libraries tended to approach the saturation plateau, except for the ATL and BA-1 bacterial libraries for which additional clone sequencing effort is needed to cover total diversity (Supplementary figure S1).

Total diversity in the three bacterial libraries encompassed 42 OTUs covering 10 different phyla: Thermotogae, Firmicutes, Deferribacteres, Proteobacteria, Synergistetes, Bacteroidetes, Actinobacteria, Chloroflexi, Chlorobi, and TM7, in addition to several phylotypes of unknown affiliation (Fig. 2). Archaeal clones from the libraries represented in total, 21 OTUs distributed among four taxonomic classes within the phylum Euryarchaeota (Fig. 3).

Fig. 2
figure 2

Relative abundance of a bacterial phyla and b bacterial classes detected in 16S rRNA clone libraries from oil Brazilian samples

Fig. 3
figure 3

Relative abundance of archaeal classes detected in 16S rRNA clone libraries from oil Brazilian samples

Phylogenetic analysis

All bacterial (42) and archaeal (21) representative phylotypes obtained from the libraries and their closest related sequences were used for phylogenetic analyses (Figs. 4, 5). Most of the closest relatives found in the databases corresponded to isolates or environmental clone sequences from oil-associated environments. Bacterial clades with a larger number of clone sequences were divided into three distinct trees, Proteobacteria (Fig. 4a), Firmicutes/Actinobacteria (Fig. 4b), and the remaining phyla (Fig. 5).

Fig. 4
figure 4

Phylogenetic analysis based on partial 16S rRNA sequences from BA-1, BA-2 and ATL libraries of organisms within the Proteobacteria lineage (a) and Firmicutes and Actinobacteria phyla (b). Number of clones representing each OTU/numbers of total clones in the BA-1 (brown), BA-2 (green) and ATL (blue) libraries, respectively, are shown in brackets. Bootstrap values (n = 1000 replicate runs, shown as %) greater than 70% are listed. Genebank accession numbers are listed after species names in parentheses. Geoglobus acetivorans strain SBH6 was used as outgroup

Fig. 5
figure 5

Phylogenetic analysis based on partial 16S rRNA sequences from BA-1, BA-2 and ATL libraries depicting members of phyla Bacteroidetes, TM7, Chloroflexi, Deferribacteres, Chlorobi, Synergistetes and Thermotogae and their related species. Number of clones representing each OTU/number of total clones in the BA-1 (brown), BA-2 (green) and ATL (blue) libraries, respectively, are shown in brackets. Bootstrap values (n = 1000 replicate runs, shown as %) greater than 70% are listed. Genebank accession numbers are listed after species names in parentheses. Geoglobus acetivorans strain SBH6 was used as outgroup

Within the Proteobacteria phylum (Fig. 4a), most of the sequences (130/148 sequences distributed in 6 OTUs) were affiliated with Gammaproteobacteria class, where numerous sequences (88 distributed in three OTUs, BA2_OTU4, BA2_OTU11 and ATL_OTU10) were grouped into a separate cluster closely related (99%) to sequences of Marinobacter sp. derived from microbial communities of oilfields in China (Tang et al. 2012), to Marinobacter-type microorganisms isolated from an oil-polluted saline soil in a Chinese oilfield (M. guadaonensis) (Gu et al. 2007) and to one halophilic isolate from benthic sediment of the South China Sea (M. segnicrescens) (Guo et al. 2007). Sequences associated with Marinobacterium sp. (41 sequences) shared 99% identity with environmental sequences from a non-flooded, high temperature petroleum reservoir in Japan (Kobayashi et al. 2012a). The remaining Proteobacteria sequences were affiliated with the Betaproteobacteria, Alphaproteobacteria and Deltaproteobacteria classes and were phylogenetically most closely related to sequences derived from seawater (Yin et al. 2013), anaerobic digesters (Riviere et al. 2009) and from an alkaliphilic isolate from a Soda Lake (Zavarzina et al. 2006), respectively.

Phylogenetic analyses of the Firmicutes and Actinobacteria phylotypes (Fig. 4b) showed a close relatedness to organisms with interesting metabolic abilities, such as BA2_OTU5 and BA2_OTU7 which were 99 and 98% similar to an halophilic strain of Halanaerobium sp. isolated from hypersaline sediments in Utah (Kjeldsen et al. 2007). Also, phylotypes ATL_OTU3 and BA2_OTU2 showed the highest 16S rRNA identity (93%) with the thiosulfate reducer type strain Dethiosulfatibacter aminovorans, isolated from marine sediments (Takii et al. 2007). Phylotype ATL_OTU2 showed 99% similarity with a Syntrophomonadaceae bacterial clone from enrichments of formation water samples associated with the complete metabolism of hydrocarbons to methane in an oil reservoir (Kobayashi et al. 2012b) and showed 91% similarity to the thermophilic type strain Thermosynthropa tengcongensis, with a proven long-chain fatty acid syntrophic degradation ability (Zhang et al. 2012a).

Figure 5 shows the phylogenetic relationships among the remaining bacterial OTUs, which were distributed in seven phyla. The largest number of sequences in this phylogenetic tree (59/92) grouped into two OTUs (BA1_OTU19 and BA2_OTU3) classified as Deferribacteraceae according to Silva database, and showed the highest similarities (>85%) with 16S rRNA sequences from published studies of microbial communities in contaminated sediments after the Prestige oil spill (Acosta-González et al. 2013) and in oil reservoirs in Denmark (Gittel et al. 2012). The closest type organism for these sequences was the thermophilic nitrate-reducing strain Calditerrivibrio nitroreducens, isolated from a hot spring in Japan (Iino et al. 2008). Other sequences were associated with representatives of the Thermotogae phylum, which are thermophilic anaerobic microorganisms from oil reservoirs, such as BA1_OTU21 which showed 99% similarity with Petrotoga halophila from an offshore reservoir in Congo, Africa (Miranda-Tello et al. 2007), and the phylotype BA2_OTU1 which showed 98% similarity with Thermosipho geolei from a thermophilic onshore reservoir in Siberia (L’Haridon et al. 2001).

Archaeal libraries contained, in total, 21 unique phylotypes. Methanomicrobia contained the largest number of sequences (204/336) (Fig. 6), which were closely related to sequences from the Niibori oilfield in Japan (Kobayashi et al. 2012a) and methanogenic alkane-degrading enrichments from an oily sludge (Cheng et al. 2014). Thermoplasmata sequences were related to uncultivated community members found in a methanogenic enrichment from the formation water of a petroleum reservoir in Russia (Nazina et al. 2013). Methanobacteria phylotypes were most closely related with sequences classified as either Methanothermobacter or Methanobacterium spp. and with uncultivated microorganisms from production waters (Cheng et al. 2011).

Fig. 6
figure 6

Phylogenetic analysis of partial archaeal 16S rRNA sequences from BA-1, BA-2 and ATL libraries. Number of clones representing each OTU/number of total clones in the BA-1 (brown), BA-2 (green) and ATL (blue) libraries, respectively, are shown in brackets. Bootstrap values (n = 1000 replicates, shown as %) greater than 70% are listed. Genebank accession numbers are listed after species names in parentheses. Pseudomonas aeruginosa DSM 50071T was used as outgroup

Comparison of the bacterial community compositions in the Brazilian oil samples

The BA-1 sample showed the most diverse and rich bacterial community when compared with BA-2 and ATL 16S rRNA bacterial libraries (Table 2; Figure S1). As detailed before (Table 2; Fig. 5), the majority of the 16S bacterial sequences in the BA-1 sample (50%) grouped in a single OTU (BA1_OTU19) classified as Deferribacteraceae. The second most abundant taxon at phylum level in BA-1 sample was Firmicutes (14%), with Bacillus and Tumebacillus as the most representative genera, followed by the phylum Thermotogae (12%), represented by the genera Kosmotoga and Petrotoga (Table 2). The Proteobacteria was the fourth most abundant phylum identified (10%) and encompassed the classes Alphaproteobacteria (3%), Betaproteobacteria (3%), Deltaproteobacteria (3%), and Gammaproteobacteria (1%). A minor proportion of the clone sequences belonged to the phyla Actinobacteria (6%), Bacteroidetes (3%), Chloroflexi (3%), Chlorobi (1%), and TM7 (1%) (Fig. 2; Table 2).

Table 2 Taxonomic classification and relative abundance of bacterial and archaeal 16S rRNA gene sequences in the clone libraries

The bacterial composition of the non-degraded samples BA-2 and ATL was very similar to each other (Fig. 2; Table 2) and did not show significant differences using the RDP library compare tool (P > 0.01). Sequences in samples BA-2 and ATL were predominantly related to the phylum Proteobacteria, represented only by the classes Gammaproteobacteria (largest proportion, 86 and 80% in BA-2 and ATL, respectively) and Deltaproteobacteria (minor proportion) (Fig. 2; Table 2). The second most abundant phylum for both samples BA-2 and ATL was Firmicutes (4 and 9%, respectively). Other less numerous phyla detected in the BA-2 sample were Thermotogae (1%), Deferribacteres (2%) and Synergistetes (2%), and in the ATL sample were Bacteroidetes (2%) and Chloroflexi (2%).

Significant differences were found between the biodegraded and the non-degraded samples, namely, between BA-1 and BA-2 and between BA-1 and ATL bacterial libraries. The Gammaproteobacteria class was the most differentially represented group, with the genera Marinobacter and Marinobacterium present only in the BA-2 and ATL libraries (Table 2; Fig. 4a). On the other hand, other taxonomic groups were significantly and uniquely represented in the sample BA-1, namely, the phylum Deferribacteres and the taxonomical classes Bacilli and Thermotogae, specifically the genera Bacillus and Petrotoga, respectively (Table 2).

Archaeal community composition in the Brazilian oil samples

The majority of sequences (57, 29, and 50%) in all libraries (BA1, BA2, and ATL, respectively) could not be classified below Methanomicrobia class (Table 2) and they formed a separate clustering from their closest type organisms (Fig. 6), suggesting that these sequences could belong to phylotypes not described so far. Phylogenetic analysis of these sequences showed close relationships to reported Methanocellales-type organisms (Fig. 6). Nonetheless, 47 and 26% of sequences in BA-2 and ATL archaeal libraries, respectively, showed high similarity levels (100 to 98%) at the genus level with sequences reported in RDP as Methanosaeta, Methanocalculus, Methanothermobacter, Methanobacterium, and Methanotermococcus (Table 2; Fig. 6).

In contrast to the most diverse and rich bacterial community, the BA-1 sample showed the least diverse and rich archaeal community when compared with the BA-2 and ATL samples (Table 2; Table S1). Furthermore, no archaeal phylotype in the BA-1 sample could be classified below the class level. The richest and most diverse archaeal composition was found in the BA-2 sample. But, similar to the bacterial compositions, archaeal compositions were more similar between the non-degraded samples BA-2 and ATL (Table 2). Methanobacteria (class) and Methanosaeta (genus) were the two taxonomic groups significantly represented in BA-2 and ATL archaeal libraries, but these sequence types were not identified in the BA-1 archaeal library.

Comparative analysis of prokaryotic communities from Brazilian and worldwide oil reservoirs

To gain a holistic view of the microbial communities across oil reservoirs worldwide, the microbial diversity found in this study was compared to those reported in other studies (Table 3). In this sense, the relative abundances of phylotypes (order level) obtained from 16S rRNA clone libraries from nine published studies were analyzed and compared to the data described herein. In total, the analysis considered over 5100 16S rRNA gene sequences (approx 2900 bacterial and 2200 archaeal sequences), which were re-classified at the order phylogenetic level on RDP.

Table 3 General features of the studies examined in the meta-analysis for microbial communities in petroleum reservoir fluids

Figure 7 shows the results for the hierarchical clustering of the relative abundance data collected in the present work and published studies, encompassing 14 different oil reservoirs. An apparent clustering of two main groups was observed, the first group embracing relatively low temperature and mesophilic reservoirs up to 1000 m deep and the second group encompassing high temperature and deeper reservoirs. A closer look at the cluster analysis showed that the most similar communities were found when comparing the Kobayashi water and oil components, as might be expected, since these are two different phases of the same sample (oil and water). However, the Brazilian samples BA-1 and BA-2 collected at Recôncavo basin (two different wells) did not cluster together. Contrarily, the microbial community in BA-1 oil was more similar to the one found in the PTS1 sample, collected in Potiguar Basin (Silva et al. 2013), and the BA-2 sample clustered with the ATL sample, corroborating the results already showed in this study. It is interesting to note here that BA-1 and PTS1 samples were obtained from similar in situ temperatures (50 and 48 °C, respectively) and depths (less than 1000 m), whereas the BA-2 and ATL samples were obtained from higher temperatures and depths.

Fig. 7
figure 7

Dendrogram for hierarchical clustering of worldwide studies based on complete linkage of Euclidean similarities calculated on log transformed abundance data

A similarity percentage (SIMPER) analysis was used to determine which taxa contributed most to the differences observed along the clustering. Members of the orders Campylobacterales, Methanomicrobiales, and Methanosarcinales altogether accounted for 50% of the differences across the studies, followed by the bacterial orders Clostridiales, Desulfuromonadales, and Thermotogales.

A Venn diagram was constructed to demonstrate the taxonomic classes shared by different geographic regions in an attempt to determine the existence of a core microbiome inhabiting oil reservoirs worldwide (Fig. 8). This analysis showed that no bacterial class was endemic to one single area, whereas one archaeal class corresponding to Thermoprotei was only detected in Niibori oil field, Japan (Kobayashi et al. 2012a). On the other hand, three microbial groups were found to be common among all reservoirs, namely, the Gammaproteobacteria, Clostridia, and Bacteroidia. Within the Gammaproteobacteria class, the order Alteromonadales was predominant in all sites, and Oceanospiralles and Pseudomonadales were present in at least four of the six geographical regions. Dominant taxonomic orders within Clostridia and Bacteroidia were the Clostridiales and Bacteroidales, respectively.

Fig. 8
figure 8

Venn diagram of shared communities at class level across different sites. Numbers in the Venn figure represent the number of taxonomical classes shared between the studies. Brazil included data from Silva et al. (2013) and from this study, North Sea included data from Dahle et al. (2008) and Kaster et al. (2009), North Ameri (North America) included data from Orphan et al. (2000), and Pham et al. (2009) and Japan account for data in Kobayashi et al. (2012). Bottom figure: size of each list represents the total number of taxonomical classes in each region. This figure was created with jvenn (Bardou et al. 2014)

Discussion

The present work focused on the evaluation of the distribution of bacteria and archaea in petroleum reservoirs worldwide, aiming to investigate the influence of key environmental factors in shaping the composition and structure of microbial communities in these environments. For this purpose, microbial communities in crude oil samples from petroleum reservoirs in the Northeast (BA-1 and BA-2) and Southeast (ATL) of Brazil were investigated through the analysis of 16S rRNA clone libraries and compared with microbial communities from similar culture-independent surveys in oil reservoirs worldwide.

Microbial diversity and metabolisms prevailing in Brazilian oil samples

The bacterial community in the degraded sample (BA-1) revealed a high abundance of sequences belonging to Deferribacteres, while communities from non-degraded samples (BA-2 and ATL) were predominantly composed by Marinobacter and Marinobacterium (Gammaproteobacteria). On the one hand, members of Deferribacteres have been reported in high abundance in heavily degraded oils from Bokor reservoir in China (Li et al. 2012), in addition to degraded oil samples from Brazil (Silva et al. 2013), Alaska (Pham et al. 2009) and dominating hydrocarbon-degrading anaerobic enrichments from oilfield fluids (Gieg et al. 2010). On the other hand, members of the genus Marinobacter are known to be aerobic hydrocarbon degraders (Gauthier et al. 1992; Nicholson and Fathepure 2004) and have been previously detected in Brazilian reservoirs using cultivation independent (Sette et al. 2007) and dependent approaches (Lopes-Oliveira et al. 2012). Likewise, Marinobacter has been detected in oil reservoirs worldwide (Orphan et al. 2000; Li et al. 2012) but due to its relationship with aerobic mesophilic lineages, the presence of Marinobacter has been for long time considered as contaminant. However, its continuous detection in petroleum samples as well as the proven ability of Marinobacter isolates to grow anaerobically suggests that these organisms might be indigenous in this environment, actively participating in in situ syntrophic interactions (Köpke et al. 2005; Pham et al. 2009; Gray et al. 2011). Nevertheless, the fact that the dominance of Marinobacter sequences found in this study were associated with the non-degraded samples, is suggesting that, even as a common inhabitant of these reservoirs, its metabolic activity is probably being limited by the in situ conditions.

The archaeal community found in the Brazilian oil samples (BA-1, BA-2, ATL) was comprised mostly by methanogens of the class Methanomicrobia. Within Methanomicrobia, many Methanomicrobiales can utilize formate and alcohols as an alternative source instead of H2 + CO2 to produce methane, whereas Methanosarcinales grow most generally using methyl compounds (Methanolobus), and H2 + CO2 and acetate (Methanosarcina) or acetate as sole energy source (Methanosaeta) (Grabowski et al. 2005; Kendall and Boone 2006). In this study, acetoclastic methanogens of the genus Methanosaeta were present in both non-degraded oil samples (BA-2 and ATL) but not in the degraded sample (BA-1), where methanomicrobial sequences were more related to Methanocella, a formate and hydrogenotrophic archaea. Hydrogenotrophic organisms were also found in both non-degraded samples (BA-2 and ATL), such as Methanobacterium and Methanothermobacter and the genera Methanocalculus and Methanothermococcus, detected exclusively in ATL and BA-2 sample, respectively.

Frequently, culture-dependent studies have shown the prevalence of hydrogenotrophic methanogenesis in relation to acetoclastic methanogenesis in petroleum reservoirs (Orphan et al. 2000; Bonch-osmolovskaya et al. 2003; Gieg et al. 2010; Mayumi et al. 2011). In fact, hydrogenotrophic methanogenesis is thought to be the main mechanism responsible for the formation of heavy oil in subsurface petroleum reservoir environments (Jones et al. 2008). However, a culture-independent study by Kobayashi et al. (2012a) showed differences in microbial communities associated with each component of the reservoir sample: crude oil, large insoluble particles, and formation water. Acetoclastic methanogens were found to be dominant in the oil-associated microbiota (oil and insoluble particles) and hydrogenotrophic methanogens were found in higher abundance in the formation waters (Kobayashi et al. 2012a). Therefore, the low abundance of acetoclastic organisms in culture-based studies has been explained by the nature of their inocula sources wherein water phases are commonly utilized. Here, we have shown that both hydrogenotrophic and acetoclastic methanogens are present in the oil samples, but due to the fact that the degraded oil sample did not show acetoclastic methanogens, it is more likely that hydrogenotrophic methanogenesis might be responsible for the last steps of crude oil degradation. Additional cultivation experiments with specific substrates would be necessary to clarify the respective contribution of acetoclastic and hydrogenotrophic methanogenesis in Brazilian oil reservoirs under study.

Not least, the archaeal community in the degraded oil sample BA-1 showed a high abundance of Thermoplasmatales-related sequences in the phylogenetic analysis. Although Thermoplasmatales-related organisms have been detected in relatively low abundance in other highly degraded oil reservoirs (Yamane et al. 2008; Kobayashi, et al. 2012a; Hubert et al. 2012), there is no specific role assigned to these organisms so far. The fact that Thermoplasmatales sequences in this study were found only in the biodegraded oil sample (BA-1), and that they showed closest relationship to high abundant sequences from a hydrocarbon-rich environment (asphalt lake in Trinidad and Tobago) (Schulze-Makuch et al. 2011), suggest that this taxon might represent an important member of microbial communities in highly biodegraded oil reservoirs.

Microbes in degraded vs non-degraded oils

Efforts to characterize the microbial composition in degraded and non-degraded Brazilian oil samples have been previously made in our research group. In the work of Sette et al. (2007), the diversity analysis of bacterial communities in highly and non-degraded oil samples did not show statistical differences based on ARDRA results, nevertheless it was assumed that a greater sequencing effort could have better characterized the communities. In a more comprehensive study, through the sequencing of 16S rRNA gene libraries of microbial communities in moderately and non-degraded petroleum samples made by Silva et al. (2013), statistical differences were observed for the Deltaproteobacteria class, which was found only in the biodegraded oil sample (Silva et al. 2013).

In the present study, several significant differences between microbial communities in the degraded and non-degraded oil samples were shown. Mainly, Gammaproteobacteria were dominant in non-degraded samples. Furthermore, the degraded oil sample BA-1 showed higher bacterial diversity than the non-degraded oil samples (BA-2 and ATL), taking into account the taxonomic groups that were unique or significantly dominant in this sample: Thermotogae, Actinobacteria, Betaproteobacteria, Alphaproteobacteria, TM7, Bacilli, and Deferribacteres. Silva et al. (2013) also reported a higher bacterial diversity for the degraded oil sample. Nevertheless, there is no systematic study to date scrutinizing the differences among microbial communities inhabiting biodegraded and non-biodegraded petroleum reservoirs. In the literature data reviewed for this work, information on the biodegradation extent of the petroleum reservoirs analyzed is scarce or absent. As far as we were able to consider in this survey, we observed that only Epsilonproteobacteria and Thermoplasmatales were of common occurrence in biodegraded petroleum reservoirs. However, additional samples from degraded and non-degraded petroleum reservoirs need to be considered to conclude if there is any distribution pattern of microbial populations that could be directly related with oil biodegradation processes in deep subsurface environments.

Microbial phylogeography in oil reservoirs worldwide

In the present study, several reports on the characterization of microbial communities from oil and production water fluids worldwide were reviewed aiming to determine the influence of environmental parameters in shaping the composition and structure of these communities at local and global scales.

Kaster et al. (2009) were the first to start comparing the taxonomical groups found in their study with the ones reported in previous works, showing that 2/3 of the genera identified in their study had already been detected in other microbiological surveys of petroleum reservoirs and arguing that chemical composition of reservoirs may affect microbial communities more than geological properties. Later, Li et al. (2012) explored the microbial diversity in petroleum samples from South China Sea and compared with other reservoirs worldwide. They observed co-occurrence of common petroleum-associated microbes but no geographical or temperature relatedness was found among studies. Recently, Lewin et al. (2014) investigated the level of similarity of microorganisms in two physically separated but closely located deep subsurface oil reservoirs in the North Sea. The two oil wells appeared to contain similar microbial communities but in very different relative abundances.

In this study, hierarchical clustering of bacterial and archaeal relative abundances across oil reservoirs worldwide has suggested an apparent influence of temperature and depth in the prokaryotic communities. Despite the likely bias inherent to the lack of standardization among the surveys under study (DNA isolation, primers used for 16S rRNA gene amplification, PCR conditions, etc.), the resulting dendrogram showed some degree of relatedness among the microbial communities of a group of biodegraded samples from low-mild temperature (up to 50 °C) and relatively deep (up to 1000 m) reservoirs, that included the studies of Pham et al. (2009), Silva et al. (2013), Grabowski et al. (2005), and Hubert et al. (2012). A second group of more related microbial communities included all the remaining studies of reservoirs with higher values of temperature and depth. The main taxa contributing to distinguish the two groups were identified. The first group, encompassing the low-mild temperature and relatively shallow petroleum reservoirs, apparently has in common the detection and abundance of Campylobacterales (Epsilonproteobacteria) and Desulfuromonadales (Deltaproteobacteria). The predominance of these groups in low temperature biodegraded reservoirs has been previously observed by Hubert et al. (2012). On the other hand, dominant taxa in the second group of microbial communities associated to higher temperature and deeper reservoirs are Clostridiales and Thermotogales. Members of these groups are thermophiles with mainly fermentative metabolism. Accordingly, Gray et al. (2010) also noticed higher frequency of Firmicutes and Thermotogae classes associated with high temperature petroleum reservoir communities (Gray et al. 2010).

Venn diagram has suggested the existence of a common microbiota inhabiting oil reservoirs worldwide belonging to the classes Gammaproteobacteria, Clostridia, and Bacteroidia. The Gammaproteobacteria class comprises facultative anaerobes or strictly aerobic members whereas dominant groups belonging to Clostridia and Bacteroidia are obligate anaerobic groups (fermentative heterotrophic or chemolitoautotrophic metabolism). These results are in agreement with the previous observations by Gray et al. (2010) where Firmicutes, Proteobacteria (mainly Gammaproteobacteria), and Bacteroidetes were the three bacterial phyla found in highest frequency in several hydrocarbon-impacted environments including petroleum reservoirs, but also hydrocarbon contaminated aquifers, sediments, and soils. In this study, Methanomicrobia was the dominant archaeal class in almost all reservoirs studied, which comprises both acetoclastic and hydrogenotrophic methanogens.

Results from the microbial community comparison at global scale supports the general notion that syntrophic interactions are expected to occur in oil reservoir habitats (Wentzel et al. 2013). The core microbiota found is probably involved in complex syntrophic interactions responsible for the complete degradation of alkanes and other hydrocarbon components. Future deep sequencing efforts in combination with genome reconstruction and cultivation attempts will help to recover detailed information on the function, interactions, and ecological significance of key microorganisms. Immediate efforts toward the elucidation of the ecosystem functioning should be focused on Bacteroidia members, which due to syntrophic nature are probably important actors underlying the interaction network in the reservoir environment.

Conclusions

This study described the microbial diversity from crude oil samples in three petroleum reservoirs in Brazil; two from the Recôncavo Basin in the Northeast and one from the Campos Basin in the Southeast. The vast majority of bacterial and archaeal genera found in this study have been already detected in oil fields or oil-associated environments. However, bacterial genera were not shared even between the Northeast samples (BA-1 and BA-2), and this was also true for the archaeal community compositions. Instead, the BA-2 sample showed a notably more similar prokaryotic community with the ATL sample (from Southeast Brazil), both classified as non-degraded sample. These results suggest that geographic distance at local scale does not play a dominant role in shaping reservoir microbial communities. Indeed, rather than geographical distance, the comparison among microbial communities in reservoirs worldwide taking physicochemical data into consideration showed that the distribution of microbial taxa was correlated with temperature and depth. In relatively shallow and low temperature oil reservoirs, the groups Epsilonproteobacteria and Deltaproteobacteria seem to be more abundant, whereas the orders Clostridiales and Thermotogales were found in higher frequency in bacterial libraries from deeper and high temperature oil reservoirs. A core microbiome encompassing three main bacterial classes, Gammaproteobacteria, Clostridia, and Bacteroidia and one archaeal class, Methanomicrobia could be defined, which probably are the main players in the synthophic interactions in these environments.