Introduction

The microbial adaptive response to variations in environmental conditions is associated with processes of horizontal gene transfer (HGT) [1,2,3]. Stress conditions, like ultraviolet radiation or nutritional limitations, can trigger the activity of genetic elements like transposons and insertion sequence elements, thus fostering the genetic flexibility in microbiomes [4, 5]. Moreover, microhabitats that spur microbial activity and growth have been shown to also stimulate plasmid-mediated HGT [6, 7].

The study of the roles of both transposons and plasmids in natural systems is therefore essential for our understanding of the adaptive processes in microbiomes [8]. It is well accepted that plasmids are key to bacterial evolution, providing sources of antibiotic resistance and xenobiotic-degradation genes or supporting other fitness-enhancing features to their hosts [6, 7, 9]. In particular, the broad-host-range (BHR) plasmids are important, as they are key agents in mobilomes that connect diverse species within microbiomes [1, 10]. They may also incite genomic rearrangements by homologous recombination, facilitating the gain or loss of genes [11, 12]. As a result of such process, plasmids often have mosaic structures, in which each module may have an independent phylogenetic history [13].

Next to the well-studied IncP-1 plasmids, those of the PromA group [14] have called attention as prime agents that transfer genes across a broad range of hosts. Plasmids belonging to this group are known for their promiscuous nature due to their ability to transfer and replicate in phylogenetically diverse bacteria [15]. The reference plasmid of the PromA group plasmids, pIPO2, was isolated by triparental exogenous isolation from the rhizosphere of young wheat plants [6]. This BHR plasmid group currently includes six members, namely, plasmids pIPO2 [16], pMOL98 [17], pSB102 [18], pTer331 [19], pXyl184, pMRAD02 [20], and pSFA231 [21].

All of these plasmids share, next to systems for replication and horizontal transfer, a par (partitioning) system, consisting of the parA, parB, and parS genes. PromA plasmids harbor a “type-1” Par system for partitioning. However, the mechanism of this partition has been elusive. It is known that the parA gene encodes parA protein, which is an NTPase that binds to and hydrolyzes ATP or GTP to provide energy that may drive DNA segregation [22]. In early work with plasmid RP1, the partitioning system (homologous to the Par system) was deleted without effects on the copy number control system [23]. Moreover, the Par system can be involved in both plasmid and chromosome segregation, and chromosomally encoded proteins can stabilize plasmids [24]. Thus, a certain functional redundancy exists across partitioning systems [25].

With respect to the PromA plasmid group, insertions of accessory genes have been empirically observed to occur exactly in the parA region, exemplified by the luxAB-tet gene cassette inserted into plasmid pIPO2 (yielding plasmid pIPO2T) [16], mercury resistance genes found on plasmid pSB102 [18], and heavy metal response genes on plasmid pMOL98 [17]. The parA gene region was found to have typical features associated with an insertional “hotspot” site, such as the presence of repetitive palindromic elements as well as a generally high A+T content [14]. In other plasmids, the region was deleted without effects on the plasmid copy number control system or plasmid functioning [22]. Hence, we hypothesized that the parA gene region is to some extent dispensable [23]. With respect to partitioning, parB appears to be key; it is found on all plasmids of the PromA group, and the protein encoded by this gene is important for the proper formation of the partitioning complex.

The goal of this study was to explore the PromA plasmid parA gene region as a potential insertional hotspot for mobilized genes. We thus analyzed conserved regions of the parA genes in the plasmids pIPO2, pTer331, pSB102, psFA231, and pMOL98, and designed a specific PCR system that allowed the amplification of the region spanning the aforementioned inserts. We then generated amplicons from habitat microbiomes that were fingerprinted and subsequently sequenced. The results indicated that differential sets of genes inserted in the parA hotspot, and highlighted some function-related inserts with relevance for fitness in the respective habitats.

Materials and Methods

Description of Environmental Samples Used as Models

Mangrove Soils

Soil samples from three mangroves located on the coast of São Paulo State, Brazil, were used: two located close to the city of Bertioga [one affected by oil contamination (BrMgv01 and BrMgv02), and another affected by sludge and other urban waste (BrMgv03)], and the third located in the city of Cananéia (BrMgv04), where it experiences pristine conditions. For more information on these mangroves, see [26].

Salt marsh sites

Soil samples from five successional stages (0, 5, 35, 65, and 105 years of succession) were obtained from the saltmarsh located at Schiermonnikoog Island (the Netherlands). For more information on saltmarsh and samples, see Dini-Andreote et al. [27].

Mycosphere and bulk soil samples

Mycosphere samples were taken across a forest soil area in Noordlaren (the Netherlands). In the sampling, triplicate individuals of different mushrooms (defined by morphology) were sampled, taking care that the mushroom feet soil was included [7]. For each sample, a respective bulk soil was collected nearby the mushroom (approximately 1 m). For more details on sampling, see Inceoglu et al. [28].

Extraction, Purification, and Quantification of Total DNA

Total DNA was extracted from 0.5 g of initial material using the MoBio PowerSoil DNA Isolation Kit (MoBio Laboratories, Carlsbad, CA, USA). The manufacturer’s protocol was slightly modified by the addition of glass beads (diameter 0.1 mm; 0.25 g) to the soil slurries followed by 3 cycles of bead beating (mini-bead beater, BioSpec Products, Bartlesville, OK, USA) for 60 s. Extracted DNA samples were quantified using gel electrophoresis and Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen, Carlsbad, CA, USA) on a Tecan Infinite M200 Pro (Maennedorf, Switzerland).

Design and Validation of Specific parA Primers

The primer design was based on the alignment of the gene parA found in five plasmids of the PromA group: pIPO2 [16] (GenBank accession number AJ297913), pTer331 [19] (GenBank accession number NC_010332.1), pSB102 [18] (GenBank accession number NC_003122.1), psFA231 [21] (GenBank accession number KJ850907), and pMOL98 [15] (GenBank accession number FJ666348). These sequences were aligned using MEGA4.0, and conserved region were tracked. We found two conserved regions flanking a variable region of approximately 185 bp, with characteristics that elect it as a possible hotspot for the insertion of exogenous DNA, such as high percentage of A/T nucleotides and high frequency of palindromic sequences. In addition, the sequences of plasmids pIPO2T and pTer331 presented insertions within this named hotspot. These observations supported the primer design, resulting in the primer forward parAF (5′-TGCGCATGATTGACGACC-3′) and primer reverse parAR (5′-CGCAGGACCTGTCTCTTG-3′).

The primer validation was made by the use of Primer-BLAST http://www.ncbi.nlm.nih.gov/tools/primer-blast [29], where its specificity and complementarity to sequences of PromA plasmids were verified. We also assessed the primer specificity by the amplification of parA regions found in the plasmids pIPO2 and pTer331. Further, the sequences obtained from environmental amplicons were used to construct a phylogenetic tree to observe the relationship of parA fragments recovered with the known parA genes. To construct a phylogenetic tree, MEGA 4.0 using the neighbor-joining method with Kimura two-parameter model was used. Bootstrap analysis with 1000 replications was performed to obtain confidence estimates for tree topologies. As outgroup, we used the parA sequence from the IncP-1 plasmid pAKD16.

Reactions were adjusted to be prepared in 50-μL reaction mixtures containing 10 ng of DNA, 0.2 mM of primers parAF and parAR, 2.0 mM of MgCl2, 250 nM of deoxynucleoside triphosphates (dNTPs), 0.02 μL T4 gene 32 protein (5 mg/mL), and 0.4 μL Bioline Taq polymerase (5000 U/mL) in the buffer supplied by the manufacturer (Bioline Life Science, Germany). Amplification conditions were optimized as made of an initial denaturation step at 95 °C for 5 min, and 30 cycles of 45 s at 94 °C, 30 s at 59 °C, and 20 s at 72 °C, with a final extension of 7 min at 72 °C. The conditions for amplification were optimized using the plasmids pIPO2 and pTer331 as template.

Amplification and Analysis of parA Sequences from Environmental Samples

Aliquots of DNA (approximately 50 ng) from environmental samples were subjected to amplification of the parA region using the reactions and procedures described above. Amplification patters (made of several insertion elements) were visualized by electrophoresis on agarose gels (1.0% w/v), further stained with and visualized/photographed under UV light.

The bands patterns were analyzed using GelCompar II software (Applied Maths, Sint-Martens-Latem, Belgium), where patterns were normalized and cross-compared. Cluster analysis of PCR patterns was performed using UPGMA (unweighted pair group method with arithmetic mean) based on the similarity calculated by densitometric Pearson correlation [30].

The most prominent bands observed in patterns from each environment (six from mangrove soils, one from saltmarsh soils, and four from mycosphere or bulk soil) were excised from the gels and further used for cloning. The purification of DNA from gel bands were achieved by the use of the kit Wizard® SV Gel and PCR Clean-Up System (Promega, USA). The resulting DNAs were cloned into the pGEM-T Easy vector and introduced into chemically competent Escherichia coli JM109 (Promega, USA) following the manufacturer’s instructions. Four positive clones per band were randomly selected and screened for the presence of the target bands. Primers M13F and M13R were used to determine which clones contain the correct-sized inserts. These were further purified and subjected to sequencing, performed by LGC Genomics (Berlin, Germany). Sequence chromatograms were trimmed using the Lucy algorithm [31] at a threshold of base quality score > 20. High-quality sequences were submitted to the National Center for Biotechnology Information (NCBI) database under the accession numbers MF037631 to MF037640. These sequences were also compared against the GenBank database using the Basic Local Alignment Search Tool algorithm (BLASTx) nt/aa [32] to provide information on taxonomy and function of genes inserted into parA regions in distinct environmental samples.

Results and Discussion

Alignment of parA Regions, Primer Design, and Validation

We first determined the level of “sharedness” of the parA region across the five PromA plasmids (Fig. 1a). We thus in silico removed the presumed inserts and aligned the six remaining 698-bp regions (Fig. 1b). These were completely homologous (100% identity) (Fig. 1b), supporting the design of a specific pair of primers annealing to regions identified by initial positions 181 and 366 (Fig. 1b) in the parA sequences of five PromA plasmids. The parA gene region was, by our analyses, found to have a high number of putative recombination sites (A+T-rich interaction site with DNA sequences), as inferred from finding insertions in plasmids pSB102 and pMOL98 at almost exactly the same sites of the parA gene.

Fig. 1
figure 1

a Phylogenetic tree of parA nucleotide of selected PromA plasmids using the neighbor-joining method. The evolutionary distances were computed using the Kimura two-parameter method, and the bar indicates the dissimilarity scale on tree branches. b Schematic diagram of linear alignment of the five PromA plasmids in the region parA

The parA locus revealed several sites rich in A+T with the 185-bp hotspot region having 63% A+T richness relative to total parA (Fig. 1b). Also, palindromic sequences were found (Fig. 1b). Such features are often present in hotspots for insertion of exogenous genes. The highly conserved regions A and B (Fig. 1) that flank this hotspot site were very appropriate for the design of primers suitable for amplification of the intervening region. Hence, we surmised that the parA region constitutes an insertional hotspot. In Fig. 1, we indicate the A+T-rich regions. The target in parA is the palindromic site; its orientation specificity is not surprising. Indeed, the known parA sequences contain a conserved region. These segments may be sequence-specific functional elements of the palindrome. The insertions collected by the PromA plasmids may have traveled through different bacterial host species (on pSB102 and pMOL98 on naturally occurring transposons), and they may have been acquired from different donors. However, the scope of the current study did not allow to determine the source of these insertions.

We tested the amplification conditions for specificity in pIPO2 and pTer331 plasmids, using as a negative control the plasmid pSUP104. By testing a suite of plasmids, the PCR system revealed to be specific for the PromA plasmid class, where the use of these primers generated amplification products with average sizes of (the predicted) 185 bp. The primers and the amplification conditions thus yielded a very specific and sensitive system that enabled the production of specific parA fragments. The sensitivity of the systems was revealed by the fact that 107,105, and 103 copies of the plasmid, when used as the target, all yielded clear products (data not shown).

Amplification of parA Sequences from Environmental Samples

As the primer pair revealed specificity for parA and efficiently amplified the targets, we applied them to environmental microbiome DNAs that had previously shown good amplification with 16S rRNA gene primers (Fig. 2). Amplicons were produced for all samples, in the (overall) size range 392–1464 bp. Remarkably, the gel electrophoretic patterns were distinct and characteristic for each environment, as the replicates of all samples consistently revealed very similar patterns that clustered away from those of other habitats (Fig. 2). Clustering analysis confirmed this clear separation by sample type. The richness of the parA-inserted genes varied between 6 in mangrove sediments, 2 in bulk soil, and 1 in mycosphere and salt marsh ecosystems (Fig. 2). Hence, the system revealed a restricted view of potentially inserted genetic material in the parA sites.

Fig. 2
figure 2

Cluster dendrograms of PCR patterns created using UPGMA based on similarity calculated by densitometric Pearson correlation analysis for a mangrove sediments samples (BrMgv01—area slight of oil contamination in the spilled mangrove; BrMgv02—area highly impacted by the oil contamination; BrMgv03—mangrove near the city, under anthropogenic pressure; BrMgv04—located in a preservation area, under pristine conditions), b island of Schiermonnikoog, the Netherlands (that is, stages 0, 5, 35, 65, and 105—in years of soil development), and c Buinen and Gieterveen sandy clay loam soils and mycosphere samples. Black arrows indicate the position of selected bands excised from the gels

Interestingly, the mangrove microbiome patterns clustered along the level of (previous) oil pollution. That is, the patterns from the less contaminated (BrMgv01 and BrMgv04) areas clustered away from those of the highly polluted ones (BrMgv02 and BrMgv03) (Fig. 3a). This divergence is in accordance with previous data on total microbial groups, which revealed the occurrence of distinct microbial communities in each of these mangrove sediments [26]. Similarly, the salt marsh-derived patterns grouped in accordance with the timespan within the chronosequence. The patterns derived from the younger habitats (0, 5, and 35 years) were distinct from one another and also distinct from the ones from the “old” habitats in the chronosequence (65 and 105 years) (Fig. 2b). Again, this separation corroborates previous data on the bacterial communities in the same chronosequence [27]. The comparison between bulk soils and the mycosphere also resulted in the differential display of bands from insertion sequences in the parA hotspot. Mycosphere-derived samples presented bands that were different from those found in bulk soil-derived patterns (Fig. 2c), as observed by Warmink and co-workers [33]. Overall, the amplification patterns obtained from the selected samples suggested that the parA inserts, within the constraints of the method, were consistently different between the evaluated habitats. Probably, environmental characteristics drive and select the respective genes in their genetic and host context.

Fig. 3
figure 3

Analysis of 80-pb regions directly flanking the forward and reverse primer annealing sites in the analyzed amplicons. a Forward primer and b reverse primer. The scale bar indicates the number of base substitutions per site, and values in nodes indicate percentage of bootstrap values (total of 1000 repetitions). The outgroup, comprising incP1 plasmid pAKD16 sequence parA region

Identification of parA Gene Insertions Obtained from Soil DNA

Representative dominant bands (considering the cluster analysis) for each habitat were excised from gel, reamplified, cloned, and sequenced, in order to assess the nature of the parA gene insertions (see Fig. 2). We confirmed the identity of these sequences by a separate analysis of the flanking regions of the amplicon inserts, approximately 80 bp upstream and downstream of the inserted sequences; invariably, these were strongly related with parA gene regions found in PromA plasmids (Fig. 3).

Six bands from the mangrove patterns were analyzed [five from highly impacted mangrove (BrMgv02 and BrMgv03) and one from the lowly impacted one (BrMgv01)]. The data were remarkable. In the low-oil-contaminated mangrove, the dominant insertion was associated with an aceto acetyl-CoA synthetase, a widely distributed enzyme associated with ATP generation processes (band 1, see Table 1) [34], whereas in the oil-polluted habitats BrMgv02 and BrMgv03, genes related to an ammonium transporter (band 2), an IS5/IS1182 family transposase (band 3 and 4), an invertase (band 5), and a dioxygenase beta subunit (band 6) were detected, with complete genes (Table 1). The selective pressure exerted by anthropogenic influences thus drove the presumably dominant microbial adaptability. The genes detected from the impacted environments may thus be associated with microbial adaptations to selective pressures (e.g., the dioxygenase beta subunit, an element with potential function in oil degradation, in the oil-impacted mangrove). Moreover, the detection of transposases could be associated with the pressures exerted on plasmids and hosts in the context of frequent environmental changes [35].

Table 1 Taxonomic affiliation of cloned parA-inserted genes amplified from mangrove, island, and bulk soils

The taxonomical affiliation of most of these sequences indicated that their potential hosts were, as expected, members of the Gammaproteobacteria (bands 1, 3, 4, and 6). Remarkably, other putative hosts, belonging to the Deltaproteobacteria (band 2 and 5), were also indicated (Table 1). Overall, finding such taxonomical affiliations matched the expectancy, in accordance with the description of the most prominent microbial groups in mangrove sediments [26, 36].

With respect to the salt marsh microbiome-derived patterns, dominant band 7 was most pronounced in the 65- and 105-year samples (Table 1). This sequence, affiliated with gene regions found in Betaproteobacteria, revealed high identity with that of the Ralstonia solanacearum narG (partial) gene, which encodes the alpha subunit of nitrate reductase, an enzyme involved in the conversion of nitrate into nitrite. The DNRA (dissimilatory nitrate reduction) process, represented by narG, was described as a prominent process of nitrate reduction in salt marsh habitats [37]. It is remarkable that such a key gene was found in the broad-host-range mobilome represented by PromA plasmids.

Analysis of the mycosphere and bulk soil-derived patterns revealed the existence of three dominant bands (one from mycosphere and two from bulk soil, respectively) (Fig. 2c). The mycosphere-derived sequence revealed high similarity with a putative dehydrogenase (gene Thioredoxin_like) from an uncultured bacterium (band 8). Interestingly, the bulk soil-derived sequences were highly affiliated with an esterase-encoding gene (gene Aes) (band 9) and a gene encoding a subunit of an endo-1,4-beta-xylanase (gene XynA) (band 10). Interestingly, both genes (esterase and xylanase) are associated with the degradation of organic matter derived from plants (lignocellulose), which is routinely found in high concentrations in soils [38]. Expectedly, these sequences were affiliated to those of Alphaproteobacteria (band 9) and Gammaproteobacteria (band 10), as detected with complete genes; see Table 1.

In conclusion, the extent to which horizontally acting gene transfer agents provide adaptive value to members of natural microbiomes is an understudied area of research. Clearly, the PromA plasmid group allows high-frequency transfers across a wide range of hosts. The parA gene region appeared as a propitious insertional region. We here successfully established a specific parA-based PCR assay that facilitates the examination of microbiomes for gene insertions into this region. A remarkably habitat specificity was found on the basis of the patterns reporting on the parA insertions, indicating that each habitat, as driven by the specific local conditions, had distinct dominant insertional patterns. Moreover, diverse parA-borne traits were found per habitat, which suggests that the putative ecological drivers of these are different and habitat-specific.