Introduction

Microorganisms are considered the most diverse and abundant organisms on earth [1]. Progress has been made in understanding the distribution of bacteria across several biogeographical scales, but questions still remain about the factors important to community structure and diversity [2]. For example, soil properties such as pH, carbon, and nitrogen have been shown to have strong associations with bacterial community structure, but fewer studies have attempted to compare biogeographically separated and climatically different soils that have comparable parent materials [3, 4]. Most of the studies on microbial diversity have focused on studying the diversity at a particular location, soil type, or landscape and have ignored understanding the microbial community assembly across spatial scales [2, 5]. Microbial biogeography is an emerging field of microbial ecology that studies the distribution of microbial diversity across space and time. It not only aims to reveal the relationship between the microorganisms and their habitat but also to determine the environmental factors that select or maintain these organisms in those habitats [5, 6]. Studying such relationships would provide more insights on relative influence of environmental and evolutionary changes that determine the structure of microbial communities [7, 8].

There have been varying views about microbial biogeography over the past decade. The ubiquity model proposed by Beijerinck [6] was considered a paradigm of microbial biogeography until the advent of modern molecular techniques. According to this theory, due to their small size, abundance, and high dispersal rate, microorganisms have cosmopolitan distribution. In essence, any bacterial species can be found anywhere in the world owing to their small size and high numbers. This view is supported even in a recent literature [9] where the authors argue that any organism less than 1 mm is likely to be present everywhere due to their unlimited capability of long distance dispersal whereas larger organisms have constrained distribution. The underlying assumption in their argument is that due to the higher local abundance, microbes have the capability to successfully colonize any remote location by chance [5, 9, 10]. In contrast to the cosmopolitan theory, some researchers proposed the endemicity model. According to this model, the endemic taxon is restricted to a particular region or habitat type, and thus, their distribution across the landscape exhibits a nonrandom relationship. Although it has been mentioned in the literature [5, 11], this model is least studied. Most of the studies supporting this theory rely on limited dispersal and local adaptation. For example, extremophiles (microbes that inhabit extreme habitat) and obligate symbionts (ectomycorrhizal fungal association with tree species) have much lesser potential for universal dispersal than the non-extremophiles and non-symbionts [12, 13]. A conservative middle-ground model supports the idea that microbes can exist ubiquitously as long as the niche is suitable for their survival. The basic hypothesis of these researches that focused on this idea were based on the Bass-Becking’s statement “everything is everywhere, but, the environment selects” [7]. According to this idea, habitats that have similar environmental and physical conditions would support similar microbial communities.

Theories that have been established in macro-ecology have recently been tested in microbial systems, with the aim of describing the factors that determine population abundance patterns [5, 14]. These factors are often divided into two major groups: environmental and spatial. Though these groups can co-vary and care must be taken during interpretation, these factors have been shown to effect bacterial community structure [4, 5, 15, 16].

In this study, bacterial community composition and diversity were described across a series of dune chronosequences developing with similar parent materials but varying climatic conditions. The goal was to compare the environmental soil gradients created by pedogenesis in two climatically different (subtropical and cool-temperate) locations in the continental US. Habitat filtering along the pedogenic gradients were hypothesized to be a major determinant of bacterial community change (using 16S rRNA gene sequences), but that differences between locations would help to explain some of the biogeographic differences related to factors such as climate and vegetation. We hypothesized that the geographic distribution of bacterial communities will be more closely related to local environmental variations (soil physico-chemical characteristics) than the physical distance between the soils.

Materials and Methods

Site Description

Two sites were chosen for this study, one located in Emmet County of the lower peninsula of Michigan in Wilderness Park. The GA dunes are located in the Altamaha and Ohoopee river valleys of southeast Georgia. The reason for selecting these sites was that they were sandy with similar parent materials in different climate zones (Table 1). At MI study site, a series of beach-dune ridges form a series of approximately 108 eolian deposits that run parallel to Lake Michigan. Depositional ages of the parent materials from present day to approximately 4500 years [17] were derived from glacial deposits and Paleozoic bedrock underlying the lake basin. The soil type is fine sand dominated by quartz but containing numerous other minerals in minor quantities. The chronology of the dunes was estimated using accelerated mass spectroscopy (AMS) radiocarbon dating of the macrofossil remains from each dune [17]. The ridges are approximately 2.5 km long, 10–30 m wide, and vary between 3 and 5 m in height along the shore and reaching 15 m high in the inland [18]. A set of nine differentially aged dunes (105, 155, 210, 450, 845, 1475, 2385, 3210, and 4010 years) were sampled. A previous study [18] has shown patterns of primary succession with grasses and shrubs on younger dunes to mixed coniferous forests dominating the older dunes.

Table 1 Description of sampling sites

The GA dunes are located in the Altamaha and Ohoopee river valleys of southeast Georgia. Chains of eolian deposited parabolic dunes ranging from 3.7 to 14.0 m high at the crests are found ∼6 km away from the modern river channel [19]. They are formed from a well-drained sandy parent material surrounded by coastal plain lowlands that are greater than 165,000 years old. The chronology of the dunes was estimated using optically stimulated luminescence procedure (OSL) [20]. The GA dunes are eolion, blown into ridges from fluvial deposits of the Altamaha river valley. The dunes were stabilized once they were dominated by vegetation [20]. A set of four differentially aged dunes (21, 38, 45, and 77 K; K = 1000 years) were sampled. The oldest soils were deposited during the Pleistocene interglacial period and were assigned an age of 500 K based on their geology and proximity to the present coastline. These deposits are sedimentary but represent sandy marine deposits that are much older formations than the dunes.

Soil and vegetative sampling

Five replicate soil samples were taken at an interval of 20 ft between each sampling spot across a 100-ft transect along the crest of each dune. From each sampling spot, 5–6 cores (0–10 cm depth, 5-cm diameter) were collected from the zone of dominant root activity (A-horizon) using a stainless steel soil corer. After sampling, soil from other zones were carefully removed, and A-horizon soil from the same plot were homogenized with and quickly transferred to a Whirlpak® bag. The sample bags were frozen immediately in cooler pack filled with dry ice. Upon arrival in the laboratory, soils were thawed for 25–30 min and homogenized through a 4-mm sieve; extraneous roots and organic materials were removed and stored at −80 °C. Plant sampling was conducted on the selected dune ridges along the chronosequence by measuring the plant species composition, tree density, and percentage canopy cover across the sampling area covering a strip of 5 × 20 m (100 m2). The tree species composition was measured by counting the number of tree species within the sampling area and under-storey species cover was measured at five random spots within the sampling area using a 1 m2 quadrat. The tree species canopy cover was estimated by fitting the dbh (diameter breast height) measurement into a conifer crown radius model [21].

Soil Characteristics

Soil organic matter content was estimated by measuring the mass difference before and after ignition (560 °C). The mineralizable C was estimated by measuring the cumulative CO2-C produced from 100-g soil during a 1-month soil incubation (20 °C). Soil pH was measured on 1:2 soil and 0.01 M CaCl2 mixture. Soil extractable cations were analyzed according to the Mehlich-3 extraction protocol [22].

DNA Extraction and Pyrosequencing

Total DNA was extracted from 0.5 g of soil from each soil sample using ZR Soil-Microbe DNA™ kit (Zymo Research). After extraction and purification, the DNA was inventoried and stored at −80°. Small subunit bacterial rRNA gene fragments were amplified from an amount of DNA equivalent to that found in a 0.5 g of soil (varied with each sample) to appropriate size using 515R-M (5′-CCGCNGCKGCTGGCAC-3′) [23] and the sevenfold-degenerate primer 27F-YM + 3 [24]. The primers were synthesized in such a way that A and B sequencing adaptors (454 Life Science’s FLX) were immediately upstream of the 515R-M and 27F primer sequences, respectively. In addition, an 8-nt sample-specific barcode tag was attached between the A-adaptor and primer 515R-M to allow multiplexing and to eventually separate each sample sequence bioinformatically after sequencing. Each 25 μl PCRs consisted of 12.5 pmol of each forward and reverse primer, 1.25 μL of template DNA, and 22.5 μL of Platinum PCR SuperMix (Invitrogen). Samples were initially denatured at 95 °C for 3 min, then amplified by using 20 or 30 cycles of 94 °C for 30 s, annealing at 50 °C for 30 s, and extension at 72 °C for 1 min. A final extension of 4 min at 68 °C was added at the end of the program to ensure complete amplification of the target region on a Veriti® 96-well Thermal Cycler (Applied Biosystems). The PCR products were run on a 1% agarose gel and image quantified on a Typhoon Trio+Variable mode imager (GE Healthcare) using Image Quant 5.2 (Molecular Dynamics). The PCR products from 5 replicates of each soil age were then pooled into equimolar concentrations and gel eluted using Zymoclean™ Gel DNA Recovery Kit (Zymo Research). The amplicons were then quantified on the Experion System (Bio-Rad), and a composite sample for pyrosequencing was prepared by pooling equal amounts of PCR amplicons from each soil age. The final mixed amplicon pool was further purified using the Agencourt AMPure XP system (Beckman Coulter Genomics) and submitted to the Environmental Genomics Core Facility at the University of South Carolina for pyrosequencing on a 454 Life sciences Genome Sequencer FLX (Roche) machine using standard protocol [25]. Pyrosequencing generated 151,794 quality short-read bacterial 16S rRNA sequences from 75 (14 dunes × 5 replications) samples across both the sites. The gene fragments averaged to approximately 530 bp (base pairs) in length. Sequences generated in this study have been deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database under accession number SRP091769.

Processing of 16S rRNA Gene Data

A two-step pipeline was established to analyze the 16S rRNA gene sequence data. Quantitative Insights into Microbial Ecology (QIIME) was used to quality trim the raw sequences for primers, chimeras and to sort them based on the barcodes [26, 27]. The denoised data was then passed through MOTHUR v1.22.0 [28], a software for describing and comparing microbial communities. To facilitate the downstream analysis of the large sequence datasets, identical sequences or artificially duplicated sequences, which can constitute a significant fraction of the dataset, were removed [29, 30]. The non-redundant sequence dataset was then aligned using SILVA reference dataset (http://www.arb-silva.de/) [31]. Sequence reads were assigned (clustered) to Operational Taxonomic Units (OTUs) based on pairwise distances between all aligned sequences in MOTHUR. A distance matrix was generated using average-neighborhood algorithm at an evolutionary distance D = 0.03, which restricted the distance matrix to keep only sequence reads that had 97 % sequence similarity.

The alpha-diversity estimates such as rarefaction, richness, evenness, Shannon, Simpson’s reciprocal index, and Chao1 estimates were done on OTUs at D = 0.03 evolutionary distances. This level of DNA sequence similarity is typically used to assign sequences to the same taxa [28]. Finally, the phylogeny was assigned to representative sequences from OTUs using SILVA reference taxonomy [28].

The 16S rRNA gene sequence possessed a 530-bp average length and formed 15,131 OTUs in total at 97% sequence similarity (D = 0.03). Each soil age was represented by between 4779 and 11,248 sequences forming 537 to 2352 OTUs per soil age. The most abundant OTU was represented by 19,258 sequences, accounting for ∼13% of the entire sequence data set. The top 10 and 100 OTUs represented 48 and ∼93 % of the entire sequence data set, respectively. Thus, top 100 dominant OTUs, those with at least 0.05 % average abundance across all samples, were used for the ordination techniques. This was done to minimize the stress (a measure of poorness of fit between the ordination and measured ecological distances) while using distance-based ordination methods when the entire dataset with all OTUs were used.

Statistical and Multivariate Analyses

In order to visualize the differences in microbial community assemblages related to environmental factors across the pedogenic gradient, 16S rRNA gene-derived OTUs were analyzed using Bray-Curtis analysis and mantel tests. PC-ORD (MJM Software, Gleneden Beach, OR, USA) software and the Sorensen’s metric were used to obtain the n-dimensional ordination, with Monte Carlo tests (1000 randomized runs) to determine the occurrence of significance differences between groups [32]. Mantel tests were used to determine correlations between bacterial community composition, site vegetation, and soil variables. Partial mantel test were performed to examine correlation between the degree of similarity in bacterial community composition with environmental and spatial variables, correlating the community with one variable while holding the third group constant. [15]. Correlations between soil and site variables were performed by using log-linear correlations and Pearson coefficients for multiple comparisons. The effect of site pedogenic age on the distribution of OTU was conducted by using Bray-Curtis-derived ordination scores and testing the statistical significance using a two-way general linear model (GLM; SAS version 9.2; SAS Institute Inc., Cary, NC, USA). Multiple mean comparisons were done using Fisher’s least significant difference (LSD) in SAS version 9.2 (SAS Institute Inc., Cary, NC, USA. Hierarchical cluster analysis of the most abundant OTUs was done using the PC-ORD software [33]. Factors that significantly influenced community composition were used to construct a soil variables matrix for Variance partitioning analysis in CANOCO (ver 4.5) [34]. The significance of the correlations with each factor was evaluated through the Monte Carlo permutation test by applying 998 permutations.

Results

Plant Community Succession Across the Chronosequences

At MI site, changes in percentage cover of 13 plant species (including herbs, shrubs, and hardwood trees) that changed considerably across the chronosequences were investigated. The change in plant species abundance was higher in the youngest sites than the older sites (Table 2). Dune-building plant species were replaced by evergreen shrubs which were later replaced by mixed pine forests. This shift in early-succession species to late-succession species happened around 450 years of soil development (Table 2). The early-succession species started disappearing and the mixed pine forest started establishing as soil ecosystem developed.

Table 2 Total percent ground and canopy cover of different plant species across the Michigan chronosequences

GA dunes were dominated by turkey oaks, sand live oak, and longleaf pine covering the canopy while sparkleberry, lichen, and mosses covered the dune ground. A vegetative succession along the chronosequences did not show a primary succession pattern as observed across the MI chronosequences. However, the oldest site (500 K) had a different vegetation composition than the other ages dominated by oaks and holly (Table 3).

Table 3 Total percent ground and canopy cover of different plant species across the Georgia chronosequences

Change in Bacterial Community Structure and Ecosystem Properties

The distribution of the 100 most abundant OTU, indicative of bacterial community change, and showed patterns related to ecosystem development and site characteristics likely reflecting conditions in MI and GA (Fig. 1). Axis 1 ordinates explained 78% and Axis 2 at 6% of the variation in bacterial community composition (Fig. 1). The spread of the data was dominated by axis 1 which indicated that variation associated with age and pedogenesis within each site (MI and GA) explains a greater proportion of the variance than differences between locations. Several soil characteristics showed significant log-linear correlation with the soil development (Table 4). Except for the oldest site (500 K), Ca and Mg accumulation showed a decreasing trend across the developing soil chronosequences which was characteristic of age related weathering. This effect was also reflected in soil pH, which declined from near-neutral (7.6) to acidic (3.5) as the soils aged and weathered. Carbon and nitrogen levels showed an increasing trend with the oldest site showing the highest accumulation levels, an occurrence typical of developing soil ecosystem (Table 4). A hierarchical cluster analysis of the microbial community composition broadly grouped the bacterial community (OTU) into two major clusters comprising of the younger dunes and older dunes. The older dunes were further grouped based on the location (Fig. 2). This pattern was similar to the vegetation distribution across the older sites (at both MI and GA sites) that were dominated by mixed-pine forests (Tables 2 and 3).

Fig. 1
figure 1

Bray–Curtis ordination plot showing relationship between soil ecosystem development and bacterial community composition and structure at both the sites (Michigan and Georgia). Notes: 100 most abundant OTUs were used for the ordination. OTUs were formed using average neighborhood algorithm in MOTHUR at evolutionary distance of 0.03 (97% sequence similarity). Error bars represent standard error (n = 5)

Table 4 Mehlich-3 extractable soil cations and selected soil properties from the mineral soil across the chronosequences
Fig. 2
figure 2

Hierarchical Cluster Analysis of the 100 most abundant bacterial OTU. Notes: Cluster analysis done using Ward’s method and relative Euclidean distance. The distance axis represents the similarity index

In order to understand the relative importance of the environmental variables, a canonical correspondence analysis (CCA) was performed on those variables that were shown to influence the bacterial community composition. This technique has been shown to be useful to identify the best predictor of soil microbial communities [35]. Soil Ca, Mg, pH, total carbon (%) and total nitrogen (%) were chosen for the ordination based on the factors that had a significant Pearson correlation with the bacterial community distribution (see Table S1 for details). CCA significantly explained 70% of the OTU–environment relationship across the first two canonical axes. Axis 1 (62 %) explained more of the age-related pattern while axis 2 (8 %) structured the community based on the location (Fig. 3). The joint plot shows that CCA grouped the ages similar to the other ordination techniques used in the study. Variables Ca, Mg, and pH showed a significant correlation along canonical axis 1 (Monte Carlo test of significance, p = 0.001) while % C and N showed significant correlation along canonical axis 2 (Monte Carlo test of significance, p = 0.001). Variables Ca, Mg, and pH showed positive correlations along axis 1 (r = 0.81, r = 0.85, and r = 0.98, respectively), implying that they significantly decreased as the soil aged. Whereas % C and N were negatively correlated along axis 2 (r = − 0.71 and r = − 0.56, respectively), increasing significantly as the soil aged.

Fig. 3
figure 3

Canonical correspondence analysis of the relationship between physical and chemical parameters and bacterial relative abundances (OTU intensities). The projected length of the vectors centered in the panel on the two axis represents the strength of the soil factors. Notes: All environmental variables were significant (P < 0:05, Monte Carlo test of significance). Error bars represent standard error (n = 5)

Phylogenetic Affiliation of the 16S rRNA Genes

At a broad level of phylum-class classification, the most abundant bacterial phyla across the chronosequenes were Actinobacteria, Proteobacteria (including α, β, and γ), and Acidobacteria covering 93 % of all the sequences with 42, 38, and 13% individual contribution, respectively (Fig. 4). The remaining phyla present within the libraries, such as Bacteroidetes, Cyanobacteria, Firmicutes, and Planctomycetes, all comprised less than 7% of the sequences. None of the bacterial phyla showed a log-linear trend across the age gradient, except Bacteroidetes, that showed a decreasing log-linear relationship with a higher abundance in younger compared to older soils (Fig. 5). Although not statistically significant, Acidobacteria showed an increasing abundance in older soils (low pH) compared to young soils (high pH).

Fig. 4
figure 4

Relationship between the relative abundance of bacterial phyla across Michigan and Georgia chronosequences

Fig. 5
figure 5

Relationship between percentage relative abundance of nine individual bacterial phyla across both chronosequences during ecosystem development (105 to 500,000 years). Notes: Each point in the graph is the average (n = 5) of the percentage abundance of each phyla at each stage of development. Regression co-efficient and p value for each phylum are shown

Bacterial Diversity

To calculate diversity indices (using D = 0.03), the number of sequences per sample were normalized to 4779 by randomly subsampling a subset of sequences using QIIME scripts. This was done to avoid the possible influence of sample size on diversity estimates and normalized subset were used for further diversity measurements [36]. Based on the Shannon and Simpson’s reciprocal index, the bacterial diversity tended to decrease considerably across the chronosequence (Table 5). The diversity declined from 205 to 47 (Simpson’s 1/D) during 105 to 500,000 years of soil development. The chao1 richness predictor values showed that only 29–57% of the OTU’s predicted by this estimator were actually observed, indicating that the diversity was not completely sampled at evolutionary distance of 0.03. Estimates of bacterial diversity were much greater in MI (numbers of OTU, ACE, and Chao1) and remained 2–3× greater in MI than GA after (Table 5).

Table 5 Diversity indices of the 16S rRNA gene sequences

Bacterial Biogeography

A Mantel test showed that there was no significant correlation between the bacterial community dissimilarity and the geographic distance (Table 6). The standardized Mantel statistic (rM) was not significant at 95% confidence level using 999 permutations (rM = 0.13, P = 0.35) and was unsupportive of a pattern of community change related to distance. Because the distance factor is strongly weighted by many values below 50 km and those above 1700 km, a Mantel correlogram was developed to plot autocorrelation as a function of geographic distance [37, 38] classes defined by: class 1 (0–1 km), class 2 (1–50 km), and class 3 (50–1700 km). For each class, a n × n matrix was constructed containing zeroes for site pairs whose geographic distances fall within the class and ones for pairs that do not fall into that range class. Then for each distance class matrix, a simple mantel test was performed between the bacterial community distribution and the distance class matrices (Fig. 6). For the first two distance classes, there were no significant spatial autocorrelations observed for the distribution pattern of bacterial community structure (p value, class 1 = 0.96 and class 2 = 0.25) (Fig. 6). However, for the third distance class (50–1700 km), there was weak negative correlation which was significant (rM = −0.19, p < 0.001). A simple Mantel test also showed that there was significant correlation between bacterial community distribution and soil physico-chemical characteristics (rM = 0.59, p < 0.001) (Table 6). However, controlling for variables is an important challenge in microbial biogeography studies when one tries to make meaningful comparisons between the geographic distance and genetic differences, as they may make the comparisons more complex [39]. So we performed a partial mantel statistic on our distance matrices. After controlling for geographical distances, a partial Mantel test showed that there is still a very high significant correlation between bacterial community distribution and soil physico-chemical characteristics (rM = 0.84, p = 0.001) (Table 6). However, when the soil physico-chemical characteristics were included as a control matrix, there was no significant correlation between bacterial community distribution and geographical distance (rM = 0.02, p = 0.41) (Table 6).

Table 6 Mantel test correlations between the bacterial community distribution and selected environmental characteristics
Fig. 6
figure 6

Mantel correlogram for the spatial autocorrelation analysis of bacterial OTU distribution across all ages. Notes: Standard Mantel statistics (rM) are plotted against the distance classes. Closed symbols represent significant autocorrelation at 95 % confidence level. Distance classes: Class 1 (0–1 km). Class 2 (1–50 km). Class 3 (50–1700 km)

Relative Contribution of Temporal and Environmental Filters on Bacterial Community Structure

Variance partitioning analysis (VPA) was performed to quantify the relative contributions of geographic distance and soil parameters to the taxonomic structure of the bacterial communities. A subset of environmental parameters (Ca, Mg, pH, total carbon (%), and total nitrogen (%)) that had the highest Pearson correlation with the bacterial communities were selected by the CCA procedure. The combination of selected soil characteristics and geographic distance showed a significant (p = 0.002) correlation with the bacterial community structure. These factors explained 50.1% of the observed variation, leaving 49.9% of the variation unexplained. The soil factors explained 38.2% (p = 0.003), and geographic distance alone explained 11.9% (p = 0.029) variations, and no interaction effect was detected (Fig. 7). Although the sediment properties together explained more of the variation, geographic distance by itself explained 11.9% of the variation observed, more than any of the other 5 of the individual soil variables (Fig. 7).

Fig. 7
figure 7

Variance partition analysis of the effects of geographic distance and soil variables on the bacterial community structure

Discussion

The broad goal of this experiment was to characterize patterns of bacterial community composition and diversity during soil pedogenesis in two different climate zones. Parent materials in the two locations where similar, but the stages (and age) of ecosystem development differed, with the site in MI representing an aggrading system and that in GA representing a mature mid-succession and possibly retrogressive system. Aggrading ecosystem by definition is gradual accumulation of biomass in the form of living biomass, dead wood, soil fertility, and physical and biological complexity. Essentially, the ecosystem is building biomass over time during the developmental period through primary succession [40, 41]. A retrogressive ecosystem on the other hand is characterized by reductions in ecosystem productivity and plant biomass which is accompanied by shifts in aboveground and belowground communities dominated by stress-tolerant species. These ecosystems are often associated with plant communities reaching a stable and self-replacing climax [41, 42]. One of the important aspects of our study is that it represents first large-scale molecular survey of soil bacterial 16S rRNA gene β diversity across naturally occurring sand chronosequences. Although β diversity estimates (between sample variations) are considered important for the overall understanding of soil microbial community dynamics [8, 43], they are traditionally less studied than α diversity (diversity estimates within a single sample). Pedogenesis is the driver of soil change and is affected by parent material, climate, organisms such as plants and microbes [44], and the length of time that these processes have occurred. As shown previously, bacterial community change is sensitive to environmental gradients [45, 46], especially soil pH [47]. There are, however, a number of other insights with community change reflecting differences between the two locations.

The absolute diversity of taxa is shown to be underestimates; however, comparison of similar sized libraries provides confidence when making comparisons between soils. The soils in MI were estimated to contain up to 2500 different OTU, while those in GA were much lower and ranged between 500 and 600 OTU. Even when pH and soil properties were considered, the number of OTU remained ∼2× greater (∼1000) in the soils of MI. Greater evenness and 3× greater richness (ACE, Chao1) support the conclusion that there is greater diversity in the bacterial communities in the soils of MI.

Bacterial species loss and turnover could also affect the diversity of bacterial taxa in a soil. Weathering rates and temperature related to climate are considerably greater in southern GA than northern-lower MI. Turnover and decomposition of organic matter is greater in GA and therefore likely supports a lower standing microbial biomass. This was supported by the CCA analysis (Fig. 3) which showed that carbon and nitrogen were important drivers of community differences between the two locations [48] . In GA dunes, high summer temperature, low organic matter at the surface to modulate temperature swings, and the feast and famine of rainfall might create some of the most extreme environmental swings for bacteria existing in the top 10 cm of soil. Many studies have shown that alternate wetting and drying of soil in sub-tropical conditions may have a direct influence on microbial community structure and diversity [49, 50]. In MI, low temperatures might be the biologically most difficult variable to adapt; however, large swings are modulated to some extent by the regional climatic effects of Lake Michigan. We also attribute pH to be another important filter that shaped microbial community diversity. The chronosequences showed changes in soil properties, depicting the classic patterns of soil podzolization during ecosystem development [51] . During 450 years of soil development, carbonate mineral started weathering and large quantities of Ca and Mg started leaching from the system as the pH of the upper mineral soil decreased from 7.6 to 3.6. As the dunes got older, decomposing coniferous litter material would have helped in hastening the mineral weathering process by production of organic and carbonic acids [51, 52]. At phylum level classification, the changes in relative abundances of specific taxonomic groups across the chronosequences pH gradient are similar to the pH responses observed in other studies. For instance, the relative abundance of Acidobacteria has been shown to increase towards lower pH [3, 53, 54]. Consistent with those results, our results shows that the relative abundance of Acidobacteria changed from lower abundance in near-neutral condition to higher abundance in acidic condition (Fig. 5). Dunes that were near neutral showed higher bacterial community diversity when compared to the older dunes that were more acidic and supported less bacterial diversity. Therefore, it appears that the established pH–diversity relationship [47, 54] observed here and elsewhere could be driven by the dominance of a few taxonomic groups in low pH soils.

While studying the effect of dispersal limitation on microbial community structure, the community similarities were not positively correlated with geographic distance (rM = 0.13, p = 0.35) when the distance matrix included all the distances. Thus the Mantel test shows there was no significant isolation by distance at a confidence level of 95 %. However, when the distance variance was partitioned into classes, there was a marginal correlation at a spatial scale of 50–1700 km. Our results showed that, at this scale, the community dissimilarity had a moderate negative correlation with the geographic distance (Fig. 6). But, partial correlations of community distances and environmental conditions, keeping the geographic distance controlled, results in a stronger correlation (rM = 0.84, p = 0.001). This was higher than the correlation observed due to soil physico-chemical factors when the effect of the geographic distance was not controlled (rM = 0.59, p < 0.001), indicating an increase in genetic dissimilarities once the geographic isolation effect was removed. Thus the result show some evidence of dispersal limitation in shaping the community patterns but they are difficult to distinguish from the effects of environmental heterogeneity [37, 55].

Thus, the results suggest that local environmental conditions could have a stronger effect than the geographic distance, and could be a major contributor in shaping the bacterial community structure at smaller scales [5, 56]. Previous studies at smaller geographic scales have shown similar effect on the microbial communities in which the environmental conditions was considered a major driving factor in shaping the variability in the community structure [14, 57]. At intermediate geographic scale, studies have shown individual distance effect [58] and environmental effect [7] on microbial communities. Other similar studies conducted at this intermediate scale have shown that both distance and environmental factors could be major determinants in shaping microbial communities [59]. Similar to what we observed, studies have shown that the correlation between community dissimilarity and the geographic distance disappeared as the soils got separated farther [60, 61]. Thus our results suggest that at a geographic scale of 50–1700 km, the geographic distance and environmental conditions would have had different extents in shaping the bacterial community structure.

In conclusion, this study attempts to survey bacterial spatial patterns across two pristine US dune chronosequences which are approximately 1700 km apart. Despite the large effect of environmental factors on community distances, we also found some evidence for residual spatial autocorrelation at closer spatial scales. We show that local geochemical features could be a dominant factor in driving bacterial community structure, while geographic distance as a single factor could contribute to some community variation at a specific scale (50 – 1700 km). Thus, the results show that the bacterial abundance is spatially structured and could be more dependent on local filters such as soil characteristics than the global filters such as climatic factors or the presence of natural barriers. Hence supporting Bass-Becking’s idea that “everything is everywhere, but, the environment selects” which implies that similar habitat and physical conditions would support similar microbial communities. Soil pedogenesis is both a result and driver of ecosystem development and change, and thus understanding how microbial communities change during this natural process will help develop and describe fundamental ecological theory.