Introduction

Bacteria are integral components of soil, where their community structure and diversity have been found to be linked to many soil environmental characteristics, such as the physical and chemical properties of the soil [1, 2]. Traditional microbial cultivation techniques frequently overlook the majority of microbes present in a sample [3], as most bacteria cannot be cultivated under laboratory conditions. During the last two decades, many studies have used high-throughput PCR amplified 16S rDNA sequencing to overcome this difficulty to identify the members of the prokaryotic community. The 16S rRNA gene is by far the most widely used genetic marker for phylogenetic and microbial community studies, as it has highly conserved regions that permit effective PCR primer design, and sufficient variable regions to allow for accurate taxonomic and phylogenetic identification of community members. Since this gene has been widely sequenced in microbial diversity surveys, there is a large amount of accumulated 16S rDNA sequence data in databases [4] such as the Greengenes, Silva, and RDP databases. The 16S rRNA gene in bacteria includes a total of 9 hypervariable regions (V1–V9), and the V1–V3 regions have been shown to be effective for bacterial identification [5].

The Desert of Maine is a tract of glacial silt with a surface area of 160,000 m2, surrounded by a pine forest, in southern Maine in the northeastern USA. The glacial “desert” was once covered by a farm, and exposed because of severe soil erosion due to crop rotation mismanagement (https://www.desertofmaine.com). It is not a true desert, as it receives an abundance of precipitation (76–120 cm/year), with a mean annual air temperature of 6–7 °C (https://websoilsurvey.sc.egov.usda.gov/App/HomePage.htm, Natural Resources Conservation Service). Although it is a tourist attraction, there is no imported sand nor designer landscaped dunes. This surface area was formed approximately 11,000 years ago, during the end of the last Ice Age of the Pleistocene Period [6]. The parent material of the soil is sandy glaciofluvial deposits derived from granite and gneiss. Soil and ground rocks were slowly scraped by glaciers into a sandy substance, forming a layer up to 25 m deep. Then, over many centuries, surface soils formed a cap, concealing the “desert sand,” and allowing a forest to grow, followed by the subsequent development of agriculture.

The soil of the Desert of Maine has a sandy texture with poor water-holding abilities, nutrient conservation capabilities, and an acidic pH value. Mineral sandy loam soils contain less organic materials, with a basic pH [7, 8], while low pH soils generally contain more organic materials, such as the soils of forests and some grasslands [9,10,11]. Soil of the Desert of Maine presents a novel example to identify the bacterial populations that can be present in a mineral sandy loam soil with a relatively low pH and low concentration of organic materials. We used pyrosequencing of PCR amplified 16S rRNA genes from total DNA to assess bacterial diversity, community structure, and the relative abundance of bacterial taxa within two sites of the surface soil from the Desert of Maine, and compared this bacterial community with those from several other sandy desert environments, as well as from other mineral soils.

Materials and Methods

Sampling

Two surface sand samples from the Desert of Maine (Fig. 1) were obtained on September 2, 2011. The two soil samples were collected by scooping surface sand into 50 ml sterile polyethylene conical centrifuge tubes, in an area cordoned off from tourists and without any measurable rainfall for at least 4 days. The average air temperature during the week of sampling was 22 °C. After collection, the samples were treated as previously described [12]. To perform the analyses of selected physicochemical parameters of the sample site soil, samples from the two sites were pooled and sent for analyses using standard methods to the Laboratoire d’Analyses de Sols (Institut National de la Recherche Agronomique, Arras, France).

Fig. 1
figure 1

Location of the Desert of Maine in the northeastern USA

Sample Preparation

Total DNA was extracted from each sample using the protocol of An et al. [12]. An aliquot of extracted DNA was adjusted to a final DNA concentration of 15 ng/μl in 1/10 TE buffer (1 mM Tris pH 8; 0.1 mM EDTA) using a NanoVue spectrophotometer (GE Healthcare, Little Chalfont, Buckinghamshire, UK), and the concentration verified by ethidium bromide fluorescence after electrophoresis through a 1% agarose gel in TAE buffer (2 mM Tris–acetate pH 8; 5 mM Na-EDTA). PCR reactions were performed in 25 μl reaction volumes. Each reaction contained one of two different thermostable DNA polymerases and their corresponding reaction buffers, 200 μM of each dNTP, 0.5 μM of each primer, and 1 to 10 ng of extracted DNA. The 16S rRNA genes were amplified using universal bacterial primers for pyrosequencing and covering hypervariable regions V1–V3: primer 27F (A adaptor + GAGTTTGATCMTGGCTCAG) and primer 518R (B adaptor + Mid + WTTACCGCGGCTGCTGG), where A and B represent the adaptors using the 454 Roche FLX Titanium pyrosequencing reaction platform. The Mid-sequences are eight nucleotide tags designed for sample identification barcoding according to the 454 protocol. PCR amplification conditions were adapted for the use of two different thermostable DNA polymerases: (A) Phusion High-Fidelity DNA Polymerase (Finnzymes, Finland): 98 °C for 2 min, followed by 28 cycles of 98 °C for 30 s, 54 °C for 20 s and 72 °C for 15 s, and a final elongation step at 72 °C for 5 min; (B) Pfu DNA Polymerase (Fermentas, Canada): 95 °C for 2 min, followed by 30 cycles of 95 °C for 30 s, 48 °C for 30 s and 72 °C for 1 min, and a final elongation step at 72 °C for 5 min. Each DNA sample was subjected to 3–5 different PCR reactions per DNA polymerase to minimize PCR bias. The PCR products were pooled and subjected to electrophoresis through a 1% agarose gel in TAE buffer. After electrophoresis and visualization of the PCR products by ethidium bromide staining and long-wave UV light illumination, NucleoSpin Extract® kits (Macherey–Nagel, Germany) were used to purify the 16S rDNA PCR products. Then, 40 ng of PCR products from each sample were mixed for pyrosequencing, performed using a 454 Roche FLX Titanium Pyrosequencer (Microsynth AG, Switzerland).

DNA Sequence Data Processing

The raw DNA sequences were first assigned to each sample via their Mid-tag using MOTHUR version 1.33 [13], and reads were removed if at least one of the following criteria was met: (i) length less than 200 nt or longer than 600 nt, (ii) one mismatch to barcode sequences or more than one mismatch to the primer, and (iii) the presence of homopolymers of > 8 bp in length. Adaptor sequences were removed from the sequences using the “Cutadapt” tool [14] implemented in the Galaxy server of the Institut de Génétique et Microbiologie (IGM) of the Université Paris-Sud (https://galaxy.igmors.u-psud.fr). Then, the sequences were checked for quality scores by ConDeTri version 2.2 [15], using the criteria that 80% of the nucleotides in a sequence have quality scores > 25. We used UCHIME [16], with reference database Greengenes version 2013_May, and Decipher (through web tools available at https://decipher.cee.wisc.edu/) to detect chimera sequences [17]. Sequences detected as chimeras by both programs were removed from the data sets. The raw sequences have been deposited in the GenBank short-read archive (SRA), with accession number SRP056525.

Taxonomy assignments of the remaining 16S rDNA reads were conducted using the Silva NGS website with Silva database release 123 [18, 19]. Sequences classified as Chloroplast, or that could not be classified as belonging to the Bacteria Kingdom, were removed. Diversity analyses were performed using the software package MOTHUR. The relations of the relative abundance of bacterial groups at different taxonomy levels between the two samples were calculated using the Pearson correlation coefficient measure with SPSS Statistics (Version 22). A comparison based on the bacterial relative abundance of the communities from nine different regions was examined using Unweighted Pair Group Method with Arithmetic mean (UPGMA) based on Bray–Curtis distance at 95% similarity. The clean reads were clustered into operational taxonomic units (OTUs) using UPARSE at a cutoff value of 97% sequence identity [20, 21]. The Chao1 and Shannon indices were calculated to estimate taxon richness and diversity [22]. The significance of differences between two bacterial communities was calculated using Libshuff implemented in MOTHUR [23], with 1500 randomly selected sequences from each sample selected using PANGEA [24].

Results

Chemical and Physical Properties of the Sand Samples

Two areas, separated by 3 m, of the Desert of Maine were sampled on September 2, 2011 (GPS position 43°51′29.46″ N, 70° 9′22.97″ W). The mean chemical and physical properties of the combined sand samples are shown in Table 1. The mean pH values of the soil at the sampling site was 5.09, indicating an acid soil environment. The levels of total organic carbon and organic material were less than 1 g/kg soil.

Table 1 Average chemical and physical properties of the Desert of Maine soil samples

Diversity Analyses

The average length of the raw DNA sequences for the two samples was 479 nt (Maine 1) and 480 nt (Maine 2), respectively, while the total number of reads for each sample were 23,405 (Maine 1) and 28,983 (Maine 2), respectively. After bioinformatic cleaning, approximately 95% of the sequences remained (22,320 for Maine 1 and 27,680 for Maine 2). The sequences were further filtered for quality and examined for chimeric sequences, leaving 65% of the total reads (14,776 for Maine 1 and 19,085 for Maine 2). The average length of the sequences after processing was 396 nt for the two samples. The number of reads remaining after each step are presented in Table 2.

Table 2 Summary of the number of sequences and diversity indices before and after normalization to the same number of reads (14,776)

The clean sequences were clustered into OTUs at 97% similarity levels, excluding the unclassified sequences at the phylum level, and sequences classified as Chloroplast were removed. In total, 1394 OTUs were observed in the two samples, and the number of common OTUs observed in both samples was 668. The numbers of core taxons (most abundant OTUs comprising more than 1% of the sequences) were 18 in the Maine 1 sample and 14 in the Maine 2 sample. The core taxons comprised 30% of the bacterial population in the Maine 1 sample and 23% in the Maine 2 sample. We observed no differences between the Shannon diversity indices of the two samples. We also normalized our two samples by subsampling of sequences to the same number of reads (14,776) to perform the analysis based on OTU distribution, and the results after normalization are shown in Table 2.

Classification of DNA Sequences

The sequences from the two samples were classified at 6 taxonomic levels with the Silva NGS database, and comprise at least 22 phyla, 41 classes, 76 orders, 115 families, and 172 genera, plus a number of unclassified sequences at various taxonomic levels. The distribution of sequences at the phylum level is shown in Fig. 2a. Unclassified sequences at the phylum level represent 2.4% of the sequences in Maine 1 and 3.9% in Maine 2. The community structure at the phylum level shows a similar distribution for the two samples (R = 0.989) using the Pearson correlation coefficient measure. The significance of the differences between the two bacterial communities was calculated using the Libshuff package in MOTHUR, with a P value < 0.01, indicating that the two bacterial communities do have not the exact same composition. The predominant phyla of the samples represent members of the Proteobacteria, Actinobacteria, Chloroflexi, Bacteroidetes, and Acidobacteria phyla, at 39.1%, 18.5%, 9.7%, 6.4%, and 5%, respectively, as the average for the two samples, followed by members of the Cyanobacteria (4.2%), Gemmatimonadetes (3.1%), Planctomycetes (2.5%), Armatimonadetes (1.9%), and Deinococcus-Thermus (0.5%) phyla.

Fig. 2
figure 2

Relative abundance of bacterial 16S rRNA gene sequences from the two samples at the Phylum (a) and Proteobacteria classes (b) levels. See Materials and Methods for details

Within the phylum Proteobacteria, members of the Alphaproteobacteria represent the most abundant group in both samples (68.3% in Maine 1, 63.8% in Maine 2), followed by members of the Betaproteobacteria (15.4%, 16.9%), and Deltaproteobacteria (14.1%, 15.4%). The bacterial community structure at the Class level (Fig. 2b) presents a similar pattern of distribution in the two samples (R = 0.959) using the Pearson correlation coefficient measure. Examining families, members of the Sphingobacteriales family were the predominant members of the Bacteroidetes phylum, with an average of 4.8% of the total reads for the two samples. The dominant orders in the two samples were from the Acetobacterales (43%), Rhizobiales (13%), and Sphingomonadales (7.2%) orders.

Among the 172 genera identified in the samples, 69 genera belong to the phylum Proteobacteria, 41 genera belong to the phylum Actinobacteria, 24 genera belong to the phylum Bacteroidetes, and 9 genera belong to the phylum Acidobacteria. The most abundant 20 genera in each sample are shown in Table 3. The similarity between the two bacterial communities at the genus level is 0.975 using the Pearson correlation coefficient measure. Each of these abundant genera account for 0.6–11.3% of bacteria identified in the Maine 1 sample and 0.8–13.6% in the Maine 2 sample, with the most abundant genus being Acidiphilium (from the phylum Proteobacteria) for both samples. Members of the genus Crinalium (1.9% average for the two samples) represent a group of phototrophic bacteria belonging to the phylum Cyanobacteria, and species of this genus have been reported to be highly drought-resistant and commonly isolated from coastal sand dunes [25, 26]. Members of the genus Arthrobacter are abundant in both samples (1.4% in average for our two samples), and members of this genus are frequently involved in mineral weathering of soil and can secrete large amounts of oxalic acid [27, 28].

Table 3 Relative abundance of the 20 most abundant bacterial genera in the Desert of Maine samples

Discussion

In recent years, desert-related environmental effects have been increasing, as global warming and human activities contributing to desertification are increasingly threatening ecosystems around the world (www.unddd.int). Studies concerning microbial colonization and dispersion in deserts have been performed to estimate the function of microbial communities from hot desert sand, which may play an important role in soil stability, nutrient cycles, and environmental health [12, 29,30,31]. The Desert of Maine was previously productive agriculturally, and was covered by farm land. The underlying mineral soils were exposed because of severe soil erosion due to potato crop rotation mismanagement [6]. In this study, we used pyrosequencing of PCR amplified 16S rRNA genes to assess bacterial diversity and community structure of surface soil from the Desert of Maine. An examination of bacterial populations in samples of surface soil from this unique site can provide an opportunity to investigate bacterial diversity and community structure in a hot desert-like oligotrophic environment, and its relation with those from other soil types, even though it is not a true desert but is, in fact, surrounded by a pine forest.

Previous studies have revealed that many environmental factors, including pH, the concentration of organic material and that of sodium, can have large effects on the presence and distribution of bacterial community members in soil [7, 32, 33]. A large proportion of the variance in soil bacterial diversity and community composition appears to be strongly influenced by pH [34], at local [35] and even continental scales [1]. We thus compared our data on the distribution of the predominant phyla (Fig. 3) with those from studies using 16S rRNA gene pyrosequencing of samples taken from apparently similar soil environments or with similar physicochemical factors. The Desert of Maine is surrounded by a pine forest. Thus, its soil microbiome may be influenced by that of the nearby pine forest environment. Uroz et al. [36] studied the bacterial community of soils from an oak forest of Breuil-Chenue in Morvan (France), and found that members of Acidobacteria account for about 36% of the total bacterial population in their sample. Shah et al. [10] examined the sandy, acidic, and relatively nutrient-poor soil of a pine barrens region of Long Island (New York, USA), which is also composed of gravel deposited by the withdrawal of glaciers. This soil had an average pH value of 4.75, total organic carbon of 11.7 g/kg, and 0.13 g/100 g of Al and 0.21 g/100 g of Fe. Samples of surface soil from other hot, or cold, deserts were compared with our results including (1) a sand sample (Gobi 1) from the Gobi Desert of Northwestern China [12], which also has a low concentration of organic carbon and organic materials (< 1 g/kg); and (2) a sand sample from the Kumtagh Desert in China, collected in September 2011 (data not published). The samples were treated in the same manner as these from the Desert of Maine; (3) a sand sample from the Australian Deserts [37]; (4) a sample from the Sonoran Desert [38]; (5) a sample (Altamira) from the Atacama Desert in Chile [8]; (6) a sample (Upper Wright Valley) from McMurdo Dry Valleys [39] in the area of the Antarctic continent (a cold desert) and a sample (mineralized sandy soils) from Mayotte, France [31]. All of these studies were performed using pyrosequencing of 16S rDNA amplicons. We used the Pearson correlation coefficient measure to estimate the similarity of the distribution of the predominant phyla among the different sites and a UPGMA tree based on the Bray–Curtis similarity index. The results show that the distribution of the predominant phyla in both samples from Maine are closer to samples from the two forest soils (R values > 0.7), than to samples from either hot or cold desert soils (R values < 0.5). For the Phylum Proteobacteria, there is no significant difference between samples from the Desert of Maine and these from the other two forest samples. Moreover, the relative abundance of this phylum in our samples is greater than from the other desert samples (P < 0.05). In contrast, the Gobi Desert, Kumtagh, and Mayotte samples contained a much higher proportion of members of the Firmicutes than the others. Samples from the Atacama Desert and the McMurdo Dry Valleys contained a larger population of Actinobacteria members when compared with other groups (P < 0.05). There is no significant difference in the percentage of Bacteroidetes members among the different samples examined here. These results confirm those of earlier studies [7, 8, 40] that suggest that oligotrophic environments with a large mineral component and low levels of organic materials have a relatively large proportion of Gram-positive bacteria, such as those from the Firmicutes and Actinobacteria phyla. Previous studies [1] indicate that high pH soils typically have a higher relative abundance of members of the Actinobacteria and Bacteroidetes phyla, with a lower abundance of Acidobacteria, when compared with populations from more acidic soils. We did not find significant differences in the percentage of Acidobacteria among the different soils we examined, suggesting that other factors may affect the bacterial community structure of these types of soils.

Fig. 3
figure 3

Comparison of the Desert of Maine bacterial communities with those from other selected soil samples at the Phylum level. UPGMA tree calculated using the Bray–Curtis similarity at 95% based on the relative abundance of bacterial phyla in soil samples from the 2 Desert of Maine samples; An oak forest soil in France [36]; A pine forest soil in the USA [10]; A Gobi desert sand sample from Mongolia [12]; A sand sample from the Kumtagh Desert in China (data not published); A sand sample from the Australian Desert [37]; A sand sample from the Sonoran Desert [38]; A sand sample from the Atacama Desert in Chile [8]; A sample from the McMurdo Dry Valleys in the Antarctic [39] and a mineralized sandy soil sample from Mayotte [31]

Lauber et al. [1] examined bacterial communities in 88 soils from across North and South America using high-throughput sequencing of PCR amplified 16S rRNA genes, and their results showed that, in soils with pH values of 5–6, the dominant phyla belonged to members of the Acidobacteria, Alphaproteobacteria, Actinobacteria, Bacteroidetes, and Beta/Gammaproteobacteria [41]. In our samples, which had a similar pH range, we observed a lower proportion of Acidobacteria (5.1% vs. 29.7%), and a higher proportion of Actinobacteria (18.5% vs. 8.8%). Our data also reveal a high level (10.3%) of Chloroflexi phylum members, which is not commonly found in studies of deserts [42]. A study of the Atacama Desert showed that the non-cyanobacteria phototrophic bacteria Chloroflexi was dominant in the hyper-arid core of the desert [43]. Previous studies have shown that members of the Chloroflexi may play an important role in soils as soil photoautotrophs that contribute to CO2 uptake in the surface soil [44, 45]. Members of the family Acetobacteraceae were abundant in both samples, comprising 16.8% of total sequences, on average. Members of this family have been described as nitrogen fixing bacteria able to act in plant growth promotion by a variety of mechanisms [46]. Koberl et al. [47] reported a greater proportion of N-fixing bacterial groups in desert soils than in farm soils, and suggest that this could be explained by the fact that plant growth promoting bacteria play an important role as a nitrogen donor in soils without compost treatment.

The most abundant identifiable genera shown in Table 3 demonstrate that the dominant genus in our two samples belongs to members from the genus Acidiphilium, accounting for 11.3% of the bacteria in the Maine 1 sample and 13.6% for the Maine 2 sample. Acidiphilium is a genus in the phylum Proteobacteria, and many species from this genus are acidophilic bacteria isolated from acidic mineral environments [48, 49], which is consistent with the physical characteristics of our sample site. Acidiphilium spp. are also involved in the iron cycle, with the function of reducing ferric iron by oxidizing organic matter at low pH [50]. In mineral soils, Fe-oxidizing bacteria are well represented [51, 52]. Lithotrophs, such as members of Microcoleus in the phylum Cyanobacteria, are typically the dominant microorganisms in the microbial community in hot desert soils (oligotrophic), as well as sulfate-reducing bacteria such as Desulfobacterales, but these two groups of bacteria do not appear to be abundant in the Desert of Maine soil [53]. Members of the Alphaproteobacteria, Acidobacteria, and Actinobacteria are ubiquitous in mineral environments, which is consistent with the distribution of bacteria in our samples. We found that the samples contain relatively low levels of mineral weathering bacteria such as members of the Burkholderia, Agrobacterium, and Bacillus genera, and only one abundant genus, Arthrobacter, was found to be correlated with mineral weathering [27, 28, 54].

We also observed that 5% of the total OTUs in the dataset contained > 50% of the total sequences, while approximately 80% of the total OTUs were highly diverse and comprised < 20% of the total sequences. The results of taxonomic assignments of the sequences showed that > 30% of the sequences were not able to be classified at the genus level. This situation is frequent in studies of soil bacterial communities using high-throughput DNA sequencing techniques [55,56,57].

In this study, we used Pyrosequencing of PCR amplified bacterial 16S rRNA genes to reveal a high degree of bacterial diversity and community structure in two soil samples from the Desert of Maine. This small sand-like environment presents unique bacterial community patterns when compared with sand from hot deserts, and also presents differences on the abundance of certain predominant microorganisms within mineral soils, thus aiding in the understand of soil mineralization and, hopefully, helping in land recovery to reverse extreme soil erosion.