Introduction

Wastewater treatment plants (WWTPs) produce sewage sludge in quantities about 10 Mt dry matter (dm)/year in the EU and 7 Mt dm/year in the USA. Mesophilic anaerobic digestion (MAD) is the most common sludge stabilization process in Europe, accounting for approximately half of treatment facilities, yet the number of thermophilic anaerobic digestion (TAD) plants has considerably increased during the last decade (Kristensen 2014; Levantesi et al. 2014). The main advantages of TAD versus MAD are (i) the ability to process larger volumes of organic waste in shorter time periods, (ii) increased production of biogas, and (iii) an improved hygienization which reduces the number pathogenic microorganisms (Martín-González et al. 2011). Agricultural re-use together with disposal of treated sludge to landfills is the main way of stabilized sewage sludge application and can achieve levels as high as 70 % of the total sludge produced. However, land application of treated sludge is disputed in many European countries and even prohibited in some countries (e.g., Switzerland and the Netherlands) (Heimersson et al. 2014). The main cited concern associated with land application and agriculture use of treated sludge is the risk of a potential negative impact on human health or the environment in general (Harder et al. 2014). Many anthropogenic compounds in wastewater may persist despite treatment and can accumulate in sludge, causing the spread of such compounds when the sludge is land-applied (Cincinelli et al. 2012; Stiborová et al. 2015; United States Environmental Protection Agency 2009). An additional threat is the exposure of humans to pathogenic microorganisms during the sludge handling or after land application. Many pathogens have been detected in sewage sludge including viruses (Bibby and Peccia 2013), protozoa (Kitajima et al. 2014), and bacteria (Cai and Zhang 2013; Cai et al. 2014). Of main concern are human pathogens, which can grow rapidly and multiply under favorable conditions, in particular bacteria.

The aim of this study was to estimate the effect of two types of anaerobic digestion (mesophilic and thermophilic) on overall bacterial community structure and phylogenetic diversity in four sewage sludge samples from the Czech Republic, with a specific focus on investigating the distribution and diversity of possible human bacterial pathogens. Several techniques have been established to monitor the bacterial populations in WWTPs, such as plate counting (Guzman et al. 2007), qPCR (Yu et al. 2014), denaturing gradient gel electrophoresis (DGGE) (Boonnorat et al. 2014), terminal restriction fragment lengths polymorphism (T-RFLP) (Pervin et al. 2013b), fluorescence in situ hybridization (FISH) (Pervin et al. 2013a), and others. The characterization of microbial community structure via 16S ribosomal RNA (rRNA) gene amplicon sequencing is highly reproducible and has been greatly advanced in recent years by the introduction of the next generation sequencing (Pilloni et al. 2012). Therefore, bacterial community structure and phylogenetic diversity was assessed using 16S rRNA gene pyrotag analysis. This method, although currently widely used, still struggles with errors caused by PCR (DNA polymerase errors, formation of chimeric sequences) and pyrosequencing noise (Huse et al. 2010; Quince et al. 2011; Reeder and Knight 2010). Efforts have been made which try to minimize these biases (Schloss et al. 2011), yet some errors still linger, increasing thus the observed diversity. The number of errors tends to increase with increasing length of amplicon reads. We therefore evaluated analyses of the diversity based on V4–V5 regions and V4 region only (i.e., longer and shorter reads, respectively).

Materials and methods

Sample collection, preparation, and cleanup

Sewage sludge samples were collected from four different wastewater treatment plants (WWTPs) in the Czech Republic in August 2008. Anaerobic digestion in treatment reactors was operated under mesophilic temperature conditions (36–40 °C) in WWTP Hradec Králové and Brno and under thermophilic conditions (55–60 °C) in WWTP Klatovy and Pilsen. Additional data for the WWTPs in 2008 are as follows: WWTP Hradec Králové (50° 12′ 37.674″ N, 15° 51′ 4.136″ E)—volume of treated wastewater, 16 million m3; total length of sewage net, 496 km; and number of sewage connection, 16,775; WWTP Brno (49° 7′ 54.494″ N, 16° 37′ 49.225″ E)—volume of treated wastewater, 31 million m3; total length of sewage net, 1350 km; and number of sewage connection, 49,930; WWTP Klatovy (49° 24′ 44.248″ N, 13° 16′ 12.022″ E)—volume of treated wastewater, 2.3 million m3; total length of sewage net, 114 km; and number of sewage connection, 4069; and WWTP Pilsen (49° 43′ 14.354″ N, 13° 23′ 39.577″ E)—volume of treated wastewater, 4 million m3; total length of sewerage net, 500 km; and number of sewage connection: 16,500. The samples were pooled in jars and shipped on ice and then stored at −20 °C until analysis.

For metagenomic DNA isolation, samples were defrosted on ice and total DNA extractions and purifications were carried out on 10 g of sample using a PowerMax Soil DNA Isolation Kit (Mo Bio Laboratories Inc., USA) according to manufacturer’s instructions.

16S rRNA gene amplification and sequencing

PCR with primers f563-577: 5′-AYTGGGYDTAAAGNG-3′ (Cole et al. 2009) and r1406-1392: 5′-ACGGGCGGTGTGTRC-3′ (Lane et al. 1985) was performed in order to amplify V4–V8 regions of 16S rRNA genes. The cycling conditions were as follows: 95 °C for 2 min, 25 cycles of 95 °C for 30 s, 54 °C for 30 s, and 72 °C for 60 s with final extension at 72 °C for 7 min. Each 20 μL reaction contained 0.2 mmol/L dNTPs (Finnzymes, Finland), 0.25 μmol/L primers (Generi Biotech, Czech Republic), 0.1 mg/mL bovine serum albumin (New England BioLabs, Great Britain), 0.4 U of Phusion Hot Start II DNA Polymerase (Finnzymes, Finland) with the corresponding buffer, and template DNA (10–50 ng). Both forward and reverse primers bore 5′-end sequencing adapters (454 Sequencing Application Brief No. 001–2009, Roche), and the forward primer was also modified with different tags (454 Sequencing Technical Bulletin No. 005–2009, Roche) so that more samples could be pooled and sequenced at once. The PCR products were checked on 1 % agarose gel, pooled, and purified using AMPure XP Beads (Agencourt, Beckman Coulter, USA) to remove residual primer-dimers according to manufacturer’s instructions. Amplicons were unidirectionally sequenced from the forward primer using GS FLX+ chemistry (Roche).

Amplicon data analyses

The mothur software package version 1.31.1 (Schloss et al. 2009) was used for pyrosequencing data analyses. First, the flowgrams were trimmed using 650 and 800 as the minimal and maximal number of flows, respectively. The number of differences in barcode and primer was set to 0. Second, the flowgrams were denoised until the change in flowgram correction achieved 10−6. Resulting fasta sequences were trimmed allowing no error in barcode or primer and no more than eight bases in homopolymeric regions. Picked unique sequences were aligned against the merged SILVA bacterial and archaeal reference alignments, and the alignment was filtered to remove sequences that did not align well or were shorter than 400 bp. Unique sequences were pre-clustered using the pseudo-single linkage algorithm merging sequences with the difference of 1 bp per 100 bp of sequence length. Chimeric sequences were identified by Perseus (Quince et al. 2011) and removed along with singletons. Valid sequences were classified against Ribosomal Database Project (Cole et al. 2009) reference files (trainset 9) and clustered by average linkage algorithm at 3 % to create operational taxonomic units (OTUs). Error rate was determined by analyzing mock community sequences as described previously (Uhlík et al. 2013). Sequence coverage, Chao1 OTU richness estimate, and the Shannon entropy were also calculated in mothur software package version 1.31.1 (Schloss et al. 2009). Alpha diversity was calculated as Euler’s number (e) raised to the power of Shannon index (Jost 2006). Similarities between communities were assessed through unweighted and weighted UniFrac (Lozupone and Knight 2005), respectively, implemented into mothur.

Detecting bacterial pathogens

The 16S rRNA genes of 122 pathogenic species (mostly type strains) belonging to 61 genera (Table S1) were retrieved from Ribosomal Database Project (RDP) (Cole and Tiedje 2014) and were employed to construct a database which was compared with retrieved sequences from the sludge samples. The comparison of 16S rRNA gene sequences in bacterial pathogens and pyrosequencing libraries was performed by local BLAST using a BLAST+ Release 2.2.26 (Zhang et al. 2000) using the 98 % identity threshold over the first 400 bp.

Sequence IDs

Sequences were submitted to the metagenomics RAST server (Meyer et al. 2008) under the MG-RAST Project ID 7696.

Results and discussion

Pyrosequencing data

Although the primers used in this study span five variable regions of 16S rRNA genes, pyrosequencing reads are not accurate in more than two. Typically, the distal ends of pyrosequencing reads have diminishing quality (Schloss 2013) which can make the longer reads more erroneous. Using the mock community sequences, we verified that the optimal range of flows to be used for further processing varies between 650 and 800. Using this span, the resulting sequences are minimum 400 bp long which is a sufficient threshold for genus classification (Cardenas and Tiedje 2008). These 400 bp cover the regions V4 and V5.

Table 1 shows the number of sequences that passed each analysis step. Based on the analysis of the mock community, the overall error rate is estimated to be about 2.7 × 10−3. We were able to further decrease the error rate to less than 2 × 10−4 by trimming the sequences using the probe 5′-TACNVGGGTATCTAATCC-3′ (corresponding to positions 785–802). The coverage of the forward primer and the probe corresponding to positions 785–802 is 86 and 94 % of bacterial and 15 and 57 % of archaeal 16S rRNA genes allowing no or one mismatch, respectively. In other words, cutting sequences at the position 785 should exclude minimum valid sequences from the original data set. The actual trimming, however, resulted in the loss of 1–19 % of sequences (Table 1), most likely due to the loss of those sequences having errors in the regions 785–802. Nonetheless, the entire pyrosequencing analysis pipeline based on the standard operating procedure (SOP) described earlier (Schloss et al. 2011) is designed to mask both PCR-generated and pyrosequencing errors and, as a result, the final number of OTUs is very similar (up to 2.5 % differences) for both groups of sequences (Table 1). In addition, Wang et al. (2007) previously reported that classification accuracy is higher for longer reads allowing us to conclude that trimming the sequences did not bring a significant benefit to the analysis.

Table 1 Number of resulting sequences depending on the analysis step taken

Community structure and diversity

Bacterial diversity in the sludge samples was assessed through both taxonomic and phylogenetic approaches. The effective number of OTUs, calculated according to Jost (2006) by taking the exponent of the Shannon index, indicates that the sample from Brno had the highest diversity, followed by that from Hradec Králové. The effective number of OTUs in the sample from Brno was twice that of Klatovy and Pilsen samples (Table 2), in which the digestions were operated under thermophilic conditions. These data are consistent with other studies, where the richness and diversity of microbial populations were higher under mesophilic than thermophilic conditions (Gou et al. 2014; Pervin et al. 2013b).

Table 2 Sequence coverage and alpha-diversity indices for each sample

The genetic similarities between the communities were analyzed using the UniFrac platform (Lozupone and Knight 2005). The results (Table 3) show that, in terms of membership, the communities of Hradec Králové and Brno shared about one third of phylogenetic diversity (UniFrac distance of 0.66) and were the most similar of investigated communities. The communities associated with the pairs Hradec Králové-Pilsen and Brno-Pilsen were the least similar, sharing less than 15 % of the total phylogenetic diversity. When examining the structure (i.e., abundance of the sequences is taken into account by using the weighted UniFrac approach), the most related communities were those of Hradec Králové and Klatovy, whereas the communities of Brno and Pilsen were the most distant (Table 3).

Table 3 Genetic distances between the communities as determined by UniFrac (Lozupone and Knight 2005) and weighted UniFrac (Lozupone et al. 2007)
Table 4 Weighted UniFrac distances (Lozupone et al. 2007) between the most abundant groups (phyla, classes) in the communities

In all samples, the bacterial community was dominated by reads affiliated with Proteobacteria (Fig. 1), with over 70 % of sequences clustering with this phylum in samples from Hradec Králové and Klatovy. Within the proteobacterial phylum, the classes Gammaproteobacteria and Alphaproteobacteria were the most abundant. Samples from Pilsen and Brno had a lower relative abundance of Proteobacteria-affiliated reads than the other samples (50 and 41 %, respectively) and a much higher relative abundance of Firmicutes and Bacteroidetes sequences. Both these phyla, Firmicutes and Bacteroidetes, and proteobacterial classes Gammaproteobacteria and Alphaproteobacteria are common in both mesophilic and thermophilic digesters (Pervin et al. 2013a, b). One major difference between the samples from either the thermophilic or mesophilic digesters is the increased presence of thermotolerant populations in the thermophilic digester, as previously described (Martín-González et al. 2011; Pervin et al. 2013b). Our study also indicates that the sludge from thermophilic digesters in Klatovy and Pilsen had a higher number of reads belonging to phyla Deinococcus-Thermus and Thermotogae in comparison to the other samples. Additionally, bacteria of the genus Coprothermobacter, typical of thermophilic anaerobic digesters (Tandishabo et al. 2012), were detected only in the samples from Klatovy and Pilsen. Finally, samples from mesophilic digesters of Brno and Hradec Králové had an increased number of reads clustering with Chloroflexi, specifically over 17 and 4 %, respectively. The phylogenetic differences within these phyla or classes are shown in Table 3.

Fig. 1
figure 1

Percentage contribution of prokaryotic phyla reads to the total community composition. Proteobacteria are subdivided into classes

The UniFrac distances indicated that although the structure of Proteobacteria did not differ by more than 45 % (Proteobacteria in the samples from Klatovy and Brno), the differences in proteobacterial classes were higher—Gammaproteobacteria and Alphaproteobacteria differed up to 64 or 80 %, respectively. Even more notably different were the populations of Firmicutes and Chloroflexi where the distances reached 90 to 95 % for the pairs of the communities Hradec Králové-Pilsen and Brno-Pilsen, respectively (Table 4).

Only 29 OTUs were shared among all four of the sampled communities. Only one OTU, which clustered with Rhodanobacter, was detected in all communities with a relative abundance >1 %. This OTU could be further subdivided into four clusters (“micro-OTUs”) represented by the sequences of the type strains (i) Rhodanobacter ginsengisoli GR17-7, (ii) Rhodanobacter spathiphylli B39, (iii) Rhodanobacter fulvus Jip2/Rhodanobacter soli DCY45, and (iv) R. spathiphylli B39/Rhodanobacter lindaniclasticus RP5557 (Table 5). Analyzing sequences in these “micro-OTUs,” one could see that only the sequences related to R. ginsengisoli GR17-7 were represented in all the sampled communities. Rhodanobacter spp. have been previously described as common in aromatics-contaminated sites, such as contaminated groundwater (Green et al. 2012), aquifer (Prakash et al. 2012), sediment (Luo et al. 2008), or soil (Luo et al. 2008; Uhlík et al. 2012). Some Rhodanobacter populations were also associated with the degradation of halogenated pollutants, such as lindane (Nalin et al. 1999), chlorobenzoate (Gentry et al. 2004), or chlorobiphenyl (Uhlík et al. 2013).

Table 5 Inner clustering of the Rhodanobacter OTU with respect to the closest type strains

Pathogenic bacteria

In 2008, the total sludge production in the Czech Republic was more than 175 thousands of tonnes dm with almost 78 % of total sludge disposal used for agricultural purposes such as composting and landfilling (Horackova 2008). Although sludge can act as an efficacious fertilizer, it can also potentially accumulate toxic chemical substances and pathogenic organisms. We therefore investigated the abundance of reads affiliated with pathogenic microbial species in sludge samples after TAD and MAD stabilization.

The abundance of pathogenic bacteria ranged between 0.23 and 1.57 % of the total sequences (Table 6) which is comparable with other studies (Bibby et al. 2010; Ye and Zhang 2011). The abundance of sequences affiliated with pathogenic bacteria was higher in the sludge with MAD versus TAD stabilization, which can be ascribed to the higher temperatures during the anaerobic digestion in TAD and thus better hygienization of the biosolids. Similar results showing that the thermophilic waste treatment is more efficient in reducing the pathogenic bacteria have also been described elsewhere (Arthurson 2008; Levantesi et al. 2014).

Table 6 Sequences of pathogenic bacteria detected in the samples. The comparison of 16S rRNA gene sequences in bacterial pathogens and pyrosequencing libraries was performed by local BLAST using a BLAST+ Release 2.2.26 (Zhang et al. 2000) using the 98 % identity threshold

Among the 61 genera listed in the Table S1, only ten genera were identified in sludge samples (Table 6). Neither Escherichia coli or Salmonella, organisms which have been extensively studied in connection with surviving the stabilization process, were detected in any of the analyzed samples (Arthurson 2008). The most frequently detected genera in the sludge were Mycobacterium and Streptomyces. Among mycobacteria, the retrieved sequences matched those of opportunistic pathogens Mycobacterium avium, Mycobacterium intracellulare, Mycobacterium phlei, and Mycobacterium kansasii. These nontuberculous mycobacteria are commonly found in environmental samples, mostly water, soil, and sewage sludge, and the infection threat depends on their concentrations and the route of exposure (Bibby et al. 2010; Cai and Zhang 2013; Lahiri et al. 2014; van Ingen et al. 2009).

The other abundant genus detected was Streptomyces (Table 6). Streptomycetes are Gram-positive, aerobic, filamentous actinomycetes which are ubiquitous in soil and have an important ecological role in the turnover of organic material. Streptomyces somaliensis, however, is a human pathogen which can cause actinomycetoma: a severe and debilitating infection affecting the deep tissues and bones (Kirby et al. 2012; Seipke et al. 2012). S. somaliensis was found mostly in the sludge sampled in Hradec Králové, where its reads accounted for more than 50 % of all sequences of pathogens. To the best of our knowledge, this is the first report describing the presence of this human pathogen in sewage sludge samples.

Other species detected in sludge samples included Acinetobacter calcoaceticus, Alcaligenes faecalis, and Gordonia spp., which are opportunistic pathogens that readily colonize immunocompromised patients and are commonly found in soil and water (Choi et al. 2012; Nakano et al. 2013). Also, detected in investigated samples were Legionella anisa, Bordetella bronchiseptica, Enterobacter aerogenes, Brucella melitensis, and Staphylococcus aureus.

Conclusions

In conclusion, this study shows that phylogenetically diverse microbial populations inhabit sewage sludge, with only a few taxa detected in all investigated samples. Our data indicate that diversity of the communities in the sludge is influenced by the temperature of the anaerobic digestion. Diversity was higher upon the mesophilic stabilization process, while populations after the thermophilic treatment were less diverse with notable shifts towards the higher amounts of thermotolerant taxa and lower numbers of sequences affiliated with pathogens. Further studies remain to reveal the survival rates of the organisms in biosolids after their land application.