Introduction

With a rapidly growing population, solid and liquid wastes are increasing at an alarming rate in global ecosystems (Guerrero et al. 2013). Limited resources and the lack of proper management systems create dumping sites and pollute river ecosystems, which leads to the spread of human pathogenic bacteria and antibiotic resistance (AR) genes in both terrestrial and aquatic ecosystems (Blomström et al. 2016). However, knowledge of pathogenic bacteria in the natural environment is limited. Thus, an in-depth microbiological investigation is important for understanding the risks associated with such environments (Mwaikono et al. 2016). Most of the bacterial species are not suitable for laboratory cultivation, and therefore information on pathogenic species present in contaminated ecosystems is limited.

High-throughput sequencing (HTS) allows for in-depth metagenome research and can address the entire bacterial communities in any ecosystem. It has been successfully applied to monitor bacteria communities and their metabolic potentials in diverse environments (Panda et al. 2015; De Mandal et al. 2017). Deep amplicon sequencing of 16S rRNA has been used to detect rare species (Welch and Huse 2011). An analysis of the contaminated environmental metagenome may be useful for assessing pathogenic strains present in the environment. Sequencing of such metagenomes would also lead to the discovery of novel bacterial taxa, virulence genes, and other mobile genetic elements.

Mizoram, a state in Northeast India, is known as a “biodiversity hotspot” and has a unique ecology and geography with mountainous landscape. The Mizo-tribal community has a municipal waste management system, but due to the hilly terrain, the wastes are often mixed with streams and rivers. Discharges from hospitals, dumping sites, and domestic waste cause significant water pollution, which leads to an increase in infectious diseases to the local population and animals. To date, no studies have been conducted to evaluate the bacterial populations present in these contaminated sites. Therefore, it is important to determine the dominant bacterial communities, their diversity, and distribution in these environments.

In the present study, we examined the bacterial community structure and diversity using paired-end Illumina sequencing of V3–V4 regions of 16S rRNA in the domestic and hospital wastes in polluted soil (dumping land) and sediment (river) of Mizoram, North East India. The aim of this study was to analyze the bacterial communities and their imputed metabolic potentials in the contaminated sites. Questions of interest include:

  1. 1.

    What are the bacterial communities and potential human bacterial putative pathogens occurring in contaminated sites?

  2. 2.

    What are the functional roles, especially the antibiotic resistance and xenobiotic degrading genes prevailing in these environments?

Materials and methods

Site selection and soil sampling

Soil and sediment samples were collected from three rivers and one dumping site contaminated by hospital and domestic waste in the Aizawl city of Northeast India. Aizawl city is the state capital of Mizoram, Northeast India, located at an average altitude of about 1132 miles. The population according to the 2011 census was 293,416. Based on the data on the drainage system and sewage management of the Public Health Department and Aizawl Municipal Council, four sampling sites were selected where maximum pollution could occur. Chite river site (CHR) is contaminated by discharges from Hospital 1, Hospital 2, Hospital 3 and Hospital 4. Turial river site (TUR) is largely polluted by Hospital 5 and Hospital 6. Tuikual river site (TUKR) is contaminated by the discharges from the Hospital 7 , and most of the solid wastes from the Aizawl city are disposed at the solid waste dumping site (SWD) (Figs. 1 and 2). Soils and sediments (about 50 g) were sampled from three contaminated rivers and one dumping site from 0 to 10 cm depth. Ten replicate samples were collected at different locations for four individual sites and mixed uniformly to make a composite sample. All the collected samples were transferred to ethanol-disinfected core tubes inside the ice box and transferred into the laboratory, and kept at – 80 °C until processed.

Fig. 1
figure 1

Geographical location of sampling sites: Tuikual river site (TUKR), Chite river site (CHR), Turial river site (TUR), and solid waste dumping site (SWD).

Fig. 2
figure 2

Schematic representation of major households effluent and domestic waste flow. a Eastern site of Aizawl City. b Western site of Aizawl City (Public Health Dept., Govt of Mizoram, 2017).

DNA isolation and Illumina sequencing

Bacterial DNA was isolated from approximately 500 mg of soil and sediment samples using the Fast DNA spin kit (MP Biomedical, Solon, OH, USA) according to the manufacturer’s protocol. The DNA concentration was quantified using a microplate reader (Spectra Max 2E, Molecular Devices, CA, USA). Each sequenced sample was prepared according to the Illumina 16S Metagenomic Sequencing Library protocols. The quantification of DNA and the DNA quality was measured by Pico Green and Nanodrop. Amplification was performed using the forward primer 341F (TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG) with locus-specific V3 sequence (CCTACGGGNGGCWGCAG) and reverse primer 805R (GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG) with locus-specific V4 sequence (GACTACHVGGGTATCTAATCC) (Klindworth et al. 2013). Genomic DNA (2 ng) was amplified using 5 pmol forward-tailed target-specific primer, 5 pmol reverse-tailed target-specific primer, and Herculase II polymerase (Agilent). PCR was carried out in triplicate for each variable region in a 25-μl reaction as follows: 95 °C for 3 min, followed by 25 cycles of 95 °C for 30 s, 55 °C for 30 s, and 72 °C for 30 s, and a final elongation step at 72 °C for 5 min. The amplicon libraries were removed of excess nucleotides, salts, and enzymes using 20 μl of Agencourt AMPure XP system (Beckman Coulter Genomics) and eluted in 25 μl TE buffer. Ten microliters of this first step reaction was subjected to a second amplification step following Nextera XT Index Primer (N7xx), Nextera XT Index Primer (S5xx), and Herculase II polymerase (Agilent). PCR for each variable region was carried out in triplicate in a 25-μl reaction as follows: 95 °C for 3 min, followed by 10 cycles of 95 °C for 30 s, 55 °C for 30 s and 72 °C for 30 s, and a final elongation step at 72 °C for 5 min. The amplicon libraries were cleaned to avoid excess nucleotides, salts, and enzymes using 20 μl of the Agencourt AMPure XP system (Beckman Coulter Genomics) and eluted in 25 μl of TE buffer. The final purified product was quantified using qPCR (KAPA Library Quantification kits for Illumina Sequencing platforms) and qualified using the TapeStation DNA screentape D1000 (Agilent). Paired-end Illumina sequencing of the V3–V4 hypervariable region of the 16S rRNA gene was performed (Macrogen Inc., Korea), and raw reads were deposited into the NCBI (SRA accession number SRR7079871–SRR7079874).

Analysis of Illumina sequences

The adapter sequences were removed from the raw fastq sequences using Scythe, and low-quality bases were trimmed using Sickle v1.33 (Joshi and Fass 2011). The adapter trimmed sequences were analyzed using QIIME software package v.1.8.0 (Caporaso et al. 2010). Sequences with poor quality were filtered using the split_libraries command. USEARCH was used to remove chimeric sequences followed by deletion of singletons (Edgar et al. 2011). The consensus sequences were clustered into operational taxonomic units (OTUs) based on their sequence similarity by the Uclust program (similarity cutoff = 0.97) (Edgar 2010). Finally, the representative sequence of each OTU was aligned to the Greengenes core set reference database using the PyNAST program and classified using Greengenes database (DeSantis et al. 2006). Shannon diversity and observed species indices were calculated from the Illumina dataset by QIIME software.

Human pathogen identification from Illumina sequencing data

The bacterial pathogen database (HBPD) was used for the search for human bacterial pathogens from Illumina sequencing reads (Cai and Zhang 2013). All the pre-processed V3–V4 16S-rRNA Illumina reads obtained in this study were compared against HBPD database for the sequence similarity using the NCBI-BLAST algorithm. Only sequences with a blast hit with ≥ 97% identity were considered for the analysis (Banskar et al. 2016).

Predicted metagenome analysis

Phylogenetic investigation of communities by reconstruction of unobserved states (PICRUSt) (Langille et al. 2013) was used to predict the community function in the contaminated sites based on a constructed phylogenetic tree of 16S rRNA marker gene sequences. The OTU table obtained by pick_closed_reference_otus.py script against the Greengenes database was employed for predicted 16S rRNA gene copy number normalization, followed by metagenome functional analysis. The resulting table was collapsed at KO level 3 within the pathway hierarchy of KEGG using the categorize_by_function.py script.

Statistical analysis

The Shannon index represents OTU abundance and estimates both richness and evenness, while the observed species metric indicates only the unique OTUs present in the samples. In this study, beta diversity between the bacterial communities was determined by weighted and unweighted UniFrac (Lozupone and Knight 2005). Weighted and unweighted UPGMA trees were constructed by performing Jackknife test. Heatmap was used to determine the relative abundance of the top 50 bacterial genus using the CIMminer tool (http://discover.nci.nih.gov/). Differences in the metabolic potential of soil and sediment samples were examined using Welch’s t test in the STAMP software.

Result and discussion

Environmental monitoring is an important aspect for understanding the processes and activities of an environment and is also used to assess the human activities which carry a risk of harmful effects on the natural environment. Next-generation sequencing (NGS) technologies have been applied in various complex microbial environments to answer diverse ecological questions and have enabled scientists to routinely use this technology for environmental monitoring (Kwon et al. 2011). In the present study, we employed high-throughput Illumina sequencing method to characterize the taxonomical and imputed functional composition in four different sites contaminated with domestic and or hospital wastes. A total of 1,113,884 16S rRNA sequence tags were obtained by Illumina sequencing with an average length of 200 bp (Table 1). The pre-processed high-quality Illumina sequences ranged from 37,999 (SWD) to 42,387 sequences (CHR) with a mean of 39,985 and SD 1791. A detailed description of the sequence statistics is provided in Table 1.

Table 1 Raw read statistics obtained from Illumina and nanopore sequencing

Bacterial diversity

NGS method identified the abundance and diversity of bacteria in soil and sediment samples of Mizoram, Northeast India. Rarefactions were performed to a depth of 37,999 reads. Rarefaction curves for richness in all four samples plateau at the maximum depth, indicating an adequate sampling procedure (Fig. 3). Shannon index suggests that the microbial diversity is different between the study sites, with the lowest diversity associated with SWD (8.58) and highest in CHR (11.18). Chao richness estimates suggest that the greatest number of the least number of sequences were captured in TUR sediment and lowest in SWD (Table S1). A total of 20108 OTU was found ranging from 2537 (SWD) to 6492 (CHR). Both soil and sediment samples contain high bacterial diversity. A large number of untreated domestic waste are discarded into these studied sites, which may serve as an inoculum to increase the overall bacterial diversity in these environments. Differences between the bacterial communities were carried out using principal coordinate’s analysis (PCoA) and UPGMA hierarchical clustering analysis using QIIME. PCoA (Fig. 4a, b) revealed that the microbial communities were different in the studied samples. Bacterial communities present in TUR and TUKR were more closely related than the CHR and SWD samples. Hierarchical clustering analysis (Fig. 4) using both soil and sediment samples showed that the microbial communities are significantly different from each other. The differences in the alpha diversity, as well as bacterial communities between the samples, may be due to the variation in physiochemical parameters (Ibekwe et al. 2013).

Fig. 3
figure 3

Rarefaction curve of the samples (cutoff of 3%): alpha-rarefaction plots were generated in QIIME using the observed species metric for estimating alpha diversity.

Fig. 4
figure 4

Principal coordinate analysis (PCoA) and UPGMA cluster obtained using the UniFrac distance matrix. a The PCoA showed the unweighted Unifac distance. b The PCoA showed the weighted Unifac distance. c UPGMA cluster based on the unweighted Unifac measurements. d UPGMA cluster based on the unweighted Unifac measurements. The variance explained by each principal coordinate axis is shown in parentheses. Datasets were subsample to equal depth prior to the UniFrac distance computation. Legend: TUKR Tuikual river site, CHR Chitae river site, TUR Turial river site, SWD solid waste dumping site.

Composition of the bacterial communities

Domestic or hospital waste has a major impact on microbial populations (Liu et al. 2015, 2018). In the present study, 50 bacterial phyla were identified in the soil and sediment samples using Illumina sequencing (Fig. 5). Bacterial members (phyla, genus, species etc.) with a higher relative abundance in the complete dataset are referred to as dominant bacteria. The most dominant bacterial phyla present in these samples were Proteobacteria (48.1%), Bacteroidetes (23.6%), Acidobacteria (4.9%), Verrucomicrobia (4.20 %), Firmicutes (3.1 %), Actinobacteria (2.6 %), Chloroflexi (1.90 %), Planctomycetes (1.9%), and OD1 (1.3%). Other bacterial phyla were present in < 1% of the total bacterial community. Abundance of Gammaproteobacteria and Bacteroidetes were previously identified in municipal waste landfill sites (Song et al. 2015). Presence of these phyla was commonly found in anaerobic ecosystem and plays a major role in organic matter degradation and carbon cycling. Phyla such as Firmicutes (8.8%) and Actinobacteria (6.7%) were dominated in the dumping site; Acidobacteria (14.1%), Verrucomicrobia (7.6%), and Planctomycetes (5.3%) in Chitae river sediment; Verrucomicrobia (4.5%) and Acidobacteria (3.8%) in Turial river sediment, and Verrucomicrobia (4.6%) and Firmicutes (1.8%) in Tuikual river sediments. The presence of the bacterial phyla Firmicutes indicates that possible anaerobic and methanogenic decomposition processes are likely to occur in dumping sites (Song et al. 2015). Comparison of soil and sediment samples revealed that soil samples were dominated with the bacterial phyla Proteobacteria (50.90%), Bacteroidetes (31.10%), Firmicutes (8.80%), and Actinobacteria (6.70%) whereas sediment samples were dominated by Proteobacteria (47.00%), Bacteroidetes (20.90%), Acidobacteria (6.70%), Verrucomicrobia (5.60%), Chloroflexi (2.50%), Planctomycetes (2.50%), OD1 (1.80%), Nitrospirae (1.70%), Firmicutes (1.20%), and Actinobacteria (1.20%) (Fig. S1). Presence of Verrucomicrobia in all the sediment samples is consistent with the previous literatures, but little is known about their physiology (Shen et al. 2017) (Fig. 2).

Fig. 5
figure 5

Relative abundance of the bacterial community at Phylum level. The relative abundances at the bacterial community at the phylum level were calculated using QIIME. Each bacterial family is represented as a different color in the bar graphs below. Legend: TUKR Tuikual river site, CHR Chitae river site, TUR Turial river site, SWD solid waste dumping site.

The relative richness of the 95 bacterial genera identified at these four sites showed that the most abundant genus was Flavobacterium (4.1%), Acinetobacter (3.7%), Geobacter (2.5%), Prevotella (1.8%), Novosphingobium (1.70%), Cloacibacterium (1.4%), Corynebacterium (1.1%), Bacteroides (1.1%), Paludibacter (1.1%), and Comamonas (1.1%). A heat map of the top 50 bacterial genera is shown in Fig. 6. Dominant bacterial genera present in > 1% of the total bacterial community in both soil and sediment samples were Acinetobacter, Flavobacterium, Prevotella, Corynebacterium, Comamonas, Bacteroides, Wautersiella, Cloacibacterium, Stenotrophomonas, Sphingobacterium, and Pseudomonas, whereas soil samples were specifically dominated by the bacterial genera Geobacter, Novosphingobium, and Paludibacter and sediment samples with Arcobacter and Myroides. Members of the genus Geobacter have activity for the reduction of metal and sulfur (Yan et al. 2012). Further long-read nanopore sequencing identified two species under this genus: G. metallireducens and G. sulfurreducens. Both the species were involved in metal reduction (Lovley et al. 1993; Caccavo et al. 1994). The second most dominant genus in the river sediments was Novosphingobium, a gram-negative bacterium involved in the degradation of polycyclic aromatic hydrocarbons (Sohn et al. 2004). Whereas Paludibacter is a gram-negative, strictly anaerobic, chemoorganotrophic and non-motile genus, its possible function in the river sediment is not yet known (Qiu et al. 2013). Identification of the genus Dechloromonas indicates the oxidation process of aromatic compounds in river sediments (Coates et al. 2001). Nitrospira is found in different ecosystems and involved in the nitrogen cycle (Lucker et al. 2010; Van Kessel et al. 2015). The presence of Nitrospira in the sediment samples suggests that nitrification may occur in these environments. Members under the genus Cloacibacterium are facultative anaerobic bacteria and have been identified in wastewater (Allen et al. 2006). The present study also identifies the genus Acinetobacter. Members under this genus were previously reported from healthcare-associated infections. They also have the ability to acquire AR much faster than other gram-negative organisms (Manchanda et al. 2010). Prevotella is typically found in human gut microbiome, and the presence of this genus in the soil ecosystem indicates the possible fecal contamination through domestic waste (Tanaka et al. 2008). The Illumina dataset also detected a large number of Stenotrophomonas maltophilia. Members under this species are opportunistic pathogen with multidrug-resistant properties against multiple antibiotics (Brooke 2012).

Fig. 6
figure 6

Heatmap showing the top 50 major bacterial genera: TUKR Tuikual river site, CHR Chitae river site, TUR Turial river site, SWD solid waste dumping site

Human pathogen in different samples

Improper management of domestic and hospital wastes is one of the main reason for the occurrence of pathogens in different ecosystems (Watson et al. 2016). Understanding pathogens in the natural environment provides direct evidence of possible health risks and helps to develop preventive action strategies (Yang et al. 2008; Ivanek et al. 2009; Ibekwe et al. 2013). In the present study, the presence of putative bacterial pathogens in domestic- and hospital waste-contaminated soil and sediment samples was evaluated using the HBPD database. This database contains the 16s rRNA gene sequences derived from the well-characterized bacterial pathogens (Cai and Zhang 2013). A similar approach was used to identify pathogens from bat intestine using Ion Torrent sequencing of the V3 region of 16s rRNA (Banskar et al. 2016). Twenty-seven different species of well-known bacterial pathogens were identified with 97% sequence similarity cutoff and ≥ 99% query coverage. The main pathogenic strains dominantly present in all the sites were Salmonella enterica, Pseudomonas aeruginosa, Escherichia coli, and Staphylococcus aureus. Other dominant species found in SWD were Corynebacterium diphtheriae, Enterococcus faecalis, Brucella melitensis, Yersinia enterocolitica, Y. pestis, Shigella boydii, and S. flexneri. Yersinia enterocolitica and S. flexneri were also abundantly found in TUR and TUKR, respectively (Fig. 7).

Fig. 7
figure 7

Heatmap showing the major bacterial pathogens present in the soil and sediment samples. Legend: TUKR Tuikual river site, CHR Chitae river site, TUR Turial river site, SWD solid waste dumping site

In the present study, major putative bacterial pathogen was Salmonella enteric, which causes a spectrum of diseases, including typhoid and gastroenteritis (Giannella 1996). Two species of Yersinia, Y. enterocolitica and Y. pestis, have also been identified. While Y. enterocolitica can cause yersiniosis and diarrhea in human, Y. pestis causes plague (Kenneth and Ray 2004; Fàbrega and Vila 2012). The presence of these species indicates the possible transmission of pathogenic bacterial species through animal activities in the dumping land. Another identified pathogen, P. aeruginosa, has been reported to cause several nosocomial and opportunistic infections (De Bentzmann and Plésiat 2011). Two important bacterial pathogenic species, Corynebacterium diphtheria and Mycobacterium tuberculosis, have also been observed in the Illumina dataset (Kenneth and Ray 2004). Both the genera Corynebacterium and Mycobacterium can spread through the droplet nuclei and have been reported from the sediments affected by urban runoff (Kenneth and Ray 2004). An abundant number of the bacterial pathogen, Enterococcus faecalis, was identified mainly in the dumping soil. Members under this species can acquire AR genes from a medical waste-contaminated environment that plays a major role in pathogenesis (Kenneth and Ray 2004). Three species under the genus Shigella were identified (S. flexneri, S. boydii, and S. dysenteriae) in abundant number at dumping sites compared to sediment samples. Members under this genus were involved in several human and animal diseases and also have the ability to acquire multidrug resistance (García et al. 2010; Doumith et al. 2012; Fritah et al. 2014; Mwaikono et al. 2016). The association of this genus with dumping sites was previously studied by Mwaikono et al. (2016), and they postulated that interaction between animals, humans, and microbes through the dumping sites could be a public health risk factor (Mwaikono et al. 2016).

Predictive functional role of the bacterial community

The PICRUSt tool used in this study may provide useful insights about the imputed functions of the bacterial community (Langille et al. 2013). Based on the prediction, KEGG database categorized the identified genes under the putative pathways involved in metabolism (50.02 %), genetic information processing (16.23 %), environmental information processing (12.90 %), cellular processes (4.37 %), organismal systems (0.79 %), and human diseases (1.10 %). A high representation of amino acid metabolism (20.780 %), carbohydrate metabolism (19.97%), and energy metabolism (11.87%) was observed. In addition, high proportions of gene families involved in lipid metabolism (7.46 %), metabolism of cofactors and vitamins (8.456 %), nucleotide metabolism (6.42 %), and xenobiotics biodegradation and metabolism (6.75%) were also predicted (Fig. S2). These dominant putative pathways present in bacterial communities were vital for the survival of bacterial communities (Erickson et al. 2012; Lamendella et al. 2011).

AR in pathogenic bacteria has become a worldwide problem caused primarily by the abuse and misuse of antibiotics, and a high prevalence of AR genes has been commonly found in contaminated ecosystems including river sediments (Goossens et al. 2005). In the present study, selected sites were contaminated with domestic and hospital waste. Therefore, it is important to study the AR genes in these environments (Allen et al. 2009). The presence of abundant putative AR genes involved in beta-lactam antibiotic resistance (K01207, K01467, K03585, K03587, K05366, K05515, K08218, and K12340) was found primarily 51 taxonomically diverse families (Table S2). In the sediment samples, these genes had a higher predicted prevalence among the bacterial family Bacillaceae, Bacteroidaceae, Bdellovibrionaceae, Chitinophagaceae, Comamonadaceae, Cytophagaceae, Corynebacteriaceae, Enterococcaceae, Erysipelotrichaceae, Flavobacteriaceae, Nocardiaceae, Hydrogenophilaceae, Hyphomicrobiaceae, Mycobacteriaceae, Sphingobacteriaceae, Staphylococcaceae, Streptococcaceae, Micrococcaceae, Streptomycetaceae, and Micromonosporaceae. In the soil sample, this KO is predicted to occur mainly in Bacteroidaceae, Enterobacteriaceae, Corynebacteriaceae, Hyphomicrobiaceae, Nocardiaceae, and Corynebacteriaceae (Fig. 5). Identified AR genes are involved in the degradation of β-lactam (ampicillin) antibiotic and commonly used as antimicrobial agent to treat bacterial infection (Elander 2003). Thus, the presence of such genes leads to speculation about the misuse or overuse of antibiotics in this region.

Due to the improper disposal system, the household and hospital wastes containing xenobiotic compounds were deposited in the environment. The presence of recalcitrant xenobiotic compounds exerts selective pressure on the microbial community leading to the elimination of susceptible species (Nojiri et al. 2004; Devpura et al. 2017). These communities adopt a number of metabolic processes to survive in these environments (Nojiri et al. 2004; Devpura et al. 2017). They acquire novel enzymatic mechanisms as well as degradative metabolic pathways through natural selection for the decomposition of recalcitrant compounds (Galvão et al. 2005). Thus, an analysis of the metabolic activity of these communities will help to understand the possible impact of these domestic wastes on the ecosystems. In the present study, we analyzed various genes involved in such metabolic process in samples of contaminated soil and sediments in Northeastern India and several genes encoding various enzymes involved in the degradation of xenobiotic compounds. A total of 8, 4, 12, 7, 12, 8, 12, and 8 putative xenobiotic degradation genes involved in the degradation of benzoate, aminobenzene, naphthalene, fluorobenzoate, chloroalkane and chloroalkene, chlorocyclohexane and chlorobenzene, and nitrotoluene and toluene, respectively, were found in the soil and sediment samples. Major bacterial OTU’s associated with benzoate degradation were Acinetobacter, Arthrobacter, Comamonas, Diaphorobacter, Geobacter, Novispirillum, Phenylobacterium, Pseudoxanthomonas, Pseudoxanthomonas, and Rhodoplanes. The predicted dominant genera associated with the degradation of aminobenzene were Sphingomonas, Novosphingobium, Sulfuricurvum, and Telmatospirillum. Similarly, the degradation of naphthalene was carried out by the major bacterial OTUs assigned under the genera Arthrobacter, Comamonas, Diaphorobacter, Geobacter, Klebsiella, Leptothrix, and Novosphingobium. OTU under the bacterial genera Acinetobacter, Agrobacterium, Citrobacter, Devosia, Diaphorobacter, Erwinia, Hyphomicrobium, Klebsiella, Leptothrix, Novosphingobium, Perlucidibaca, Pseudomonas, Rhodoplanes, and Variovorax mainly contributed to the fluorobenzene degradation. A complete list of bacterial OTUs associated with the xenobiotic degradation is shown in Table S3. The abundance of these genes in the metagenomics datasets indicates that the hospital and domestic wastes may contain the high concentration of benzoate, aminobenzene, naphthalene, fluorobenzoate, chloroalkane and chloroalkene, chlorocyclohexane and chlorobenzene, and nitrotoluene and toluene.

Conclusion

The present study examined the taxonomical and metabolic diversity of the bacterial communities present in contaminated soil and sediment samples. The bacterial community at polluted sites was dominated by Actinobacteria and Proteobacteria. All the sites were associated with pathogenic bacteria, and the imputed functional analysis identified putative genes associated with antibiotic resistance as well as xenobiotic degradation in these environments. This study improves our understanding of bacterial community and its metabolic activity in the contaminated ecosystem, which will help us to mitigate the health hazards and develop environmental remediation plans.