1 Introduction

Arsenic (As) is a ubiquitous element in the earth’s crust. It is a metalloid that exists in different allotropic forms including carbonate, sulfide, and elemental forms (Genchi et al., 2022a) Globally, the As concentration in the soil varies viz., Bangladesh (4 to 137.9 mg kg−1), United States (2.8 to 73 mg kg−1), China (1.9 to 36.0 mg kg−1), Japan (8.5–10.30 mg kg−1), South Korea, (2–4.6 mg kg−1), India (3.7 to 423 mg kg−1) (Khan et al., 2010; Kumar et al., 2016; Mandal & Suzuki, 2002; Ori et al., 1993; Patel et al., 2005; Srivastava & Sharma, 2013; Zhou et al., 2018).

Contamination and exposure to As have been reported to severely affect the environment and human health. It is categorized as a group I human carcinogen causing skin, lungs, bladder, liver, and kidney cancer (Genchi et al., 2022b). Exposure to As can cause skin lesions, cardiovascular diseases, birth defects, and cognitive impairment (Domingo-Relloso et al., 2022; Hamadani et al., 2011; Monteiro De Oliveira et al., 2021; Rudnai et al., 2014). Millions of people spanning 70 different countries are affected by the use of As-contaminated groundwater (Shrivastava et al., 2015). The As-tainted groundwater also contaminates agricultural soil, leading to toxicity and reduced growth in plants viz., straight-head disease in paddy (Mishra et al., 2021).

Soil-dwelling microorganisms can potentially detoxify As through the process of absorption, precipitation, accumulation, and further chemical transformation through redox and/or methylation (Singh et al., 2015). These microbial As detoxification/resistance mechanisms can be exploited for As bioremediation at the contaminated sites. Various bacterial and fungal species including Bacillus subtilis, Bacillus cereus, Acidithiobacillus ferrooxidans, Saccharomyces cerevisiae, and Aspergillus niger manifest arsenic biosorption (Chandraprabha & Natarajan, 2011; Giri et al., 2013; MasudHossain & Anantharaman, 2006; Mitra et al., 2017; Zoroufchi Benis et al., 2020). Some microbes such as Trametes versicolor are known to hyper-accumulate arsenic within their cells (Adeyemi, 2009). Bacillus sp. strain DJ-1 is reported to accumulate up to 9.8 mg of As per g of its dry weight (Joshi et al., 2009). Westerdykella aurantiaca, Neosartorya fischeri, Aspergillus sp., Rhizopus sp., and Humicola sp. have been reported to methylate As, suggesting their role in arsenic detoxification by formation of organic As+5 derivatives which are relatively less toxic (Čerňanský et al., 2009; Srivastava et al., 2011; Tripathi et al., 2020). Intracellular methylation of arsenate has been demonstrated in Trichoderma asperellum, Penicillium janthinellum, and Fusarium oxysporum (Su et al., 2012).

Genomic studies have led to the identification of novel microbial species and enzymes involved in the detoxification of As, which can further be used in As bioremediation. Metagenomics is one such approach that can serve as an effective tool to analyze native microbial communities and to provide comprehensive information on soil community composition, and their potential functional traits(Castro-Severyn et al., 2021). Unlike the culture-dependent approach, metagenomics targets all the microbes and their associated genes hence providing a holistic picture. Xiao et al. (2016) conducted a metagenomic study in five paddy soils contaminated with As (< 16 mg kg−1) and reported the relative abundance of distinct As metabolizing genes in different samples, among which the arsenate reduction genes were found to be dominant. In another study conducted on As-contaminated mine samples, a high abundance of Proteobacteria with Gammaproteobacteria dominating all the samples was found. Bacterial phylum Chlorobi and Bacteroidetes were observed to occur only in the soil samples with high As concentration (~ 821.23 mg kg−1). Among the functional diversity, higher relative abundance and diversity of As-responsive genes including arsC, arrA, aioA, arsB, and ACR3 were reported in soil samples with high concentrations of As and Sb (Antimony). The gene arsC, responsible for arsenate reduction had the highest mean relative expression, whereas arsenic methylating gene arsM showed the lowest expression (Luo et al., 2014).

An understanding of microbial dynamics at As-contaminated sites is needed to explore novel microbes expressing novel genes, enzymes, and processes that may prove helpful in As bioremediation, especially in agricultural ecosystems. The agricultural ecosystem is directly responsible for the health and well-being of the population dependent on it. Therefore, arsenic contamination in these areas may have serious implications. Since microbes inhabiting the rhizosphere of the crops grown in As contaminated regions influence As geochemical cycling, its chemical form, and bioavailability, it becomes essential to study the impact of changing soil arsenic concentration on the native microbial diversity and abundance (Gu et al., 2017). This study thus highlights the role of the As gradient in shaping the microbial community and their functions in the contaminated paddy soil using a comprehensive metagenomic-driven approach. It was hypothesized that the variation of soil arsenic contamination may have a differential impact on the soil microbial communities, their taxonomic diversity, and their functions. This study divulges microbial community compositions and functions at different levels of As-contaminated soils (DNA level), which provides a comprehensive understanding of the relationship between microbial community characteristics and As contamination in the soil.

2 Experimental Procedures

2.1 Sample Collection, Chemicals, and Reagents

Soil samples were collected from four distant sites namely, Ghorhat (SKNNNGH), Siswania (SKNNNSIS), Sakhi (SKNNNSK), and Majhaua (SKNNNMJ) of an administrative block Nathnagar in the district Sant Kabir Nagar (26.55N – 26.66N, 82.95E – 83.08E) lying in the North Eastern Plains of Uttar Pradesh, India (Fig. S1). The district Sant Kabir Nagar was selected based on our previous soil and groundwater arsenic contamination mapping data. Block Nathnagar was chosen within the district based on availability of a gradient of soil arsenic contamination and groundwater As concentration. The groundwater total As concentration (µg l−1) in samples SKNNNGH, SKNNNSIS, SKNNNSK, and SKNNNMJ were 19.93, 86.90, 98.66, and 143.93, respectively. The samples observed with the gradient of arsenic contamination were then further used for metagenomic studies.

Soil samples were collected from the rhizosphere of the paddy fields at the time of harvesting at physiological maturity. The same paddy variety was grown by the farmers at all four sites of sampling. The soil samples were drawn from the topsoil (0–10 cm) of agricultural land using an iron core (60 mm internal diameter). At a site, nine parallel samples from paddy fields located within a 1 ha area were collected and mixed thoroughly to form a site-wise representative sample. The samples were stored in sterile polyethene bags and aseptically transferred to the laboratory for sieving through a 2 mm sieve. One sub-section of the fresh soil sample was stored at -20 °C for further analysis. The remaining soil was air-dried and used to analyze soil physico-chemical parameters. Based on soil As contents, samples were classified into three ranges of arsenic gradient viz., low (4.88 mg kg−1), mid (15.89 and 24.84 mg kg−1), and high (43.67 mg kg−1).

The multielement standards were procured from Thermofisher Scientific, Germany. Trace metal grade Nitric acid and Hydrogen Peroxide were used for digesting the samples for ICP-MS analysis. The chemicals used for soil physicochemical and biological analysis were of reagent-grade procured from Himedia Pvt Ltd.

2.2 Soil physico-chemical and biological Analysis

The physicochemical and biological properties of soil samples were analyzed initially to get baseline information on soil characteristics. The soil parameters like bulk density (BD) and soil texture were analyzed using the method given by Jackson (1969). Analysis of Water Holding Capacity (WHC) was performed by Keen’s box method. The soil pH and EC were measured using Multi 9630 IDS multimeter (WTW, Germany) in a soil suspension with a 1:5 ratio of soil to ultrapure Milli Q water. Microbial biomass Carbon (MBC) was analyzed by the chloroform-fumigation extraction method and Total Organic Carbon (TOC) by the Walkley and Black method (Vance et al., 1987; Walkley, 1947). The soil available nitrogen and phosphorus were analyzed by the Kjeldahl and Olsen methods, respectively(Bremner, 1960; Olsen, 1954). The available potassium was analyzed using a flame photometer (Systronics 128). The activity of soil enzyme Dehydrogenase was analyzed by spectroscopic quantification of its reaction product Triphenylformazan (TPF), formed by reduction of 2,3,5 Triphenyl tetrazolium chloride (TTC) when used as substrate (Małachowska-Jutsz & Matyja, 2019). The soil Fluorescein diacetate (FDA) hydrolysis activity was analyzed by measuring the concentration of fluorescein (µg g−1 h−1) released upon hydrolysis of FDA (a colorless substrate) by several classes of soil enzymes such as lipases, esterases, and proteases (J. Schn rer & T. Rosswall, 1982).

All soil samples were analyzed in triplicate and results were shown as Mean ± SE.

2.3 Analysis of total soil As using ICP-MS

The total As, Manganese (Mn), Zinc (Zn), Selenium (Se), and Iron (Fe) content of the soil samples was analyzed using ICP-MS (iCAP TQ Thermo Fisher Scientific). The dried soil samples were sieved (2 mm) and 500 mg of each sample was digested using 69% nitric acid and 30% hydrogen peroxide in a microwave digester (Mars 6 CEM). The digestion was carried out according to the EPA method 3051A (Element, 2007). The digested samples were diluted and filtered through 0.2 µm syringe filters and then analyzed by ICP-MS. The analysis of As (75) and Se (80) was done in Triple Quadrupole (TQ) mode using oxygen as the reacting gas and analyzed as 75As.16O and 80Se.16O respectively. The elements Mn, Zn, and Fe were analyzed in Single Quadrupole-Kinetic Energy Discrimination (SQ -KED) mode using Helium (He) gas in the collision cell. The mass calibration of the instrument was performed using the iCAP Q/Qnova calibration solution (Thermo Scientific). The intermediate performance of the instrument was checked using iCAP TQ Tune Solution BRE0009578 (Thermo Scientific) and BMSCIENTIFIC-442 (Inorganic Ventures) tune solution. The calibration and quality assurance of the samples were ensured by repeated analysis (n = 5) of CRMs with the recovery of the elements in the range of 98–105%. The detection limit for As was 1 µg L−1.

2.4 Metagenomic Sequencing

2.4.1 DNA Extraction and Library Construction

Total DNA from the As-contaminated soil samples was isolated using a Nucleospin soil kit (MACHEREY–NAGEL, Germany) as per the manufacturer’s instruction. The purity, concentration of DNA, and library construction were done as reported previously by Kaur et al. (2021). Briefly, the fragmented DNA ~ 350 bp (100 ng) was used for the library preparation using NEB Next® UltraTM II DNA Library Prep Kit (New England Biolabs, UK). During this end repair, A-tailing and adapter ligation were done sequentially. Following this, the selected fragments were enriched and purified and the quantitation of the library was done by Qubit DNA.

2.4.2 Sequencing Quality Control and Assembly of Reads

Paired End Illumina libraries were loaded onto the Illumina platform for cluster generation and sequencing. For quality control, low-quality reads (≤ 40 on the phred scale) and adapter-contaminated reads were removed as a part of pre-processing. The QC-passed samples were assembled using optimized SOAPdenovo protocol or MEGAHIT for Soil Water (Brum et al., 1979; Scher et al., 2013). The assembly was done using different K values to get the largest N50. This was then taken for analysis and the Scaftigs were assembled using Soap 2.21 (Mende et al., 2012; Nielsen et al., 2014).

2.4.3 Taxonomic and Functional Annotation

Taxonomy classification was done using the Kaiju aligner against the latest database of NCBI-(nr + euk), with parameters (SEG low complexity filter: Yes; Run mode = greedy; minimum match length = 11; Minimum match score = 75; allowed mismatch = 5) (Menzel et al., 2016). The output file generated from Kaiju for each respective sample contains the number of assigned reads per taxon. The functional classification was performed through the SEED subsystem database within MG-RAST. The analysis was done with default settings after the normalization of raw counts. Furthermore, the analysis was carried out by referring to different databases viz., KEGG, eggNOG, and CAZy. The relative abundance of the As-responsive genes in different soil samples was obtained by annotating the metagenome assembly (with bit scores ≥ 40) with the AsgeneDB, a manually curated database for As metabolizing genes (https://github.com/XinweiSong/Asgene) (Song et al., 2022). The gene prediction was carried out by the MetaGeneMark based on the scaftigs assembled by single and mixed samples. The predicted genes were then pooled together for dereplication to construct a gene catalog (Fu et al., 2012; Li & Godzik, 2006). Based on the Clean Data of each sample from the gene catalog, the abundance of information from the gene catalog for each sample was obtained.

3 Results and Discussion

3.1 Soil Arsenic and Micronutrient Quantification

The total As concentration of the soil samples SKNNNGH was 4.88 mg kg−1 and was categorized as low in this study, similarly, samples SKNNNSIS and SKNNNSK where As concentration was 15.89 and 24.84 mg kg−1 were categorized as mid-level and finally soil sample SKNNNMJ with the As concentration 43.67 mg kg−1, was categorized as high. This categorization was employed to explain the findings of this study. Numerous studies have reported the formation of As gradient naturally in contaminated areas, these gradients may form naturally over a long period due to the topography of the area (Liu et al., 2023; Morosini et al., 2023; Valverde et al., 2011; Yang et al., 2024). In most of the cases, the impact of As on the soil microbial communities is studied in isolation, with each study area having its unique edaphic and climatic conditions. Even if these are similar, it is less likely that the vegetation on the sites in question is also the same. Since these factors are important in shaping and influencing the existing microbial communities their influence on the structure and functions of the microbial communities in the As contaminated areas cannot be ruled out. In such conditions, the studies involving As gradient at the contaminated sites are important to comprehend the effect of different As concentrations in shaping the microbial communities. Thus, in our study, we selected the sampling sites based on the similarity in the soil physicochemical and biological properties, the similarity in the climatic conditions, and the vegetation cover. Considering the direct impact of As in agricultural soil on human health through its entry into the food chain, such gradient studies on agricultural soils become of prime importance. One of the studies carried out at a gradient of arsenic at Terrubias mine reported a shift of microbial community towards Firmicutes as the dominant phyla with an increase in soil arsenic concentration (Valverde et al., 2011). Another study carried out at the gradient of As contamination reported the abundance and increase in expression of As resistant genes in paddy fields with increasing As concentration (Zhang et al., 2021).

Apart from As other micronutrients were also analysed as they may also influence microbial diversity. The total Mn concentration in the soil samples SKNNNGH, SKNNNSIS, SKNNNSK, and SKNNNMJ were 33.7, 52.20, 56.17, and 60.51 mg kg−1, Zn concentrations were 12.31, 13.14, 11.21, and 14.90 mg kg1, Se concentration were 4.31, 5.05, 3.91, and 4.24 mg kg−1, and Fe concentrations were 90.06, 74.81, 89.06, and 60.5 mg kg−1, respectively. The concentrations of these micronutrients in different sampling sites did not show significant differences and hence the effect on the microbial community structure and function can be attributed to changing arsenic concentrations.

3.2 Soil Physico-Chemical Properties

The soil samples were sandy clay in texture with bulk density in the range of 1.31–1.36 g cc−1 and water holding capacity in the range of 48.9–51.2%. The soil samples were near neutral to slightly alkaline with a pH ranging from 7.41 to 7.61 and electrical conductivity in the range of 225–231 μS cm−1 when estimated at 28⁰C. Soil total organic carbon content in the samples SKNNNGH, SKNNNSIS, SKNNNSK, and SKNNNMJ ranged between 0.23–0.27% with their respective MBC values 527.2, 551.25, 549.45, and 544.1 μg g−1, respectively. The soil available N, P, and K ranged between 0.95–1.04%, 72.21–75.16 mg kg−1, and 78.92–82.15 mg kg−1, respectively. The Dehydrogenase activity in all four soil samples SKNNNGH, SKNNNSIS, SKNNNSK, and SKNNNMJ was found to be 4.14, 3.81, 3.94, and 3.87 μg TPF g soil−1 h−1, respectively. Likewise, the soil FDA activity was in the range of 299.8 to 311.8 μg Fluorescein g soil−1 h−1. Soil physicochemical properties of the four samples were not significantly different except for the soil As contents (Table S1). A correlation analysis was performed between total soil arsenic content and other soil physicochemical properties, as well as with soil enzyme activities (data not shown). The total As content of soil samples was positively correlated to the soil pH. This can be attributed to the fact that the increase in pH may decrease the adsorption of As+5 (Huang et al., 2006). On the contrary, a significant (p < 0.05) negative correlation was found between total soil arsenic content and the soil electrical conductivity. The EC of the soil is an indicator of dissolved ions, and with an increase in the ionic strength As absorption may decrease due to an increase in the net negative charges on the plane of sorption which may facilitate leaching of As (Kim et al., 2021; Smith et al., 1999). Similarly, a negative correlation was found between the total soil As and microbial biomass carbon (MBC). It has been earlier shown that chronic As exposure can adversely affect the MBC in contaminated soils (Ghosh et al., 2004). The As concentration was correlated with the changes in soil properties showing some direct or indirect influence on the soil Arsenic concentration and its bioavailability to crops.

3.3 Taxonomic Composition of Bacterial Community

The analysis of microbial community composition in soil samples with different As concentrations was done to unveil the microbes with suitable functional traits to cope with As toxicity, which can be employed as a potential bioremediation agent in contaminated agricultural fields. In the present study, the dominant bacterial phylum observed in all the samples in the order of their abundance were Actinobacteria, Proteobacteria, Cyanobacteria, Chloroflexi, and Acidobacteria (Fig. 1a). Upon PCoA analysis (data not shown), it was found that the bacterial phylum compositions of two soil samples with the mid-As concentration were more similar to each other than to those with either high or low As concentration. This indicates that As has some role in shaping the bacterial communities in the contaminated soil. Among the bacterial orders, the most abundant ones were, Rhizobiales, Solirubrobacterales, Propionibacteriales, Gaiellales, Streptomycetales, and Pseudonocardiales (Fig. 1b). To get a better picture of the bacterial community composition of the soil samples contaminated with different levels of As relative abundance at the genus level was studied (Fig. 1c). This may eventually provide an overview of the bacterial genera that can be explored for their As remediating potential. The genera Bradyrhizobium, Tolypothrix, Scytonema, and Anaeromyxobacter were found to be sensitive with a decrease in their relative abundance to the increase in soil As concentration. The two genera Scytonema and Anaeromyxobacter were absent in samples with high As concentrations indicating their sensitivity. Most of the genera such as Gaiella, Nocardioides, Solirubrobacter, Microvirga, and Nitrospira were observed in the mid-As concentrations. The diversity of bacterial genera reported in the present study for As-contaminated soil samples was similar to those earlier reported from the contaminated soil of lead (Pb) and zinc (Zn) mines. The common genera found were Solirubrobacter and Sphingomonas, and the similarity was more pronounced at the phylum level (Hemmat-Jou et al., 2018). Cui et al. (2018) conducted a study to evaluate the effect of high concentrations of heavy metals and metalloids on bacterial diversity using 16S rRNA gene sequencing and found Arthrobacter, Nocardioides, Aeromicrobium, Solirubrobacter, Blastococcus, Microvirga, Gaiella, and Candidatus, tolerant to heavy metals as well as alkalinity stress. This coincided to an extent with the bacterial genera observed in the present study. The presence of bacterial genera Gaiella in the soil contaminated with heavy metals has previously been reported (Duan et al., 2021; Hu et al., 2021). Gaiella was also found to be positively correlated to Chromium (Cr) concentration in the soil irrigated with the treated wastewater that exhibited Cr bioremediation (Xi et al., 2021). Thus, Gaiella might have the mechanism to tolerate high concentrations of heavy metals, but its specific response to As in culture condition and the detoxification mechanism involved therein is yet to be explored. Nocardioides showed a similar trend, with its higher relative abundance in soil samples with mid- and high-As concentrations. Nocardioides sp. L-37a has been reported to express arsenate reductase (arsC) with increased expression by four folds on exposing the strain to As+5(Bagade et al., 2016). The genus Solirubrobacter of the phylum Actinobacteria was observed to have a higher relative abundance in soil samples with the mid-As concentration. The genus has previously been reported in the rhizospheric soil of the Cd/Zn hyperaccumulating Sedum plant and was also found to be positively correlated with the bioaccumulation factor of Cd and Zn in the Sedum plant(Wu et al., 2022). It was also reported from the rhizosphere of Echinocactus platyacanthus growing in soil with higher zinc concentrations up to 800 mg kg−1 and was found to be higher in relative abundance, indicating its tolerance to Zn and probably other heavy metals (Sarria Carabalí et al., 2019). Microvirga, an α-proteobacteria was observed to be higher in its relative abundance in the soil samples with the mid-As concentration. One of its strains Microvirga indica S-MI1b sp. nov. was known for its As+3 oxidizing potential, with an ability to oxidize 15 mM As+3 in 39 h. The oxidation of As+3 might be due to the presence of the aioA gene and its product arsenite oxidase, which is induced by the build-up of As concentration in soil. Apart from this, the strain was also capable of resisting other heavy metals such as Pb (1000 mg l−1), Hg (10 mg l−1), Sb+3 (100 mg l−1), Cd (20 mg l−1), Cr+6 (300 mg l−1), Ni (400 mg l−1), and Cu (30 mg l−1) (Tapase & Kodam, 2018). In general, it is seen that various bacterial genera adopt different mechanisms to combat As toxicity and some of these bacteria can be explored and used for bioremediation of As-contaminated agricultural soil. Suitable arsenate-reducing bacteria can also be used along with As hyperaccumulating plants to increase As bioavailability to the As-accumulating plants for successful implementation of phytoremediation in As-contaminated soils.

Fig. 1
figure 1

Bacterial community composition at a gradient of arsenic contamination (a) Relative abundance of bacteria at the phylum level, (b) Relative abundance of bacteria at the order level, (c) Relative abundance of bacteria at genus level in soil samples contaminated with different concentrations of arsenic

3.4 Taxonomic Composition of Fungal Community

Fungi can be exploited for As bioremediation as they are metabolically robust, having good surface area and biomass which may aid in efficient bioremediation. Fungal cell surface has several metal-binding functional groups and their hyphae extend deep into the soil, penetrating complex soil structures. They are predominantly present at sites contaminated with heavy metals including arsenic (Singh et al., 2015). The fungal counts observed at the different levels of soil As concentration in contaminated soil samples, their predicted gene statistics, and the number of As gene hits for each sample are summarised in Table S2. In the present study, maximum fungal counts obtained after Kaiju analysis were 532 and 460 in the soil samples SKNNNSIS and SKNNNSK, respectively in the mid-As concentration. The least number of fungal counts (388) were reported in the sample SKNNNMJ with the high-As concentration, and moderate fungal counts (426) were observed for sample SKNNNGH with the low-As concentration. On the contrary, the maximum number of hits for As metabolizing genes as per the AsgeneDB were observed in the soil sample SKNNNMJ with the highest As concentration. From this, it can be concluded that the high concentration of As in the soil triggers microbial communities to express genes having As remediating potential to survive under the As stress.

The phylum Ascomycota and Basidiomycota dominated the fungal community in the studied samples. Basidiomycota displayed a higher relative abundance in samples with higher As concentrations (Fig. 2a). The relative abundance of Ascomycota was found to be on the higher side in the sample with the mid-As concentration (15.89 mg kg−1) as compared to samples having low- and high-As concentrations (Fig. 2a). The dominant orders observed in the fungal communities of As-contaminated soil samples included Glomerellales, Hypocreales, Eurotiales, and Pleosporales (Fig. 2b). In this study, it was observed that the fungal genera, Verticillium, Beauveria, Tolypocladium, Talaromyces, Aspergillus, and Pyrenophora were found to be the most abundant. The fungal genera Beauveria, Talaromyces, Aspergillus, Pyrenophora, and Valsa were found to be higher in relative abundance with the As concentration in different soil samples. The genus Ophiocordyceps was found to be lower gradually with the build-up of the As concentration, while the genera Beauveria, Talaromyces, and Aspergillus showed higher abundance in a similar trend of increase in soil As concentrations (Fig. 2c). The genus Beauveria was reported to produce metallothioneins (metal-binding proteins) under metal (Cu+2 and Cd+2) stress (Kameo et al., 2000). In a study, Beauveria bassiana was reported for bioaccumulation and removal of heavy metals viz., Ni+2, Cu+2, Zn+2, Cd+2, Cr+6, and Pb+2 with a removal percentage of 75, 74.1, 67.8, 63.4, 61.1, and 58.4%, respectively (Gola et al., 2018). Furthermore, Talaromyces sp. was earlier reported to be As hyper-tolerant and to biosorb As from aqueous medium (Nam et al., 2019). In a study, Talaromyces helices was reported for the removal of Cu, as well as exhibiting co-tolerance to other heavy metals like Co, Cd, and Pb. This type of co-tolerance development and enhanced removal efficiencies are crucial in remediating sites contaminated with multiple heavy metals (Romero et al., 2006). Aspergillus has been earlier reported by many researchers as the promising strain for metal(loid) bioremediation. Aspergillus niger was studied for As, Cu, and Cr removal efficiency from the waste wood treated with chromated copper arsenate (CCA), used as a wood preservative to increase the durability of the wood. It was reported to remove 97% of arsenic, 49% of copper, and 55% of chromium from the CCA-treated wood chips in 10d (Kartal et al., 2004). In another study, Aspergillus nidulans isolated from arsenic-contaminated soil was found resistant to 500 mg kg−1 of As and showed 84.3% of arsenic adsorption from the contaminated soil after 11 d of cultivation, indicating its potential role in arsenic remediation of the contaminated soil (Maheswari & Murugesan, 2009). In the present study, it was seen that the relative abundance of Verticillium was higher in the soil sample with 15.89 mg kg−1 of total As which subsequently lowered as the As concentration in the soil samples built up further. Although the genus has been previously reported for the removal of Pb, Zn, and Cd, the findings here suggest that the genus is sensitive to higher concentrations of As in soil. The genera Gaiella, Solirubrobacter, Beauveria, and Verticillium have been first time reported in As-contaminated soil samples, and understanding of their functions in As-bioremediation is subjected to further experimentation. This study opens up an avenue to use these fungal strains in the field condition for reducing the bioavailability of As to the plants and decreasing its entry into the food chain (Henrique et al., 2019).

Fig. 2
figure 2

Fungal community composition at a gradient of arsenic contamination. (a) Relative abundance of fungi at the phylum level, (b) Relative abundance of fungi at the order level (c) Relative abundance of fungi at genus level in soil samples contaminated with different concentrations of arsenic

3.5 Functional Flux Against Metal Contamination

In the current metagenomic study, the overall community was assessed for its physiology and the collective functions performed by the microbes constituting the community as a function of the As concentration gradient in different agricultural soil samples analyzed. Insights into the functional diversity of a community can be obtained by annotating the metagenomic sequences with the functions (Cantarel et al., 2009; Feng et al., 2015; Li et al., 2014; Qin et al., 2012). According to the functional abundance obtained for each sample, various analyses were performed. On average, the samples SKNNNSIS and SKNNNSK with the mid-level of As concentrations had the maximum number of predicted genes, 1,068,529 and 895,459, respectively. The lowest number of total predicted genes (591,738) was observed in the soil sample with the high-As concentration Table S2.

When considering the database Kyoto Encyclopedia of Genes and Genomes (KEGG), overall the samples displayed the maximum number of genes involved in the metabolism (Fig. 3). This was followed by environmental information processing genes in which the genes for membrane transport were more in number. The genes for cellular processes, genetic information on processing, and organismal systems were reported in all the samples to a relatively lesser extent. Processes like metabolism, environmental information processing, cellular processes, human diseases, and organismal systems were observed to increase in the soil samples with high-As concentration as compared to the samples with low-As concentration. The relative abundance of genes through the KEGG database was analyzed and a graph of the functional abundance for each sample was drawn (Fig. 3).

Fig. 3
figure 3

Total Functional abundance based on KEGG

3.5.1 Abundance of Arsenic Metabolizing Genes

Microorganisms inhabiting different environments with As contamination are known to possess several As metabolizing genes. The mechanism for As detoxification includes the oxidation of more toxic and more mobile As+3 species to less toxic and less mobile As+5 species catalyzed by the aioA gene. As+5 may also be used for energy production by microbes, wherein, As+5 reduction is coupled to ATP production catalyzed by respiratory As+5 reductase (Arr) (Slyemi & Bonnefoy, 2012). In another detoxification mechanism, As+5 is reduced to As+3 by a reductase (ArsC) and effluxed out of the cell by ArsB and Acr3 permease (Rosen n.d.). As+3 can be methylated by the ArsM, a microbial As(III) S-adenosylmethionine (SAM) methyltransferase, that can convert the inorganic As species to their volatile derivatives or into less toxic As+5 organic derivatives. (Verma et al., 2016).

Considering the importance of As metabolizing genes and their role in As detoxification, their diversity in the soil and their relative abundance were studied under the gradient of soil As concentration (Fig. 4). Several genes involved in As transport pathways like ACR3, arsA,arsB, arsD, arsJ, arsP, GET3, glpF, pgpA, PiT, pstA, pstB, pstC, and pstS were found in all the soil samples contaminated with As (Fig. 4a). The pstB gene coding for a high-affinity phosphate uptake system, involved in the uptake of As+5 was the most abundant among the As transport genes. Likewise, several genes responsible for As+3 oxidation, aioA, aioB, aioR, aioS, aioX, aoxC, arsH, arxA, arxB, arxR, arxS, and arxX were found to occur in all the soil sample contaminated with As, with arsH being the most abundant in all the samples (Fig. 4b). Few genes of the As+5 respiratory pathway such as GstB, mgsR, arrA, and arrB were found in the soil samples studied. The genes involved in the As+5 reduction pathways like moeA, arsC, and arsR were also observed in the As-contaminated samples among which the arsR gene was the most abundant (Fig. 4c).

Fig. 4
figure 4

Number of arsenic metabolizing/transport genes w.r.t. the AsgeneDB in soil samples contaminated with a gradient of As concentration {(a) Arsenic Transport, (b) As (III) Oxidation, (c) As (V) respiratory, As (V) reduction, and As methylation} (d) major genes given in (a), (b), and (c) showed significant variation across the As concentration gradient

The As metabolizing/ transport genes showing variation across the gradient of As concentration in soil based on the AsgeneDB is given in Fig. 4d. It was found that the As transporter genes arsJ and pgpA were higher in the soil sample with higher As concentration. The gene arsJ was reported to provide As+5 resistance in Pseudomonas aeruginosa by efflux of As+5 as a phosphoglycerate derivative (Chen et al., 2016). Therefore, the higher relative abundance of arsJ with the increasing As concentration could detoxify As+5. Similarly, the higher relative abundance of the pgpA gene may be due to its role in As detoxification by transporting As+3 as a glutathione conjugate (As(GS)3) into the cellular vacuoles (Légaré et al., 2001). The gene arsM responsible for As methylation was also found to be higher with the increase in the As concentration along the gradient indicating that it provides As resistance to the microbes by converting inorganic As species into its volatile derivatives (Qin et al., 2006). A few genes of the As+3 oxidation pathway such as aioA, aioB, aioR, and arsH were also found to be higher in the samples with high As concentration probably to provide As resistance. The aioA and aioB genes code for the large and small subunits respectively of the Arsenite oxidase catalyzing the oxidation of As+3 to relatively less toxic As+5 (Li et al., 2021). The product of aioR gene is a regulator that interacts with the RNA polymerase to initiate the expression of aio genes (Cai et al., 2013). The relative abundance of arsH gene, an organoarsenical oxidase, is correlated with the higher abundance of arsM gene mainly because it functions to oxidize the toxic As+3 organoarsenicals to As+5 organoarsenicals and hence may provide resistance to the microbes harboring these two genes (Chen et al., 2015). The arsC gene involved in the As+5 reduction pathway is also observed to be higher in relative abundance at high As concentration. This gene is involved in the conversion of As+5 taken up by the cell to As+3 which is then acted upon by the other As detoxification pathways such as As methylation pathway (Cai et al., 2013). Hence its concomitant higher level with other As detoxification genes is justified. Like arsC, gstB gene coding for glutathione S-transferase B is also involved in the reduction of As+5 to As+3, but unlike arsC, the reduction mediated by gstB is not dependent on glutaredoxins. The As+5 entering the cell is reduced using the reduced form of glutathione as an electron donor (Chrysostomou et al., 2015). mgsR is the gene expressed during oxidative stress conditions which in turn induces the expression of MgsR subregulon consisting of more than 50 genes (Reder et al., 2012). The higher relative abundance of mgsR with the build-up in As concentration might be to combat the oxidative stress induced by As.

4 Conclusion

This work presented a comprehensive report on the changes in microbial community dynamics as a function of soil arsenic concentration. We found Actinobacteria and Ascomycota respectively dominating the bacterial and fungal communities across the gradient. We also reported the occurrence of Gaiella, Solirubrobacter, Beauveria, and Verticillium in the arsenic-contaminated soil which can be further explored for their potential role in arsenic bioremediation. The study also focused on the changes in the abundance of microbial genes responsible for providing tolerance against As to the resident microbial community exposed to increasing As stress, although further studies at the transcriptional level (RNA) are required to enhance our understanding of the response mechanism of soil microbial community to the As contamination. In this study we found arsJ, arsM, aioR, arsH, and arsC genes to be the ones most influenced by the changing As concentration across the gradient. The study concludes that As contamination of the agricultural soil has a major influence in reducing the taxonomic diversity as well as the composition of the microbial species in the soil microbial community. The same is true for the normal functional pathways operating in the soil, which are altered due to As stress probably to enhance the microbial capability to combat As stress. The microbes that are abundantly found in the As-contaminated soil can further be explored for their As bioremediation potential and their field applicability.