Keywords

1 Introduction

Contemporary methods of environmental risk assessment are based on the detection of critical impact levels, which are interpreted as a starting point of decrease in ecosystem stability, therefore disappearance of the basic structural components of biocenoze or destruction of functional communications begins [1, 2]. The basic way of hazardous concentration estimation of technogenic pollution with reference to real populations in environment where active experiment cannot be realized, is practical use of available laboratory-derived toxicity values for surrogates of species. The problems of correct extrapolation of toxic effect are distinctions in taxonomic structure, specific life-cycle stages, levels of the biological organizations, set of accompanying physical and chemical parameters of media, type of exposure temporal regimen, spatial characteristics [3].

Analysis of Species Sensitivity Distribution (SSD) is one of the statistical extrapolation methods of laboratory data on different natural environments [4,5,6,7,8]. The SSD curve approximates from a panel of available acute or chronic toxicity values (as a rule, LC50), or other effect measures for different species with respect to a particular chemical and interpreted as an integral function of some theoretical distribution of probability. It was originally developed for the risk assessment of single substances through the setting of thresholds: a hazardous concentration affecting p % of species (HCp, i.e. either p-th % quantile of received distribution), or the fraction of species potentially affected by a certain concentration [5]. For example, if the threshold concentration is assumed as HC5, it means that it is hazardous (lethal) for 5% of the most sensitive species and neutral for the rest, i.e. the zero-hypothesis about absence of harmful exposure is accepted at 95%-s’ level. Probabilistic environmental risk assessment can be generally presented as a distribution ratio of the exposure (observed) and sensitivity (predicted on SSDs) concentration of pollutants [9, 10].

From the beginning of the use of SSDs the importance and difficulty of laboratory-to-field extrapolation and possible bias of risk assessment, caused by these reasons, has been discussed [3, 11]. The most important differences include a whole range of phenomena: bioavailability, spatial and temporal variance in field exposures, genetic or phenotypic adaptation and etc. SSDs, as a matter of fact, in any way does not use the information on ecology of communities (interspecific interactions, trophic communications, habitat factors, or the specific importance of keystone species and functional groups).

Another source of bias caused by data selection is that the species used for toxicity testing are not a random sample from the community of species to be protected [12] or in general with it not to coincide [13]. Often there is only the very narrow range of species tested relative to communities potentially exposed [14]. For example, micromycete and microbial communities in soil are almost inevitably under represented of toxic values when SSDs are intended to include them. Massive laboratory-derived determination of toxicometric indicators (NOEC, EC50) for diverse ecotoxicants and with respect to multiple soil biota types is not actually feasible.

Therefore actual problem is search of the approaches that would enable a rapid assessment of the soil environmental condition basing merely on field observation data without special toxicometric experimenting.

The factorial ecology considers ways of statistical fitting of function of distribution of an abundance of any specie on a gradient of change of studied factors of environment, such as the maintenance of chemical substances, availability of resources etc. [15,16,17,18]. The simplest models of such dependences known as Species Response Curves, estimate three major parametres: optimum position, its amplitude and width of the response – see Fig. 1. The optimum defines preferable value PV of the factor where the specie can be found with the greatest probability, that is, localization of peak of distribution. Tolerance is connected with ability of population of a species to live and reproduce posterity in not optimum environment. The tolerant interval estimates a factor range in which the basic indicators of physiological activity or abundance of population can be remained or restored. Its TV right boundary value actually corresponds to maximum NOEC inefficient concentration, and any exposure exceeding this threshold, are considered as the hazardous. However basic difference of these estimations of ecological parametres from toxicity values LD5 consists that they consider all set of conditions of a concrete habitat.

Fig. 1.
figure 1

Curve of distribution of abundance Palpomyia sp. on a gradient of salinity of water in inflows of the lake Elton. PV – preferable value of salinity; the grey fills in a range of tolerance with right boundary value TV; toxicological indicators LC5, LC50, LC100 are presented roughly.

In this article we propose an SSD alternative which takes into consideration absence of toxicological values by species of studied community, and uses in the analysis only raw data of field researches from a limited number of observation points. We consider possibility to utilize ordination procedures and multidimensional smoothing models for the estimation of a preferable PV and tolerance TV values for each species. Additionally, we propose a probabilistic risk of decrease of taxonomic richness estimation that links the modeled species distribution with the variability of environmental exposure conditions. The applicability of the proposed methods is elucidated in a case study on response for communities of microscopic fungi based on assessment of environmental risk of soil contamination from past uraniferous ore production.

2 Materials and Methods

2.1 Data Set for Illustration

Comprehensive studies of soil contamination were performed in the area of Kadzhi-Sai settlement (Kyrgyzstan), where low uranium concentration ore deposits were developed back in 1947–1965. Sampling from top soil layers took place in May, 2014 from sites located both in the territory of uranium mine tailings, and in rather clean areas on the slopes of adjoining hills, on the Issyk Kul lake shore, and in Boomsky canyon (42°08' 48'' N, 77°11' 10'' E).

A level of technogenic impact on the area in terms of two groups of indicators was analyzed: the activity of three radionuclides (U-238, Ra-226, Pb-210) using spectrometer Canberra (USA) consisting of a germanium detector HPGe and 16 heavy metals and other chemical elements content in the top soil horizons using hand-held XRF spectrometer DELTA Classic (USA). Total soil contamination with heavy metals (Zc) was derived using the modified Pollution Load Index (PLI) as the geometric mean [19]:

$$ Z_{c} = n(K_{1} \cdot K_{2} \cdot \ldots \cdot K_{n} )^{1/n} {-} \, \left( {n - \, 1} \right), $$
(1)

where n is the number of the components, Кi = Ci/ Cib, Cib and Ci is the background and actual content of the i-th element in the soil. To account for different toxicity of the elements local PLI-indexes for three classes of hazard were calculated separately: for high hazard Z c(1) (As, Cr, Pb, Zn), for moderate Z c(2) (Co, Mo, Cu) and low hazard Z c(3) comprised of background and rare earth elements (Ba, Ti, Fe, Mn, Sr, K, Ca, Rb, Zr). The summary PLI-index was calculated with allowance for correction factors for toxicity:

$$ {\text{Z}}_{\text{c}} = 1.5\,Z_{c(1)} + 1.0\,Z_{c(2)} + 0.5\,Z_{c(3)} $$
(2)

Soil fungi are among the most extensive and diverse groups of organisms used for the biodiagnostics of an environmental condition of biotopes, for setting environmental standards, and for environmental risk assessment [20]. The results of bioindication studies of micromycete communities in the soil sampled from 4 sampling sites with disturbed habitats and from 3 sites located in relatively clean zones (control) were used to evaluate ecosystem’s response. Isolation of the cultivated microfungi was performed by a standard procedure of water soil suspension plating from 1:100 dilutions to the Czapek agar medium in a 3-fold replication. Frequency (%) of occurrence of each species was presented as its share in soil subsamples, in which a particular species was isolated. In total, 41 microfungal species were detected.

2.2 Statistical Analysis

Statistical processing to assess critical exposure levels and environmental risk with a preset certainty was conducted in two stages: (1) using ordination methods, the calculation of contamination factors corresponding to maximal abundance of each fungal species, and (2) approximation of data from the theoretical distribution curve for probability of species occurrence.

Procedure of multidimensional ordination of communities consists in optimum projecting of the studied habitats on a plane with latent axes S1 and S2 [16, 21]. An matrix of frequencies of occurrence with 41 microfungal species from 7 sampling sites used as input data. The matrix D of distances in multidimensional species’ space between each pair of the soil samples by the Bray-Curtis formula was calculated [22].

The ordination of microfungal communities was built by the algorithm of nonmetric multidimensional scaling (NMDS). Then a minimum of “stress” Δ is searched, which reflects degree of distortion of mutual distances between sites at a reduction from initial multidimensional space to a 2-dimensional plot [23, 24]. The major advantage of NMDS method is that it does not require a priori any assumptions about statistical distribution from the input data in contrast to such approaches as analysis of principal components [21]. Further, the weighted average coordinates s 1 and s 2 were estimated for individual microfungal species on the NMDS projection, which identified their position relative to sampling sites, and the ordination plot of the species was built.

Environment factors were used for interpretation of ecological gradients in species compositions along the constructed additional axes which have been added to axes of unconstrained ordination. The disposition of these vectors on ordination diagram was defined by model of multiple regression, in which each factor of environment was used as a response, and coordinates of sites s 1 and s 2 - as explanatory variable. Significance of models is tested by permutation procedure.

For any of the analyzed soil contamination factors Y generalized additive model (GAM) was built and fitted 3D smoothing surfaces in the same ordination plot was added. Models looked like:

$$ Y =\upalpha{ + }f_{1} \left( {s_{1} } \right) + f_{2} \left( {s_{2} } \right) + f_{3} \left( {s_{1} ,s_{2} } \right) + {\varepsilon ,} $$
(3)

where f 1, f 2, f 3– specially picked functions from the NMDS coordinates s 1 and s 2 in the form of smoothing polynoms or penalised splines with k freedom degrees [25]. Predicted values of ecological maxima \( {\text{PV}}_{j} = \hat{Y}_{j} \) corresponding to coordinates of the most probable position of each j-th species of fungi, j = 1, 2, …, 41, on the NMDS projection were found from the fitted models. So high Y values were taken as approximated tolerant threshold for the j-th species, that they were low probable within the limits of the smoothing GAM model, i.e. the upper boundary values of confidence intervals \( {\text{TV}}_{j} = \hat{Y}_{j} + t_{\alpha /2} S_{{\hat{Y}_{j} }} \), where t α/2 – quantiles of student’s t-distribution at α = 95%, \( S_{{\hat{Y}_{j} }} \) - standard prediction errors of regression.

Further on, the attained empiric distribution of the preferable species value PV j and tolerance threshold value TV j along the Y-axis of contamination indicator was approximated by the theoretical distribution of the continuous random variable. A choice of the best distribution from a set of possible candidates (normal, log-normal, Weibull, etc.) and estimation of its parameters were conducted basing on the likelihood function log maximum. Confidence intervals of the selected cumulative distribution function F(PV(t)) and F(TV(t)) were estimated by a parametrical bootstrap method [4, 26].

All calculations were performed using vegan package of statistical environment R v. 3.02 [27].

3 Results

3.1 Multivariate Analysis of Data

The observed variability of the species structure of micromycete communities is associated basically with the gradient of the environmental conditions change in the area under study. The ordination graph in Fig. 2a testifies to rather clear differentiation of the sampling sites: soil samples from the uranium mine tailings (2 and 3) and from the natural reserve in Boomsky canyon (14) occupied extreme positions on the main axis S1 of the nonmetric projection. Variability of the microbiota structure in other habitats with an intermediate contamination level was determined by the second ordination axis S2.

Fig. 2.
figure 2

Ordination of nonmetric multidimensional scaling data: (a) sampling sites (2, 3, 5 – a dump, 8 – a residential area of settlement Kadzhi-Saj, 12, 13 - the Issyk Kul lake shoreline, 14 – Boomsky canyon); (b) microfungal species (for some codes see Table 1). The arrows denote additional axes of physical gradients: index Zc, U-238 and Ra-226 radionuclide activity in soil, and the Co, Cr and Zn content. Grey isoclines show the cobalt content (2A) and Saet’s index Zc (2B) calculated using the additive model.

Provided coefficients of correlation between the soil contamination indicators and the coordinates s 1 and s 2 on the ordination axis are calculated, it is possible to plot additional axes of physical gradients reflecting the nature and power of each factor’s impact. The arrows of factor loads shown in Fig. 1a are approximately close both in direction and in length, so soil contamination factors in the studied region are likely to form an interconnected and multicollinearity complex. The best correlation (R 2 = 0.83, p = 0.022) was noted between a variation of the fungal communities structure (by frequency of occurrence of the found out species) and the concentration of cobalt (Co, mg/kg) in soil, - see surface smoothing by the GAM model in Fig. 2a.

The ordination of microfungi species groups (Fig. 2b) is closely connected with the ordination of habitats. If a species is only encountered in one sample, its position on the graph coincides with a point corresponding to a sampling site. Otherwise the species position is determined by weighted average coordinates of its possible several habitats. We believe, that it is a point of an “ecological optimum”, where species occurrence most probably.

If a 3D smoothing surface (3) is built for any of the analyzed soil contamination factors it is easy to calculate a preferable values PV j and thresholds of tolerance TV j for points of an optimum of each jth species, which can be used further for modelling of probabilistic distribution of sensitivity. For some species on Fig. 2b the calculated values are resulted in Table 1.

Table 1. Coordinates s 1-s 2 on ordination plot (Fig. 2(b), preferable value PV of the concentration of cobalt (mg/kg soil), standard error of model and the right borders of a tolerant interval TV for some species of micromycete in soils of the former uranium-producing province
Table 2. Critical values of soil contamination indicators for various environmental risk levels (p %), calculated from the SMD and SSD curves (Fig. 3)

3.2 Statistical Distribution of Species Occurrence

Further calculations were performed for soil contamination indicators presented in Fig. 2. With the use of preferable values PV j for 41 microfungal taxas, the parameters of the Species ecological Maxima Distribution (SMD) were estimated on the scale of each analyzed factor. Similarly with use of thresholds of tolerance TV j the Species Sensitivity Distribution (SSD) were fitted. In all cases, the highest-likelihood approximations followed the log-normal distribution law.

Exemplary SMD and SSD curves are shown in Fig. 3 where it is possible to see how the response of microfungal communities is varying under the impact of different factors. The occurrence distribution over radionuclide activity and cobalt content scales is rather regular whereas the sensitivity in relation to other heavy metals and to the PLI-index Z c (2) has a more contrastive nature. Of general pattern is considerable reduction of the specific richness and diversity of soil microfungal communities under the impact of heavy metals. However, in Fig. 3, it is easy to single out groups of the species possessing elevated resistance to some pollution forms unusual for normal conditions.

Fig. 3.
figure 3

Curves of log-normal distribution of probability of microfungal species maxima occurrence (SMD) and species sensitivity distribution at hazardous concentrations (SSD) on the scale of soil contamination indicators: 3A- U-238 radioactivity, Bk/kg, 3B– chromium content, mg/kg. CI – lower and upper curves enveloping the 95% confidence interval of SMD

If we take to arbitrary critical probabilities, e.g. p = {5, 10, 16, 20 and 50%}, using cumulative distribution curves SMD and SSD, it is possible to estimate a set of isoeffective values of, accordingly, preferable PC p and hazardous for microfungal communities HC p concentrations of the exposure factor (for examples see Fig. 3).

4 Discussion

Models of species sensitivity distributions (SSDs) were developed to derive criteria for the protection of biological entities in contaminated media. Assessment endpoints will vary depending on the protection goals and corresponds to a certain level of conservatism. Hence, it is necessary to define the relationship of the SSDs to sense of the setting of thresholds HC p , given the input data. The acute LC50 values are based on mortality or equivalent effects (i.e., immobilization) on half of exposed organisms. At the ecosystems level at use SSDs it means, that at hazardous concentration HC p in p % populations 50% of organisms will be lost approximately. The use of the SMD-curve determines a point of beginning of deviations from optimum of habitat conditions for p% of species and creates more stringent limitations to the estimation of critical concentrations. Assessment endpoints on the basis of thresholds of tolerance TV or no effective values NOEC, will occupy intermediate position (see Table 2) and make sense “mild ecological hazardous” for p % of populations as the probability of their resistance remains high.

To bring assessment endpoints into accord to the protection goals with reference to a concrete situation, the selection of values of the uncertainty factors (UF) and protection levels p is carried out [3,4,5]. Exist ambiguous opinions on what proportions p of the community or taxon as trigger values that should be considered as critically hazardous for an ecosystem [28]. Another uncertainty is the ambiguity of determining a share of maximum effect of impact p. This is usually done with account both of statistical “elasticity” of rated indicators and of a degree of researcher’s responsibility for a conclusion (i.e. usually is a result of political compromise, instead of a science problem). Taking into account sense of thresholds of tolerance TV and ecological characteristics of tested community and media (micromicetes in soil), we believe that it is reasonable to be guided on p-values of 15%, which is in the precision region of ecotoxicologic methods [29].

The method of NOECs prediction and distribution of sensitivity of species on the basis of the field data and spatial models proposed by us is not the competitor to classical SSD. Use of available toxicity values is necessary and it is desirable, as the comparative analysis of results of modelling by various methods reduces uncertainty of assessmented endpoints. We will notice also a decrease in the specific richness is far from being a unique indicator for setting norms and standards in environmental regulations. Among the like is reduction of functional diversity or productivity, a switchover of the dominant species complex, etc. For decision-making it is important to have all accessible complex of the information on the response экocиcтeмы in a gradient of influencing factors.

There are two reasons for regarding the cutoff values in Table 2 as tentative. First, ecological sense of NOEC in our case is identical to a finding of tolerant limits TV of existence of each species. Here again for a approximate estimation of the right tolerance boundary concerning an optimum we suggest to take advantage of confidential statisticans of smoothing model. Whether more correct estimation of tolerant ranges of occurrence of species is possible?

Theoretically, the TV estimation method has been only proposed for the normal or log-normal response (species abundance or occurrence) distribution on the exposure concentration scale (Oksanen et al., 2001). In more general case a finding of ecological optimum and tolerance ranges of occurrence is possible with the use of generalized regression models for each of species [30, 31]. However this would require several tens of measurements in identical ecosystem conditions. The wide range of a variation of concentration of polluting substance is necessary in any case.

Second, a limiting pollutant was not allocated, a combinatorics of cooperative impact from a mixture of toxicants was not considered, and also influence of accompanying parametres of environment, such as soil characteristics, was not analyzed. Problems connected with prediction of the potentially affected fraction of species and consequently for the risk assessment of chemical mixtures can be at least partially solved by various approaches [7]. Therefore we will notice only, that underestimated in comparison with environmental quality standards (for example, 100 mg/kg of the zinc) values of concentrations, presented to Table 2, we explain effect of additivity in a mix of components.

Let’s pay attention to special circumstance, that the arrangement of points of sites on ordination plot in Fig. 2 is formal depends only on distances between them in multidimensional space of species. In turn, the configuration of points of species is defined by the sum of their statistical distributions under the influence of multiple stressors, uniting except influence analyzed pollutants, the combinations of other not considered factors of environment, including properties of soils [29, 32]. Hence, the prediction of preferable values PV j for everyone individual stressor is carried out by the method described above against and taking into account influence of all of the others. Thus, rather than eliminating or minimizing extraneous variance, sources of variance may be explicitly acknowledged as part of the our SSD methodology.

Ultimately, one more problem of exposure rating is associated with high spatial heterogeneity of technogenic soil contamination, what, undoubtedly, tells on the number, abundance and distribution of fungal species [33, 34]. The redistribution of concentrations under precipitation impact within a microrelief, a relative height of sites, spotty nature of pollution, variability of the “assimilation capacity”, biological activity of soil, etc. strongly influence bioindication results as a whole. Spatial interpolation of pointwise observations in geographical coordinates with a view to compensate for random fluctuations can be carried out by means of kriging models. We propose to perform smoothing surface modelling after the initial data projection onto a plane with latent axes directly related to the species structure of the community under study – that is to put aside natural spatial coordinates. In this case, the nonmetric multidimensional scaling method enables modelling of even and steady smoothing surfaces.

5 Conclusion

The modelling of Species Sensitivity Distribution allow to establish the critical (threshold) values of toxicant concentrations in the soils by using only field data without special toxicometric experimentation. In this work threshold values of six soil pollutants have been determined based on the analysis of the structure of micromycete communities of soil samples from the former uranium mining province (Kyrgyzstan).