Introduction

The degradation of freshwater habitats is more rapid than that of any other ecosystems, and the adverse effect of human-induced stress is considerable (Dudgeon et al. 2006; Lacoul and Freedman 2006). As a result, freshwater biodiversity is declining at far greater rates than the biodiversity of any other terrestrial ecosystem (Ricciardi and Rasmussen 1999). Such changes in biodiversity have a strong potential to alter the ecosystem properties, as well as the goods and services they provide to human society (Hooper et al. 2005).

Considering the diversity of aquatic macrophytes, maintenance of species-rich plant communities may enhance the functioning and associated services of wetland ecosystems. For example, Engelhardt and Ritchie (2001) showed that higher macrophyte richness in wetlands may potentially yield up to 25 % more algal biomass, thereby potentially supporting a greater abundance of wildlife and retaining up to 30 % more potentially polluting nutrients such as phosphorus. Moreover, the susceptibility of ecosystems to invasion by exotic species generally decreases with increasing species richness (Hooper et al. 2005). In general, the consequences of biodiversity loss might be complex and difficult to predict (Downing and Leibold 2002). Therefore, there is an urgent need for accurate information on diversity patterns and drivers of species richness that could be used in the management of freshwater ecosystems.

On a global scale, the highest macrophyte species richness is seen in the Neotropical region, and intermediate richness is seen in the Oriental, Nearctic and Afrotropical regions. It is lower in the Palearctic, Australasia and Pacific Islands, and the lowest in the Antarctic region (Chambers et al 2008). In Europe, macrophyte species richness peaks between the latitudes of 40° and 50°. Central-European countries, including Slovakia, typically have intermediate species richness (Chappuis et al. 2012).

At a local scale, the diversity of aquatic macrophytes is influenced by a plethora of variables that are scale-dependent and interacting, but they can be generally divided into three main groups: biogeographic, climatic and geomorphological (Lacoul and Freedman 2006). Primarily, water movements, light availability, substrate structure and nutrient contents in the water are the most important factors (Bornette and Puijalon 2011).

Here we present the results of an analysis of relationships between macrophyte species richness and environmental characteristics using an extensive set of data collected from 160 sites in two central-European bioregions. We modelled macrophyte richness using recursive partitioning methods with the following aims: (i) to assess the importance of various environmental predictors and (ii) to estimate the environmental thresholds of species richness in four types of aquatic habitats (rivers, streams, ditches and ponds). We believe that the results of this study are directly applicable to the freshwater biodiversity management and conservation planning.

Material and methods

Study area

The study was conducted in two central-European bioregions (Futák 1966): the Pannonian and the Western Carpathian (hereafter referred to as the Carpathian bioregion). The regions cover a substantial part of the lowland and mountainous areas of central and south-eastern Europe. Two model areas were selected for the study of macrophyte diversity: the Borská nížina lowland (BNL) in the Pannonian bioregion and the Turčianska kotlina basin (TKB) in the Western Carpathian bioregion (Fig.  1 ).

Fig. 1
figure 1

Map of the study area

Both areas are similarly dominated by agricultural landscape (more than 50 %) and forests (less than 30 %). They differ mainly in altitude (planar and colline belts prevail in BNL and TKB, respectively) and the associated climate. TKB belongs to the moderately warm climatic region and the humid to very humid subregion, with mean July temperature and mean annual precipitation totals of 14–16°C and 800–900 mm, respectively. By contrast, BNL represents the warm, moderately dry to semi-humid subregion, with mean July temperature and mean annual precipitation totals of 18–20°C and 500–600 mm, respectively (Miklós 2002).

Field sampling and laboratory analyses

Four water body types (rivers, streams, ditches and ponds) were studied within both bioregions. We followed the classification criteria described in Williams et al. (2003). Rivers and streams were defined as the lotic water bodies created mainly by natural processes with 8 m as the cut-off width for classification of watercourses. Ponds were characterized as lentic water bodies and included both man-made and natural types, which may be permanent or seasonal (e.g. reservoirs, gravel and sand pits, fishponds, river oxbows). Finally, ditches are man-made canals created mainly for agricultural and industrial purposes, and have the following characteristics: (a) a linear planform; (b) linear field boundaries, often turning at right angles; and (c) little relationship with natural landscape contours (Williams et al. 2003).

Overall, 160 sampling sites were selected randomly from all potential sites. Sampling sites were stratified by bioregion (80 per bioregion) and water body type (20 sites per water body type in each bioregion). An area of 100 m2 was evaluated within each sampling site, and all macrophytes were recorded on foot or from a boat (cf. Janauer and Dokulil 2006). The sampling area in lotic habitats was determined as the length × width of the watercourse. In ponds, the area was approximately triangular with the apex at the middle of the water body and the base following the water body margin. In order to suppress the effect of vegetation seasonality, water bodies were sampled during two periods (June and August/September) during the vegetation season of 2011. Species richness (number of observed species) was used as a response variable in subsequent analyses.

Simultaneously with macrophyte sampling, 28 geographical, hydrological and chemical characteristics were evaluated and/or measured in the field, or subsequently calculated or measured in the laboratory (Table  1 ). Composite samples of water and soil, consisting of three spatially stratified subsamples, were collected at each sampling site. Water samples were quickly frozen and maintained at −18°C until ammonia and phosphate contents were measured (Hrivnák et al. 2010). Soil samples were dried at laboratory temperature and stored in a dry place until analysis (Hrivnák et al. 2013). The environmental variables obtained were only weakly correlated among each other (R < 0.6) and, thus, no variable selection procedure was necessary prior to data analyses.

Table 1 Summary characteristics of environmental variables obtained in the field and laboratory

Data analyses

Recursive partitioning procedures were used to assess relationships between environmental variables and macrophyte diversity. Individual regression trees were built for each habitat type in order to identify the environmental characteristics most strongly associated with species richness and to estimate the environmental thresholds for macrophyte diversity. Regression trees are robust non-parametric methods ideally suited for complex ecological data where non-linear relationships and high-order interactions are the rule rather than the exception (De'ath and Fabricius 2000). The regression tree models are fitted by successively splitting the data into more homogeneous groups, using combinations of categorical and/or numeric explanatory variables, thus resulting in hierarchical trees of decision rules based on the threshold values of explanatory variables that define each group. In this study, a conditional inference approach based on multiple statistical tests was implemented to prevent over fitting and a selection bias towards variables with many possible splits (Hothorn et al. 2006). We used a 5 % significance level for variable selection and as the stopping criterion for growing the trees. To evaluate predictive performance of the trees, cross-validated (leave-one-out) root mean square error (RMSELOO) was calculated for each model.

Individual regression trees, however, can be unstable, and even a small change in the data can produce highly divergent trees (Prasad et al. 2006). Moreover, the approach of adding one variable at each split, employed in individual regression trees, can lead to a locally optimal model that does not necessarily correspond to the globally best model over all possible combinations of variables (Strobl et al. 2009). Therefore, we used a random forest algorithm (Breiman 2001) to identify a small number of globally relevant predictors from the full set of environmental variables. Random forests combine multiple individual regression trees derived from resampled data, while every single tree is grown with a randomized subset of predictors for each split. An unbiased algorithm with subsamples drawn without replacement (Strobl et al. 2007) was employed in constructing the forests with 10,000 trees. In regression random forests, the number of input variables randomly sampled as predictors at each split (mtry) is suggested to be the one third of the number of variables (e.g. Liaw and Wiener 2002). However, Hastie et al. (2009) showed that the optimal mtry depends on the data at hand and that it should be treated as a tuning parameter rather than a fixed value. Therefore, we systematically searched for the optimal mtry values that produced random forests with the best prediction accuracy according to the cross-validated (out-of-bag) root mean square error (RMSEOOB). Final random trees were grown with these tuned mtry parameters. The importance of environmental variables for the prediction of macrophyte species richness was calculated using a permutation procedure. For selection of significant predictors, we adopted the conservative approach suggested by Strobl et al. (2009), i.e. we excluded all variables whose permutation importance scores did not exceed the amplitude of the largest negative scores. In other words, all variables with permutation importance that were negative, zero, or positive, but with a value that lies in the same range as the negative values, were considered irrelevant, and such predictors were consequently excluded from further interpretation.

While random forests are powerful predictor tools and can effectively handle complex data, they are essentially a black box where it is difficult to assess partial relationships between the response and predictors. To corroborate the random forest results, we used ordination methods to visualize the relationships between macrophyte species richness and significant predictors. Matrices of important predictors were submitted to principal component analysis (PCA). PCA for each habitat type was conducted on correlation matrices in order to equalize the weight of dimensionally heterogeneous environmental variables. Subsequently, species richness data were overlain on each PCA ordination plot as a smooth surface using predictions based on generalized additive models with thin plate splines and restricted maximum likelihood selection of the degree of smoothing (Oksanen et al. 2013).

All analyses were performed in R (R Core Team 2013).

Results

In total, we recorded 72 macrophyte taxa, including vascular plants (86.1 %), bryophytes (9.7 %) and macroscopic algae (genera Chara and Nitella; 4.2 %). Species richness ranged from 0 to 14 species per site. Considering habitat types, the diversity of macrophytes decreased in the following order: ditches (mean [min-max] = 4.9 [0–12]) > ponds (3.7 [0–9]) > streams (2.6 [0–14]) > rivers (2.1 [0–9]).

The regression tree analysis identified six significant predictors of macrophyte species richness in four different aquatic habitats (Fig.  2 ). In rivers, the partition was based on the threshold value of water pH of 8.4. The river sites with pH greater than this threshold supported, on average, significantly lower macrophyte species richness than the sites with a lower pH. The prediction error of the model was relatively small (RMSELOO = 1.9 species). In streams, the first partition was based on soil conductivity, with streams that had conductivity higher than 588 μS/cm showed greater macrophyte richness. The model further partitioned the group of streams with lower conductivity based on whether soil pH was greater or less than the threshold value of 7.9. Streams with low soil conductivity and low pH favoured macrophyte diversity. However, the prediction efficiency of the model was quite weak (RMSELOO = 3.6 species). In ditches, the proportion of sand substrate was the only significant predictor of macrophyte diversity. Ditches with sand cover higher than 30 % showed lower species richness. Again, the predictive power of the model was low (RMSELOO = 3.1 species). Finally, the macrophyte diversity of ponds was significantly related to the turbidity and average depth of the water. Ponds with higher turbidity (> 2) supported lower species richness than the group of sites with more transparent water. The subsequent partitioning of sites with lower turbidity was based on a water depth threshold of about 1.5 m. Shallow ponds showed higher macrophyte diversity than deeper ones. This model showed reasonable prediction performance (RMSELOO = 2.4 species).

Fig. 2
figure 2

Conditional inference regression trees showing relationships between the environmental predictors and macrophyte species richness in the four aquatic habitats

Regression trees optimally fit the data, but a wider generalization of the results was achieved through random forest analysis. Variables significant in individual regression trees appeared as the important predictors in random forests as well (Fig.  3 ). Additionally, a couple of other globally relevant predictors of macrophyte species richness emerged. Geographical and landscape variables (cf. Table  1 ) were more important for slow flowing and standing waters (ditches and ponds) than for rivers and streams. Landscape structure covers in the vicinity of the sites did not appear relevant. Interestingly, pond origin (natural vs artificial) played a role in macrophyte diversity, while natural sites supported higher richness (5.2 species in average) than artificial sites (3.0). Variables related to hydrology were relevant for each water body type. Among those, fine substrate cover and water depth appeared as the most general predictors across all but one aquatic habitat type. Substrate diversity, per se, did not affect macrophyte diversity. While the chemistry of water and soil appeared important for species richness, no general pattern emerged. Interestingly, phosphate loads in water and soil did not affect macrophyte richness. Prediction error of the random forests (RMSEOOB) increased in the following order: rivers (1.9) < streams (2.3) < ponds (2.4) < ditches (3.0).

Fig. 3
figure 3

Variable importance plots for environmental predictors from random forest models of the four aquatic habitats. The permutation importance of a variable is the mean decrease in the mean squared error for the out-of-bag data when the values of the target variable were randomly permuted. Dashed lines indicate variable importance that can be due to random variation only. Relevant predictors are denoted by black bars. Description of variables is given in Table  1 .

The relationships between important variables and macrophyte species richness were visualized in principal component space (Fig.  4 ). Apparently, macrophyte diversity and environmental variables are related in a complex and non-linearly way. Streams are notable exceptions, where a rather linear pattern emerged. Richness of stream macrophytes showed a positive linear association with nutrient gradients represented by conductivity and ammonia.

Fig. 4
figure 4

Principal component plots of important predictors from random forest models of the four aquatic habitats. Contour lines represent the smooth surface of macrophyte species richness fitted using a GAM. The size of a centroid is proportional to the species richness of a given site. The variance explained by a particular component is displayed in parentheses.

Discussion

The regression tree analysis revealed several hydrological and chemical variables as important predictors of aquatic macrophyte species richness. These findings were further corroborated within a more general framework of random forests. Importantly, the effect of regional differences was negligible for each habitat type, which may allow for a wider generalization of the results.

Association with chemical variables

In our study, water and soil pH emerged as significant predictors of species richness in running waters (rivers and streams). Numerous other studies found pH and associated factors as principal determinants of macrophyte diversity (Lacoul and Freedman 2006). The effect of pH relates to physiological differences among plants, some of which can use both bicarbonate and carbon dioxide as carbon sources. Others, however, rely on carbon dioxide, which is unavailable in waters with even moderately high pH. Consequently, many aquatic plant species are distributed in aquatic habitats according to their ability to use only carbon dioxide or both sources of carbon (Bornette and Puijalon 2011). Hydrochemical parameters, such as pH, limit aquatic diversity at both very low and very high values, and the relationship between these parameters and biodiversity may be described by a unimodal curve (Karatayev et al. 2008). Macrophyte species richness is usually very low in acidic conditions, peaks at pH > 7 and declines in waters with high pH. For instance, some studies of North European lakes showed that macrophyte species richness reaches the highest values in the slightly alkaline pH values in the range of 7 to 8 (Vestergaard and Sand-Jensen 2000; Karatayev et al. 2008). Indeed, such a unimodal pattern is noticeable only when examining large pH gradients. Here, we sampled a relatively short pH gradient and we found pH values of 8.4 and 7.9 as thresholds for species richness in rivers and streams, respectively. Sites with pH values lower than these thresholds supported higher species richness than sites with more alkaline conditions. Considering the left-truncated gradient of pH examined in this study, our findings from running waters are consistent with the previous results obtained in lakes. Apparently, physiological limits imposed by pH and related factors on aquatic plants are mirrored in the similar responses of macrophyte diversity in standing and running waters.

Nutrient concentration alone does not seem to be related to species richness in our study (cf. Fig.  3 ). On the other hand, soil conductivity showed a significant association with macrophyte diversity in streams. Conductivity, an integration of all the dissolved materials, is usually linked with the trophic state and productivity of water bodies (e.g. Brylinsky and Mann 1973; Heino et al. 2010). Our data show that the best regression model relating the soil conductivity of streams to the other environmental predictors involved the concentration of ammonia in soil and water, phosphates in water and the proportion of fine sediment (F4,35 = 33.2, P < 0.0001, R 2 = 0.79). All were positively related to conductivity and, therefore, conductivity can be considered as a complex proxy for productivity in this study. Recursive partitioning of this productivity gradient showed that streams with soil conductivity > 588 μS/cm have more diverse communities than less productive streams with lower conductivity. A positive relationship between conductivity and diversity of stream macrophytes is rarely reported explicitly; however, the link between conductivity and macrophyte distribution in running waters is well known (e.g. Manolaki and Papastergiadou 2013). Again, the pattern of the response of macrophyte richness to conductivity depends on the length of the gradient examined. In general, the richness of aquatic vegetation responds in a manner consistent with the predictions of the hump-back model of productivity-diversity relationship (Rørslett 1991; Sand-Jensen and Borum 1991; Murphy 2002): Aquatic vegetation in oligotrophic conditions is species-poor because of nutrient limitation, while the vegetation diversity peaks at an intermediate productivity level. In strongly eutrophicated water bodies, aquatic macrophytes decline in species richness or even disappear due to light limitation, anoxia and other indirect effects of eutrophication. A unimodal response of macrophyte richness to conductivity was not observed in our study, but this is likely the result of the relatively short gradient covered in the study that lacked strongly eutrophicated streams.

Association with hydrological variables

It is generally known that aquatic plants are primarily influenced by factors inherent in the limnology of water bodies, including geomorphology, climate, hydrology and related sediment quality (Lacoul and Freedman 2006). Among the variables linked to the hydrological conditions, we found water depth, turbidity and sediment composition important predictors of macrophyte species richness.

The proportion of fine sediment was the most general predictor of aquatic plant diversity across all habitats (Fig.  3 ). Not surprisingly, fine sediment played an important role in the species composition of aquatic macrophytes in the Carpathian bioregion (Hrivnák et al. 2013). In general, the physical texture of the substrate directly affects the recruitment, rooting and growth of aquatic plants (Bornette and Puijalon 2011). Specifically, fine sediment is a suitable substrate for many vascular plants, including true aquatic plants (Willby et al. 2000), which represent a substantial portion of macrophytes in our study. In all habitats studied, macrophyte richness significantly increased with increasing fine substrate cover, or peaked at high proportions of fine substrate (pair wise GAM models, details not shown). Since rooted aquatic plants are able to take up dissolved nutrients from the sediment (Barko and James 1998), the observed pattern can be a consequence of differences in nutrient availability among substrates. Jones et al. (2012) argued that the effects of fine sediment on macrophytes closely parallel the effects of nutrient enrichment (see above). This is rather unsurprising since the fine sediment often contains high concentrations of organic matter, carbon, nitrogen and phosphorus (Sand-Jensen 1998). We showed that fine substrates can increase species richness in all types of aquatic habitats, but the most pronounced effect is expected in nutrient-poor, fast flowing streams with stony bottoms, where the accumulation of fine sediment can increase the patches of stream beds suitable for macrophyte anchoring and provide the nutrient-rich substrate for plant growth.

Turbidity and water depth were also identified as significant hydrological predictors of macrophyte species richness in this study. Indeed, these characteristics are important mainly in lentic ecosystems (Akasaka et al. 2009; Broyer and Curtet 2012). However, average water depth emerged as a general predictor in all but one of the examined habitats (Fig.  3 ). Turbidity and water depth are closely connected with light conditions, which represent a physiological limit for macrophyte distribution. For a long time, it has been recognized that the maximum depth of colonization by aquatic macrophytes and water transparency are positively related (Canfield et al. 1985). Greater transparency is usually accompanied with increases in species richness (e.g. Akasaka et al. 2009; Capers et al. 2009). The associated mechanism is probably linked to the presence of a higher area for colonization and increased habitat richness generating more niches for the growth of aquatic plants (Vestergaard and Sand-Jensen 2000; Sand-Jensen et al. 2008). These findings nicely support our observations in central-European ponds, where the sites with high water transparency and low water depth harboured the highest macrophyte diversity (cf. Fig.  2 ).

Association with geographical and landscape variables

Macrophyte species richness showed only weak or no relationships with the geographical and landscape variables examined in this study. Landscape structures influence the local diversity of aquatic plants through multiple direct and indirect processes involving the distribution of propagules in adjoining habitats, availability of dispersal routes, or changes in water quality, to name a few (Houlahan et al. 2006). However, these processes operate at different spatial extents and, for example, the landscape structures that exhibit the dominant effects on diversity at one spatial extent may be insignificant at another spatial extent (Akasaka et al. 2009). Apparently, the proportion of landscape structures in a close vicinity of the study sites did not affect macrophyte richness. This was not surprising in rivers and streams, where flowing water may carry buoyant seeds long distances, and local plant communities may receive a large proportion of their seeds by long-distance dispersal. In lentic systems, existing studies conducted over multiple spatial scales report rather wider spatial extents for detecting relationships between land use and diversity (Houlahan et al. 2006; Akasaka et al. 2009). It seems that 100 m from a habitat edge (examined in this study) is probably too short to detect land-use effects on macrophyte species richness. It is more likely that the aquatic plant diversity is influenced from wider spatial extents.

A notable exception among landscape predictors is the effect of pond origin. On average, the natural ponds supported almost twice as many species as the man-made ponds. One of the reasons for low macrophyte diversity in man-made ponds might be their lower physical complexity. For example, most of the studied artificial ponds are relatively deep (mean depth = 1.7 m), permanent water bodies with steep banks and a stable hydrological regime, while natural ponds are shallow habitats (< 1.0 m) with gradually sloping banks and often unstable hydrological regime with dry periods. Indeed, artificial water bodies can be beneficial as secondary habitats for preserving the diversity of freshwater organisms. Under some circumstances, man-made ponds can create favourable conditions for the maintenance of diverse plant communities (e.g. Linton and Goulder 2000). However, this is not applicable as a general rule. The differences observed here suggest that the loss of natural ponds might result in the impoverishment of the plant communities in pond habitats.

Performance of recursive partitioning models

Recursive partitioning methods were employed due to their ability to fit the interactions and non-linear relationships, which is advantageous when analysing the complex relationships typically found in ecological data (e.g. De'ath and Fabricius 2000). In this study, for example, the response of species richness to important environmental predictors was somewhat non-linear (cf. Fig.  4 ). The preliminary examination of pair wise relationships using flexible GAM approach revealed that more than 60 % of all significant models showed curvilinear patterns (equivalent degrees of freedom > 1.5). Moreover, random forests can deal with the large numbers of predictor variables, even exceeding the sample size (Strobl et al. 2009), which can be challenging for traditional statistical methods. If linear methods were solely used in such cases, some important responses may be left unnoticed (e.g. the effect of fine sediment on plant diversity in this study).

The combination of random forests and individual regression trees allowed us to detect globally important predictors and to identify the environmental thresholds of macrophyte diversity. The predictive performance of random forests was comparable or even better than the performance of individual regression trees. The latter method showed a lower predictive power in streams and ditches. In general, the predictive power of random forests outperforms single trees in most real data applications, especially in cases where many predictor variables work in complex interactions (Strobl et al. 2009). However, the predictive accuracy of all models was quite weak, with a mean error of prediction ranging from 1.9 to 3.6 species. Field studies show that aquatic macrophyte richness is related to a wide range of chemical, hydrological and landscape variables, and their effects interact and are scale-dependent (for a review see Lacoul and Freedman 2006; Bornette and Puijalon 2011; Bakker et al. 2013). This makes macrophyte diversity generally hard to predict. Such a weak explanatory power of diversity-environmental models and apparent randomness in the response lead Edvardsen and Økland (2006) to conclusion that water bodies accumulate species more or less individualistically. We believe that some mechanisms and patterns can be acquired from the field surveys, as we outlined in the discussion above.

Implications for conservation and management

Maintaining higher diversity of aquatic macrophytes may sustain wetland ecosystem functioning and promote the services of these ecosystems to humans (Engelhardt and Ritchie 2001). We offer a series of simple rules for maintaining higher macrophyte species richness, which can have important applications for the conservation and management of aquatic habitats in central Europe.

The key effect of hydrological variables is noteworthy. Changes in substrate composition, especially fine sediment, will clearly have a direct effect on the number of species in each type of aquatic habitat. A higher proportion of fine substrate will enhance macrophyte richness. Our observations also indicate that the protection of high water transparency would be generally beneficial for higher macrophyte richness. Thus, a key component of maintaining diverse plant communities is to keep turbidity at low level.

The water chemistry, particularly pH and conductivity of streams and rivers, is also important. Slightly alkaline reaction of water and elevated conductivity seem to maximize species richness in running waters.

There is also evidence that natural ponds support a greater number of plant species than constructed habitats. In order to effectively conserve macrophyte diversity, natural ponds must be protected, since these habitats are more likely to provide suitable and heterogeneous conditions for aquatic plants than artificial, secondary habitats.

Finally, our results suggest that the land use at a distance 100 m from a habitat edge has limited impact on aquatic plant richness. Under restricted resources, attention should be focused more on managing hydrological features and water chemistry than toward adjacent land use, although all of these features need to be considered in general.