Introduction

Invasive alien plant species are recognized as a major component of global change (Higgins and Richardson 1996; Higgins et al. 2000) with effects on biodiversity, ecosystem services, disturbance regimes, biogeochemical cycles, and human societies (Van Wilgen et al. 2001; Pauchard and Alaback 2004; Sharma et al. 2005; Thuiller et al. 2007; Gassó et al. 2009; Rödder et al. 2009; Pejchar and Mooney 2009).

Species Distribution Models (SDMs) have been used for identifying suitable areas for species to invade and thus prioritizing IAPS surveys and management actions. One of the factors influencing model robustness is the selection of meaningful predictor variables in relation to the spatial scale and geographical extent (Austin 2002; Araújo and Guisan 2006; Elith and Leathwick 2009; Franklin 2010). Robust SDMs are useful to improve our understanding of the importance of environmental and human-related variables in influencing the ranges of species (Luoto et al. 2006; Ohlemuller et al. 2006).

The selection of variables should be based on the ecological and physiological characteristics of the IAPS, data resolution, quality, and availability (Thuiller et al. 2003; Dormann 2007; Rödder et al. 2009; Austin and Van Niel 2011). Many studies have used SDMs to predict the invasion suitability of areas based on climatic variables (Peterson and Vieglais 2001; Rödder et al. 2009), which at large biogeographical scales, set the physiological niche of plant species for survival and reproduction (Rouget and Richardson 2003; Thuiller et al. 2003). However, opinions differ on whether climatic variables are sufficient for explaining species distributions (Pearson and Dawson 2003; Thuiller et al. 2004; Pearson et al. 2004). There is a need for more evidence in support of the idea that purely climate-based modeling proves sufficient to predict the distribution of IAPS (Araújo and Guisan 2006; Araújo and Luoto 2007; Austin and Van Niel 2011), compared to using other environmental and human-related variables (e.g., Pearson et al. 2004; Thuiller et al. 2004; Pauchard and Alaback 2004; del Barrio et al. 2006; Coudun et al. 2006; Luoto et al. 2006; Ficetola et al. 2007).

Other factors such as land cover, soils, fire frequency, and anthropogenic variables determine the presence or absence of a species in a particular area (Rouget and Richardson 2003; Pearson and Dawson 2003; Staver et al. 2011). Unfortunately, these variables are rarely taken into account due to the scarcity of adequate data sets (Thuiller et al. 2003, 2005). SDMs that incorporate these factors could have an increased predictive power, and unveil specific factors responsible for detailed distributional phenomena (Roura-Pascual et al. 2004). Only a few studies have been conducted comparing the importance of variables as well as the variation of this importance at different scales (Ohlemuller et al. 2006; Luoto et al. 2006; Funk et al. 2016). Consequently, recent studies have started to incorporate human activities and disturbances variables (e.g., Pino et al. 2005; Thuiller et al. 2006; Chytrý et al. 2008; Gassó et al. 2009). Roura-Pascual et al. (2004) stated that SDMs that do not take into account anthropogenic disturbance could underestimate the predicted species distribution. Hence, the need to assess the importance of human-related predictor variables for a successful modeling of IAPS (Dormann 2007; Broennimann et al. 2007).

South Africa represents an interesting case study since it has been invaded by numerous species with significant ecological and economic implications (Richardson et al. 2005; Sharma et al. 2005). The number of IAPS and their impacts have increased since the first concerns regarding plant invasion emerged in the 1770s with the introduction of Opuntia ficus-indica, and Prospopis glandulosa, P. juliflora and P. velutina in the 1800s (Milton and Dean 2010). IAPS, especially large trees, have a greater water usage compared to native species (Calder and Dye 2001; Mallory et al. 2011) resulting in the annual runoff decline registered in the past decades, adversely affecting water supplies (Le Maitre et al. 1996, 2000). According to Cullis et al. (2007), the impact of IAPS in South Africa is approximately 4% of current registered water use, and could increase to 16% in the future. These adverse impacts of IAPS on water flows have been the prime motivation for the establishment, in 1995, of South Africa’s national Working for Water programme (WfW) (Le Maitre et al. 2016). The rationale behind it being that the most cost-effective way to increase water supplies is to remove IAPS from catchments–more water could be available at a lower cost where IAPS control operations are in place compared to developing additional water supply schemes (Le Maitre et al. 1998). Although this public program is one of the largest ecosystem restoration programs in the world (Hobbs 2004; Richardson et al. 2005), its effectiveness in minimizing the spread has recently been proven to be questionable (McConnachie et al. 2012; van Wilgen et al. 2016). The use of robust and reliable SDMs for IAPS can be a useful tool for stakeholders to prioritize interventions by control programs and monitor their progress, in order to reach successful and efficient implementation.

This study tested the appropriateness of including human-related variables in SDMs for predicting the distribution of IAPS, and assessed the variables’ importance on the presence and abundance of these species. Using a newly available dataset on IAPS distribution and density, the National Invasive Alien Plant Survey (NIAPS), conducted at quaternary catchment (QC) level, we addressed the following questions: (1) what are the main factors influencing the spatial distribution of the main IAPS of South Africa? (2) are there differences in the influence that these variables have on the potential presence and abundance of these IAPS? (3) does the inclusion of anthropogenic variables in the SDMs result in a better prediction and successful modeling of these invasive plants? Considering that IAPS present a growing challenge for environmental managers and policymakers (Higgins and Richardson 1996; Rouget et al. 2002), we analyzed the predicted potential spread of the main South African IAPS, in relation to provinces, biomes, and species’ residence time with the intent of providing to decision makers a clearer understanding of the impact of each factor.

Methods

Study area

South Africa is the southernmost country in Africa and due to a varied topography and oceanic influence, a great variety of climatic zones exist, ranging from the extreme desert of the southern Namib in the northwest to the lush subtropical climate along eastern coastal areas. As a result of the wide range of climatic and geomorphological differences, South Africa hosts a variety of different distinct communities of plants and animals (Rutherford and Westfall 1986) that have common characteristics for the environment they exist in formed in response to a shared physical climate (Biomes). The biomes identified in South Africa are eight. They include the grassy dwarf shrublands of the Nama and Succulent Karoo, and the annual plant formations of the environmentally harsh Desert in the western part of the Country (Mucina and Rutherford 2006); the Mediterranean like shrubland and heathland vegetation of the biodiversity hotspot Cape Fynbos; the Grasslands of the central plateaux where the combination of frost, fire, and grazing prevent the establishment of trees; the eastern subtropical Albany Thicket, which is a closed shrubland/low “forest” dominated by evergreen, sclerophyllous, or succulent trees, shrubs, and vines; the eastern Indian Ocean Coastal Belt dominated by evergreen temperate multilayer forests; the northern Savanna, which is characterized by the dynamic coexistence of grasses (mainly C4 type) and woody vegetation.

Response variable

Plant surveys provide essential information that can determine the geographic extent of an IAPS, define its ecological requirements, indicate the potential for further spread, and provide a historical account of the IAPS’s introduction and expansion (Henderson 1999). The NIAPS was implemented to establish a cost-effective and statistically sound IAPS monitoring system for South Africa at a quaternary catchment level. It used a stratified systematic sampling approach, which consists of 72,682 sample points distributed proportionally among environmental strata representing the steepest gradient contributing the most to the IAPS occurrence within the country. An aerial field survey was conducted and at each 100 m2 survey plot the following attributes were captured; (1) overall density of IAPS; (2) the three dominant IAPS; (3) density per dominant IAPS, and (4) size class per dominant IAPS according to the WfW mapping standards (Table 1)(Working for Water Mapping Standards 2003; Kotzé et al. 2010).

Table 1 List of explanatory variables divided into 25 environmental (E) and 5 anthropogenic variables (A)

Data were grouped into quaternary catchments (QC), which is the basic unit for water resource management in South Africa, and the operational delineation for WfW clearing projects. Total species included in the survey amounted to 215 and was obtained from three referenced species classifications (Nel et al. 2004; Robertson et al. 2003; Marais et al. 2004). A final list of 27 major plant invaders was produced, and species were mapped and included in the NIAPS database (Table 2). They were recorded as species (e.g., Arundo donax) or genera when referring to multiple species not easily differentiable but with similar ecological requirements (e.g., Opuntia spp. includes Opuntis ficus-indica and Opuntia stricta). The majority of species are alien phanerophytes likely reflecting the high introduction pressure of trees since and the facilitation of alien tree spread through deliberate and massive planting (Richardson 1998).

Table 2 List of NIAPS species/genus referred to as IAPS from here on (Kotzé et al. 2010)

As response variable we used the condensed area value for each taxon, which expresses the equivalent of the total invaded area with the canopy cover rescaled to 100% (Versfeld et al. 1998). For example, 100 ha that was covered by 10% with IAPS was expressed as a condensed area of 10 ha with 100% cover. The condensed area was calculated by multiplying the average IAPS density (%) by the invaded area (ha) and dividing by 100. We then normalized the condensed area of each IAPS within the QCs (Kotzé et al. 2010).

Explanatory variables

The selection of variables was based on the understanding of the ecological and biophysical processes influencing the dispersal of the species, the availability of data and the purpose of the model (Thuiller et al. 2003, 2005). The choice of the climatic and land use variables follows Rouget et al. (2002, 2004), and Thuiller et al. (2007). The use of other environmental variables, such as soil water, as well as some human-related variables, such as roads and urban areas was based on Richardson et al. (2005).

  • We estimated the values of environmental (climate, soil, land use, etc.) and human-related variables (roads, railways, urban areas) from two databases with spatially explicit GIS data layers (Table 1). We identified nine climatic variables (Schulze 2008), slope, and the ten soil variables from the ARC’s Land Type Survey database (Land Type Survey Staff 1972–2006 (2006)), while land use and other human-related variables were obtained from the National Land Cover database (CSIR and ARC 2005). Considering also the importance of fire history and soil nitrification on the spread of alien plant species in limiting environments as South Africa (e.g., Funk et al. 2016) we have included fire frequency fire frequency (F) from a fire count map for the period 2000–2013 and based on MODIS MCD64 burnt area product (Giglio et al. 2009), as well as the nitrogen content (N) of soils at a depth of 20 cm obtained from the AfSIS Africa Soil Map at spatial resolution of 250 m (Hengl et al. 2015).

Land cover, soil type, and human-related variables were standardized for the area of each QC. The rest of the variables were calculated by averaging the data within each QC (e.g., climatic variables, soil pH, etc.).

Model

Zero-inflated generalized linear models (GLMs) were used to examine the variation in the occurrence and abundance of the IAPS for each QC across the study area (Segurado and Araújo 2004). We fitted two-part models to analyze the relationship and identify factors explaining occurrence and abundance. The first part of the model is a logistic regression model to analyze the influence of the explanatory variables on the presence of IAPS. The second part is a multivariate linear regression model for the log-abundance, which highlights the influence of the explanatory variables on the abundance of IAPS (Di Lorenzo et al. 2011). Since the NIAPS database had both presence and absence data, and presence-only models tend to overpredict the actual invasion distribution, we applied a model that uses presence/absence data (Senan et al. 2012). The log-likelihood of the two-part model was expressed as the sum of the log-likelihood of each part, and maximum likelihood estimation was then performed simultaneously on the two parts to avoid bias in parameter and standard error estimates.

Model selection was performed using a forward stepwise algorithm based on minimization of the Akaike Information Criterion (AIC), which gives guarantees against multicollinearity issues and leads to parsimonious models with good predictive properties. Collinearity was checked for the final multivariate model through variance inflation factors.

The two-part model was run twice: first, using only the environmental variables, and secondly using both environmental and anthropogenic variables. To understand which set of variables is more appropriate we compared the out-of-bag prediction error values of random forest models. The prediction error is the squared difference between observed and predicted values, and this can be estimated repeatedly on subsets of observations that are not used for estimation (out-of-bag observations). The model with the lowest error-rate was considered the best.

We also analyzed if the difference between the error-rates for each species were statistically significant, by applying the Wald test (Agresti 2002) after estimation of the standard errors via resampling. The null hypothesis for the selection of the explanatory variables was that the difference between error-rates of the two models was 0, in other words, that the use of either model does not result in significant differences in the response predictions. If p value < 0.05 the difference between the models is significant, hence the inclusion of anthropogenic variables results in a more adequate prediction. Similarly, the null hypothesis for the selection of the species was that all the coefficients of the regression model were 0 for each species.

We then used a Random Forest (RF) model to assess the importance of the explanatory variables in influencing the distribution and abundance of IAPS (Breiman 2001). In an RF model, bootstrap samples are used to construct multiple trees with a subset of randomized variables. The measure of importance was derived from the contribution of each explanatory variable accumulated along all nodes and trees (Senan et al. 2012). We analyzed the model accuracy by computing the Gini Impurity Measure. We then compared the current IAPS distribution with the predictions from the RF model, taking into account the relationship with the variables. In order to have maps for this comparison, we converted the probability data into binary presence/absence data, using a probability threshold value above which the IAPS were deemed to be present. We used the method of the least predicted value: the lowest value of the presence probability associated with the areas where the species was actually observed, as the minimum value beyond which a species was considered present. All the statistical analyses were performed in R (R Core Team 2014) Fig. 1. We finally analyzed the potential distribution of the IAPS in relation to the South African provinces and biomes, and we also assessed the relative occupancy as a function of the species’ residence time.

Fig. 1
figure 1

Map of South Africa showing the currently invaded QCs and potentially new QCs suitable for invasion. In white are the QCs for which the two-part model showed invasion potential values that were lower than the set presence threshold. (Color figure online)

Results

Assessment of variables

We assessed whether the use of anthropogenic variables could improve the accuracy of the predictive model. Table 3 shows the values for each species of the error-rate related to the model run using only environmental variables (OOB error-rate ENV) and to the model run including the human-related variables (OOB error-rate TOT). For each taxon, the best model is expressed with the lowest error-rate value. The “Δ error-rate (ENV-TOT)” shows the differences between the error-rate of the two models: if Δ error-rate is negative, the best model is the one with the environmental set of variables alone, if Δ error-rate is positive, the model with both environmental and anthropogenic variables is the one with a better prediction.

Table 3 Out-of-bag (OOB) error-rate for each species for the two RF models built with only environmental variables (ENV) and with the inclusion of anthropogenic variables (TOT)

For 13 IAPS, the model built with environmental and anthropogenic variables (TOT) seemed to be more suitable for the prediction. For 10 IAPS, the model built only with environmental variables (ENV) was more suitable. For the remaining 4 IAPS, no difference was observed. We then analyzed the statistical significance of these differences, through the use of the Wald test. This test indicated that for nearly all the IAPS, excluding Caesalpinia decapetala and Senna didymobotrya, there are no significant differences in the predictions given by the two models (p value > 0.05).

For the rest of the analysis, we decided to use the ENV model, by excluding the human-related variables, which were not found to be instrumental in improving the accuracy of the model.

Potential spread

The average decrease in the Gini impurity measure showed the species for which the RF model provided a statistically significant prediction. Out of the 27 IAPS considered, only 14 had significant results (p value > 0.05), the remaining IAPS were found to be those with the least number of QC currently invaded (generally less than 50–Table 4). The latter IAPS were consequently excluded from further analysis. Analyzed species showed a differential behavior with species characterized by a small current and predicted invaded areas, e.g., Ceresus jamacaru and Acacia cyclops, and species with a large invaded areas, such as Acacia spp. (A. baileyana, A. dealbata and A. mearnsii) and Ecualyptus spp. (E. camaldulensis and E. conferruminata). Some species have a low relative occupancy and a great potential invasion increase, such as Lantana camara and Melia azederach, while others have already occupied the potential spread (Table 5). We assessed if this was related to the difference in residence time. However, the association between relative occupancy and minimum residence time was weak and no association was identified.

Table 4 The number of QCs currently invaded is reported for each species
Table 5 For each IAPS, we reported the number of QCs currently invaded and potentially invaded (in numbers and  %, calculated as a proportion of the total 1840 QCs), the relative occupancy (RO), being the proportion of potential distribution range currently occupied by each species, the percentage of invasion increase, and the residence time (RT)

The biomes differed markedly in their invasibility. According to the predictions obtained by the model no species seemed to be able to invade the Desert, only two IAPS (Agave spp., Opuntia spp.) could potentially invade the Nama-Karoo and four IAPS (Acacia cyclops, Eucalyptus spp., Pinus spp. and Acacia spp.) could potentially invade the Succulent Karoo. While the other biomes are more prone to invasion, with the Savanna and Grasslands suitable for almost all the IAPS considered (Table 6).

Table 6 For each biome, we reported the number of QCs within the biome, the number of QCs currently and potentially invaded, the percentage area suitable for invasion, the invasion increase, the percentage of new QCs invadable compared to the total, and the number of IAPS for which the biome is suitable for invasion

The areas most affected by the potential spread of the IAPS were found to be those adjacent to the major urban areas, coastal settlements, and the northwestern plains. With the exception of the Northern Cape Province, all South African provinces were found to have a percentage of potentially invaded QCs higher than 50%. The eastern coastal and central-eastern provinces (KwaZulu-Natal, Mpumalanga, and Gauteng) were those that had the highest percentage of potentially invaded area. Gauteng and KwaZulu-Natal, according to the predictive model, would be completely suitable for the spread of the IAPS (Table 7, Fig. 2).

Table 7 For each province we reported the number of QCs within the province, the number of QCs currently and potentially invaded, the percentage area suitable for invasion, the invasion increase, the percentage of new QCs invadable compared to the total, and the number of IAPS for which the province is suitable for invasion
Fig. 2
figure 2

Maps of South Africa showing the currently invaded QCs (left) and potentially invaded QCs (right) with the related number of IAPS currently/potentially found in each QCs. Provinces: WC Western Cape, EC Eastern Cape, NC Northern Cape, NW North West, FS Free State, KZN KwaZulu-Natal, GP Gauteng, MP Mpumalanga, LI Limpopo. (Color figure online)

Determinants of invasion

The two-part model highlighted both positive and negative correlations between the explanatory variables and the species distribution. First, the influence of the explanatory variables on the presence of the IAPS was highlighted with the logistic part (Table 8 in Appendix), and second, the influence of the explanatory variables on the richness of the IAPS was highlighted with the regression part (Table 9 in Appendix).

Table 8 Log odds or estimate values of the Logistic part of the two-part model
Table 9 Coefficient or estimate values of the regression part of the two-part model

The presence of IAPS was found to be positively correlated with the mean precipitation of autumn season, mean annual temperature, mean growing season duration, density of natural areas, mean soil water content, and most of the soil variables. IAPS presence was found to be negatively correlated with the mean annual precipitation, mean precipitation of winter and summer season, mean maximum temperature of hottest month; mean minimum temperature of coldest month, and mean duration of frost period.

IAPS abundance was positively correlated with the mean precipitation of spring and autumn season, mean moisture growing season duration, density of rivers, and two soil variables (index of rocky areas and soils with a plinthic horizon). IAPS abundance was found to be negatively correlated with the mean precipitation of summer season, mean annual temperature, mean duration of frost period, density of water bodies, and two soil variables (red-yellow well-drained, massive or weakly structured soils and well-structured soils, generally with a high clay content).

Importance of variables

The RF model provided a measure of the importance of the explanatory variables influencing the distribution of the IAPS. These importance values were species-specific, but we identified similarities when comparing the 14 IAPS. Climatic variables have an overwhelming importance on the distribution of IAPS, particularly both annual and summer precipitation. While land cover, pedological, and anthropogenic variables are similarly of minor importance, the only exception being pH for Lantana camara, Agave spp. and Opuntia spp. and density of waterbodies for Acacia cyclops (Table 10 in appendix).

Table 10 Classification of explanatory variables importance predicted with the Random Forest (RF) model

Discussion

Modeling approach

We have modeled the distribution of the major South African IAPS, on the assumption that these species are at equilibrium with their environment (Rouget et al. 2004; Guisan and Thuiller 2005). Although most of our IAPS have a long history in the South African region (Table 5), it is likely that they have not yet reached equilibrium due to dispersal limitations (Rouget et al., 2004; Pearson, 2010).

Five major factors could affect the accuracy and the predictive capacity of our models. First, SDMs are based on the assumption that the current distribution of IAPS and the environmental characteristics of their current range provide a good indication of their potential range. However, the predicted potential range could have been overestimated for IAPS occurring in few scattered locations, or underestimated for IAPS currently occurring in a small range (Rouget et al. 2004). Second, spatial bias in the NIAPS database, related to the two overlapping layers (landscape and riparian), may have led in the riparian areas to underestimation of the current and potential distribution of IAPS that are underrepresented in the landscape layer. Third, the NIAPS may have underestimated or overestimated the current distribution of those species that are not very conspicuous or underrepresented for other reasons (e.g., taxonomic identification difficulties). Fourth, averaging the values of the explanatory variables for each QCs is based on the assumption that the mean values represent the point of occurrence of the IAPS. The likelihood of this assumption being erroneous depends on the level of variability of the explanatory variables in the QCs, and will be higher in complex topographic areas (Le Maitre et al. 2004). Fifth, the modeled invasion could have been underestimated due to the fact that after the Gini impurity measure, we excluded 13 response variables from our SDM, which did not result in having statistically significant results.

However, this study is intended to provide guidance for decision-makers, and none of the factors mentioned above significantly affect the overall accuracy or usefulness of our results.

Invasion determinants

Our study showed that the addition of the human-related variables to the environmental set of variables did not result in statistically significant improvements in the accuracy of SDM for 24 out of the 27 IAPS (Table 3). Improvements were obtained only for Caesalpinia decapetala and Senna didymobotrya, which are located in highly urbanized and degraded areas. The reason that the difference between the two GLMs was not significant could be related to the extension of the study area and the rather coarse scale of the study. This scale could undermine the effect of some human-related variables that require higher resolution to be influential.

Moreover, it was found that for only 14 out of the 27 IAPS considered, the variables were statistically significant in explaining the potential distributions (Table 4, Fig. 1). Consequently, we excluded the other 13 IAPS from subsequent analysis, which were found to be those with the least number of QC currently invaded (generally less than 50).

We observed some general trends when analyzing the results obtained from the two-part model and the RF model. The importance values obtained from the RF model showed that at this scale of analysis climatic variables are overall more important than land use, soil and anthropogenic variables in predicting the potential distribution of the IAPS, in line with Ohlemuller et al. (2006)’s findings (Table 10 in Appendix).

The two-part model analysis confirmed the importance of the climatic variables, as they proved to be influential for many of the IAPS. The logistic part of the model also highlighted some influence of the soil variables (Fig. 2, Table 8 in Appendix). We observed general trends of both positive and negative correlations. The fact that the growing period and the mean autumn precipitation had an influence on the presence of plant species was in line with Rouget et al. (2004) and Coudun et al. (2006). We analyzed the maps of the soil types and noticed that the distribution of some soil variables matched the potential distribution patterns of the IAPS. For example, the plinthic horizon soils are usually associated with humid and subhumid warm climates with a distinct dry season (Fey 2010); such climatic conditions are characteristic of the Grasslands (Rutherford and Westfall 1986) where Eucalptus spp., Populus spp. and Salix babylonica are major invaders (Henderson 1999; Van Wilgen 2009). The logistic part outputs also showed a negative correlation between the presence of most of the IAPS and the following climatic variables (Table 8 in Appendix): mean maximum temperature of hottest month, mean minimum temperature of coldest month (as in Rouget et al. 2002, 2004), frost period, and winter and summer precipitation (as in Randin et al. 2009).

In the regression part of the model (Table 9 in Appendix), which analyzed the influence of the environmental variables on species abundance, the fact that the growing days and the frost period had an influence on the abundance of IAPS was in line with Rouget et al. (2004), whereas the influence of the summer precipitation was in line with Pearman et al. (2008) and Randin et al. (2009). The influence of the mean annual temperature on IAPS was already demonstrated in many studies (e.g., Leathwick 1998; Pearman et al. 2008; Rickebusch et al. 2008; Randin et al. 2009).

Aside from the climatic variables, this study seems to confirm that the invasion process is highly species-specific as well as spatially and temporally specific (Mack 1996; Wilson et al. 2007; Theoharides and Dukes 2007). As a result, the spread of the IAPS can be explained only through a combination of environmental factors, species characteristics and anthropic use for each species. Nonetheless, we are confident that by including land cover and soil variables in our models, the predictions are more accurate than by only including climatic variables (Ibáñez et al. 2009; Gassó et al. 2012). Moreover, progress in using SDMs will only be made by increasing the understanding of the ecological drivers of IAPS’ distributions, and the extent of their relationship with determinant variables (Hijmans and Graham 2006). Failure to incorporate an influential predictor decreases model performance and outcome relevance (Austin and Van Niel 2011). Hence great attention should be given to the relative weight, causality, and estimation of each SDM predictor (Araújo and Guisan 2006).

Potential distribution

Once we selected the variables and the species for which the model provided statistically significant predictions, we analyzed the differences between the current and potential distributions, in order to assess the overall invasion risk for each species and the invasion susceptibility of the QCs. The potentially invaded QCs predicted by the model for the 14 IAPS considered, amounted to 1299 QCs out of the 1840 QCs, i.e., 568089.8 Km2 (46, 5% of South Africa). Compared to the 1018 QCs currently invaded by our IAPS, the model predicted that 181 new QCs have the potential to be invaded, corresponding to 120972.0 Km2, which would represent a 21.3% increase of the current invasion (Fig. 1). The 172 new QCs are in the proximity of the currently invaded QCs: along the western and eastern coasts, and across the northern plains. The QCs that appeared to be suitable to a higher number of IAPS were located along the eastern coast of KwaZulu-Natal and the plains, grasslands and savanna of Limpopo, Gauteng, and Mpumalanga (Fig. 2).

The invasion process resulted highly dependent on the biome types. The results showed that the Desert had not been and was not predicted to be invaded; the Nama-Karoo and the Succulent Karoo have, respectively, only 2.7 and 13.5% of their QCs potentially to be invaded (Table 6). Conversely, the other biomes (Albany Thicket, Fynbos, Grasslands, Savanna, and the Indian Ocean Coastal Belt) were found to be suitable for invasion from IAPS with over 80% of their area suitable for invasion. This result may be explained considering the higher level of human disturbance, propagule pressure and nutrients that differentiates these biomes from those of the arid and hyperarid zones.

From a catchment and water stress perspective, the suitability of the Grasslands to invasion, particularly by a high number of woody IAPS including riparian Salix babylonica and Populus spp., is of great concern (Table 7). In fact, in this area, catchments have generally high water yields and these invaders can significantly reduce catchment runoff and cause great water scarcity (Le Maitre et al. 2000; Rouget et al. 2004).

Richardson and Rejmánek (2011) have shown that the Savanna is one of the biomes that most recently experienced a high increase in plant invasion due to propagule pressure from other savannas outside South Africa. Earlier literature indicated that in South African savanna plant invasion was not considered as a major issue (Parsons, 1972), however, subsequent studies highlighted the increased invasion in humid and riparian areas of this biome (Henderson and Wells 1986; Turpie 2004). The major invaders in the Savanna biome were found to be Cereus jamacaru, Melia azederach and species of Eucalyptus, Acacia, and Agave.

The model showed the suitability of South African provinces to invasion. The Northern Cape was the only province that had a very low suitability for IAPS; 5.4% of its areas could potentially be invaded (Table 7). In fact, the Northern Cape is mostly occupied by biomes with a very low suitability: Desert, Succulent- and Nama-Karoo. The lack of potential IAPS in the mountain areas of the Western Cape depends mainly form the fact that the Fynbos is invaded by alien species, fire dependent and well specialized to survive nutrient-poor soils (Richardson et al. 1997) that were excluded from the modeling due to the low number of sample points in the NIAPS. The other provinces were found to be highly suitable for IAPS invasion: 50 (Free state province) to 100% (Gauteng province) of the total province area could potentially be suitable (Table 7, Fig. 2).

The KwaZulu-Natal was suitable to all the 14 IAPS considered, except Acacia cyclops. The area that could potentially be completely invaded amounts, and according to the low invasion increase value (17.9%), it seems that the future invasion will be related to an increase in the abundance of currently present IAPS, rather than to an increase in the number of invaded QCs. In this case, control actions should focus on eradicating the most affected areas rather than preventing further spread in other QCs. The Gauteng province resulted as the most threatened by our IAPS, in line with the fact that two of the most suitable biomes are found within it, namely, Grasslands and Savanna, and that it is characterized by large urban areas, such as Johannesburg. The invasion increase value is also very low (6.7%) since the current IAPS distribution is already consistent (Table 7).

Table 5 shows the difference between the invasion increase percentages among the IAPS considered. Some species had a rather limited current and potential invasion area, e.g., Acacia cyclops and Senna didymobotrya. These species tend to occupy specific and often very limited biomes, such as the Fynbos in the case of Acacia cyclops, and the Indian Ocean coastline in the case of Senna didymobotrya. Other species are characterized by a wider potential distribution: species of the genera Acacia, Eucalyptus, and Pinus are able to potentially invade several biomes, namely Savanna and Grasslands. Moreover, these last species are characterized by the highest relative occupancy Eucalyptus, Pinus, and Acacia, with more than 80% of the suitable area occupied, and thus appear to have nearly reached the full extent of their potential distribution range. Other species seem to be still in the full expansion phase, e.g., Lantana camara has an invasion increase of 432.2%, and Melia azederach of 229.6%. This difference in invasion dynamics is due to the following factors: climate, characteristics of the species, characteristics of the invaded biome, and anthropic plant use. In South Africa, the largest biomes are Savanna, Grasslands, and the Karoos, therefore, the species more suited to invading these biomes will have a greater chance to invade a broader area, compared to species adapted to other biomes (Wilson et al. 2007). Moreover, Maestre and Cortina (2004) showed that arid environments are more likely to be less competitive than humid environments that, being more open and exposed, are more susceptible to invasion.

Residence time is often identified as having a positive correlation with the extent of occurrence of IAPS, i.e., species introduced earlier, on average occupy a higher proportion of their potential distribution range (Rejmánek 2000; Castro et al. 2005; Gassó et al. 2012). However, in South Africa, Thuiller et al. (2006) showed that minimum residence time did not explain the distribution patterns of invaders, even after removing the confounding effect of the environment, and that minimum residence time is a limited value when considering distribution patterns at regional scale after a century of residence. Similarly, in this study, no significant relationship was observed, as species introduced earlier (e.g., Salix babylonica, RO = 52.4%, RT = 337 years) showed a relative occupancy that was comparable to that of other species introduced more recently (e.g., Chromolaena odorata, RO = 49.6%, RT = 158 years). Some species introduced a long time ago had still a restricted distribution with respect to their modeled potential distribution range. Species introduced more recently (Eucalyptus spp.) had a much higher relative occupancy (RO = 74.1%) compared to species introduced only few years earlier (Lanatana camara with a RO = 13%). The link between relative occupancy and minimum residence time could be weak for species introduced many centuries ago (Salix babylonica, Pinus pinaster and Opuntia ficus-indica) due to the high uncertainty related to old first records (Gassó et al. 2012). Furthermore, as mentioned, SDMs are based on the assumption that organisms are at equilibrium with their environment, however, this might not be the case for recently introduced species (Opuntia stricta and Cereus jamacaru introduced less than a century ago) which could undermine predictions (Gassó et al. 2012; Thuiller et al. 2006).

Conclusions

IAPS pose a significant threat to biodiversity and functioning of the ecosystems, and billions of dollars are spent to control them (de Lange and van Wilgen 2010). The most cost-effective approach is prevention (McConnachie et al. 2012). Hence, the success of biological invasion management depends on the ability to predict the potential distribution of IAPS and identify the invasion determinants (Sharma et al. 2005). Van Wilgen et al. (2012) suggested that South African plant control programs should prioritize both the species and the areas: IAPS control operations begin by identifying the priority species and catchments, and then consulting with the decision-makers and key stakeholders (Balmford 2003). SDMs are powerful tools that can be used to plan IAPS management programs through: (i) the classification of invaded areas for differentiated management actions, (ii) the support of control initiatives for preventing IAPS’ spread, (iii) the information of funds reallocation away from controlling IAPS in areas where suitability is expected to decrease in the future, and (iv) the identification of opportunities for relatively inexpensive invasion prevention (Kriticos et al. 2011).

In this paper, we modeled the distribution of IAPS using as sampling units the quaternary catchments that are also the basic hydrological unit for water management. This approach provides us with both scientific insights on IAPS spread processes as well as some indications for decision-making to plan control operations, for which efficiency and prioritization are fundamental (Rouget et al. 2004). At the national scale, we confirmed the prevailing importance of climatic variables with respect to land cover and anthropogenic factors. Moreover, we found out that overall the newly potentially invaded QCs are much less than the currently invaded ones, therefore control operations should focus on managing the density of priority IAPS within their current range (see Van Wilgen et al. 2007), rather than preventing the expansion of the distribution (Rouget et al. 2004). This type of approach should necessarily be accompanied with finer scale analysis to identify local components of the problem so that the control operations can be structured as efficiently as possible.