Introduction

In recent years, the number of studies applying species distribution models (SDMs) has remarkably increased (Guisan and Thuiller 2005; Guisan and Zimmermann 2000; Jeschke and Strayer 2008). These are used for various applications such as assessments of possible climate change impact (e.g. Araújo et al. 2004, 2006; Rödder and Schulte 2010), potential establishment of invasive species (e.g. Bomford et al. 2009; Peterson and Vieglais 2001; Rödder and Lötters 2010; Rödder et al. 2008a, b), reserve selection (e.g. Araújo et al. 2004), spatial epidemiology (e.g. Peterson 2007; Rödder et al. 2008a, b, 2009a) or species delimitation related to taxonomy or historical biogeography (Brown and Twomey 2009; Raxworthy et al. 2007; Lötters et al. 2010; Rödder et al. 2010). Generally, SDMs try to characterize the niche of a species and project it into geographic space including regions from which it is unknown. This can be achieved by applying mechanistic models based on detailed information on the study species' ecology and physiology (e.g. Kearney and Porter 2004; Kearney et al. 2008). Alternatively, correlative models can be developed using georeferenced species records and environmental information stored in grid-based geographic information system (GIS) layers (e.g. Heikkinen et al. 2006; Jeschke and Strayer 2008). The second mentioned characterize the target species' niche by comparing environmental conditions at presence records with conditions observed at localities where the species is absent. If reliable absence records are unavailable, as it is the case in the majority of species (Gu and Swihart 2004), pseudoabsence data or background data reflecting the available climate space can be used (Jeschke and Strayer 2008; Phillips et al. 2006). The general idea of these correlative models is the identification of statistically significant relationships between a species presence at a given locality and features of the environment, which are subsequently used to assign probability values or an index of relative habitat suitability to all grid cells covering the study area (e.g. Guisan and Thuiller 2005; Guisan and Zimmermann 2000; Jeschke and Strayer 2008).

When interpreting SDMs, it is important to distinguish between a species' fundamental and its realized niche (e.g. Godsoe 2010). As defined by Hutchinson (1957, 1978) and later extended by Soberón and Peterson (2005), a species' fundamental niche is equivalent to the complete set of environmental conditions under which population growth rate is ≥1. A species' realized niche is what at maximum can be captured when using species records in SDMs. Hence, commonly only a subset of the fundamental niche, determined by geographic dispersal barriers, intrinsic dispersal limitations of a species and biotic interaction including competition, predation etc. (Godsoe 2010; Soberón 2007; Soberón and Peterson 2005).

Although, many advances have been made during the last two decades, the theoretical background and the validity of underlying biological assumptions of SDM applications still is not well understood (Heikkinen et al. 2006; Jiménez-Valverde et al. 2008). Crucial assumptions in SDMs are that the range of the study species is in equilibrium with environmental variables (Araújo and Pearson 2005) and that its fundamental niche is ‘conservative’ in space and time (Pearman et al. 2008; Wiens and Graham 2005). However, to date only a limited number of studies assessing the validity of these assumptions has been carried out (e.g. Araújo and Pearson 2005; Broennimann and Guisan 2008; Broennimann et al. 2007; Graham et al. 2004; Pearman et al. 2008; Peterson et al. 1999; Rödder and Lötters 2009; Warren et al. 2008). It has been shown that, although the predictive ability of SDMs can be robust within the area for which the models have been fitted, their transferability into novel areas or even for the same area in time may be critical (e.g. Aaújo and Rahbek 2006; Beaumont et al. 2009; Fitzpatrick et al. 2007; Randin et al. 2006). Among others, possible reasons for limited model transferability may include problems caused by non-analogous climate (Fitzpatrick and Hargrove 2009) or the use of indirect predictors correlated to actually physiological meaningful predictors in combination with differences in the correlation matrices of predictors among areas (Heikkinen et al. 2006). Apart from this, differences in species compositions between areas affecting biotic interaction or evolutionary processes and thereby altering niche properties (e.g. Peterson et al. 2007; Randin et al. 2006; Vanreusel et al. 2007 and references therein) may affect model transferability.

Invasive species, as unanticipated experiments, are well suited to provide valuable insights for ecology and evolutionary biology, including evolutionary change in niche properties (Holt et al. 2005; Kozak et al. 2008; Sax et al. 2008). It is well known that successful establishment of a non-indigenous species into an existing species community depends on factors such as species richness, competition, predation, food availability and human footprint (Ehrlich 1989; Rödder 2009; Williamson 1996). Perhaps more central in this context, the climatic similarity of the invaded area compared with the source ecosystem often is a useful predictor (but not always, see below) (Bomford et al. 2009; Peterson and Vieglais 2001). As a consequence, invasive species allow an insight and characterization of the interaction between available climate space within a species' range, its realized distribution, its potential distribution and the overall structure of its climatic niche (Rödder and Lötters 2009). However, it has to be noted that the realized distributions of invasive species may be in disequilibrium with climate, again complicating SDMs.

Although an important assumption when projecting SDMs through space and time is niche conservatism (e.g. Graham et al. 2004; Jeschke and Strayer 2008; Peterson et al. 1999), recent studies have noticed a mismatch between realized climate niches in species' native and invasive ranges. It has been suggested that niche shifts may have taken place during biological invasion (e.g. Broennimann et al. 2007). Such a shift may concern only the realized niche or even both the realized and fundamental niches (e.g. Beaumont et al. 2009; Broennimann and Guisan 2008; Broennimann et al. 2007; Fitzpatrick et al. 2007). The possibility of niche shift over a period of relatively short time would seriously violate basic assumptions behind SDMs and weaken their reliability.

Besides a possible evolutionary response (e.g. Broennimann et al. 2007), Peterson and Nakazawa (2008), Rödder and Lötters (2009) and Rödder et al. (2009a, b) have demonstrated that the choice of environmental data sets used for model building strongly affects the SDM output. For instance, climatic conditions in different geographic regions may show strong variation in climatic variables. Some of them may actually be biologically ‘meaningful’, affecting the fitness of the species under study and restricting its geographic range. Others may have a minor impact in a particular area, because other variables are more limiting (see Fig. 1; termed ‘relaxed’, following Rödder and Lötters 2009). It is likely that different parts of the environmental (or climatic) niche space suitable to a species are captured by different variable sets (Fitzpatrick et al. 2008; Godsoe 2010; Rödder and Lötters 2009). Hence, it is reasonable to assume different predictive abilities of models if a species occupies a different climatic space in its native and invasive ranges (Fitzpatrick et al. 2008); this may lead to truncated response curves related to SDMs. This might be especially true if the available gradients of particular variables vary between a species’ native and invasive ranges (e.g. due to landscape heterogeneity). As a consequence, different sets of variables may be limiting in different regions and response curves in SDMs may hence simulate parts of the environmental tolerance of the species only. Further understanding is expectable when directly comparing the available gradients of particular variables each within species' native and invasive ranges, i.e. more information on the species' physiological limits representing its fundamental niche breadth and conditions available in its range. Figure 1a illustrates such a possible dependencies considering three variables in three distinct areas of occurrence neglecting possible interplays of variables for facilitation proposes (i.e. native and invasive ranges 1 and 2). The available gradient of a particular variable can either cover only parts of the species' fundamental niche breadth or exceed it (e.g. variables 1 and 3 of the native range in Fig. 1a). As a result, a particular variable may or may not be directly limiting—depending on its available gradient within the species' general area of occurrence in relation to the breadth of its fundamental niche. Furthermore, differences in the composition of local species communities rather may result in differences in biotic interaction, which in turn may lead to varying habitat occupancy.

Fig. 1
figure 1

a Theoretical differences in the available gradients of three niche variables (en-dashed boxes), for each of which the fundamental niche breadth (solid light grey boxes) is assumed to be constant, in native versus invasive ranges of a species neglecting interplays of variables for facilitation purposes. The fundamental niche breadth of multiple variables is delimited by their minimum overlap as indicated by the solid arrows. The potential niche breadth illustrated as em-dashed boxes can either be defined through the available gradient of variables within a species' geographic range (native, invasive 1) as indicated by em-dashed arrows or through the species' fundamental niche breadth of a single variable within a species' geographic range (invasive 2). b In a SDM, ideally, the potential niche breadth is identified via variable information obtained at species records. When using species absence or pseudoabsence data (i.e. random points), the explanatory power of variables used for SDM development highly depends on the overlap between the available gradient of variables with the geographic range and the potential niche breadth (as defined in a). For the explanatory power of variables in SDMs, it is important if random points only lie within or whether they exceed the potential niche breadth. The first mentioned has no or little contrast and hence is less informative (e.g. variable 3 in invasive range 1), while the second mentioned is more informative (e.g. variable 2 in native range). Note that variable interactions are neglected herein; for a discussion of possible effects of interactions between variables affecting response curves see Godsoe (2010)

Considering multiple variables, the distribution of a species is limited by its tolerance ranges for all local environmental factors. The variable that first exceeds this tolerance is actually restricting the range according to Schelford's (1931) ‘law of tolerance’. This variable can also define the species’ potential niche (Jackson and Overpeck 2000) within a particular area of occurrence (e.g. variables 1 and 3 of the native range in Fig. 1a). Note that herein the ‘potential niche’ refers to the niche space potentially suitable to a species, while the ‘realized niche’ refers to the niche space actually occupied by it. Due to differences in the available gradients of variables within a species’ native and invasive range, the variable(s) actually limiting the range may differ between regions. In Fig. 1, the realized niche breadth in the native range is defined by variables 1 and 3 but variables 1 and 2 actually are limiting the invasive range 1. Additionally, if the fundamental niche breadth of the species regarding one variable is smaller than its niche breadth (regarding all other variables and the available gradient of this variable) this is not considered as limiting within a species’ range. The fundamental niche breath itself may limit its geographic distribution (see Fig. 1a, invasive range 2). Note that the potential niche breadth is defined as the complete set of conditions suitable to the species which may not necessarily be entirely covered by the species realized distribution, e.g. due to biotic interaction or dispersal limitations, in contrast to its realized niche. Furthermore, in reality the dependencies may be more complex with regard to possible interdependency and interaction among variables (Godsoe 2010).

These theoretical considerations should strongly implicate the transferability of SDMs in space and time. It becomes clear that the available gradient of environmental variables within species' native or invasive ranges may remarkably influence the explanatory power of single variables. If this translates into different weightings of variables in models trained with native (or invasive) records, models may become misleading when projected into areas showing different available variable gradients exceeding the calibration range. In this paper, we demonstrate this by assessing differences in realized climatic niches in the native and invaded ranges of the Caribbean Greenhouse frog (Eleutherodactylidae: Eleutherodactylus planirostris) in terms of commonly applied climate variables in SDMs. We analyze which variables are more ‘conserved’ in comparison to ‘relaxed’ variables (i.e. subject to niche shift) and whether the observed patterns can be traced back to different available variable gradients or active habitat choice. The results are discussed in the theoretical framework presented above.

Methods

Study species

The Neotropical frog genus Eleutherodactylus includes some of the most successful invasive vertebrate species (Bomford et al. 2009). Their success in the colonization of new habitats can cause major environmental and socio-economic problems (e.g. as shown for Eleutherodactylus coqui; Beard and Pitt 2005). The Greenhouse frog, E. planirostris, is a moderate-sized (maximum female snout-vent length 36 mm), tan to brown species (Schwartz and Henderson 1991). Native to Cuba and the Bahamas, it is found in various habitats including both mesic and xeric broadleaf forest as well as gardens (Schwartz and Henderson 1991).

Invasive populations of E. planirostris are known from Grenada, Guam, Jamaica, Mexico (Veracruz), Turks and Caicos Islands and US mainland (Alabama, Florida, Georgia, Louisiana, Mississippi) as well from the Hawaiian Islands. Climatically, the native and invasive distributions of the Greenhouse frog can be considerably different (Jensen 2008; Kraus 2008; Lever 2003). First reports of Greenhouse frogs on the US mainland (Florida) were documented between 1863 and 1875 (Dundee and Rossman 1989; Kraus 2008). Today, E. planirostris is one of the most common frogs on Key West (Wilson and Porras 1983) and has spread northward as far as Lowndes and Thomas counties in southern Georgia (Jensen 2008). Several invasive populations occur in the area between Florida and New Orleans. Here, the Greenhouse frog's distribution is in equilibrium with climate, derived from the much shorter reproductive periods in the more northern invasive range (Goin 1947; Dundee and Rossman 1989). Generally, reproduction takes place between May and September in Florida (Goin 1947), but is restricted to mid June to July in New Orleans and elsewhere in Louisiana (Dundee and Rossman 1989). Females lay three to 26 eggs in moist soil from which, after direct development (i.e. absence of free swimming larva), froglets hatch after 13–20 days and reach maturity within 1 year (Goin 1947; Schwartz and Henderson 1991). Egg development appears to be most successful in 100% substrate humidity (Lazell 1989), implying a strong dependency on precipitation or waterlogged soil.

The impact of invasive E. planirostris populations on native fauna is still poorly studied. While populations in Florida are expected to have little or no impact on native species (Nonindigenous Aquatic Species (NAS), the data base of the US Geological Survey (USGS); http://nas.er.usgs.gov/queries/CollectionInfo.asp?SpeciesID=61), Kraus et al. (1999) and Kraus and Campbell (2002) suggested that invasive populations in Hawaii may be a serious threat to native arthropods. Since the Greenhouse frog was introduced into Florida more than 130 years ago, impacts may be obfuscated by their long presence in this state. Hawaiian populations may have established in the early 1990s only (Kraus and Campbell 2002).

Pough et al. (1977) showed that climatic conditions are directly correlated with activity patterns and habitat choice in E. planirostris, wherein temperature and moisture conditions of its preferred habitats are closely related to its physiological properties. According to laboratory experiments conducted by these authors, the preferred temperature of E. planirostris is 27.3 ± 0.66°C with its critical maximum temperature ranging 36.4–41.8°C (acclimated to 20°C: mean = 38.7 ± 0.38°C, range = 36.4–40.0°C; acclimated to 30°C: mean = 40.5 ± 0.35°C, range = 39.0–41.8°C) (Pough et al. 1977). Although laboratory experiments describing the frog's thermal tolerance regarding lower limits are lacking, Zippel et al. (2005) reported that Greenhouse frogs may even survive in a bag of cypress mulch stored under freezing conditions for at least 32 days.

Species records

We use records of E. planirostris obtained from collections linked to the Global Biodiversity Information Facility (GBIF) (2008), the HerpNet (2008) database, the NAS database of USGS (see above) and the Herpetological Database of the Florida Museum of Natural History (http://www.flmnh.ufl.edu); in addition, distribution data were adopted from published references (Lever 2003; Kraus 2008; Schwartz and Henderson 1991). Only two other invasive Eleutherodactylus species are known from Florida and Hawaii (E. coqui and Eleutherodactylus portoricensis; NAS database of USGS), which are easily distinguished from E. planirostris. For model computation, only records from within areas with confirmed reproduction listed in the NAS database are considered. Georeferencing was conducted when necessary with the BioGeoMancer (http://bg.berkeley.edu/latest/). Records are only included if they can be unambiguously assigned to a single grid cell leaving a total number of 188 (67 native, 121 invasive in mainland USA, 35 invasive in Hawaii; Fig. 2). We used DIVA-GIS 5.4 (Hijmans et al. 2002; http://www.diva-gis.org) to test the accuracy of coordinates (Check Coordinates tool) by comparing information provided with the species records to locality information extracted from an administrative boundaries database at the smallest possible level (state/country/city). This information should be the same and any mismatches probably reflect errors. In the cases of such incongruence, data were corrected.

Fig. 2
figure 2

Eleutherodactylus planirostris records used in this study. Native records in Cuba and on the Bahamas are indicated as black dots and invasive records on mainland USA (a) and Hawaii (b) as triangles

Climate data

Information on current climate was obtained from the Worldclim database (version 1.4), which is based on weather conditions recorded between 1950 and 2000 with grid cell resolution 30 arc sec (corresponding ca. 1 × 1 km within the study areas) (Hijmans et al. 2005; http://www.worldclim.org). Raw climate data include monthly mean variables of minimum and maximum temperature and precipitation. Based on these data, 19 so-called bioclimatic variables can be calculated with DIVA-GIS 5.4 (Table 1). Bioclimatic variables have been proven to be useful for many large scale SDM approaches (Busby 1991; Nix 1986). With the goal to assess possible intercorrelations of variables and to compare conditions at native and invasive Greenhouse frog records with conditions available, we extracted all 19 bioclimatic variables at each record and computed pair-wise Pearson's correlation coefficients. For subsequent analyses, we used ten variables with R 2 < 0.75 only, which are thought to be biologically relevant for the Greenhouse frog based on the information given above (i.e. ‘annual mean temperature’, ‘mean monthly temperature range’, ‘maximum temperature of the warmest month’, ‘mean temperature of the wettest quarter’, ‘mean temperature of the warmest quarter’, ‘mean temperature of the driest quarter’, ‘annual precipitation’, ‘precipitation of the wettest month’, ‘precipitation of the driest month’ and ‘precipitation of the warmest quarter’. Furthermore, we assessed bioclimatic conditions within the general area of occurrence defined by a minimum convex polygon (MCP) including all native (and likewise invasive) records. For illustrative purposes, we here plot them as boxplots using XLSTAT 2009 (Addinsoft).

Table 1 Relative variable contribution in three times 100 Maxent models trained either with native records (nat), invasive records on mainland USA (inv USA) or on Hawaii (inv HW) or both native and invasive records of Eleutherodactylus planirostris and a set of ten bioclimatic variables

Species distribution models

To develop SDMs, we use Maxent 3.3.2 (Phillips et al. 2006; http://www.cs.princeton.edu/~shapire/Maxent), a machine-learning algorithm. In different comparative tests, Maxent often performed better than other methods using presence records and random background data (e.g. Elith et al. 2006; Hernandez et al. 2006; Wisz et al. 2008). The Maxent algorithm computes a probability distribution covering the study area that satisfies a set of constraints. These constraints are derived from environmental conditions at species presence records and require that the expected value of an environmental variable or a function thereof must be within a confidence interval derived from its empirical mean. Maxent chooses a distribution with maximum entropy (Jaynes 1957) within all possible distributions, which satisfies these constraints (Phillips et al. 2006). Computations were conducted using the default values for all program settings. Background points were randomly chosen within those areas enclosed by MCPs comprising all native (and invasive) records reflecting areas accessible for the species, as recommended by various authors (Phillips 2008; Warren et al. 2008; Anderson and Raza 2010; Godsoe 2010). SDMs, to derive the species' potential distribution, are computed (1) for the frog's native and invasive ranges in mainland US and Hawaii, respectively, using all ten bioclimatic variables simultaneously in order to determine the relative contribution of each variable (see below) and (2) for each bioclimatic variable separately in order to assess realized niche overlap, realized niche equivalency and similarity per variable. In these approaches Maxent is operated via a Perl tool developed by Warren et al. (2008) (see below). However, due to computation limitations, the second approach is restricted to the Greenhouse frog's native and invasive ranges on mainland USA.

  1. (1)

    Assessment of possible differences in the variable contribution in Maxent models derived from the set of ten bioclimatic variables as expected according to our theoretical framework. When trained with multiple predictors, Maxent allows tracking the relative contribution of each variable allowing comparisons of models developed for a species’ native range, its invasive range and both native and invasive ranges combined. Therefore, relative variable contributions are determined in each step of the training process when increasing the gain of the model by modifying the coefficient for a single feature. Subsequently, Maxent assigns this increase to the variable(s) the feature depends on and converts them to percentages (Phillips et al. 2006; Phillips and Dudík 2008). To test for model performance of all ten bioclimatic variables, we each trained 100 models with 70% of the native records, invasive records or both native and invasive records combined and random background points taken from MCPs enclosing the respective records. The remaining 30% of the records were used as test points as suggested by various authors (Fielding and Bell 1997; Jeschke and Strayer 2008; Phillips et al. 2006). Using Spearman's rank correlation coefficients we tested for the significance of variation in variable contribution. This was based on the average contribution per variable over all 100 models computed for the frog's native range, its invasive range on mainland USA and Hawaii and for all pooled records. Furthermore, similarities of model projections onto a grid comprising the entire known range of the Greenhouse frog (i.e. native and invasive ranges on mainland USA and Hawaii) were assessed using Schoener's D and a modified Hellinger Distance (I), as proposed by Warren et al. (2008, see below).

  2. (2)

    When trained with single variables, the relative contribution of each variable is 100%. Therefore, it is not possible to evaluate the explanative power per variable using the same approach as above. Fortunately, Maxent allows for model testing by calculation of the area under the receiver operation characteristic curve (AUC) (Fielding and Bell 1997). AUC values represent the ability of the model to distinguish presence data from random background data (Phillips et al. 2006), wherein an AUC value of 0.5 suggests no better discrimination ability than expected by chance in presence/absence evaluation data sets and an AUC value of 1.0 suggests perfect discrimination (Swets 1988). Maximum AUC values computed with presence background or presence/pseudoabsence evaluation datasets may be <1 (Phillips et al. 2006), however. As a nonparametric ranking tool, AUC values can therefore be used as a quantitative description how tightly the observed range of a specific variable mapped into geographic space reflects the actual species occurrences and hence as a proxy for the explanative power per variable. However, it needs to be noted that AUC values do not necessarily reflect the goodness of fit to the species' true response as a function of the environment (Lobo et al. 2008).

To identify possible variation among different algorithms, we computed additional SDMs for the frog's native and invasive range on mainland USA using single variables with BIOCLIM (Busby 1991) and DOMAIN (Carpenter et al. 1993) as implemented in DIVA-GIS 5.4, which both do not have a variable weighting function, however. To evaluate BIOCLIM and DOMAIN models, we apply the same approach as for Maxent models but use DIVA-GIS for AUC computation. It needs to be noted that, when trained with multiple variables, BIOCLIM and DOMAIN assign an equal weight to each variable why the analyses described in the approach above remain restricted to Maxent.

Realized niche overlap, similarity and equivalency

In order to compare the Greenhouse frog's realized climatic niches in terms of potential distributions, we used Schoener’s index for niche overlap (D) and a modified Hellinger distance (I), as proposed by Warren et al. (2008). Both indices allow quantitative similarity assessments of two potential distribution maps (i.e. GIS grid layers) by computing the differences between them cell by cell. D and I values range from 0, indicating that two maps are completely different, to 1, suggesting that both maps are equal. In the past, different comparative methods and null hypotheses have been used to quantify and define niche conservatism (Warren et al. 2008). Peterson et al. (1999) for example assessed realized niche similarity to uncover whether SDMs derived from occurrences of one species predict occurrences of a second better than expected under a null hypothesis, i.e. that SDMs provide no information on one another's range. On the other hand, Graham et al. (2004) performed a test of realized niche equivalency to evaluate if the niches of two species effectively were indistinguishable, i.e. that they are not identical (see also Knouft et al. 2006; Pfenninger et al. 2007). Realized niche similarity and equivalency represent two extremes within a spectrum of niche conservation. Hence, applying one or the other is expected to cause conflicting conclusions, as pointed out by Warren et al. (2008).

For realized niche equivalency, we applied a randomization test based on the metrics D and I. We computed each 100 pseudoreplicate datasets by randomly resampling the combined native and invasive records into sets of the original size of native records (or invasive). Subsequently, we computed SDMs from each pseudoreplicate using each of the ten bioclimatic variables separately and compared their D and I, respectively. The observed D and I values were compared with the percentiles of the former computed null distributions. This was conducted as a one-tailed test since observed overlap values falling in the null distribution suggest niche equivalency of the species, whereas observed values outside the null distribution indicate significant differences. Note that in the later case, observed values can only be lower than the null distribution but not higher making it a one-tailed test. It evaluates the hypothesis that SDMs for native and invasive records were not significantly different. This test evaluates niche conservatism in a strictest sense, i.e. the effective equivalency of the realized climatic niches in the native and invasive ranges.

In a second test for realized niche similarity, another randomization test introduced by Warren et al. (2008) is applied each separately focussing at the ten bioclimatic variables as predictors. It compares the actual similarity of SDMs based on native records in terms of D and I values to the distribution of similarities obtained by comparing them to SDMs obtained by randomly choosing the same number of cells as originally included as invasive records from among the cells in the study area of the invasive records. The same procedure is performed two times using either invasive records and native background or native records and invasive background each 100 times to construct an expected distribution of D and I values between SDMs based on actual occurrences and random background data points. Appropriate selection of background points is important here, since they can influence the significance of the test (Anderson and Raza 2010). A background restricted to areas accessible to the species is considered as optimal, since environmental conditions in other regions may be less informative with regard to its environmental preferences (e.g. Phillips 2008). As a result, we restrict background points to the area defined by the MCP comprising all native (and likewise invasive) records. These null distributions served test to assess the following null hypothesis: measured realized niche overlap between native and invasive ranges is explained by regional similarities or differences in available habitat (Warren et al. 2008). This hypothesis is rejected if the actual similarity falls outside the 95% confidence limits of the null distribution. The test is two-tailed since results outside the null distributions in either direction are both biologically plausible and interesting (Warren et al. 2008). Significantly higher values suggest that SDMs are more similar than expected by chance indicating active habitat choice and lower values indicate greater differences due to habitat avoidance (Rödder and Lötters 2009; Warren et al. 2008). This test, applied to models computed with single variables, allows for an assessment of similarities in the estimated habitat suitability, taking into account that the potential niche is limited due to multiple variables (Fig. 1b). All computations of D, I, realized niche similarity and equivalency were performed with a Perl script written by Warren et al. (2008).

Results

Environmental conditions at native records, invasive records and within the general area of occurrence

Figure 3 provides an overview of bioclimatic conditions at native and invasive records of E. planirostris affecting the Maxent probability distributions compared in our tests. The ranges of the ‘mean temperature of the wettest quarter’, ‘mean temperature of the warmest month’, ‘annual precipitation’, ‘precipitation of the wettest month’ and ‘precipitation of the warmest quarter’ at invasive records within US mainland completely fall within the variable gradients observed at native records; or they only slightly exceed the conditions observed at native records. At the same time, the variable gradients well reflect climatic conditions within the species' general area of occurrence. High proportions of the ranges of the ‘annual mean temperature’, ‘mean monthly temperature range’, ‘mean temperature of the driest quarter’ and ‘precipitation of the driest quarter’ at invasive records within US mainland clearly differ those observed at native records. The available ranges of these variables also show the largest differences between the Greenhouse frog's general area of native and likewise invasive occurrences. In contrast, the available climate space within the Greenhouse frog's native and invasiv ranges on US mainland is much more restricted than the climate space available in its Hawaiian invasive range. However, the species often utilizes relatively small portions of the available climate space only, which are more similar to the conditions within the frog's native range (‘annual mean temperature’, ‘maximum temperature of the warmest month’, ‘mean temperature of the wettest quarter’, ‘mean temperature of the warmest quarter’, ‘mean temperature of the driest quarter’; Fig. 3). These patterns suggest that particular climate conditions actually limit the species range here.

Fig. 3
figure 3

Comparison of bioclimatic variable scores as observed at native and invasive records of Eleutherodactylus planirostris (NAT REC native, FL REC invasive mainland USA, HW REC invasive Hawaii) and their general availability (BAC) throughout the frog's native and invasive ranges defined by a minimum convex polygon enclosing all native (and likewise invasive) records

Variable importance and similarity of SDMs considering different ranges

The relative variable contribution in Maxent models trained with either native records only, invasive records only or both native and invasive records of E. planirostris combined and using the set of ten bioclimatic predictors simultaneously is shown in Table 1. The relative variable contribution varies in all three cases, as expected by our theoretical considerations. Comparing all variable contributions, we find significant correlations in three comparisons (nat/invHW; nat/comb; invUS/comb) (Spearman's rank correlation coefficients: ρ nat/invUS = 0.442, P = 0.197; ρ nat/comb = 0.709, P = 0.026; ρ invUS/comb = 0.745, P = 0.017, ρ nat/invHW = 0.648, P = 0.046; ρ invHW/comb = 0.612, P = 0.063). Comparing only correlations between the relative contribution of variables contributing more than 5%, no significant correlations with exception of the comparison invHW/comb are detectable, however (Spearman's rank correlation coefficients: ρ nat/invUS = 0.024, P = 0.977; ρ nat/comb = 0.643, P = 0.096; ρ invUS/comb = 0.690, P = 0.069, ρ nat/invHW = 0.214, P = 0.619; ρ invHW/comb = 0.762, P = 0.037).

Explicit spatial comparisons among SDMs trained with either native or invasive records or both native and invasive records combined and subsequently projected into the Greenhouse frog's entire range reveals niche overlaps at values D = 0.39–0.62 and I = 0.58–0.72, respectively (D comb/nat = 0.57; I comb/nat = 0.70; D comb/invUS = 0.62; I comb/invUS = 0.72; D comb/invHW = 0.52; I comb/invHW = 0.67; D nat/invUS = 0.39; I nat/invUS = 0.58; D nat/invHW = 0.50; I nat/invHW = 0.65; D invUS/invHW = 0.53; I infUS/invHW = 0.67). This merely indicates marginal similarities of the potential distributions derived from SDMs trained within the different ranges. In one case, D values indicate that the spatial distribution of the potential distributions derived from the Greenhouse frog's native range and its invasive range on mainland USA even are contradictory.

Realized niche overlap, similarity and equivalency

Table 2 provides an insight into realized niche overlap, similarity and equivalency tests among the Greenhouse frog's native and invasive ranges on mainland USA. D values ranged 0.28–0.87 and I values 0.55-0.91 (note that I values are always higher than D values; Warren et al. 2008). Highest realized niche overlap is found in the ‘mean temperature of the warmest quarter’, ‘mean temperature of the warmest month’ and ‘precipitation of the wettest month’. Interestingly, the mean value for the variable ‘mean temperature of the warmest quarter’ observed at the native and invasive records is close to the Greenhouse frog's preferred temperature determined by Pough et al. (1977). This suggests the existence of a direct link to the species' natural history. Lowest overlap is detected in the ‘annual mean temperature’ and ‘mean temperature of the driest quarter’. Values of the other variables are more intermediate. The hypothesis of realized niche equivalency thus is rejected in most cases with exception of ‘maximum temperature of the warmest month’ and ‘mean temperature of the warmest quarter’. The results obtained from different algorithms (BIOCLIM, DOMAIN, Maxent) are highly coincident (P < 0.05) when compared with Spearman's rank correlation test (Table 3).

Table 2 AUC values per model, niche overlap in terms of I and D values and assessment of niche identity and equivalency via randomization tests (see text)
Table 3 Correlation and significance of AUC values per BIOCLIM, Maxent and DOMAIN models computed with single variables and corresponding niche overlap between the native and invasive ranges on mainland USA in terms of I and D values in terms of Spearman's rank correlation coefficients

Results from the Maxent realized niche similarity test based on native records compared to the invasive background on mainland USA revealed that climatic conditions described by most variables are more similar to those expected by chance when applying both D and I measures (Table 2). One test applying D and four tests applying I are not significant. Results based on invasive records compared with the native background revealed that seven tests applying D and I reveal no significance, wherein tests for two variables are contradictory. Climatic conditions described by only one variable are more different and three are more similar than expected by random (Table 2). With increasing overlap Maxent model performance in terms of AUC values decrease (Fig. 4). However, these relationships are not significant (R 2 I, nat = 0.223, P = 0.169; R 2 I, invUS = 0.051, P = 0.531; R 2 D, nat = 0.261, P = 0.131; R 2 D, invUS = 0.032, P = 0.622).

Fig. 4
figure 4

Relationship between niche overlap of native (nat) and invasive (inv) ranges of Eleutherodactylus planirostris on mainland USA in terms of D and I values and model performance (AUC)

Discussion

Our results clearly demonstrate variation in bioclimatic variables with regard to the available gradient, their power to characterize the realized distributions of E. planirostris and the degree of similarity of native and invasive range realized climatic niches. Results presented herein have important implications for model transferability across space and time, studies on biological invasion, impacts of climate change on species' ranges and niche evolution'.

Explanative power of variables

Several studies have advocated that the selection of variables can highly influence SDM results (e.g. Beaumont et al. 2005; Peterson and Nakazawa 2008; Rödder and Lötters 2009; Rödder et al. 2009a, b). Commonly a specific number of variables is necessary to archieve optimal predictions (Peterson and Cohoon 1999; Rödder et al. 2009a, b). Rödder and Lötters (2009) suggested that ‘relaxed’ variables, which have no or only local impact within a species’ range, should (in terms of statistics) be considered poor predictors for species' realized distributions. This becomes obvious when comparing the overlap between the available gradient of variables within a range and a species' potential niche among different areas (Fig. 1b). For instance, the overlap is relatively small in the hypothetical native range in variable 1 but almost complete in invasive ranges 1 and 2 in Fig. 1b, providing a poor contrast between climatic conditions at the species’ records and those available in the general area of occurrence. Indeed, our results indicate that model performance in terms of AUC values varies largely among predictors but not significantly among different algorithms using single variables. Comparing differences among AUC values of the same predictor within models developed with native and invasive records also reveals high variation in some cases (e.g. ΔAUCBio2 = 0.168, ΔAUCBio12 = 0.184, ΔAUCBio18 = 136; Table 2) but not so in others (e.g. ΔAUCBio9 = 0.019, ΔAUCBio13 = 0.020, ΔAUCBio10 = 0.000; Table 2). Interestingly, AUC values of models developed with single variables slightly decreased while the corresponding D and I values increased (Fig. 4), suggesting lower predictive ability when environmental conditions within the native and invasive ranges become more similar. A similar pattern was described by Rödder and Lötters (2009).

Applying multiple predictors, the relative contribution per variable varies in models developed for different regions (Table 1). From this, we derive that the explanative power of variables can strongly vary among general areas of occurrence, i.e. native and invasive ranges. This confirms our expectation formulated in the introduction. These differences in the explanatory power of variables when models are trained in different regions may lead to erroneous predictions when models are projected into novel areas not used for model training and especially when computing algorithms which weight predictors according to their explanative power. Such weighting functions certainly improve the predictive ability of SDMs within the region they are developed for, especially when complex dependencies among predictors are implemented. On the other hand, weighting functions and the implementation of variable interactions may lead to limited model transferability through space as shown in the case of E. planirostris.

Comparing the training and test AUCs of our SDMs, it is evident that training AUC values are always higher than test AUC values. Such differences can be taken as a measure of overfitting which is likely to result from truncated response curves. Some sophisticated algorithms such as Maxent allow compensation of such effects by manually adjusting regularization parameters and feature classes (Phillips and Dudík 2008; Elith et al. 2010). Future studies should focus on a broader screening of crosswise predictions in invasive species to determine best settings. Until now, it may be considered that simple profile algorithms without any weighting function such as BIOCLIM or DOMAIN may provide more robust predictions when transferring SDMs through space, although more sophisticated algorithms such as Maxent may frequently outperform them in the training region (Elith et al. 2006; Wisz et al. 2008). However, this requires that predictors have to be carefully chosen so that natural history properties of the target species are reflected. The implementation of predictors which are biologically less meaningful may over-restrict a potential distribution prediction (e.g. Rödder et al. 2009b). The observed trade off between model complexity and transferability should be subject to further studies. Some suggestions how to create SDMs with a better transferability have recently been formulated by Elith et al. (2010). These comprise specific treatments of background data, selection of biologically relevant predictors (as also suggested by Rödder et al. 2009a, b) and manual down regulation of partial model complexity (‘smoothing’ of response curves). At the current stage, more examples how to apply these treatments are necessary to derive general guidelines.

Niche conservatism versus niche shift

We show that there are varying degrees of conservatism of realized climatic niches among predictors in the Greenhouse frog. This pattern can be explained either by limited gradients of available climate in one region which are extended in others, or by climate niche shift, which may be facilitated by genetic bottlenecks (Jakob et al. 2010). The results of our realized niche similarity test suggest that the observed niche difference between native and invasive populations is more often a function of habitat selection and/or suitability than an artifact of the underlying environmental difference between the suites of the habitats available. In eight cases, we detect active habitat selection in one region but no such signal in the other region (Table 2). This indicates that the spatial distribution of climatic features suitable and unsuitable to E. planirostris varies among the native and the invasive ranges. Given the complex nature of climatic niches, it might be reasonable to assume such patterns while niches may be maintained along some environmental axes but not along others (Fitzpatrick et al. 2008). Our results apparently mirror a common pattern since they are largely coincident with those revealed for the invasive Mediterranean house gecko (Rödder and Lötters 2009) and other invasive species (authors' unpublished data). When interpreting patters of apparent niche shift, the difference between patterns traceable to changes in habitat selection and caused by different habitat availability is therefore essential. Once identified, observed niche shift may be either assigned to shift in both the fundamental and the realized niches (Pearman et al. 2008) or to shift in the realized niche only (e.g. due to extended accessibility or due to relaxation of biotic constraints). While the latter can easily be accessed when comparing environmental conditions of native and invasive ranges, detection of shift in a species' fundamental niche remains difficult.

Holt et al. (2005) have theorized that evolution of environmental tolerance is most likely if a species is introduced into a novel environment marginal to its tolerance, a process recently empirically confirmed by Jakob et al. (2010). Our results suggest that actually the variables limiting a species’ potential distribution vary regionally (Fig. 1). Hence, selective pressures may also vary spatially, leading most likely to a locally restricted, spatially non-homogenous distributed shift in the fundamental niche. Such spatial in-homogeneities were detected by Brand and Grossenbacher (1979). They found differences in a frog species’ developmental properties correlated with different environmental tolerance between lowland and Alpine populations in Europe. The degree of the resulting spatial in-homogeneities of the fundamental niche breadth within a species may partly be compensated through genetic exchange among populations and may therefore depend on dispersal abilities and landscape permeability.

Earliest observations of established populations of E. planirostris were reported from Florida more than 130 years ago. These sites are most proximal to the species’ native distribution and show a high climatic similarity. By now, this frog has colonized most parts of the Florida peninsula and its distribution is close to equilibrium with climate (Dundee and Rossman 1989; Goin 1947). Similar effects accompanied with reduced reproductive success and decrease in female body conditions were recently detected along a latitudinal gradient in the Cuban tree frog (Osteopilus septentrionalis), native to the same region as E. planirostris (McGarrity and Johnson 2009). These findings are congruent with a decreasing climatic suitability in northern Florida derived from the tree frog’s native distribution suggesting niche conservatism (Rödder and Weinsheimer 2009). The invasion histories of the tree frog and E. planirostris in Florida are comparable. Therefore, it a priori is unlikely that the time since initial invasion is sufficient for greater shifts in the fundamental niche of E. planirostris.

Methodical caveats

One important assumption when applying SDMs is that environmental variables are actually limiting the target species' range. Consequently, it has to be asked whether the Greenhouse frog, an island species, is a suitable study organism with regard to our questions. We consider it suitable, since it is not equally distributed throughout Cuba and the Bahamas or the Hawaiian Islands. For example, its native altitudinal range is 0–730 m above sea level only, while Cuba's altitudinal range exceeds 1,100 m above sea level. This suggests that some part of Cuba may climatically unsuitable to E. planirostris. In addition, this provides sufficient contrast for successful SDM development which is also supported by the comparison between the available variable gradients and the conditions observed at species records (e.g. ‘maximum temperature of the warmest month’, ‘precipitation of the coldest quarter’; Fig. 3). On the other hand, gradients of some variables available throughout Cuba and the Bahamas are completely reflected in the Greenhouse frog's realized native distribution (e.g. ‘annual mean temperature’, ‘mean temperature of the driest quarter’, ‘mean temperature of the coldest month’; Fig. 3). These different degrees of overlap well agree with our theoretical approach presented in Fig. 1. Note that the theoretical approach made her is also supported by the results presented for the Mediterranean house gecko by Rödder and Lötters (2009), in which native and invasive ranges are predominately limited by unfavorable climate conditions. However, further studies are necessary to evaluate the general applicability of our theoretical framework.

Conclusions

Our findings have implications on the transferability of SDMs over space and time, which is strongly affected by the choice of predictors (Peterson and Nakazawa 2008; Rödder and Lötters 2009; Rödder et al. 2009b). Broennimann and Guisan (2008) and Beaumont et al. (2009) suggested that training models with records from both native and invasive ranges can improve model output by incorporating more information on the target species' niche. The advantage of this is that it is possible to overcome (at least regionally) the effects described above. However, this may only be valid for the area the model is trained for, i.e. the environmental conditions within the native and invasive ranges, which are actually represented in the species records and absence/pseudoabsence data used for model training. Generally, the incorporation of ‘relaxed’ predictors correlated with biologically meaningful predictors may lead to erroneous predictions for novel areas or time slices (Heikkinen et al. 2006). Because of this, unsuitable areas might be characterized as suitable and the other way around. Furthermore, throughout assessments of invasive species’ ecology enabling the identification of biologically meaningful predictors and degree of conservatism among predictor variables may help to find more general patterns.