Introduction

Invasive species are a major component of global environmental change, contributing to biotic homogenization and creating threats to biodiversity, ecosystem services, human health, and economic activities (Vitousek et al. 1997; Pejchar and Mooney 2009; Pyšek and Richardson 2010). However, knowledge gaps about the basic and applied ecology of species invasions and a research implementation gap between the knowledge generated by ecologists and that which is used by policy and management practitioners are persistent problems that impede the monitoring, management, and scientific understanding of invasive species (Esler et al. 2010; Bayliss et al. 2013). Tools that synthesize and transfer scientific knowledge effectively are needed in order to narrow these knowledge gaps, and to foster collaboration between the scientific, management, and policy development communities (Bayliss et al. 2013; Matzek et al. 2014; Giakoumi et al. 2016).

The spatial distributions of invasive species are a key piece of knowledge required for effective monitoring and management, but there is often considerable uncertainty about where invasive species do and do not occur. Unfortunately, few large-scale multi-species monitoring programs exist, and much of the information available on the distributions of invasive species comes from scattered, incidental occurrence records (e.g. Johnson et al. 2006; Gormley et al. 2011). Where dedicated surveys or monitoring programs have been implemented, they are often focused exclusively on areas of high activity by introduction and transport vector activity (e.g. ship traffic and aquaculture; Arenas et al. 2006; Carman et al. 2010). Surveys that sample beyond hubs of vector activity are almost inevitably limited to smaller spatial extents and/or coarser spatial resolutions than are needed for effective management action by logistical constraints. The deficiencies of occurrence information are exacerbated in marine ecosystems, where observing and monitoring invasive species is typically more difficult, and fewer potential observers are active, than on land.

Anthropogenically-induced environmental change is altering the biophysical properties of the ocean and has the potential to transform marine ecosystems and the human systems that rely on them (Cheung 2019). For example, the distributions of a wide variety of marine taxa, including important fisheries species, are already tracking changes in local climate (Pinsky et al. 2013). There is strong scientific and practical interest in predicting how the distributions of invasive species are likely to change in response to future climate change, but present-day and historical occurrence records alone can tell us little about this (Jeschke and Strayer 2008; O’Donnell et al. 2012; Lowen et al. 2017). Consequently, reliable information about the current and future distributions of invasive species remains a major knowledge gap, and has contributed to the omission of invasive species from more than 95% of conservation plans, globally (Giakoumi et al. 2016; Mačić et al. 2018).

The limitations of existing occurrence data have increasingly caused scientists turn to species distribution models (SDMs), which predict the distribution of organisms based on the distribution of appropriate biophysical conditions (Elith and Leathwick 2009). Species distribution models allow us to use known, spatially incomplete occurrence locations to predict comprehensive distributions with spatially variable, quantitative, estimates of occurrence probability (Rondinini et al. 2006). When coupled to climate projections, they also allow us to obtain first-order predictions of how these distributions might shift in response to expected environmental change (Jeschke and Strayer 2008; O’Donnell et al. 2012; Lowen et al. 2017). Moreover, maps derived from individual species’ distribution models are powerful communication tools that can be used to support policy development, guide management decision-making, inform systematic conservation planning, and optimize monitoring and detection programs (Ball et al. 2009; Venette et al. 2010; Baxter and Possingham 2011; Guisan et al. 2013). However, ecosystems are at risk of invasion by many non-indigenous species, each of which is likely to have its own distribution, environmental requirements, potential impacts and associations with vectors of introduction and spread, but understanding and communicating this complexity and detail is difficult. One way to deal with this complexity is to identify invasion hotspots, or areas subject to risk of invasion by multiple invaders (O’Donnell et al. 2012; Li et al. 2016). Just as biodiversity hotspots have been used extensively to focus conservation efforts, the concept of invasion hotspots can help to provide a focus for policy and management and support efforts to optimize our allocation of the limited resources available for monitoring and management in ecosystems at risk of invasion by many species (O’Donnell et al. 2012).

The concept of invasion hotspots has been used frequently in invasion ecology and management, but its use has often been vague and instances where it has been formalized or applied in aquatic ecosystems are comparatively rare. Hotspots of marine invasions have been identified via counts of invasive species within individual embayments, governmental units, or ecoregions (e.g., Cohen and Carlton 1998; Ruiz et al. 2011; Marchini et al. 2015), or by using the intensity of vector activity as a proxy for invasion risk (e.g. Drake and Lodge 2004; Tidbury et al. 2016). Recent studies have combined information on global ship traffic patterns and the environmental distance between ports to model invasion risk and identify hotspots at the port and ecoregion level (Seebens et al. 2013, 2016). An alternate approach is to integrate (or “stack”) multiple high resolution species distribution models to identify areas of high invasion potential as indicators of invasion hotspots and risk, as has been done in some terrestrial systems (O’Donnell et al. 2012; Li et al. 2016). While the other approaches have proved informative and useful, they produce assessments that are focused on relatively small scales (e.g. single embayments), are focused on large scales but provide spatially homogenous estimates for large areas (e.g. entire ecoregions), or that provide spatially variable, but disjunct assessments over large scales (e.g. for large ports around the world). Stacked species distribution models are a promising tool because they can provide higher resolution, spatially comprehensive and variable assessments of present-day invasion risk over large spatial extents and predictions of how this risk might respond to environmental change. However, there are few studies using stacked species distribution models to assess the risk of multiple marine invasions or identify marine invasion hotspots (but see Lowen et al. 2017). In fact, even the number of studies modelling the ranges of individual marine invasive species and how they are likely to respond to climate change has been minimal compared to those on terrestrial or freshwater ecosystems (13, 314 and 98 studies, respectively) (Bellard et al. 2018).

In this paper, our goals were to use species distribution models to describe spatial patterns in predicted invasive species richness, to identify present-day and future hotspots as spatial indicators of risk of invasion, and to describe how maps of predicted invasive species richness and invasion hotspots could be used as simple communication tools to guide monitoring and management decisions. We coupled occurrence data of invasive species on the east and west coasts of North America with high resolution seasonal climatologies for environmental variables via species distribution models to predict current and future distributions of 12 high-risk invasive species per coast. Integrating these species distribution models allows us to compare patterns of predicted invasive species richness to observed patterns, to identify hotspots and coldspots of invasion risk, and to examine how these patterns are predicted to change under climate change scenarios RCP 8.5 by 2075. Finally, we describe how hotspots of invasion risk can be used to inform managers and decision makers.

Methods

Study regions

We focused on the northwest Atlantic and the northeast Pacific for this study, concentrating on areas where more than a decade’s worth of systematic marine invasive species monitoring data and high-resolution marine climate models from Fisheries and Oceans Canada could be leveraged. Within both regions, there is significant interest from government in using information about the present day and potential future distributions of marine invasive species in ecosystem management, environmental protection, and the design of monitoring programs. In the Atlantic, our model domain was from 32° to 53° N and 49° to 80° W. In the Pacific, our model domain was 24°–62° N and 111°–155° W. Although our present-day species distribution models covered these broader domains, we focused on smaller domains, where we were able to use 2075 climate projections (Atlantic: 41°–53°N, 52°–73° W; Pacific: 45.5°–56°N, 122.25°–135.75° W). A map of each study region and the places mentioned in the text can be found in Online Appendix 1.

Species occurrence data

To identify invasion hotspots, we used a subset of 12 invasive species on each coast. We focused on seaweeds and benthic/demersal invertebrates of shallow-water coastal habitats as little information is available about potential pelagic and offshore invasive species in our study regions. However, we selected species covering a range of taxonomic and life history variation that have already invaded the broader region (northwest Atlantic or northeast Pacific) and that have been identified as moderate to high risk of invasion and impact in Canadian waters (Drolet et al. 2016). We selected well-established (> 20 years) species with occurrence records from at least 30 unique locations to ensure that our focal species would have had time to spread within the study region, and that we had sufficient data for species distribution modelling. For the Northwest Atlantic, our focal species included solitary (Ascidiella aspersa, Ciona intestinalis, Styela clava) and colonial (Botrylloides violaceus, Botryllus schlosseri, Diplosoma listerianum, Didemnum vexillum) tunicates, a skeleton shrimp (Caprella mutica), two shore crabs (Carcinus maenas, Hemigrapsus sanguineus), a colonial bryozoan (Membranipora membranacea) and a seaweed (Codium fragile). For the Northeast Pacific, the focal species included six of the same species used in the Atlantic (B. violaceus, B. schlosseri, C. mutica, C. maenas, D. vexillum, S. clava), four bivalves (Mya arenaria, Crassostrea gigas, Venerupis philippinarum, Nuttallia obscurata), a gastropod (Ocinebrellus inornatus), and a seaweed (Sargassum muticum). All these species are algae or ectotherms that have adapted differently across cold to warm temperate saltwater environments, and whose development rates and distributions are expected to be strongly delineated and constrained by temperature and salinity.

We compiled occurrence data (latitude and longitude of each observation of each species) primarily from Fisheries and Oceans Canada’s (DFO) Aquatic Invasive Species Monitoring, the Ocean Biogeographic Information System (OBIS 2018), as well as several other online databases and primary and grey literature publications (Online Appendix 1), using data only from the focal region. We chose to exclude data from other parts of the world to ensure our models would reflect the environmental responses of the genetic lineages that have invaded our study regions. For plots of observed invasive species richness, we rasterized the occurrence points at a resolution of 0.5°, counting the number of unique invasive species (from our focal species) in each grid cell. We did not include grid cells without observations, as it is often likely that no sampling for our focal species was conducted in these areas. Prior to model fitting, we spatially rarified the occurrence points to a 10 km resolution using SDMtoolbox 2.0 for ArcGIS to reduce potential effects of spatial autocorrelation on our results (Brown et al. 2017).

Environmental predictors

We selected surface water temperature, salinity, and wave action (i.e., significant wave height, Atlantic only), which are known to influence the distributions of coastal marine invertebrates and algae (Burrows et al. 2008; Lowen et al. 2016), as predictors in our species distribution models. Our decision not to include other predictors reflects a lack of available data (particularly future projections) for potentially relevant variables when we began this work. We intentionally excluded variables such as depth, latitude, and distance from shore which are sometimes used as proxies for factors that control distributions, but whose relationships to those factors is likely to change in space and time. These decisions reflect recommendations to limit variables to functionally relevant predictors, and the dangers of using distal and proxy variables, which often make models more error-prone when they are extrapolated in space or time (Elith and Leathwick 2009). As our work involves extrapolation in time, including these proxies might improve model fit for the present day, but would risk significantly degrading future projections. Data for each predictor was averaged into seasonal climatology rasters, including winter (Atlantic: January to March, Pacific: December to February) spring (Atlantic: April to June, Pacific: March to May), summer (Atlantic: July to September, Pacific: June to August), and fall (Atlantic: October to December, Pacific: September to November). The difference in the definition of the seasons reflect how physical Oceanographers/ocean climate modellers typically define the seasons in these regions, and are related to the timing of events like stratification/destratification, warming, cooling, and the spring phytoplankton bloom.

In the Atlantic, sea surface temperature and salinity values were assembled at 0.01° resolution from AMSR-E Level 3 sea surface temperature satellite data (Advanced Very High Resolution Radiometer data, AVHRR Atlantic; 2002–2012; compiled by Fisheries and Oceans Canada) and global oceanographic salinity composites (BioOracle; Tyberghein et al. 2012). Seasonal significant wave height values were derived from numerical models (Wavewatch III, Rascle and Ardhuin 2013, 0.167° resolution; Guo and Sheng 2017, 0.125° resolution). Future seasonal climatologies of salinity and temperature matching the resolution (0.01°) of “present day” values were derived from numerical model projections (BNAM RCP 8.5 2075 monthly anomalies; Brickman et al. 2016), as were future seasonal significant wave height climatologies (Wavewatch III driven by Canadian Regional Climate Model, CanRCM4, predictions for 2070–2071, 0.125° resolution, Guo and Sheng 2017). Lower resolution wave height values (present and future) were resampled, using the nearest neighbor value, to match the 0.01° temperature and salinity values. Our choice of scenario RCP 8.5 2075 reflects the fact that our oceans are already warming significantly faster than previously expected (Cheng et al. 2019). It also reflects the availability of ocean climate projections for this scenario on both the Atlantic and Pacific coasts produced by models developed by Fisheries and Oceans Canada (including several of the authors of this manuscript), and our familiarity with them.

In the Pacific, we compiled salinity and temperature values from hindcasts from the University of British Columbia’s Salish Sea Nucleus for European Modelling of the Ocean (NEMO) model (2014–2017 hindcast, 0.006° resolution, Soontiens et al. 2016; Soontiens and Allen 2017), a Regional Ocean Modeling System (ROMS) model of the British Columbia shelf (BC ROMS 1981–2010 hindcast, 0.04° resolution, Peña et al. 2019), as well as the MARSPEC database (0.00833° resolution, Sbrocco and Barber 2013). The higher resolution data from the NEMO model and MARSPEC were resampled using bilinear interpolation to match the 0.04° resolution of the ROMS model data. Future climatological scenarios for salinity and temperature were derived from projections of the BC ROMS model (RCP 8.5, 2041 to 2070, 0.04° resolution, Peña et al. 2018). Extrapolation detection analyses (Mesgaran et al. 2014) indicated neither novel values for individual predictors (i.e. values that fell outside the range of those in our present day values), nor novel combinations of co-variates was present in our future climate scenario for either the Atlantic or Pacific study area (Fig. A1.3, Online Appendix 1), indicating that we did not need to be concerned about our SDM projections involving extrapolation into non-analogous environmental conditions.

Although our focal species live primarily in the intertidal zone or very shallow subtidal areas close to shore, and the vast majority of our occurrence records were from depths shallower than 30 m, some focal species (e.g. C. maenas, C. intestinalis, D. vexillum) have been observed in deeper waters. We initially planned to limit our analysis to areas shallower than 100 m on both coasts. In the northwest Atlantic, this would have caused us to include data from large offshore banks, stretching up to several hundred km from shore in the background environmental data used to fit our SDMs. Our occurrence dataset comprised records almost entirely from coastal areas (most < 30 m), possibly because of increased sampling near shore. Such spatially biased sampling will generally lead to environmental differences between occurrence and background datasets that may result in inaccurate models (Phillips et al. 2009). Restricting the background environmental data used in SDMs is an effective way to reduce potential bias in model results (Kramer-Schadt et al. 2013; Brown et al. 2017). Thus, we clipped environmental data layers to areas shallower than 30-m depth in order to eliminate thousands of square km of offshore banks from 30 to 100 m. Of the 1314 occurrence records remaining after spatial rarefaction, 11 were from depths greater than 100 m, and an additional 31 were excluded as a result of this change. In the Pacific, the 100 m depth contour is typically much closer to shore than in the Atlantic, offshore banks do not exist to the same extent, and we had more records from areas 30 to 100 m deep. Consequently, we limited our analysis to areas shallower than 100 m, as originally intended.

Species distribution models

We modelled each species distribution using MaxEnt 3.4.1 (Phillips et al. 2017) with seasonal salinity, temperature, and significant wave height (Atlantic only) as predictors. MaxEnt is a presence-only method that estimates species distributions by identifying the distribution with maximum entropy, subject to constraints derived from the values of environmental covariates at presence locations. This is equivalent to minimizing the relative entropy between the probability distribution estimated for the covariates at presence locations and that estimated for the covariates for the entire background landscape (Elith et al. 2011; Phillips et al. 2017). Although MaxEnt was originally developed from a machine learning perspective, it is equivalent to an inhomogeneous Poisson process model (Aarts et al. 2012; Renner and Warton 2013).

We used the default complementary log–log transform option within MaxEnt to produce estimates of occurrence probability, due to its better theoretical justification compared to the previously favored logistic transform (Phillips et al. 2017). To choose the feature class (i.e. potential response curve complexity) and regularization parameter (i.e. penalty for model complexity) settings for each species’ model, we used the ENMeval package (Muscarella et al. 2014) for R version 3.5.2 (R Core Team 2018) to evaluate a broad range of combinations. We selected the model settings that resulted in the lowest corrected Akaike information criterion (AICc) for each species and fit a model using those settings in MaxEnt using the maximum possible number of background points (depending on which study area) and 30-fold random cross-validation to evaluate the models and estimate standard deviations for the model predictions. We also used this model, together with 2075 climate projection datasets to predict each species’ future distribution.

Stacked species distribution models

We estimated grid cell-level invasive species richness by stacking our species distribution model predictions for present day and 2075 projections. We also calculated the centre of gravity (i.e. weighted centroid) of our present day and 2075 species richness estimates to examine the predicted latitudinal shift in the spatial distribution of invasive species richness within our study area. Our estimates of species richness are limited to the richness of our 12 focal species (per coast), so the potential range of our estimates (and observations) of species richness ranges from 0 to 12.

Species richness was estimated as the sum of the occurrence probability predictions from our individual models (Calabrese et al. 2014):

$$E\left( {S_{j} } \right) = \mathop \sum \limits_{k = 1}^{K} p_{j,k}$$

where E(Sj) is the expected species richness (S) at sitej, K is the number of species in the dataset (12 per coast), and pj,k is the occurrence probability prediction for species k at site j. This is equivalent to summing species suitability (based on occurrence probability) maps.

Assuming pj are exact, known quantities, the variance of E(Sj) should be estimated as (Calabrese et al. 2014):

$$Var\left( {S_{j} } \right) = \mathop \sum \limits_{k = 1}^{K} (1 - p_{j,k} )p_{j,k}$$

However, in reality, pj are estimated with uncertainty. To incorporate this uncertainty, we propagated the error estimated within our individual models by summing the estimated variances (Li and Wu 2006) and adding it to the expected variance, changing the equation to:

$$Var\left( {S_{j} } \right) = \mathop \sum \limits_{k = 1}^{K} (1 - p_{j,k} )p_{j,k} + \mathop \sum \limits_{k = 1}^{K} sd_{j,k}^{2}$$

where sdj,k is the standard deviation of the model prediction for species k at site j. We used these estimates to construct Wald-type 95% confidence intervals for each site.

To check the performance of the stacked species distribution models we visually and statistically compared their predictions to observed species richness. For the statistical comparisons we aggregated our model predictions to the same 0.5° resolution used for the observed values, taking the mean of the finer resolution prediction grid cells. We then used linear regression to examine the relationship between the predictions and the observations. Ideally, these regression models would have an intercept near 0 and a slope near 1, and our predictions would statistically explain (predict) a significant proportion of the variation in observed species richness. However, we expected the proportion of explained variation to be modest and for model predictions to frequently exceed observed species richness because we believe some invaders may not have had the opportunity to disperse all potentially suitable areas, and that the observed number of the invaders in many areas is likely to be an underestimate of the true value because of insufficient sampling.

Hotspot/coldspot identification

In many other fields, hotspots and coldspots are defined using measures of local spatial association, such as Getis-Ord Gi*, which allow one to identify clusters of values in weighted point data that are particularly high, or low, relative to surrounding areas (Getis and Ord 1992). In order to evaluate invasion risk and impact, the magnitudes of the predicted values are more important than their spatial association, so we defined hotspots as the areas within the highest invasive species richness and coldspots as the areas with the lowest species richness, without considering spatial clustering. Areas of intermediate species richness can be considered “warmspots”. Several approaches could be used to distinguish hotspots and coldspots from other areas. We used two to demonstrate how and why they might be used. The first evaluated whether the invasive species richness estimate in each pixel was significantly above or below specific thresholds (above 8 for hotspots, below 4 for coldspots), according to their 95% confidence interval. This approach takes uncertainty in model estimates into account and is effective in communicating how the magnitude and spatial distribution of invasion risk might change through time. We used this approach to identify hotspots and coldspots across our entire study regions. The second approach delineated hotspots and coldspots according to whether pixels fell within the top and bottom deciles (i.e. top and bottom 10%) of the estimates. This approach enables one to focus attention on a fixed proportion of a focal area, regardless of the actual values, and whether they change through time or between different focal areas. It will be useful in situations where logistical constraints impose a limit on the area that can be studied, sampled, monitored, or protected, or where one wants to produce similar results across changing contexts. We used this approach to identify hotspots and coldspots in different ecoregions within our broader study areas that have different numbers of invasive species. The species richness and quantile thresholds we set in our examples were arbitrary but, as for choosing a method to define or delineate hot- and cold-spots, there may be legislative, policy-based, scientific, or logistical reasons for choosing thresholds in other instances.

Results

Species distribution models

Our individual species distribution models exhibited good performance, with the area under the receiver operating characteristic curve (AUC) ranging from 0.760 to 0.950 and Boyce Index ranging from 0.7 to 0.98 in the Atlantic (Table 1). In the Pacific, AUC ranged from 0.815 to 0.948 and Boyce Index ranged from 0.69 to 0.97. Seasonal temperature variables were generally the best predictors of the distributions of our focal species in the Atlantic, with salinity and wave height variables each playing secondary or tertiary roles for different species (Table 1a). In the Pacific, seasonal temperature variables also were more important than salinity for all the species, except for 2 molluscs; Nuttallia obscurata, and Ocinebrellus inornatus (Table 1b).

Table 1 Evaluations of (a) Atlantic and (b) Pacific model performance and variable importance for each species

Observed invasive species richness and stacked species distribution model predictions

In the Atlantic, most coastal grid cells contained observations of at least one of our focal species, except in much of Quebec and parts of Newfoundland (Fig. 1a). Where data were available, the number of our focal invasive species observed in the 0.5° grid cells ranged from 1 to 12, with particularly high numbers of invaders observed in parts of southwestern Nova Scotia, southern Maine, New Hampshire, Massachusetts, and Rhode Island. Few invaders were observed in the inner Bay of Fundy, northern New Brunswick, Quebec, and most of Newfoundland (see Figure A1.1 in Online Appendix for map of locations). Our present-day stacked species distribution model estimates for invasive species richness ranged from 0.2 to 11.1 of the possible 12 species, (4.0 ± 2.6; mean ± Std Dev) (Fig. 1b), and exhibited a broadly similar spatial pattern to observed richness of our focal invasive species (Fig. 1a). The stacked species distribution model did a good job of predicting observed species richness, with the intercept and slope close to their ideal values and more than half the variation in observed species richness explained by predicted richness (intercept:− 0.20, slope: 0.91, r2 = 0.52, Online Appendix 2, Fig. A2.1). As expected (see Methods), over-predictions of observed species richness (62% of cells) were more common than under-predictions (38% of cells). Our stacked species distribution model for 2075 predicts the average number of invasive species present per cell to increase slightly, from 4.0 to 4.5 (± 2.6; range 0.3–11.0; Fig. 1c). These relatively small changes in the mean and range of predicted invasive species richness reflect pixel-level changes that were mostly small and positive (range − 1.8 to 2.5) as highlighted in the anomaly plot (Fig. 1d). The centre of gravity of our SSDM estimates shifted 33.6 km to the northeast, from (45.30 N, 64.60 W) to (45.47 N, 64.25 W).

Fig. 1
figure 1

Number of focal invasive species in the northwest Atlantic according to observations (a), present day stacked species distribution model (SSDM) (b), and 2075 SSDM projection (c), and the predicted change in number of invaders according to SSDMs (d)

In the Pacific, occurrence data were available from almost all the grid cells in our study area and the number of observed invaders in these grid cells ranged from 1 to 11 (Fig. 2a). High invasive species richness was observed primarily in the Salish Sea and some areas of West Coast Vancouver Island (Fig. 2a), low richness was observed along the central and northern coasts of British Columbia (BC), and low to moderate invasive species richness was observed along the outer coast of Washington and Oregon (see Figure A1.2 in Online Appendix for map of locations). Our present-day stacked species distribution model estimates for invasive species richness ranged from 0.4 to 11.6 species, (4.6 ± 2.4). The stacked species distribution model again exhibited a broadly similar spatial pattern to observed species richness, though it predicted that more invasive species would be present along much of the outer Washington and Oregon coastline (Fig. 2b). The stacked species distribution model did a similar job of predicting observed species richness to the one for the Atlantic (intercept: 0.01, slope: 0.81 r2 = 0.42, Online Appendix 2, Fig. A2.2), and overpredictions of observed species richness (69% of cells) were more common than underpredictions (31% of cells). The stacked species distribution model for 2075 predicted the average number of invasive species present per cell to increase from 4.6 to 6.4 (± 1.5; range 2.8–10. 6; Fig. 2c). The change in predicted richness (1.9 ± 1.4; range − 5.8 to 6.2, Fig. 2d) was positive in 90% of the grid cells in our study area, but there were a few small areas in the southern Strait of Juan de Fuca, Padilla Bay, Skagit Bay, Samish Bay, and Bellingham Bay where the expected number of these invaders was predicted to fall. The centre of gravity of our SSDM estimates shifted 95 km to the northwest, from (50.56 N, − 126.61 W) to (51.14 N, − 127.60 W).

Fig. 2
figure 2

Number of focal invasive species in the northeast Pacific according to observations (a), present day stacked species distribution model (SSDM) (b), and 2075 SSDM projection (c), and the predicted change in number of invaders according to SSDMs (d). Note that there are slight differences in the coverage of present day and 2075 environmental datasets and, thus, the coverage of associated model predictions. Predicted change is calculated only for areas with data for both time periods

Hotspots and coldspots

Our northwest Atlantic analysis using species richness thresholds identified hotspots through much of the area between Cape Cod and southern Maine, in Passamaquoddy Bay, and in several bays in southwestern Nova Scotia for the present day (Fig. 3a). Coldspots were confined to the northern Gulf of St. Lawrence, around Newfoundland, southeast of Cape Cod, and small shallow areas far offshore. For 2075, there was little change in the overall pattern of hotspots and coldspots, apart from slight expansion of existing hotspots and contraction of existing coldspots (Fig. 3b). In the Pacific, much of the southern Salish Sea was identified as present-day hotspot, as were a few areas along the outer coast of Vancouver Island, Washington and Oregon (Fig. 3c). Large areas around Haida Gwaii, northern BC and southern Alaska were identified as coldspots. Our analysis predicts that coldspots will be almost completely eliminated by 2075, replaced by large areas of intermediate invasive species richness (warmspots), and hotspots will expand to cover many areas on the western coast of Vancouver Island, the central coast of British Columbia, and most of the outer coast of Washington (Fig. 3d). Conversely, hotspots areas are predicted to disappear from parts of the southern Salish Sea, replaced by warmspots.

Fig. 3
figure 3

Location of present and future hot and cold spots in the northwest Atlantic (a, b) and northeast Pacific (c, d). Hot (red) and cold (blue) spots identified as areas with predicted species richness significantly higher than 8 and lower than 4, respectively. Other areas (i.e. “warm spots”) are depicted in yellow

Using deciles to delineate hotspots at the ecoregion level, we found similar present-day hotspots in the same areas of Scotian Shelf as we identified in the broader-scale species richness-based analysis (Fig. 4a). Coldspots were identified in offshore areas around Sable Island, south of Yarmouth, and in the inner Bay of Fundy. These patterns changed only slightly for the 2075 projections, with some hotspots and coldspots increasing or decreasing in size (Fig. 4b). However, new coldspots were predicted to develop in Minas Basin. In the Gulf of St. Lawrence ecoregion, hotspots were concentrated around the eastern Northumberland Straight and northern Cape Breton Island, with coldspots spread around the northern Gulf (Fig. 4c). For 2075, new hotspots are predicted to occur in Port au Port Bay, Newfoundland and a formerly very small hotspot in St. Georges Bay, Newfoundland is predicted to expand considerably (Fig. 4d). A coldspot north of the Magdalen Islands is predicted to disappear, while new coldspot areas are predicted to occur in the Saint Lawrence Estuary. In the Strait of Georgia, hotspots are concentrated in the middle part of the strait on the Vancouver Island and mainland coastlines, with coldspots to the southeast and northwest (Fig. 4e). For 2075, hotspot areas are largely predicted to move to the northwest while coldspots are predicted to shrink in the northwest and expand in the southeast (Fig. 4e).

Fig. 4
figure 4

Location of present and future hot and cold spots in the Scotian Shelf (a, b), Gulf of St. Lawrence (c, d), and Strait of Georgia (e, f) ecoregions. Hot (red) and cold (blue) spots identified as areas with top and bottom 10% of predicted species richness within the ecoregion, respectively. Other areas that are neither hotspot nor coldspot (i.e. “warmspots”) are depicted in yellow

Discussion

Hotspots of invasion risk based on the history of invasions and the predicted distributions of invasive species within an area can help to identify areas where invasive species might pose the greatest ecological and socioeconomic threats, and where limited resources available for prevention, early detection, and management should be focused (Kulhanek et al. 2011; O’Donnell et al. 2012; Bellard et al. 2014; Li et al. 2016). There is a need to assess where invasive species are most likely to establish since it is critical for marine ecosystem management, but a lack of this information prevents invasive species from being effectively integrated into biosecurity monitoring programs and conservation plans (Giakoumi et al. 2016). Previous research has used species distribution models to examine the current and potential future distributions of individual marine invasive species (e.g. de Rivera et al. 2011; Lowen et al. 2016), or closely related guilds of invasive species that exhibit differences in their temperature and salinity tolerances that translate to differences in their potential to spread across cold to warm temperate and sub-arctic environments (Lowen et al. 2017). We advanced this approach by using species distribution models to identify areas that are at a high risk of invasion from a larger, more diverse suite of moderate to high risk invasive species with well known invasion histories under present-day and future climatological conditions. Results based on this methodology provide a broader, more integrative assessment of the spatial patterns of the threat posed by marine invasive species in the coastal waters of the northwest Atlantic and northeast Pacific.

Spatial patterns in invasion risk

Previous work focusing exclusively on invasive tunicates predicted moderate to high numbers of invasive species to occur in most coastal areas of SW New Brunswick, the Atlantic coast of Nova Scotia, and in the southern Gulf of St. Lawrence, and few species to occur in most areas around Newfoundland and the northern Gulf of St. Lawrence (Lowen et al. 2017). Although we focused on a broader suite of invasive species, and used slightly different environmental and occurrence different datasets, our results were broadly similar. Focusing on the full extent of our study areas, we found that the predicted number of invasive species generally decreased with latitude, such that present day invasion hotspots were concentrated within the southern portion of our focal area and coldspots were generally concentrated in the north. This pattern is related to our choice of species, which are thought to have been initially introduced to the south of our focal areas, or towards their southern end, and subsequently spread northward (Fofonoff et al. 2018). Moreover, conditions in our focal area also tend to be towards the cooler end of the thermal tolerances of these species, except Membranipora membranacea (Fofonoff et al. 2018). However, we are not aware of marine invasive species thought to have been initially introduced to North America in the northern waters. Several lines of evidence support the existence of a latitudinal gradient in marine invasive species richness, which mirrors the general latitudinal diversity gradient. The existence of a global latitudinal gradient in coastal marine invertebrate and plant biodiversity (Worm and Tittensor 2018) mean that the pool of potential invasive species declines with latitude. Although invasive species richness is often negatively correlated with native biodiversity in experiments, and at very fine spatial scales, the opposite is typically true at coarser spatial scales, and latitudinal gradients of invasive species richness have been observed in other taxa (Fridley et al. 2007; Sax 2001). Moreover, the U.S. has a higher concentration of ports, and thus, greater exposure to shipping as a vector of introduction and spread. Thus, the southern end of our study areas have historically experienced a higher likelihood and number of introductions of marine invasive species. Finally, southern portions of the study areas also have a better climate match to European and Asian source regions than those in the north.

Widely cited global analyses have used the intensity of commercial vessel traffic in major ports (e.g. Drake and Lodge 2004), or vessel traffic and environmental conditions within major ports (e.g. Seebens et al. 2013, 2016) to assess regional-scale invasion risk. Neither patterns of observed marine invasive species richness, nor predicted hotspots, closely matched the locations of major ports within our study area (Figs. 1, 2, 3). Our results suggest that conditions in large ports do not necessarily reflect regional-scale invasion risk or experience the highest risk of invasion, despite high levels of vessel activity. Although the ports of Boston, Seattle, and Vancouver overlap with hotspots, they have lower observed and/or predicted invasive species richness (for our focal species) than nearby non-port areas, as do other large ports that fall outside hotspots [e.g. Halifax, Saint John, St. John’s, Prince Rupert, (see Figure A1.1 and A1.2 in Online Appendix for maps of port locations)]. This underscores the importance of considering where species can spread via natural dispersal following initial introduction, as well as the influence of other vectors of initial introduction and spread, such as aquaculture (Naylor et al. 2001), recreational vessels (Clarke Murray et al. 2011), oil platforms (Pajuelo et al. 2016), and the trade of live organisms for food, bait, education, research, and public and private aquaria (Weigle et al. 2005) when assessing local and regional-scale invasion risk.

Our results suggest that projected change in invasive species richness by 2075 will be far more dramatic in the northeast Pacific than the northwest Atlantic, with many species currently confined to warmer southern waters expected to move north, thus reducing or eliminating present day coldspots concentrated on the northeast Pacific Coast (Figs. 2, 3). Whereas most Pacific species expanded their range to the north in our future projections, there is less change in the Atlantic (Online Appendix 3). This may be due to differences between the models used to project the future climate scenario, including the magnitude of the projected changes in temperature and salinity, which are larger in the Pacific than the Atlantic (Brickman et al. 2016; Peña et al. 2018), or to a larger number of species expected cross thermal thresholds under future scenarios in the Pacific.

Applications of invasion hotspot maps

Different scientific, management, and communication objectives require different information, and should guide how one chooses to define invasion hotspots. Maps of predicted invasive species richness provide the best depiction of our results, but discretized hotspot maps help to simplify communication and focus attention. In some cases, there may be thresholds in the number of invaders that we are, or are not, concerned about. In other cases, resource limitations might constrain the number of sites or total area we can monitor, manage, or mitigate, so prioritizing hotspots (i.e., according to quantiles) make sense. We focus on applications of hotspots and coldspots because of the frequent need to prioritize (or deprioritize) specific areas for attention. However, this does not mean that areas of intermediate invasion risk should be ignored. Moreover, the more detailed species richness maps and/or the full coldspot-warmspot-hotspot rankings can be used when decision makers have greater capacity to deal with complexity or to allocate resources more broadly.

Marine invasion hotspots and coldspots are most directly applicable in invasive species monitoring and management. Guisan et al. (2013) discuss how species distribution models can be used in all stages of a structured decision-making process for dealing with biological invasions, and the principles they discuss apply equally to our stacked species distribution models and hotspot maps. First, the models can help to identify potential problems, such as increasing invasion risk in an area due to climate change, and help to provide a frame of reference to define objectives and possible actions. By indicating where species are most likely to become established now and spread in the future, predictive models facilitate early detection through optimization of effective, cost-efficient monitoring programs (Honrado et al. 2016), and can allow one to anticipate and plan regulatory or management interventions to prevent, mitigate or eradicate invasions. Then, the models can help to assess how effective different actions might be and trade-offs between the costs and benefits of different options prior to a final decision (e.g. Baxter and Possingham 2011). Finally, error associated with model estimates contribute to uncertainty assessments throughout the decision-making process.

Marine invasive species intersect with other conservation and marine management objectives through their vectors of introduction and dispersal, and through their ecological, economic, and social/cultural effects in affected areas. Vessel traffic (ballast water, hull fouling) and aquaculture are considered the primary vectors responsible for the introduction and spread of marine invasive species (Williams et al. 2013), including all the species in our study. In recent years, the threat posed by introduced and invasive species has been reduced though regulatory measures and voluntary codes of practice, including ballast water management (Scriven et al. 2015), and control of introductions and transfers of aquaculture species, and movement of aquaculture gear (ICES 2005). However, a degree of threat remains. Rates of compliance with existing regulations are less than 100%, and many vessels (e.g. recreational vessels, vessels moving within or between regions within national boundaries) and aquaculture activities are not yet subject to regulations (Castro et al. 2018). Moreover, hull fouling is a major component of the risk posed by vessel traffic but, unlike ballast water, it is largely unregulated (Sylvester et al. 2011). Hotspot maps depicting areas that are at high and low risk of invasion by multiple species could be used to focus new and existing regulatory measures in ways that minimize the risk of introduction and spread of invasive species by these vectors and the regulatory burden placed on both industries.

Examining how invasion hotspots intersect with areas of conservation concern (e.g. habitats of threatened species, marine protected areas, biodiversity hotspots) or areas providing important economic and ecosystem service benefits (e.g. ports, marinas, beaches) is another way to focus biosecurity measures and guide decisions about conservation and marine spatial planning. For instance, in addition to being a vector of marine invasive species, the aquaculture industry suffers some of their most significant economic impacts, through bio-fouling of aquaculture organisms and gear, as well as losses due to competition and predation by marine invasive species (Beal and Kraus 2002; Carver et al. 2003). Maps of invasion hotspots could be used to assess the current and future threat to existing aquaculture operations and potentially influence where new aquaculture leases are located. Aquaculture operations in the southern Gulf of St. Lawrence have been strongly affected by marine invasions (Locke et al. 2009), but this area has only moderate invasive species richness relative to the entire northeast Atlantic study region, and no hotspots were identified for this area in our broadest scale analyses (Figs. 2, 3). However, defining hotspots according to regionally specific thresholds reveals that hotspots within the Gulf of St. Lawrence ecoregion are concentrated in the southern Gulf, and that a high proportion of aquaculture leases are in current or projected invasion hotspots (Figs. 4, 5). This demonstrates the value of setting regionally and/or objective-specific criteria to define and delineate hotspots and illustrates how potentially serious and significant impacts on ecosystem goods and services from invasive species exist outside the broadest-scale hotspots.

Fig. 5
figure 5

Existing aquaculture sites (green diamonds) in relation to present day and future invasion risk hotspots and coldspots in the southern Gulf of St. Lawrence, demonstrating a projected increase in the severity of invasions near aquaculture sites in the Magdalen Islands. Hotspots (red) and coldspots (blue) are delineated as top and bottom 10% of predicted invasive species richness in the Gulf of St. Lawrence ecoregion (as in Fig. 4). Other areas that are neither hotspot nor coldspot (i.e. “warmspots”) are depicted in yellow

Our analyses provide information about invasion risk, in terms of the identity and number of invasive species that are likely to become established in an area. However, in many management applications, risk of impacts may be more important than risk of invasion. Estimating cumulative invasion impacts requires information about the abundance of each invasive species, as well as knowledge of the ecological and human systems present in an area, and how particular invasive species affect them. If such information cannot be obtained, it may sometimes be possible to use cumulative invasion risk as a proxy for cumulative risk of impact, but only with great caution.

Limitations

Our analysis of invasive species richness and invasion risk hotspots is subject to the same limitations as other analyses using species distribution models. All estimates provided by species distribution models are subject to errors because of data deficiencies and uncertainty introduced through imperfect model specification (Barry and Elith 2006). Predictions will be imperfect because occurrence data used to fit the models is incomplete and may be biased towards areas where observations are more likely; long term climatologies will not capture intra-annual variability’s effects on species distributions; even the best high resolution climate data will not perfectly match the environmental conditions organisms actually experience; and species distribution models never capture all the factors that control a species’ distribution (Rondinini et al. 2006).

The differences between observed species richness and model-based estimates can be informative. These differences can be caused by shortcomings of either the SDMs or the observation data (e.g. underestimates due to insufficient sampling). Where model predictions are higher than observed species richness three scenarios are worth investigating. First, some invasion(s) may have gone undetected due to insufficient sampling. Second, although our focal species are likely to have had many opportunities to spread to representative (in terms of environmental suitability) habitats in the 20-plus years since each invasion began, they may not have reached or established in all suitable areas. In suitable areas where species are not yet established there may be an opportunity for preventative management. Third, there may be a limiting environmental factor or biotic interactions in that area that has not been captured in the model. Underpredictions of observed species richness were half as common as overpredictions in our results. Underpredictions could be indicative of source-sink dynamics, where repeated introductions or on-going dispersal (via natural or anthropogenic vectors) help to support populations in marginal habitats (Lockwood et al. 2005; Franklin 2010). Future research combining predictions of invasive species richness and information about currents and vector activity could help to identify areas where high dispersal maintains higher than predicted species richness, as well as environmentally suitable areas that go uncolonized due to dispersal limitation.

Including additional environmental parameters (e.g. nutrients, pH, etc.) might have further improved our models. We did not include additional predictor variables, either because we did not have access to environmental projections that would allow us to project our models, or because no appropriate present-day data exist. For example, most of the species we investigated rely on hard substrates (Fofonoff et al. 2018). To our knowledge, high resolution substrate data throughout our study areas do not exist. Without it, our models are likely to over-estimate invasive species richness in sandy or muddy areas, unless sufficient rocky outcrops or anthropogenic structures are present within each of the grids to which a probability is assigned. Thankfully, the accessibility of high resolution, historical and present-day data for an increasingly wide array of environmental parameters that can be used to model the distributions of marine species is steadily increasing (e.g. Bio-ORACLE v2.0, Assis et al. 2018). Acquiring projections for environmental parameters under future climate scenarios is more difficult, and remains an impediment to further improving predictions of the future distributions of invasive and native species.

Distribution models based on species’ environmental responses typically do not account for biotic interactions, or how they might change in response to environmental change (Davis et al. 1998), and simple stacking of distribution models involves an implicit assumption that species occurrence probabilities are independent after accounting for the effects of the environmental predictors (Calabrese et al. 2014). Nor do species distribution models typically capture the influence of genetic variability, phenotypic plasticity, or evolutionary changes (Elith and Leathwick 2009). Accounting for genetic population structure, such that all local populations are not assumed to have the same physiological responses and environmental tolerances, significantly improves model predictions (Lowen et al. 2019). Thus, a lack of information about the plasticity, genetic structure, and evolutionary responses of populations, along with the uncertainty associated with climate projections, will constrain the predictability of species’ future distributions.

Predicting species distributions in the context of biological invasions and climate change can cause additional difficulties because the models assume that species are at equilibrium with their environment and that the relevant environmental gradients have been well sampled, yet such contexts can involve predictions for novel and non-equilibrium conditions (Elith and Leathwick 2009). We have tried to reduce the challenges posed by invasions by selecting species that were well established and have already spread over broad spatial and environmental ranges, but it is not possible to know if they have all reached equilibrium.

Despite these limitations, correlative species distribution models remain one of the few practical approaches to predicting past, current, and future species distributions, and their ability to use incomplete data to produce spatially comprehensive estimates for the present day and future distributions is a major advantage (Elith and Leathwick 2009; Rondinini et al. 2006). With today’s high-resolution oceanographic data and models, these estimates can be made for a broad range of spatial extents and resolutions that are useful in the context of management applications. These estimates are likely to improve as data for more factors that control species distributions’ become available and existing oceanographic models are refined further. Finally, like any analysis of cumulative invasion risk or impact, our findings are also limited by the set of species included in the analysis. Although we chose to select invasive species covering a broad range of taxonomic and life history diversity in hopes of making our results more representative, they may not reflect the patterns of the species we could not include due to lack of data or logistical constraints.

Local adaptation can cause distribution models for different populations and genetic lineages to differ markedly from one another, and for combined models to differ markedly from models based on the occurrence data for the entire species (Hällfors et al. 2016; Lowen et al. 2019). Our study uses occurrence data from within each of our focal regions to model the species distributions within each of those regions. We chose not to include occurrence data from other parts of the species’ global distributions so that our models would reflect the potential distributions and environmental responses of the genetic lineages that have established populations in each our focal regions. However, this means that our models may not capture the entire range each species could potentially achieve if secondary introductions bring additional genetic lineages with different responses to temperature, salinity, and wave action to our shores.

Conclusions

Understanding, managing, and monitoring species invasions are complex, difficult tasks. There are many established and potential invaders to consider, each associated with environmental requirements, dispersal vectors and pathways, potential impacts and mitigation measures. Additionally, environmental change is altering the susceptibility of native ecosystems to invasion (Stachowicz et al. 2002; Bellard et al. 2018). Tools that synthesize across this complexity are needed to clarify our thinking, to identify where the cumulative risks (and thus potential impacts) of invasive species are greatest, and to shape management responses. By stacking species distribution models for 12 invasive species in the northwest Atlantic and northeast Pacific to identify invasion hotspots, this study demonstrates that there is significant spatial heterogeneity in cumulative invasion risk in the coastal northwest Atlantic and northeast Pacific, and that these patterns are likely to change over the next 55 years in response to climate change, particularly in the Pacific. Stacked species distribution models and the hotspot maps they generate provide an integrated perspective that can help focus attention and provide guidance to those tasked with generating new scientific knowledge, prioritizing locations for monitoring and implementing policy around marine invasive species, typically with limited resources.