Introduction

While species distributions are determined by various environmental factors, resources, and conditions, the role of ecosystem engineers, which directly or indirectly modulate the availability of resources to themselves and to other species are particularly important (Jones et al. 1994; Guisan and Thuiller 2005). Autogenous engineers, such as trees or corals, modify, maintain, and/or create habitats through their own structures (Curdia et al. 2013). Gorgonian soft corals are characterized by high biodiversity, both due to the high number of gorgonian species and because they play an important role as habitat providers (Garrabou and Harmelin 2002; Coma et al. 2004; Linares et al. 2007, 2008; Buhl-Mortensen et al. 2010). Additionally, they provide numerous ecological services such as nutrient cycling, carbon sequestration, sediment stabilization, flow redirection, benefits to human populations by supporting fisheries, tourism, protecting coastal areas from wave and storm action, and delaying the spread of invasive algae (Costanza et al. 1998; Ballesteros 2003; Piazzi and Balata 2009; Cerrano et al. 2010; Casas-Güell et al. 2015; de Ville d'Avray et al. 2019). The changes in edaphic conditions caused by gorgonian forests influence larval settlement and recruitment processes of benthic communities by supporting diverse food webs, nurseries for numerous associated species, and biodiversity of marine communities (Reaka-Kudla 1997; Thomsen et al. 2010; Ponti et al. 2014; Liconti et al. 2022). Therefore, conservation of gorgonian forests is crucial to avoid depletion or degradation of coralline ecosystem functioning, especially in the early stages of recruitment (Cerrano et al. 2000; Cerrano and Bavestrello 2008; Coma et al. 2004; Linares et al. 2005; Ponti et al. 2014). The local disappearance of gorgonians may lead to a shift in epibenthic assemblages from crustose coralline algae to filamentous algae and reduce the resilience of coralline bioconstructions (Ponti et al. 2014). The spatiotemporal distribution of gorgonians is determined by the combined effects of biological and environmental factors that can influence the recruitment, growth, and mortality rates in populations of different species (Gori et al. 2011). In sessile marine organisms such as gorgonians, the interaction between recruitment and survival of individuals results in patchy distribution patterns and spatially structured populations (Sebens 1991; Karlson 2006; Gori et al. 2011). Such patterns have a fundamental influence on ecological processes in the short term and on spatial structure in the long term (Illian et al. 2008).

The distribution and abundance of gorgonians are increasingly threatened by several stressors, including global warming, seawater carbonate chemistry, the spread of alien species, and local anthropogenic activities (Cerrano et al. 2000; Milazzo et al. 2002; Coma et al. 2009; Huete-Stauffer et al. 2011; Verdura et al. 2019; Cebrian et al. 2018; Galil et al. 2019). Mediterranean gorgonians are generally considered cold-affinity species and are particularly sensitive to rising temperatures. In the northwestern Mediterranean Sea, mass mortalities (MME) in gorgonian forests have increased in frequency and intensity since the end of the last century (Martin et al. 2002; Rivetti et al. 2014; Chimienti 2021; Iborra et al. 2022). The Mediterranean Sea is a semi-enclosed basin where global warming has significant impacts and is considered a climate change hot spot (Tuel and Eltahir 2020). Thermal stress, water stratification, and diseases are the most likely causes of MME in gorgonian species associated with high water temperatures at depths greater than 40 m (Coma et al. 2009; Vezzulli et al. 2013).

We tried to assess the potential environmental drivers that shape the distribution and habitat suitability at a regional level of three gorgonian species P. clavata, E. cavolinii and E. singularis which have undergone MMEs in the Mediterranean Sea due to global warming (Piazzi et al. 2021; Garrabou et al. 2022). The red gorgonian Paramuricea clavata is a long‐lived, slow‐growing species characterized by colonies that can exceed 1.5 m in height and live for over a century. It exhibits a bathymetric range that goes from 5 to 200 m (Mokhtar-Jamaï et al. 2011). Eunicella cavolinii is very common in the north-western Mediterranean Sea, very abundant on the coasts of Marseille, is also found in the Adriatic Sea, and at low population density in North Aegean Sea (Sini et al. 2015). While it has a broad distribution, it has patchy abundance and a high depth range distribution (5–150 m) (Russo 1985; Sini et al. 2015). E. cavolinii lives mainly on rocky hard substrate in the coralligenous and pre-coralligenous habitat where the light irradiance is not too low often in association with colonies of P. clavata. The white gorgonian Eunicella singularis is one of the most representative habitat-forming species of the rocky bottoms and Mediterranean coralligenous assemblages (Pey et al. 2013). E. singularis is the sole symbiont gorgonian with autotrophic dinoflagellates in the Mediterranean. From the very first records in 1999 up to the most recently published, increasing evidence of thermal-related stress events have been recorded (Cerrano et al. 2005; Cerrano and Bavestrello 2008; Gambi et al. 2018). P. clavata is one of the most impacted species by MMEs and it is listed in the IUCN Red List as vulnerable (IUCN Red List, https://www.iucnredlist.org/). E. cavolinii and E. singularis are both listed as Near Threatened in the IUCN Red List. E. singularis has been particularly affected by MMEs because it has a very low recovery capacity. According to Fava et al. (2010), species of the genus Eunicella are more resistant and resilient than P. clavata, and assemblages dominated by E. cavolinii tolerate higher radiation levels than assemblages with P. clavata. The thermotolerance of P. clavata and E. cavolinii could be mediated by the bacterial communities of the two species (Tignat-Perrier et al. 2022).

Species distribution modelling (SDM) and ecological niche modelling (ENM) are quantitative empirical methods used to predict the probability of occurrence of species or habitats using a suite of environmental predictor variables (Elith et al. 2006; Elith and Leathwick 2009; Quattrini et al. 2013; Melo-Merino et al. 2020; Charney et al. 2021). SDMs and ENM may predict species’ distributions in unknown locations or in different time scales, as well as niche shifts under processes of disturbance, invasion, or speciation (Murphy and Smith 2021). The species–environment relationships were developed using species location data (abundance, occurrence) and those environmental variables thought to influence species distributions. The assessment of species distribution on present and future environmental conditions could play an important role in studying the impacts of global changes to species habitat and extinction rate in natural ecosystems (Franklin 2010; Van Echelpoel et al. 2015). In the framework of SDMs and ENM, several statistical methods and algorithms, including a growing number using machine learning are available. (Williams et al. 2009; Elith et al. 2008, 2011, Assis et al. 2014; McKenna and Kocovsky 2020; Bellin et al. 2022).

Here, we applied a SDM framework combined with machine learning algorithms to assess the potential environmental drivers that shape the distribution and habitat suitability at regional level of three gorgonian species P. clavata, E. cavolinii and E. singularis in the Mediterranean Sea. Our modelling framework was used to identify the location and overlap of vulnerable habitats and areas in need of further sampling to accurately predict their future habitat suitability under the worst IPCC scenario RCP8.5 (Bellin et al. 2022). We hypothesized that global warming may negatively affect the potential predicted habitat suitability of these gorgonians’ species in the Mediterranean Sea. We compared results of different genera and, within Eunicella genera, between symbiotic, E. singularis, and the non-symbiotic E. cavolinii. According to the literature, we hypothesized that P. clavata would be at a highert risk than species of the genus Eunicella. We further hypothesized that the sensitivity of the symbiotic algae (zooxanthellae) may affect the susceptibility of the holobiont E. singularis to thermal stress (Fitt et al. 2001; Baker et al. 2004; Fava et al. 2010).

Materials and methods

Species presence data and environmental variables

For all species, presence points within the Mediterranean Sea (Fig. 1) were obtained from the Global Biodiversity Information Facility database (GBIF.org) (dataset keys in Supplementary Table 1) using the R package rgibif (Chamberlain et al. 2022) and from Liconti et al. (2022). Although P. clavata was also recorded in the Atlantic Ocean and in the Aegean Sea (Boavida et al. 2016) and E. cavolinii was observed in the Tunisian and Algerian coasts, Aegean Sea, and Marmara Sea (Sini et al. 2015; Masmoudi et al. 2016), our collection of presence points was limited to the northwestern Mediterranean Sea, the Ligurian Sea, the Tyrrhenian Sea, the Ionian Sea, and the Adriatic Sea because the majority of occurrences were recorded in this area.

Fig. 1
figure 1

Occurrences of P. clavata, E. cavolinii and E. singularis reported as black points within study area in the Mediterranean Sea

Presence data was recorded in the form of spatial points whose longitudinal and latitudinal coordinates referred to the Coordinate Reference System (CRS) WGS84. The dataset downloaded from GBIF included information on the uncertainty in meters of the coordinates of each occurrence. To reduce the geo-localization error, each occurrence with an uncertainty higher than 250 m was discarded. The duplicate function in the R package (R Core Team 2021) was used to delete the duplicated data to avoid data redundancy. Geo-localization mismatches were also checked and excluded when found (Supplementary Table 1). The final cleaned dataset contained a total of 2474 data points, 843 for P. clavata, 938 for E. cavolinii and 693 for E. singularis.

To model present-day (2000–2014) habitat suitability for the selected species, 23 physicochemical and environmental variables were retrieved from Bio-Oracle (Assis et al. 2018). Four geophysical variables were retrieved from MARSPEC (Sbrocco and Barber 2013) (Table 1) with the R package sdmpredictors (Bosch and Fernandez 2022). All environmental variables were available at 30 arc-second (~ 1 km2) of resolution.

Table 1 List of the selected physico-chemical and geophysical environmental variables

For the predicted future climatic conditions (2040–2050), the RCP8.5 emission scenario (Schwalm et al. 2020) was selected alongside three available environmental variables which were obtained through Bio-Oracle: current velocity, temperature, and salinity of both surface and benthic layers (Assis et al. 2017). The future pH of the surface layer was estimated using the annual trend of pH reduction (− 0.0044 units per year) calculated by high frequency observational data in the Mediterranean Sea (Flecha et al. 2015) for the years between 2014 to 2045. The other environmental variables were kept constant at present values. All environmental variables used were at 30 s of resolution in the CRS WGS84.

Modelling approach

Spatial thinning and environmental variable filtering

To reduce the spatial bias of the occurrence data, an operation of spatial thinning was applied with a minimum distance of 3 km among points (Steen et al. 2020). This procedure was carried out with the R package spThin (Aiello-Lammens et al. 2015).

To evaluate multicollinearity among the selected environmental variables, the variance inflation factor analysis (VIF) was performed (James et al. 2014). A conservative threshold of VIF = 4 was used and only environmental variables with VIF values ≤ 4 were kept within the modelling framework. The VIF analysis was carried out with R package usdm (Naimi et al. 2014).

To identify the most informative environmental variables and to reduce the model complexity, we applied the lasso regression (Tibshirani 1996). Lasso is a regression technique based on penalized maximum likelihood, and it is based on a shrinkage parameter (λ). When λ = 0, no shrinkage of the coefficient is performed, and as λ increases, the model’s coefficients shrinkage become higher. The lasso regression was run using a binomial family with 100 different values of λ. The tenfold cross validation was performed, and the best set of predictors was selected considering the minimum average value of the mean absolute error (MAE). MAE is a measure of error, specifically it is the average of the absolute differences between predictions and actual observations for all the test samples where all individual differences have equal weight. The Lasso regression was performed with R package glmnet (Friedman et al. 2010).

Pseudo-absences generation

For all species, the occurrences were in the form of presence-only data consisting of the locations of species observations but lacking absence points (Renner et al. 2015). VanDerWal et al. (2009) carried out modelling experiments with MAXENT on 12 species and they found that the relationship between the geographic extent used to sample the pseudoabsences, model performance, and the importance of environmental factors was maximized using 200 km. Moreover, Barbet-Massin et al. (2012) recommended that when machine learning algorithms are integrated with species distribution models (SDM) and the number of occurrences was less than 1000, pseudo-absences should be drawn with a balanced design using a geographical exclusion of 2 latitudinal degrees from presences. Pseudo-absences sampled from small areas might yield to spurious results, meanwhile sampling carried out in a broad area might produce model overfitting and more simplified relationship controlled by few environmental variables. In this study, a random sampling with a balanced design and a buffer radius distance of 150 km was used as trade-off. The pseudo-absences sampling was carried out with the R package ENMTools (Warren and Dinnage 2022).

Model selection

To model habitat suitability of the three species, three Machine Learning (ML) models were tested: XGBoost, Random Forest (RF) and the K-nearest neighbour (KNN) (Bellin et al. 2022; Konowalik and Nosol 2021; Valavi et al. 2021). The XGBoost, Random Forest (RF) and KNN models were fitted by R packages xgboost, ranger and caret (Chen et al. 2022; Wright and Ziegler 2017; Kuhn 2021).

Cross-validation is one of the most widely used data resampling methods to estimate the true prediction error of models and to tune model parameters (Berrar 2018). To select the best algorithm to model habitat suitability, the tenfold cross validation was repeated 10 times (100 model runs).

To quantify the performance of each machine learning model, a performance metric was used: the Area Under the receiving operating Curve (AUC). The AUC varies between 0–1 and measures the two-dimensional area under the entire Receiving Operating Curve (ROC). Values of AUC equal to 1 represent a perfect predictor, while 0.5 a random predictor. Differences in the performance of models were tested with pairwise Wilcoxon rank sum test applying the Holm correction.

Global sensitivity and uncertainty analysis (GSUA)

The global sensitivity and uncertainty analysis (GSUA) was used to identify the most important environmental variables that shape the habitat suitability of the species and to disentangle the model's interaction terms among environmental variables (Convertino et al. 2014; Pianosi et al. 2016). For each environmental variable, a uniform distribution was used, and 10,000 observations were sampled using a sample design based on quasi random numbers. To quantify the first order and total order indices (Si and Ti) of each environmental variable, the Sobol method was computed using 1000 permutations (Saltelli et al. 2008). The second order interactions (Si+j) were quantified for the environmental variable with the highest importance. GSUA was carried out using the R package sensobol (Puy et al. 2021).

Ecological niche, response curves, habitat suitability and spatial shift (present and future)

To visualize the ecological niche of the three gorgonian species, the whole set of the environmental variables were used to compute a Principal Component Analysis (PCA). A set of 10,000 random samples were drawn along the study area and were used to build the PCA components. To avoid overfitting, an additional independent set of 1000 random samples was projected into the PCA space. For each species, the habitat suitability values of each independent sample were plotted to highlight each ecological niche and the differentiation between species.

For each of the three gorgonian species, the partial response curve of the most important environmental variable was estimated considering the others as fixed at their mean value and the environmental variable that showed the highest second order interaction term with the most important one was fixed at the extremes at the mean value of the environmental gradient (Bellin et al. 2022).

To obtain the present (2000–2014) and future (2040–2050) predictions of habitat suitability within the study area, the selected algorithm was trained and the model output (habitat suitability) was predicted across the study area. The niche overlap between pairs of species was computed by the Schoener's D index (Schoener 1968) and the similarity I index (Warren et al. 2008) using the habitat suitability values across the study area. Both indices range from 0 to 1, where zero means no overlap and one complete overlap. To estimate the spatial shift of occupancy between present and future conditions, a thresholding procedure was applied according to Liu et al. (2013). The threshold that maximizes the sum between sensitivity and specificity was selected, and the model output (habitat suitability) was converted into a binary representation: present (1) or absent (0). The spatial shift of occupancy was computed as the difference between present and future conditions considering the occupied areas. To highlight potential stable states and spatial shift of occupancy between future and present climatic conditions, a weighted generalized linear model (glm) with a binomial link function was implemented. For each location and species, the difference in sea surface temperature between future and present climatic conditions was used as an independent variable. This difference represents an important environmental change across space and time in the Mediterranean Sea and was used to predict the probability of habitat loss. For each location and species, the spatial shift of occupancy between future and present climatic conditions was computed as difference between presence or absence and used as dependent variable. For each species, the glm’s slope coefficient was tested computing z-values. Moreover, the goodness of fit was tested using the analysis of deviance with Chi-squared test against a glm model with only the intercept.

The PCA was computed with the R package factoextra (Kassambara and Mudt 2020). The Schoener's D index and similarity I index were computed with the R package dismo (Hijmans et al. 2022). The threshold selection was performed with the package PresenceAbsence (Freeman et al. 2008). The glm was fitted with the R package stat (R Core Team 2021).

Results

After the spatial thinning procedure, a total of 173 presences for P. clavata, 189 presences for E. cavolinii, and 147 presences for E. singularis were obtained.

According to the VIF analysis, several environmental variables showed multicollinearity problems (VIF > 4). Among benthic layers these included temperature, nitrate, phosphate, dissolved oxygen, phytoplankton and among surface layers it included salinity, diffuse attenuation, and PAR. For this reason, these variables were removed from the modelling framework leaving a total of 15 remaining variables that were considered (Supplementary Table 2). For the three species, the lasso regression showed that the minimum average value of MAE was reached retaining all the remaining environmental variables in the modelling framework. The computed average MAE values for P. clavata, E. cavolinii and E. singularis were 0.40, 0.38, and 0.41, respectively.

For all the species, the repeated cross validation showed that Random Forest (RF) and XGBoost produced higher median values for AUC (Fig. 2) than KNN (Wilcoxon rank sum test p < 0.0001). Difference in model performance between RF and XGBoost was not significant (p = 0.43). XGBoost was selected for further analyses for comparison with other bagging approaches used to model the habitat suitability of P. clavata (Boavida et al. 2016).

Fig. 2
figure 2

The performance metric AUC was reported to assess the model selection. The horizontal dashed line represented the baseline of a random model

The first three components of PCA explained 74% of the total variance (43.1% PCA1, 19.8% PCA2 and 11.1% PCA3) (Fig. 3). Bathymetry (-0.86), silicate (0.33) and current velocity (-0.36) of the surface layer were the variables with the highest loading values for PCA1 (Supplementary Table 3). The E-W aspect strongly influenced PCA2 (-0.96), while silicate of the surface layer (-0.51), bathymetry (0.43) and temperature of the surface layer (0.35) showed the highest contribution to PCA3 (Supplementary Table 3). For both P. clavata and E. cavolinii, the habitat suitability values in the PCA space showed a clear differentiation from low to high values along the PCA1 axis, while for E. singularis the habitat suitability split from low to high values along the PCA1 and PCA3 axes.

Fig. 3
figure 3

The loadings of the environmental variables were represented in the PCA space built on 10,000 random sample points of the study area (a). (S) and (B) referred to the surface and benthic layer, respectively. The barplot represented the percentage of explained variance relative to 15 dimensions of the PCA (b). For each species, the scatterplots represented the 1000 sampled points from the study in the PCA space [P. clavata (c), E. cavolinii (d) and E. singularis (e)]. The predicted habitat suitability values were reported as viridis color gradient

For all the three species, the most important variable ranked by GSUA was bathymetry, with the highest first and total effect indices (P. clavata: Sbathymetry = 0.57 and Tbathymetry = 0.82, E. cavolinii: Sbathymetry = 0.44 and Tbathymetry = 0.61, E. singularis: Sbathymetry = 0.35 and Tbathymetry = 0.46) (Fig. 4). Another important geophysical environmental variable was concavity, especially for E. singularis (P. clavata: Sconcavity = 0.06 and Tconcavity = 0.15, E. cavolinii: Sconcavity = 0.12 and Tconcavity = 0.20, E. singularis: Sconcavity = 0.31 and Tconcavity = 0.45). Other environmental variables appeared not important in shaping habitat suitability of any of the three species, with first and total effects values near or equal to zero (surface layers: nitrate and calcite, benthic layer: silicate and bottom light).

Fig. 4
figure 4

For each environmental variables and species, the sensitivity indices (Sobol indices) were reported in different colour: first order (Si) as grey and total order (Ti) as orange. The error bars represented the standard errors (1000 permutations). (S) and (B) referred to the surface and benthic layer, respectively

For P. clavata the most important second order interaction with bathymetry was silicate in the surface layer followed by concavity (Sbathymetry+silicate = 0.078 and Sbathymetry+concavity = 0.059) and for E. cavolinii was salinity in the benthic layer followed by concavity (Sbathymetry+salinity = 0.050 and Sbathymetry+concavity = 0.040). For E. singularis, the most important second order interaction with bathymetry was the concavity (Sbathymetry+concavity = 0.054) (Supplementary Fig. 1).

According to the response curves, the habitat suitability decreased with bathymetry: deeper than 1000 m for. P. clavata and E. cavolinii, and deeper than 250 m for the zooxanthellate E. singularis (Fig. 5). For P. clavata, the habitat suitability between 0 and 1000 m was higher at the lowest silicate concentration (1.83 mol.m−3). For E. cavolinii, the habitat suitability between 0 and 1000 m was higher at low and medium salinity (36.9 PSS). For E. singularis, the habitat suitability was the highest between 0 and 250 m. Between 250 and 750 m the habitat suitability was higher in valleys than in hills. For all three gorgonian species, the highest values of habitat suitability were concentrated in proximity of the coasts. For E. singularis, high values of habitat suitability were recorded in the North and Central Adriatic Sea, where the bathymetry is lower compared to other areas of the Mediterranean Sea (Fig. 6). In future climatic conditions, P. clavata was expected to shift the habitat suitability from lower to higher latitudes, mainly in the Adriatic Sea. For E. cavolinii, the main pattern of variation between present and future climatic conditions were along the Tyrrhenian coast where the main reduction in habitat suitability was predicted. In future climatic conditions, the habitat suitability reduction of E. singularis was expected mainly along the coasts of the Adriatic Sea.

Fig. 5
figure 5

For each species [P. clavata (a), E. cavolinii (b) and E. singularis (c)], response curves of the most important variables (bathymetry) were computed considering the extremes and the mean value of the environmental variables with the highest interaction while the other variables were kept fixed at the mean values. For the concavity, only the extremes values, hill or valley, were considered. (S) and (B) referred to the surface and benthic layer, respectively

Fig. 6
figure 6

Present and future (2040–2050 RCP8.5) habitat suitability of the three gorgonian species: P. clavata (ad), E. cavolinii (be) and E. singularis (cf)

To estimate the spatial shift of occupancy between present and future conditions, a thresholding procedure was applied: the thresholds were 0.5 for P. clavata, 0.38 for E. cavolinii and 0.39 for E. singularis. In the future, P. clavata was expected to increase the occupancy area of 757 Km2 with respect to the present (+ 0.6%) (Fig. 7) while E. cavolinii and E. singularis were expected to reduce the occupancy area of 59,335 Km2 and 23,341 Km2 (− 49% and − 15%), respectively (Fig. 7). In present climatic conditions, the Schoener's D index computed for species pairs were: P. clavataE. cavolinii = 0.70, P. clavataE. singularis = 0.63 and E. cavoliniiE. singularis = 0.62, while the similarity I index: P. clavataE. cavolinii = 0.89, P. clavataE. singularis = 0.84 and E. cavoliniiE. singularis = 0.82. In future climatic conditions the Schoener's D index computed for species pairs were: P. clavataE. cavolinii = 0.60, P. clavataE. singularis = 0.72 and E. cavoliniiE. singularis = 0.61, while the similarity I index: P. clavataE. cavolinii = 0.86, P. clavataE. singularis = 0.90 and E. cavoliniiE. singularis = 0.86. In future climatic conditions the niche overlap between P. clavata and E. cavolinii decreased with respect to the present, while the niche overlap for P. clavata and E. singularis increased. For E. cavolinii and E. singularis the similarity I index indicated that niche overlap increases into the future (Fig. 7).

Fig. 7
figure 7

Area of suitable habitat (km2) in present and future (2040–2050 RCP8.5) climatic conditions of the three gorgonian species (a). For each pair of species, the niche overlap was computed with Schoener’s D index (black) and similarity I index (grey), in present and future climatic conditions (b)

The glm model indicated a positive relationship between probability of habitat loss and sea temperature increase from present to future (Fig. 8). The slope coefficient was significant for P. clavata and E. singularis, but it was not significant for E. cavolinii. In fact, along the whole range of temperature increments of the surface layer, the habitat loss probability was the highest and most consistent for E. cavolinii (Supplementary Table 4). The goodness of fit for P. clavata and E. singularis against a model with only the intercept was significant (p < 0.0001), while for E. cavolinii it was not significant (p = 0.83). The probability of habitat loss for P. clavata increased approximately linearly with difference in temperature of the surface layer, while for E. singularis it rose more steeply.

Fig. 8
figure 8

Sea temperature of the surface layer computed as difference between future and present condition was reported as color gradient. For each species [P. clavata (a), E. cavolinii (b) and E. singularis (c)], the habitat loss and gain were reported on the delta temperature maps. For each species, the relationship between Δ temperature (difference between future and present climatic conditions) of the surface layer and the probability of habitat loss estimated by the glm model was reported with 95% of confidence intervals (shadow grey) [P. clavata (c), E. cavolinii (d) and E. singularis (e)]. (S) referred to the surface layer

Discussion

In this study, we explore a species distribution model framework combined with machine learning algorithms, to assess the potential environmental drivers that shape the distribution at regional level of the three Mediterranean gorgonian species P. clavata, E. cavolinii and E. singularis. The modelling framework was also used to predict their future habitat suitability under the worst climate change IPCC scenario RCP8.5. For all species, the supervised machine learning algorithms XGBoost and RF, reached the highest values of the AUC performance metric. XGBoost was chosen to model the habitat and ecological niche of the three species, to assess the factors that influenced their distribution pattern, and to predict their response to climate change for comparison with Boavida et al. (2016) which used a similar algorithm to model the habitat suitability of P. clavata. They showed that temperature (11.5–25.5 °C) and a geophysical variable, the slope, are the most important predictors to define the niche of P. clavata. The prediction from these variables modelled a wider distribution than previously known. In agreement with the observations by Sini et al. (2015), Boavida et al. (2016) and by Masmoudi et al. (2016), our modelling framework predicted the habitat suitability for P. clavata and E. cavolinii along the Algerian and Tunisian coasts and may be useful to identify new populations in poorly sampled areas such as the South and East of the Mediterranean and to consider different bathymetries. Within the Western Mediterranean, where most of the quantitative studies were carried out, a decline between 10 and 80% was observed for several shallow coastal populations of P. clavata (Linares et al. 2008; Gómez-Gras et al. 2021). Information is generally lacking for populations in other parts of the Mediterranean and in deep environments (below 40 m depth). Gori et al. (2013) highlight the importance to explore the ecological and evolutionary features of the deep sublittoral gorgonian populations and the connectivity with shallow populations exposed to more frequent perturbations. The distribution of such species depends on a combination of biotic and abiotic factors, but the predicted distribution and habitat suitability will also depend on the resolution and extent of the data used for modelling (Connor et al. 2019). Despite inherent uncertainties, the modelling framework may serve as a valuable first step in the planning of field surveys and to identify areas where further sampling is needed (Boavida et al. 2016). By analyzing variable importance and response curves, we showed that these three gorgonian species all responded to similar environmental conditions and that the spatial distribution of all three species in the studied area is influenced primarily by bathymetry. Several studies reported that gorgonian habitat suitability is strongly related to geophysical factors such as bathymetry, which is related to seafloor topography and hydrography (Yesson et al. 2012; Kinlan et al. 2020). In the northwestern Mediterranean Sea, P. clavata and E. singularis have been reported to occur in areas characterized by intense benthic currents (Gori et al. 2013). Bathymetry, which is related to topography, temperature, water flow, concentration of nutrients, particulate organic matter and microzooplankton, affects the physiology and ecological interactions of gorgonians (Coma and Ribes 2003; Coma et al. 2004; Ezzat et al. 2013; Mortensen and Buhl-Mortensen 2004; Jenkins and Steven 2021). According to our results, the critical depth for P. clavata and E. cavolinii is about 1000 m, but the bathymetric occupancy of the two species ranged from 5 to 200 m and 5 to 150 m, respectively (Russo 1985; Mokhtar-Jamai et al. 2011; Sini et al. 2015; Gori et al. 2019). These discrepancies could be due to the resolution scale of the bathymetric layer (1 km × 1 km). In grid cells encompassing rocky shores, bathymetry may be very deep and hide the true slope of the shoreline; however, these results are embedded in the typical range of deep-sea gorgonians in the Mediterranean Sea, which ranges from 200 m to 1,000 m (Mortensen and Buhl-Mortensen 2005). Gori et al. (2013) found populations of P. clavata and E. singularis in the deep sublittoral, emphasizing the importance of studying the distribution of gorgonian species over a large bathymetric range.

For E. singularis, the zooxanthellate species, the critical bathymetry was lower at about 250 m than for the other two species due to the dependence of algal endosymbionts on light which can transfer photosynthetically produced carbon to E. singularis, providing additional autotrophic nutrition. However, the bias of the occurrence database tends towards shallow waters, which could be due to the fact that a great percentage of occurrence data comes from recreational and scientific (not technical) diving, which is limited to 40/50 m depth. It might be helpful to encourage deeper survey by emphasizing the potential role that citizen science and technical diving could play in data collection (Pulido Mantas et al. 2022).

For each species, bathymetry showed a second-order interaction with several environmental variables. For P. clavata, habitat suitability was highest between 0 and 1000 m at low silicate concentration of an oligotrophic sea, such as the Mediterranean Sea, where silicate concentration ranges from 1 to 4 μM (Bergamasco and Malanotte-Rizzoli 2010; Schroeder et al. 2010; Sospedra et al. 2018). Our result is consistent with previous studies that found a negative relationship between coral abundance, suitability, and silicate concentration (Davies et al. 2008; Reyes Bonilla and Cruz Piñón 2002; Barbosa et al. 2020). Silicate concentration is associated with primary productivity in surface waters and would be highest in zones depleted by diatoms and with low carbon fixation rates and moderate amounts of energy input for corals (Bonilla and Cruz Piñón 2002). Clearly, species suitability distribution is affected by temperature, depth, and dissolved oxygen, but silicate concentrations were also indicated to have a negative relationship with coral species richness at different latitudes. It seems reasonable that high silicate concentrations were associated with low primary productivity and through a cascade effect with low gorgonians abundance and suitability.

For E. cavolinii, habitat suitability was high for habitat between 0 and 1000 m depth with low and medium salinity. In general, gorgonians appear to tolerate hypersaline conditions more readily than reduced salinity, with an optimal range of 29.5–42.5 PSS (Kupfner Johnson and Hallock 2020). For E. singularis, habitat suitability was the highest between 0 and 250 m. Between 250 and 750 m, suitability was higher in valleys than in hills. Plan concavity and curvature are important topographical shapes that affect the convergence and divergence of water flows and influence food availability for benthic suspension feeders. The enhanced water transport of food particles is due to local advection and resuspension or vertical currents from deeper waters (Boavista et al. 2016). Indeed, E. singularis is a species commonly found on horizontal or sloping sediment-covered soils exposed to irradiation conditions ranging from 3 to 44% of surface values. This observation could explain the importance of concavity in relation to bathymetry, where a slight difference in habitat suitability was observed between valleys and hills.

Under present conditions, the modeling approach correctly captured the expected distribution of the species mainly along rocky coasts of the Ligurian and Tyrrhenian Seas (Italy), in Corsica (France), around the island of Elba (Italy), and in the Gulf of Lyon (France). In addition, the presence of populations of the three species was detected around the islands of Sardinia (Italy). High habitat suitability values were predicted for E. singularis in the northern Adriatic Sea. This is probably a misleading result, that is a consequence of not considering the variable substrate type. At least for the Italian part of the North Adriatic, the high sedimentation rates and predominance of sandy bottoms greatly limit the distribution and occurrence of gorgonian species. Substrate type could not be considered for the modelling as there are no high-resolution products with full basin coverage regarding hard substrates. Future projection under the worst IPCC scenario of climate change, RCP8.5 showed a decrease in habitat suitability of both Eunicella species, especially for E. cavolinii for the next 30 years. The range of this species in our study area is expected to decrease by a startling 49%. Contrary to our initial hypothesis, P. clavata was expected to increase its range throughout the study area by shifting habitat suitability from lower to higher latitudes, mainly in the Adriatic Sea. As previously highlighted, the predicted shift is possibly a consequence of not including the substrate type among the input variables. The availability of this data could improve the distribution assessments of rocky bottom sessile species. For E. singularis, a 15% reduction in range was expected, suggesting that sensitivity of symbiotic algae (zooxanthellae) is not the primary cause of the corresponding susceptibility of Eunicella to thermal stress (Kupfner Johnson and Hallock 2020). E. cavolinii appears to be the most endangered species. This result could be due to the habitat suitability of the species, which we found to be higher at low and medium salinity (36.9 PSS). Increasing temperatures should lead to more evaporation, resulting in increasing water salinity, especially in the Atlantic and Mediterranean Seas. However, the input reduction of the main rivers in the area, especially the Po River in the Northern Adriatic delta, might affect the future salinity values. In our prediction, the range of salinity in the worst emission scenario was between 37.16 and 39.21 PSS, higher than the optimal values of the response curve of E. cavolinii (Fig. 5) (Topçu and Oztürk 2016).

Our simulation highlights the importance of bathymetry for gorgonian habitat suitability and the consideration of bias of the occurrence database towards shallow waters and geographical remoteness, which are factors that lead to exposure to different stressors and potential population divergence (Sánchez 2016). The environmental gradients of depth and related variables are the primary factors in ecological niche differentiation (Quattrini et al. 2013). As Gori et al. (2011) and Pivotto et al. (2015) pointed out, new studies, including molecular approaches, are important to explore the ecological characteristics, differences, and adaptations of deep sublittoral populations. Their possible connectivity with shallow ones, which are more often exposed to less stable conditions and frequent disturbances could play an important role in recolonizing the shallow areas. According to the ‘deep reef refugia’ hypothesis (DRRH), deeper reefs are less affected by thermal stress events and thus have the potential to act as refugia (Lesser et al. 2009; Bongaerts et al. 2010). The connectivity patterns and larval exchange between different depths described in P. clavata and E. cavolini using a microsatellite marker and a proteomic approach supported the hypothesis that deeper subpopulations, unaffected by surface warming peaks, may provide larvae for shallower populations (Pilczynska et al. 2016; Padrón et al. 2018; Beauvieux et al. 2023).

Some recovery of P. clavata populations was recorded despite the limited effective dispersal, low recovery capacity, and low genetic connectivity of the species (Palumbi 2003). Although spatial variation in environmental factors are key components in determining species distribution and abundance patterns, they may also result from interactions among organisms, tolerance to rapid environmental change through acclimation, genetic adaptation, and migration (Hoegh-Guldberg 2014; Tignat-Perrier et al. 2022). For example, the susceptibility of E. singularis to heat was recently studied in controlled stress experiments which revealed differential sensitivity among shallow water populations with intrinsic physiological mechanisms of acclimation (Pey et al. 2014). Ultimately, the species' reproductive cycle, recruitment, larval dispersal ability and stochasticity determine survival, growth, and reproduction of new individuals (Chiappone and Sullivan 1996; Edmunds 2000; Baird et al. 2003; Gori et al. 2011).

Our approach involved only three species of gorgonians, but further research could include several coral species found in the Mediterranean and elsewhere in the world, and could be greatly enhanced by considering biological interactions, community composition, functional traits, and the introduction of all possible classes of environmental variables that might act at regional scales and different depths. The habitat suitability predicted by our model depended strictly on spatial processes relative to the scale of the study, its resolution, and static informative layers. Causality normally is replaced with predictability, and for large non-linear systems, it is important to disentangle causal interactions and complex relationships among large set of explanatory variables. Many conceptual frameworks and different computational methods might be applied to ecological niche models: Wiener–Granger causality model (Granger 1969), Convergent Cross Mapping (CCM) (Clark et al. 2015), dynamic Bayesian networks (Trifonova et al. 2017), PCMCI (Runge et al. 2019) and the Optimal Information Flow (OIF) model (Li and Convertino 2021).

Measurements and modeling of water temperature over the past decade have made it possible to link MME to increases in water temperature at mesophotic depths, including shifts in the lower boundary of the thermocline. Heat waves and global warming, along with massive mucilaginous aggregates and the overgrowth of macroalgae on living corals, represent a combination of stressors that threaten gorgonian forests in unprecedented ways. If left alone, nature has a tremendous capacity to take care of itself (Garrabou et al. 2022; Gomez-Gras et al. 2021). However, anthropogenic activities are playing a key role in global environmental changes that are both driving biodiversity loss and altering the functioning of ecosystems (Bramanti et al. 2017; Ponti et al. 2018; Sini et al. 2019; Coppari et al. 2019). SDM combined with ML algorithms and GSUA proved to be a great tool to detect biodiversity patterns and to make spatial predictions using environmental layers. These methods are increasingly used in endangered species management, habitat suitability studies, environmental change impacts, and sustainability (Guisan and Zimmermann 2000; Peterson and Vieglais 2001; Hirzel et al. 2006; Lauria et al. 2017; Zhang et al. 2019; Jenkins and Stevens 2022). Understanding the distribution patterns of species in space and time is crucial for the identification of sites of special interest both inside and outside of existing marine protected areas and in the establishment of management and conservation actions and strategies (Fortin and Dale 2005).