1 Introduction

In the Mediterranean Basin, woodlands dominated by Quercus suber are extended for about 1 million hectares in Northern Africa (Morocco, Algeria and Tunisia) and over 1.5 million hectares in Southern Europe, mainly in Corse, Sardinia and western Iberian Peninsula (Fig. 1). Palynological and molecular studies show that the presence of Q. suber both in Italy and Iberian Peninsula is very ancient, dating back at least to the upper pleistocene (Carrión et al. 2000; Magri et al. 2015). In particular, the western part of the Iberian Peninsula has played an important role as a glacial refugium for this species (Carrión et al. 2000). The current extent of cork oak woodlands is supposed to be the remnant of a wider native forest (Rivas-Martínez 1975; Rivas-Martínez et al. 2003; Houston Durrant et al. 2016) occupying the semiarid zone between 37° and 45°N latitude during the Late Miocene and Pliocene in the Mediterranean basin (Palamarev 1989). Cork oak stands have also historically been managed as savanna-like woodlands, 30–60 trees per hectare (Houston Durrant et al. 2016), in Portugal (montados), Spain (dehesas) and Italy. These open woodlands are the result of a long history of human activities, such as agro-silvo-pastoralism and harvesting of cork (Bugalho et al. 2009).

Fig. 1
figure 1

Range of Quercus suber according to EUFORGEN (http://www.euforgen.org/species/quercus-suber/)

At present, the availability of large vegetation plot databases covering extensive areas of Europe, such as the European Vegetation Archive project (EVA; Chytrý et al. 2016), and dedicated software for storing and analysing big data (Hennekens and Schaminée 2001; Tichý 2002) make it possible to use pan-European vegetation plot database to explore and characterize the compositional, ecological features of habitat types. In the past, there has been an intensive debate on methods of standardized and formalized vegetation classification (Mucina 1997; De Cáceres et al. 2015). Several methods have been proposed with the aim to creating an international classification protocol that would integrate different classification approaches adopted. Experience from similar studies on vegetation classification at continental scale (Douda et al. 2016; Willner et al. 2017; Marcenò et al. 2018), suggests that classification system tools can be successfully applied to a broad-scale classification of Q. suber woodland.

In 1992, the European Union adopted the European Habitats Directive (92/43/EEC) recognizing the significance of the conservation habitat types and species. A set of habitats that deserve specific conservation measures by Member States is listed in the Annex I of the Directive. The list of habitats was compiled by experts selecting habitat types with main concern for conservation from the European Corine Biotopes Program, lately reclassified in Palaearctic (Devilliers and Devilliers-Terschurenn 1996) and EUNIS (Davies et al. 2004) classification. The codes were last amended by Directive 97/62/EEC.

Given its natural and cultural value, cork oak woodlands in Europe (see Palaearctic classification code 45.2; Bugalho et al. 2009) are listed in Annex I under code “9330 Q. suber forests”, which according to the Palaearctic hierarchical subdivision comprises four geographically differentiated subtypes (see Interpretation Manual European Commission 2013).

Area and indicator species are the criteria most used to assess the conservation status of habitat types (Lengyel et al. 2008) and they are officially recognized as the main parameters for monitoring according to art. 11 of the Habitats Directive (Evans and Arvela 2011). Currently, this assessment is based on regional experts, who produce maps and lists of species and habitats, but this leads to controversial results among and within Member States. One of the current issues for habitat conservation in Europe is, in fact, related to the definition and interpretation of habitat types (Evans 2010).

Here, we explore the use of georeferenced vegetation databases as a new resource to support standardized protocols for habitat interpretation, mapping and assessment at the European scale. Specifically, we propose a methodology based on the analysis of large vegetation plot data to explore compositional, and actual and potential distributional patterns of Q. suber woodland types in Europe.

2 Methods

2.1 Study area

The study area (Fig. 1) comprises the whole present-day range of Q. suber in Europe (Portugal, Spain, France, Italy). Cork oak occurs in large territories along the Atlantic coast, from the southwestern coasts of France, to the Iberian Peninsula. In Western and Central Mediterranean basin, the habitat is scattered within a narrow belt along Catalonia, Languedoc, Provence, Tyrrhenian coast of the Italian peninsula and the major islands (Pausas et al. 2009). Moreover, it occurs in a few small stands along the southeastern coasts of Ionian Sea (Apulia) and in northern Dalmatia (Simeone et al. 2009).

2.2 Data set

Data sets were obtained from the European Vegetation Archive (Chytrý et al. 2016), from databases registered in the Global Index Vegetation Plot Databases (Dengler et al. 2011): “Vegetation-Plots Database Sapienza University of Rome” (EU-IT-11; Agrillo et al. 2017); “SOPHY Phytosociological Database” (EU-FR-003; Brisse et al. 1995); “SIVIM Iberian and Macaronesian Vegetation Information System” (EU-00-004; Font et al. 2012). The vegetation plots (1538 initial data set) were combined into a single database using the TURBOVEG/2 software (Hennekens and Schaminée 2001), and later imported into JUICE/7.0 program for the multivariate analyses (Tichý 2002). Plant nomenclature follows Tutin et al. (2001), although further modifications based on national floras and EURO + MED (http://www.emplantbase.org/home.html) were conducted when necessary. As only some authors recorded mosses and lichens, we excluded these taxa from the analysis. Taxa identified at the genus level and hybrids were also excluded. Subspecies were aggregated into the corresponding species. Species recorded in more than one vegetation layer by original authors were merged into one single layer.

The final data set consisting of 1032 plots and 1020 taxa was obtained according to the following criteria:

  • Plots were filtered based on a geographical stratification based on a grid of 6′ × 6′ with a heterogeneity-constrained resampling to reduce oversampling in certain regions (Lengyel et al. 2011).

  • Plots with Q. suber cover above 50% were selected to be consistent with the definition of the Habitat type.

  • We used plots ranging from 20 to 500 m2 excluding those too small or too large to reduce potential problems of analysing vegetation data with very different plot sizes (Dengler et al. 2009) or with different sampling designs (see also Michalcová et al. 2011; Ewald 2003).

  • Plots with no indication of sampled area (plot size) were included only when species richness was comprised between 7 and 49 species (Online Resource 1). This interval corresponds to a plot size between 50 and 300 m2 based on the correlation analysis between species richness and plot size conducted using the remnant plots of our database.

2.3 Data analysis

Flexible beta-clustering (β = – 0.25) in combination with Bray–Curtis coefficient, standardized by sample unit totals, was used on square-root-transformed data to obtain floristic groups. In a comparative analysis, this method was proven to be effective and informative when coupled with informal expert-defined ecological and biogeographical judgement criteria (Lötter et al. 2013). Optimal number of groups was identified according to the crispness values obtained by maximizing the floristic distinctiveness of clusters using the overall diagnostic values of species (Botta-Dukát et al. 2005).

Resulting groups were characterized by identifying diagnostic and constant species (i.e. those with frequent occurrence). Diagnostic species were identified using the phi coefficient of fidelity (Bruelheide 2000; Chytrý et al. 2002), selecting the species with phi values > 0.25 using presence/absence data and standardization of size of all clusters (Tichý and Chytrý 2006). Constant species were selected as those with a relative frequency > 30% for each cluster.

Detrended correspondence analysis (DCA) on the log-transformed data was applied to identify the main ecological gradients among groups (Lepš and Šmilauer 2003). Climatic variables (WorldClim at 30 arc-second resolution, Hijmans et al. 2005) were plotted as vectors in the ordination space of the groups. To select the variables best explaining variation in the climate of the plots and to avoid collinearity among them, we used a PCA bi-plot of their scaled values. The Kaiser–Guttman criterion was used to decide the PCA axes to interpret.

The main groups identified by cluster analysis were mapped to visualize their geographic distributions. We used spatial predictive modelling to estimate the geographic range of each group and the correlation with major climatic factors (WorldClim at 30 arc-second resolution, Hijmans et al. 2005), which have a good discriminatory power for evergreen forest communities on small scale (Petroselli et al. 2013). This modelling approach, which is based on vegetation types instead of species, has been recently shown as a useful method for estimating distribution patterns of natural habitats (Potts et al. 2013; Keith et al. 2014; Jiménez-Alfaro et al. 2018). The models were computed using Maxent 3.3 (Phillips et al. 2006) with the default parameters within the geographic range of Q. suber in Europe. The resulting spatial models show the climatic suitability for each cluster based on a probability threshold value to exclude the lower suitability values (see Phillips et al. 2006 for details). The variable contributions defined by each model were used to assess the main environmental drivers influencing the spatial distribution of the groups, and to compare their climatic envelopes.

3 Results

3.1 Classification and ordination

The crispness curve (Online Resource S3) showed a peak on three groups, and a secondary optimum for four clusters. This provides a valuable solution to obtain a formal description on Q. suber woodlands in Europe, useful to improve the habitat subtypes explanation described in the Interpretation Manual of European Union Habitats—EUR28.

The four groups obtained by flexible beta-clustering showed a clear geographical pattern (Fig. 2) from the Italian to the Iberian Peninsula. Considering the fidelity threshold for phi > 25, the first group presented 28 diagnostic species, the second one 15 species, the third one 16 species and the fourth 29 species (Table 1). All the groups shared a set of species characteristic of Mediterranean maquis (Rubia peregrina, Smilax aspera, Ruscus aculeatus, Erica arborea, Arbutus unedo, Rubus ulmifolius, Cistus salviifolius). However, they were differentiated by: Atlantic and western Mediterranean species (Group 1—the Atlantic Lusitanian province and the Mediterranean Baetic provinces in the western part of the Iberian Peninsula); Atlantic and other mesophilous species (Group 2—the Aquitanian region, southwestern part of France, and the Cantabrian region, northern Iberian Peninsula); Mediterranean species (Group 3—the coast of the Catalan-Provençal territories); central and eastern Mediterranean species (Group 4—the central Mediterranean territories of the Tyrrhenian Italian coast, Italian main islands, Corse and Apulia).

Fig. 2
figure 2

Distribution of the four groups classified by flexible beta-clustering (Gr.1 Western Iberian Peninsula, Gr.2 Cantabrian and Aquitanian regions, Gr.3 Catalan-Provençal and Gr.4 Tyrrhenian and main islands)

Table 1 Diagnostic and constant species for the four groups

The DCA showed (Fig. 3) a good differentiation among the four groups in the ordination space (the length of the first ordination axis is 5.23 SD units, while the second axis accounts for 3.48 SD units). Among the climatic variables, the precipitation of the driest quarter, the precipitation of the wettest quarter, the minimum temperature of the coldest month, the mean temperature of the warmest quarter and temperature seasonality explain most of the climatic variation among the groups (Fig. 4). The first ordination axis is negatively correlated with the precipitation of the wettest quarter and positively with the temperature seasonality indicating a gradient from the Atlantic sites characterized by abundant precipitation and a smaller variation in the temperatures during the year to eastern Mediterranean sites with more scarce precipitation and a stronger seasonality in the temperature. The second axis is negatively correlated with the minimum temperature of the coldest month and the mean temperature of the warmest quarter precipitation and it is positively correlated with the precipitation of the driest quarter. This pattern identifies a gradient from typical Mediterranean sites characterized by warm temperature and a marked seasonality in the precipitation to sites with colder conditions but with abundant precipitation during the summer.

Fig. 3
figure 3

DCA bi-plots showing species and the four clusters obtained (Cl.1 Western Iberian Peninsula, Cl.2 Cantabrian and Aquitanian regions, Cl.3 Catalan-Provençal and Cl.4 Tyrrhenian and main islands). a Phanerophytes occurring in more than 100 plots with centroids of the four groups represented by numbers. b Distribution of plots grouped according the classification. Abbreviation of species are listed: Arbutus unedo (Arb.une), Calicotome spinosa (Cal.spi), Calicotome villosa (Cal.vil), Clematis flammula (Cle.fla), Crataegus monogyna (Cra.mon), Cytisus villosus (Cyt.vil), Daphne gnidium (Dap.gni), Erica arborea (Eri.arb), Erica scoparia (Eri.sco), Hedera helix (Hed.hel), Juniperus oxycedrus (Jun.oxy), Lonicera implexa (Lon.imp), Lonicera periclymenum (Lon.per), Myrtus communis (Myr.com), Olea europaea (Ole.eur), Phillyrea angustifolia (Phi.ang), Phillyrea latifolia (Phi.lat), Pinus pinaster (Pin.pin), Pistacia lentiscus (Pis.len), Prunus spinosa (Pru.spi), Quercus coccifera (Que.coc), Quercus faginea (Que.fag), Quercus ilex (Que.ile), Quercus pubescens (Que.pub), Quercus suber (Que.sub), Rhamnus alaternus (Rha.ala), Rubia peregrina (Rub.per), Viburnum tinus (Vib.tin)

Fig. 4
figure 4

DCA bi-plot showing the four groups with WorldClim environmental variables fitted as vector. Abbreviation of environmental variables are listed: PDrQ the precipitation of the driest quarter, PWeQ the precipitation of the wettest quarter, MiTCoM the minimum temperature of the coldest month, MeTWaQ the mean temperature of the warmest quarter, TS temperature seasonality

3.2 Suitability map

The habitat suitability models computed with Maxent (Fig. 5) showed the spatial pattern of the four groups with a high performance after tenfold cross-validation as measured by the AUC (Table 2). The areas from medium to high habitat suitability envelope the main geographical pattern of each cluster, while low suitability values showed some overlap (see for instance group 4 with 1 and 3, Fig. 5). The models did not show overfitting according to the values of AUCdiff, suggesting that the restricted distributions were driven by climatic conditions rather than by spatial autocorrelation. The variable contributions showed different climatic patterns for the four clusters (Table 2). Annual precipitation was the most important variable for cluster 1 and 4, whereas mean temperature of wettest quarter was the most relevant variable for clusters 2 and 3. The other four variables differed between each group, with precipitation seasonality as the most relevant climatic factors.

Fig. 5
figure 5

On the left: plot distribution (dots). On the right: climatic suitability models computed with Maxent (low to high suitability is indicated in blue to red) for all the clusters (Cl.1 Western Iberian Peninsula, Cl.2 Cantabrian and Aquitanian regions, Cl.3 Catalan-Provençal and Cl.4 Tyrrhenian and main islands)

Table 2 Percentage contribution of environmental variables (WorldClim) used to model the distribution of four types of cork oak forests (clusters 1–4) with Maxent in Europe

4 Discussion

4.1 Characterization of European cork oak woodlands

The floristic classification of vegetation plot data in four Q. suber woodland types in Europe suggested a clear geographic pattern. Species composition and spatial distribution of the groups show a partially overlap with the subtypes described for the Habitat type 9330 (see Interpretation Manual of European Union Habitats—EUR28), although the results achieved prove the clear distinction of the Catalan-Provençal group and merge the northwest Iberian and Aquitanian types into the group 2 (Cantabrian and Aquitanian regions).

Moreover, they appear to be also coherent with patterns emerged in the phylogeographic study of Magri et al. (2007, 2015) that suggests a differentiation of Q. suber populations during the tertiary reflecting the geological and palaeoecological history of the Mediterranean basin (Palamarev and Tsenov 2004). Accordingly, the four groups defined here correspond to distinct biogeographical regions (Meusel et al. 1965; Millington et al. 2011) and with major alliances defined in the European vegetation conspectus (see EuroVegChecklist, Mucina et al. 2016). In particular, the first (Western Iberian Peninsula) and second (Cantabrian and Aquitanian regions) groups include the southern Iberian alliance Oleo sylvestris–Quercion rotundifoliae (syntaxon code: QUI-01B) and the western Iberian alliance Quercion broteroi (QUI-01C). The third group (Catalan-Provençal) is linked to Quercion ilicis (QUI-01), a thermophilous alliance typical of Valencian-Catalan-Provençal biogeographic district (Rivas-Martínez 1975; Barberis and Mariotti 1979), but it contains also plots from northwestern Italy. The fourth group (Tyrrhenian Italian coast and main islands) is linked to the alliances Erico-Quercion ilicis (QUI-01E) and Fraxino orni-Quercion ilicis (QUI-01D), mainly composed of evergreen forests dominated by Q. suber or co-dominated by Quercus ilex.

The four groups of European Q. suber woodlands are also differentiated by their climatic envelope as shown by the DCA and distribution modelling. In particular, Groups 1 and 2 are characterized by wetter conditions due to the influence of Atlantic winds and currents, while groups 3 and 4 are characterized by more continental conditions, even though group 3 (Western Mediterranean) is also differentiated by a milder climate with respect to group 4 (Italian-Tyrrhenian).

Despite these differences, our results confirm the capacity of Q. suber woodlands to survive within environmental extremes given a minimum amount of moisture (Petroselli et al. 2013). In fact, the summer aridity characterizing the Italo-Tyrrhenian group can be avoided taking advantage from soil water availability (Dowgiallo et al. 1997; Lacambra et al. 2010; Petroselli et al. 2013), most likely due to the copious precipitations during the wettest seasons and the influence of the high values of humidity index along the coastlines.

4.2 Implications for conservation

One of the current challenges for habitat conservation in Europe is related to the definition and interpretation of habitat types. The results of this study provide information on the compositional and distributional patterns of Q. suber woodlands in Europe, offering a list of indicator species for the major eco-geographical groups. By defining diagnostic and constant species, we provide a framework to investigate the structure and functions (including typical species) of each of the habitat subtypes (from a vegetation point of view) to be able to achieve an ad hoc assessment for each of the subtypes.

The suitability maps show the pattern of potential areas upon which the actual current distribution can be assessed. Only for low probability value a certain overlap among the groups (see extent of group 4 with groups 1 and 3), while the core areas are clearly differentiated. The choice of a probability threshold to distinguish core and peripheral potential distribution is a topic of ongoing debate, as no single best solution for threshold selection was found to be appropriate in all situations (Jiménez-Valverde and Lobo 2007; Jiménez-Valverde 2014). Irrespective of the criterion adopted, this differentiation is useful to elaborate and implement proper conservation strategies consistent with the ecological and composition requirements of each sub-type. Low threshold values correspond to large estimates of habitat niche and could, therefore, be used to design broad conservation strategies that encompass the whole potential niche of habitat, while higher thresholds correspond to more conservative estimates of environmental niche or distributions and could be used to focus on the habitat core niche. For instance, critical populations are those isolated or highly fragmented at the limit of the potential range defined by suitability map for each group. Conversely, populations limited in terms of size but with a large suitable area have to be the focus of restoration plans. Moreover, suitability maps can be used to estimate range parameters, useful for assessing the conservation status at both regional (Álvarez-Martínez et al. 2017) and continental (Jiménez-Alfaro et al. 2018) scales. Our results also provide useful information for establishing favourable reference values of the habitat type and subtypes (Evans and Arvela 2011). We, therefore, encourage vegetation ecologists and conservation agencies to make use of the European vegetation databases for developing a consistent characterization of habitat types of conservation concern.