Introduction

Predicting which introduced alien species are likely to become invasive requires identifying the mechanisms that may favor their establishment and spread after their introduction into new regions and novel environments (Rejmánek and Richardson 1996; Pyšek and Richardson 2007; Pyšek et al. 2015). For plants, reproductive traits such as high seed production, early and rapid germination, and a capacity to germinate under a broad range of environmental conditions distinguish many invasive- from their non-invasive congeneric counterparts (Pyšek and Richardson 2007; Gioria and Pyšek 2017; Gioria et al. 2018) and have long been regarded as important determinants of weediness or invasiveness (Baker 1965; Erfmeier and Bruelheide 2005; Colautti et al. 2006; Moravcová et al. 2015).

Soil seed banks (hereafter ‘seed banks’) are a major component of plant community dynamics (Harper 1977), acting as reservoirs of propagules and genetic diversity for many species (Templeton and Levin 1979; Venable and Brown 1988; Levin 1990; Chesson 1994). The potential role of seed banks in promoting the successful establishment of alien species in new distribution ranges and their persistence, even under unfavorable conditions for growth and development, has been recently highlighted in a number of studies (Gioria et al. 2012; Pyšek et al. 2015; Gioria and Pyšek 2016, 2017). In their role as genetic reservoirs (Templeton and Levin 1979), the formation of a seed bank can play a critical role in maintaining or enhancing the genetic diversity found in invasive populations and reduce the rate of genetic erosion and possible inbreeding (Levin 1990; Fennell et al. 2010, 2014; Mandák et al. 2012; Baskin and Baskin 2014), potentially facilitating their adaptive responses to environmental changes in space and time (Chesson 1994; Fennell et al. 2010). This is important both for alien species, as it affects their responses to the novel conditions encountered in their new ranges, as well as for native species, by affecting how they respond to the novel conditions created by the alien species (Gioria et al. 2012).

An important function of seed banks is the regulation and promotion of dispersal, not only through space but also through time (Venable and Brown 1988), resulting from them being formed by seeds possessing varying degrees and types of dormancy (including non-dormancy) (Cohen and Levin 1991; Fenner and Thompson 2005; Baskin and Baskin 2014). In this respect, the formation of seed banks can be regarded as a bet-hedging strategy that promotes the coexistence of species due to differential responses of individual seeds/species to varying biotic and abiotic conditions and differences in resource use (Venable and Brown 1988; Chesson 1994; Venable 2007). Differences in the timing, percentage, and speed of germination determine the post-germination biotic and abiotic conditions experienced by the seedlings (Donohue et al. 2010; Gioria and Osborne 2014; Gioria et al. 2018). This, in turn, affects the probability of establishment of a species, its distributional range, and its evolutionary potential (Harper 1977; Donohue et al. 2005, 2010).

Seed banks have been classified into transient or persistent depending on how long seeds of a species can retain their viability in the soil, with viability of less than 1 year typically being considered transient (Thompson et al. 1997). Such a distinction is very important in invasion ecology, as it provides an indication of how long an invasive species can persist in the recipient communities following eradication attempts and in the absence of further introductions from nearby sources (Gioria et al. 2012). Moreover, it improves our ability to estimate the size of the pool of seeds that invasive species can accumulate over time and that can ultimately germinate under certain environmental conditions (Gioria and Pyšek 2016).

Given these functional roles, seed banks affect both ecosystem resistance and resilience (Pugnaire and Lázaro 2000) and could thus contribute substantially to the naturalization and invasion potential (invasiveness) of alien species as well as the invasibility of the recipient communities and their recovery potential (Gioria et al. 2012, 2014; Pyšek et al. 2015; Gioria and Pyšek 2016, 2017; Gioria et al. 2018; see Donohue et al. 2005, 2010). For these reasons, the number of studies examining the characteristics of the seed bank of invasive species has increased substantially in recent years (Gioria and Pyšek 2016), especially with the aim of developing effective and sustainable management measures and assessing the restoration potential of native communities (Richardson and Kluge 2008; Gioria et al. 2012; Gioria and Pyšek 2016). However, our understanding of the role of seed banks in the invasion process remains poor (Gioria et al. 2012; Gioria and Pyšek 2016) and only recently has information on the characteristics of seed banks been incorporated into large-scale analyses aimed at assessing the probability of establishment or spread of alien species (Pyšek et al. 2015).

To address this issue, we compiled a unique global database comprising information on the type (transient vs persistent) and density of the seed banks formed by 2566 species including 727 invasive species. These data encompass 14,293 records from different sites/communities and habitat types, based on assessments using seedling emergence approaches (sensu Thompson et al. 1997), and thus capturing the viable components of seed banks (Thompson and Grime 1979). This database was used here to address three main questions: whether the characteristics of the seed bank of invasive species differ in their native and alien ranges (Q1), and among invasive and non-invasive congeners, in their native (Q2) and alien ranges (Q3). To our knowledge, this is the first study comparing seed bank data collected globally, providing information on the characteristics of the viable pool of seeds that invasive and non-invasive species can accumulate in different habitat types, in their native and introduced distribution ranges.

Methods

Soil seed bank database

We compiled our seed bank database using literature sources identified by searching the Web of Science (ISI) and Google Scholar, using the keyword ‘seed’ in combination with ‘bank’, ‘below-ground’, ‘buried’, ‘community’, ‘flora’, ‘reservoir’, ‘soil’, and ‘stored’. Additional studies were searched by screening the reference lists provided in the resulting papers as well as papers citing the papers originally retrieved. We also conducted a search for grey literature and experts in the field were contacted directly for potential unpublished material and dissertations. For studies published before 1994, we also screened the reference list available in Thompson et al. (1997), a database of soil seed banks of North-West Europe that was based on 275 sources published between 1882 and the beginning of 1994. The last search for published references was conducted in April 2018.

In this database we only included studies that (1) presented data on seed bank type (transient vs persistent, sensu Thompson et al. 1997) and mean seed bank density at the site level (excluding mean density values from multiple sites); (2) provided mean seed bank density values calculated from multiple independent samples at each site (excluding studies that only provided minimum and maximum, or total number of seeds/seedling emerging from all samples collected at one site); (3) examined natural seed banks (excluding results from laboratory germination experiments burial experiments or manipulative field studies); (4) reported seed bank data for one or more species separately; (5) provided information on the habitat ecosystem or vegetation type for each study site. For the purpose of this paper, we only included studies that assessed the seed bank using the seedling emergence approach (sensu Thompson et al. 1997), as it allows estimation of the viable component of the seed bank (Thompson and Grime 1979; Notes S1). However, for species with large seeds such as Heracleum mantegazzianum (Gioria and Osborne 2009a) and Acacia species (Marchante et al. 2010), we also included seed bank estimates based on seed counts. We excluded those studies examining seed banks using seed extraction methods to avoid any potential confounding effect of the method of seed bank estimation on the final results (see Price et al. 2010).

The final database comprised 14,293 unique seed bank records for 2566 species extracted from 201 studies (Table S1). Each record is composed of information on the characteristics of the seed bank of individual species at individual sites: (1) seed bank type: transient (< 1 year) vs persistent (> 1 year), and (2) mean seed bank density, expressed as the mean number of seeds per square meter. Information on seed bank type was derived directly from the original papers or from combined information on seed bank depth, presence/absence in the vegetation and sampling time (before or after seed dispersal), using the criterion described by Thompson et al. (1997). A persistent seed bank was thus assigned to species that were (1) absent from the standing vegetation but present in the seed bank and/or (2) in deep soil layers sampled before seed dispersal but after seed germination in the field. Where information on seed bank density was not directly available, we converted the number of seeds per sample recorded at each site into seeds per square meter, based on the size of the samples.

For each record, we included information on (3) habitat/ecosystem or vegetation type at each site, taken directly from the source papers, and biogeographic information. This included (4) origin status (native vs alien; i.e. whether a species was native or alien at the study site or in the sampling region). This information was derived directly from the source papers or from regional or local floras, from a range of databases; (5) local invasive status (invasive vs non-invasive), depending on whether a species has been listed or classified as invasive locally, regionally, or globally, in a range of databases (Notes S2); and (6) the global invasive status of a species (yes/no), based on the presence of a species in the Global Invasive Species Database (www.iucngisd.org/gisd/) (see Notes S2 for details on the methods/sources of collection of the data included in the seed bank database).

For each species we also added information on (7) seed weight (mg), obtained from the Royal Botanic Gardens Kew Seed Information Database (http://data.kew.org/sid); (8) growth form (grass, herb, shrub, or tree); and (9) life history (annual, biennial or perennial), based on a combination of sources including eFloras (2016) (www.efloras.org), the United States Department of Agriculture database (http://plants.usda.gov/java), the Online Atlas of the British and Irish Flora (www.brc.ac.uk/plantatlas), and regional floras. The taxonomic status of each species was validated using The Plant List (2017) database (http://www.theplantlist.org, Version 1.1). For the species whose status in this database is unresolved, we maintained the name provided in the original source although these species were not included in the statistical analyses performed in this study. Details of the number of records for invasive and non-Invasive species by origin, habitat type, and study region, as well as the cumulative number of post-1996 records, are presented in Fig. S1–S4.

Statistical analyses

To address each research question, we created three separate datasets, each including species for which information on origin status, local invasive status, seed weight, life form, and habitat type was available (Table 1, Table S2). To assess whether the seed banks of invasive species differ in their native vs alien distribution ranges (Q1), we included only records for species that were classified as ‘invasive’ and for which seed bank data were available in both native and alien ranges. To assess whether the seed banks of invasive species differ between invasive and non-invasive congeners in their native (Q2) or alien (Q3) distribution ranges, we only included those genera for which seed bank data were available for at least one invasive and one non-invasive congener in the respective datasets.

Table 1 Description of the variables used in GLMM and MCMCglmm models used to address three research questions

We used generalized linear mixed models to identify which variables (and their interaction terms) best explain the type or density of the seed bank (origin [Q1] or local invasive status [Q2 and Q3], life form, and seed weight) (Table 1). To control for any taxonomic dependency in the data, we first tested the significance of three nested random effects, i.e., family, genus nested in family, and species nested in genus nested in family. This approach allows accounting for statistical non-independence in the data owing to shared life history and identifying the taxonomic level at which these unexplained effects might occur (Lutz et al. 2015; Bridge et al. 2016). As both seed bank type and density are known to vary considerably across habitat types (Fenner and Thompson 2005), we also included species nested in habitat as a potential random factor. For each research question, we only included the random effects that were significant in the final models.

For both responses (type and density of the seed bank), we identified the suitable distribution and link function. To model seed bank type, we performed logistic GLMMs with the binomial error. To identify the link function that is more suitable for our dataset, we calculated the maximum model with three link functions (logit-link, probit-link and complementary log–log-link) and we checked the residual deviance (Thiele and Markussen 2012). As the logit-link performed best (lowest Akaike Information Criterion), this link was used to calculate the best model. To model seed bank density, we performed linear mixed-effects models (LMMs) of the response with a Gaussian error [log(y + 1)], identity link). Diagnostic analyses, in fact, showed that the Poisson distribution, which is usually used to model count data, was a poor fit for seed bank density data, while the Gaussian distribution was more suitable to model our seed bank density data.

The same combinations of fixed predictors and random effects was used to model seed bank type and density (GLMMs and LMMs). The same datasets (same number of species and records) and procedures were used to model seed bank type and density in the GLMMs using (1) seed bank data for each record, (2) mean seed bank density values and proportion of persistent records for each species, and (3) seed bank density data for persistent seed bank records only, for each of the three research questions. Models using only persistent seed bank records were run to assess differences in seed bank accumulation over time while avoiding including variation associated with potential under- or over-estimations of the density of transient seed banks due to differences in sampling time (Notes S1).

We also ran models including habitat type as a fixed rather than random effect, to identify the habitats where origin status or invasive status might contribute to determine the characteristics of the seed bank, although the results of these models are only briefly mentioned.

The significance of fixed effects was tested using F-tests of type III hypothesis and p values calculated based on Satterthwaite’s approximations for LMMs and Wald Chi-square tests for the binomial GLMMs (Kuznetsova et al. 2016). Likelihood Ratio Tests were performed to test the significance of the random effects and thus select the most suitable structure of the random effects of the models. Marginal and conditional R2 values \((R^{2}_{GLMM(m)} ,R^{2}_{GLMM(c)} )\) were calculated using the procedure described by Nakagawa and Schielzeth (2013) in the MuMIn package (Bartoń 2017). The precision of each fixed predictor was assessed by calculating 95% confidence intervals. All analyses were performed using R 3.4.3 (R Development Core Team 2018). The functions lmer and glmer in the R package lme4 (Version 1.1-13; Bates et al. 2015) were used to fit LMMs and GLMMs while the lmerTest package was used to test for the significance of fixed and random effects (Kuznetsova et al. 2016).

Phylogeny

Since phylogenetic relatedness may lead to the statistical non-independence of data (Felsenstein 1985), we reconstructed phylogenies in order to account for shared evolutionary history (relatedness). For this we collated genetic data for the ribulose-bisphosphate carboxylase (rbcL) and Maturase K (matK) gene regions for all taxa with available data in the online GenBank repository (www.ncbi.nlm.nih.gov). In some instances, for species with no available DNA data for one or both of these genes, data from phylogenetically closely related species were used (4 out of 356 instances or 0.37% of 1072 species; see Table S3 for details of sequencing data; phylogenetic trees are presented in Fig. S5). DNA sequence data were aligned in BioEdit version 7.0.5.3 (Hall 1999) and manually edited. Flanking regions for both genes were trimmed to avoid the presence of excessive missing data resulting in a final dataset consisting of 1447 characters (base pairs). Phylogenies were estimated using Bayesian search criteria with parameter estimates obtained from the program jModelTest version 2.1.3 (best fit model GTR + I + G; Darriba et al. 2012) in MrBayes 3.1.2 (Ronquist and Huelsenbeck 2003). MrBayes was run for 1,000,000 generations and trees were sampled every 1000 generations. Nodal support for the retrieved tree topology was determined as posterior probabilities in MrBayes. The phylogeny resolved all taxa with high overall support. Separate phylogenies were reconstructed for Q1–Q3 based on the species for which sequences were available (i.e. 104, 286, and 67 species, respectively).

Phylogenetic comparative analyses

To account for phylogenetic relatedness, we performed phylogenetic linear mixed models (PGLMM, Hadfield and Nakagawa 2010) as implemented in the MCMCglmm R package (Hadfield 2010); this allowed the use of reconstructed phylogenies as a random factor. Phylogenies were considered as inverse phylogenetic covariance matrices. For each question we used two new sets of response variables: (1) within-species proportion of persistent records (WS-persistence), and (2) mean within-species seed bank density (WS-density). We used Gaussian models using a similar model structure (i.e. response and dependent variables) as in the GLMMs described above for each response variable using [log(y + 1)]-transformed density and persistence data for each research question. We fixed the covariance structure and used weakly informative priors (improper prior with ν = 0.02), for each question (Hadfield and Nakagawa 2010). Each model was run for 5,000,000 MCMC steps, with an initial burn-in phase of 1000 and a thinning interval of 500, resulting in posterior distributions with 10,000 samples (de Villemereuil and Nakagawa 2014). From these posterior distributions, we calculated mean parameter estimates (lambda), and 95% Highest Posterior Density (HPD) and Credible Intervals (CI). Significance of model parameters was estimated by examining CIs where parameters with CIs overlapping with zero were considered not significant (Carboni et al. 2013). As phylogenetic models (PGLMMs) were performed on a subset of species for which phylogenetic data were available, we compared these models with GLMMs using WS-persistence and WS-density as the response variables for the same species used in phylogenetic models. PGLMMs were also performed using persistent seed bank records only, for each research questions.

Results

Seed banks of invasive species in native vs alien ranges

Logistic GLMMs of seed bank type (based on 4336 records for 140 invasive species within 103 genera and 32 families, in nine habitat types) showed that the seed banks of invasive species were significantly lower in the alien than native range when accounting for taxonomic (species nested in genus) and habitat-related patterns (species nested in habitat type) (Fig. 1a; Table 2a). The probability of forming a persistent seed bank was significantly negatively related to seed weight (Pseed_weight < 0.001; Table 2a). PGLMMs as well as GLMMs modelling of WS-persistence based on a subset of 104 species within 78 genera and 28 families (74% of the species used in the GLMMs using seed bank type data), however, did not show significant differences in the proportion of persistent records among the total number of records in the native and alien range (Table 3).

Fig. 1
figure 1

Probability of seed bank persistent formation by different life forms based on the results of GLMMs accounting for the significant taxonomic- (species nested in genus) and habitat-structure (species nested in habitat type) of the data, in nine habitat types. Comparisons were done between seed banks of a 140 invasive species in their native vs alien ranges (4336 records), b 955 invasive vs non-invasive congeneric species in their native ranges (6824 records), and c 162 invasive vs non-invasive congeneric species in their alien ranges (1149 records). Species were grouped as annual graminoids (A_gram) and herbs (A_herb), perennial graminoids (P_gram) and herbs (P_herb), and woody species (see Table 1)

Table 2 Probability of forming a persistent seed bank based on generalized linear mixed models testing the effects of (a) origin status (native vs alien range) in comparisons of the seed bank of 140 invasive species (4336 records) and of (b) invasive status (invasive vs non-invasive) in comparisons of the seed bank of 955 invasive vs non-invasive congeneric species in their native range (6824 records)
Table 3 Summary of Bayesian models of within-species proportion of persistent records over the total number of records (WS-persistence) and mean within-species seed bank density (WS-density), where n is the number of species

The final LMM model of seed bank density of invasive species, including taxonomy and habitat type as significant random effects, showed significant differences in mean seed bank density between the native and alien range (PAlien = 0.003; Table 4a.1). Invasive woody species and, to a lesser extent, perennial graminoids and other annual and perennial herbaceous species, formed denser seed banks in the alien than native range (Fig. 2a). On the other hand, invasive annual graminoids formed denser seed banks in the native range (Fig. 2a). Seed weight was not a significant predictor of seed bank density. A significant effect of origin status (PAlien_P = 0.002) was observed also when including persistent seed bank records only (2426 observations for 92 species in 73 genera and 26 families), with denser seed banks found in the alien range for invasive woody species and smaller seed banks for annual graminoids (Table 4a2; Fig. 2b), while the effect of seed weight was not significant. Both PGLMMs and GLMMs, however, did not identify a significant effect of origin status on WS-density, both for all and persistent seedbank-only records.

Table 4 Linear mixed effects models of mean seed bank density (seeds per square meter) testing the effects of (a) origin status (native vs alien range) in comparisons of the seed bank of invasive species and (b) invasive status (invasive vs non-invasive) in comparisons of the seed bank of invasive vs non-invasive congeneric species in their native range, based on all records (a.1, b.1) and only seed bank persistent records (a.2, b.2)
Fig. 2
figure 2

Mean seed bank density values (log y + 1) for different life forms based on the results of LMM Gaussian models accounting for the significant taxonomic- (species nested in genus) and habitat-structure (species nested in habitat type) of the data as random factors. Comparisons of seed banks were done between a 140 invasive species in their native vs alien distribution range (4336 records, for 140 species in 103 genera and 32 families) and b for a subset of species based on persistent seed bank records only (2426 records, for 92 species in 73 genera and 26 families). Species were grouped as annual graminoids (A_gram) and herbs (A_herb), perennial graminoids (P_gram) and herbs (P_herb), and woody species (see Table 1)

Models of seed bank density including habitat among the fixed effects rather than in the random structure of the models showed that, overall, origin status was not a significant predictor of seed bank density using all records, but it was significant using only persistent records (PAlien_P_Habitat = 0.002), with significantly denser seed banks in the alien range in anthropogenic and arid habitats, and in shrubland.

Seed banks of invasive vs non-invasive congeners in their native range

Robust patterns were identified when modelling the type and density of the seed bank of invasive and non-invasive congeneric species in their native range. The final logistic GLMMs (based on 6824 observations, for 955 species within 166 genera and 50 families, in nine habitat types), accounting for taxonomy and habitat type as random effects, showed that invasive status was a significant predictor of seed bank persistence, with invasive species having a significant higher probability of forming a persistence seed bank compared to their non-invasive congeners (PInv < 0.001), for all life forms (Table 2b; Fig. 1b). The probability of forming a persistent seed bank was significantly negatively related to seed weight [log(seed weight + 1), Pseed_weight < 0.001] (Table 2b). PGLMMs also showed a significant effect of invasive status on WS-persistence (Table 3), even if based on records for only 286 species (29%) for which phylogenetic data were available (in 69 genera and 31 families). Analyses accounting for habitat as a fixed rather than a random factor also showed a significant effect of invasive status on seed bank type (PInv_H < 0.001), and of seed weight (Pseed_weight_H < 0.001), with the probability of forming a persistent seed bank being significantly smaller in invasive congeners in riparian, shrubland, and wetland habitats.

Invasive species formed significantly denser seed banks both in the LMM models based on 6824 seed banks records (PInv < 0.001) or only seed bank-persistent records (3356 observations, for 461 in 90 genera and 33 families, PInv_P = 0.009) and accounting for the taxonomic- and habitat-related structure of the data (Table 4b.1). Higher seed bank densities were observed especially in annual graminoids and woody species (Fig. 3a). Seed weight was significantly negatively related to seed bank density (Pseed_weight = 0.01, Table 4b.2). In LMMs where habitat was included as a fixed factor, invasive status was a significant predictor of seed bank density (PInv_H < 0.001) PGLMMs showed a significantly higher WS-density in invasive than non-invasive congeners when accounting for both transient and persistent seed bank records (286 species), but not when persistent seed bank-only records (174 species) were used (Table 3).

Fig. 3
figure 3

Mean seed bank density values (log y + 1) for different life forms based on the results of LMM Gaussian models accounting for the significant taxonomic- (species nested in genus) and habitat-structure (species nested in habitat type) of the data as random factors. We compared seed bank densities between invasive and non-invasive congeners in their native range based on a all records (6824 records, for 955 species in 166 genera and 50 families) or b persistent seedbank records only (3358 records for 461 species in 90 genera and 33 families), in nine habitat types. Species were grouped as annual graminoids (A_gram) and herbs (A_herb), perennial graminoids (P_gram) and herbs (P_herb), and woody species (see Table 1)

Models of seed bank density accounting for habitat as a fixed rather than random effect also showed that invasive status was a significant predictor of seed bank density, with invasive congeners forming denser seed banks than non-invasive congeners (PInv_H = 0.007), although within-habitat differences were not significant. The same models based on persistent seed bank records only also revealed a significant effect of invasive status on seed bank density (PInv_P_H = 0.025).

Seed banks of invasive vs non-invasive congeners in their alien range

Invasive species showed a significantly higher probability of forming a persistent seed bank than their non-invasive congeners (PInv_A < 0.001), based on 1149 observations for 162 species within 49 genera and 21 families in eight habitat types (Fig. 1c). The final GLMM model included only invasive status as a fixed predictor and accounted for taxonomy (species nested in genus) and habitat-related patterns (species nested in habitat) as significant random factors. The fixed component of this model, however, explained little variation in the data (\(R_{\text{m}}^{2} = 1.5\%\)), which was instead explained in large part by the random structure (\(R_{\text{c}}^{2} = 62\%\)). PGLMMs including 41% of the species (67 species within 22 genera and 14 families) did not identify significant differences in WS-persistence among invasive and non-invasive congeners in the alien range (Table 3).

No significant differences were identified in the density of the seed bank of invasive and non-invasive alien congeners in LMMs when including all records or only persistent seed bank records (474 observations for 36 species, Fig. 4a). These models did not reveal significant effects of invasive status on seed bank density, when accounting for the taxonomic structure of the data only, for the effect of species nested in habitat among the random factors, or for habitat as a fixed factor. Different patterns were however observed for different life forms (Fig. 4b). Significant differences were identified in PGLMMs based on records for 67 species (Table 3), but not in LMMs based on the same number of species and mean WS-density values.

Fig. 4
figure 4

Mean seed bank density values (log y + 1) for different life forms based on the results of LMM Gaussian models accounting for the significant taxonomic structure of the data (species nested in genus) as random factors. We compared seed banks of invasive and non-invasive congeners in their native distribution range for a all records (6824 records, for 955 species in 166 genera and 50 families) and b persistent seed bank records only (473 records for 85 species in 26 genera and 13 families), in eight habitat types. Species were grouped as annual graminoids (A_gram) and herbs (A_herb), perennial graminoids (P_gram) and herbs (P_herb), and woody species (see Table 1). Both models did not reveal significant differences in seed bank density between invasive and non-invasive congeneric species

Discussion

Seed bank of invasive species at home and abroad

Comparing the performance of invasive species in their native and alien ranges is critical for identifying the mechanisms that may promote the invasive potential of certain alien species in their introduced range (Hierro et al. 2005, 2009; Parker et al. 2013). Comparisons among 140 invasive species as well as phylogenetic analyses for 104 of such species showed that the probability of forming a persistent seed bank in invasive species is generally smaller in their alien rather than native range, for all life forms and across habitat types. In terms of density, however, invasive woody species formed significantly denser seed bank abroad, while the seed banks of annual graminoids were denser at home.

A range of mechanisms and processes could underlie these patterns at home and abroad, by affecting seed inputs (seed production and seed rain) and seed outputs (germination or mortality associated with the presence of natural enemies or seed decay). Some seed traits are highly variable and can evolve rapidly in response to environmental uncertainty (Venable and Brown 1988; Cohen and Levin 1991; Donohue et al. 2005, 2010). Differences in the biotic and abiotic conditions at home and abroad can translate, via plastic and/or adaptive processes, into differences in seed production, the degree or type of dormancy and seed size, or the requirements for the breaking of dormancy and for seed germination (Donohue et al. 2010; see Gioria et al. 2012; Gioria and Pyšek 2016 for reviews of these mechanisms). These differences may include, for example, the strength and importance of competitive interactions (e.g., resource competition hypothesis) and indirect competition (see Gioria and Osborne 2014) and the type and density of natural enemies (enemy release hypotheses, e.g., Keane and Crawley, 2002; Bossdorf et al. 2004, 2005; Maron et al. 2004).

Unfortunately, only few direct comparisons between the seed banks of invasive species in their native and alien ranges have been made, making it difficult to generalize about the potential mechanisms and underlying patterns at home and abroad. Among such comparisons, Herrera et al. (2011) studied the reproductive traits and seed bank characteristics of 13 native (Mediterranean Basin) and 15 introduced (California USA) populations of the invasive broom, Genista monpessulana, and found that the seed rain was four times higher in alien than native populations and seed bank density was 15 times higher in the alien range, with higher post-dispersal seed predation in the native range partially explaining these results. For another invasive broom, Cytisus scoparius, Fowler et al. (1996) found that seed production was higher in introduced populations (Australia New Zealand) than in native ones (France, UK). However, for this species the density of seed banks was similar mainly due to seed predation by vertebrate seed-feeders in the introduced range, indicating that both rates of seed production and predation contributed to patterns in the seed bank (Fowler et al. 1996).

The fact that woody species formed denser seed banks in the alien range despite a lower (though not significantly) probability of forming persistent seed banks supports previous evidence of larger seed production and lower seed predation in the alien range. For instance, it is well-known that invasive woody Australian Acacia species form dense, persistent seed banks in their alien ranges (Richardson and Kluge 2008; Marchante et al. 2010). All Acacia species for which native and alien records were available in our database (i.e. A. dealbata A. longifolia and A. saligna) formed denser seed banks in the alien than native range (Tozer 1998; Holmes 2002; Marchante et al. 2011; Mason et al. 2007; Fourie 2008; González-Muñoz et al. 2012; Meers et al. 2012). This is also consistent with direct evidence of higher seed production (as well as heavier seeds) in the alien range for A. dealbata and A. longifolia, compared to their native range (Correia et al. 2016). Denser seed banks abroad also support previous evidence of the importance of high propagule pressure for woody species in determining their invasiveness in the introduced range (Křivánek and Pyšek 2006; Pyšek et al. 2009).

Higher seed densities are also consistent with evidence of lower seed bank outputs via seed predation in the alien range for a number of invasive woody species. Correia et al. (2016) reported an absence of pre-dispersal predation of A. dealbata and A. longifolia seeds in the alien range as well as a lower proportion of aborted seeds compared to the native range, resulting in the formation of denser seed banks in the abroad than home range for these two species. Similarly, lower levels of herbivory and higher fecundity have been reported in alien populations (Germany) of the invasive shrub Buddleja davidii compared to native range populations from China (Ebeling et al. 2008), which may explain the consistently denser seed banks observed for this species in its alien range (Gioria and Osborne 2009a, 2010; Li et al. 2011; Kundell et al. 2014). The extent to which lower seed predation affects the seed bank of this species in the alien range, however, remains unclear. It is possible that a lower seed predation might contribute to denser seeds banks in many woody species in their new ranges. However, some species might acquire new enemies in the new range (Vanhellemont et al. 2014), with potential negative effects on seed bank density (Hulme 1998; Krushelnycky 2014; Pearson et al. 2014).

There is little evidence suggesting that differences in germination success (e.g. percentage germination) might contribute to patterns in seed bank density for invasive woody species. Some have found seeds of Rhododendron ponticum (Erfmeier and Bruelheide 2005) and Ulex europaeus (Udo et al. 2017) to germinate more rapidly, but to similar percentages, in the alien rather than native range, suggesting similar seed bank outputs through germination in both ranges. The generality of these findings is, however, unknown. However, for annual graminoids, lower seed bank density is consistent with experimental evidence of higher germination success for seeds collected from alien than native populations for many invasive herbs (see Gioria and Pyšek 2017 for a review). High germination success of seeds produced in a year by alien species represents a useful strategy to overcome a range of reproductive and environmental barriers they encounter in their new ranges (Richardson et al. 2000; Richardson and Pyšek 2012). However, higher seed outputs via a larger effect of seed enemies or lower seed production for these species are also possible.

Clearly, differences in seed bank density might also reflect the differences in the structure of native vs alien populations, especially in terms of population density, which in turn depend on the characteristics of the native vegetation and on the residence time (i.e. time since arrival) of aliens (Notes S1). Denser seed banks abroad would be expected for those species that tend to form virtually monospecific stands in their introduced ranges but not in their native ranges (Beerling et al. 1994; Shimoda and Yamasaki 2016), as we observed for woody species.

Analyses assessing the interaction between origin status and habitat type support evidence of increased seed bank persistence and density linked with the degree of habitat disturbance or unpredictability (Harper 1977; Thompson and Grime 1979; Thompson et al. 1993, 1998). For instance, when considering persistent seed bank records only, denser seed banks were found in anthropic habitats in the alien range than in the native range. Denser seed banks were also found in arid habitats in the alien range, suggesting that survival of alien species in these habitats requires larger seed reservoirs (see Pugnaire and Lázaro 2000; Volis and Bohrer 2013). In contrast, smaller seed banks were detected in wetlands in the alien range. Wetlands are generally composed of species forming long-term persistent seed banks that increase in density over time (Leck 1989; Thompson and Grime 1979; Gioria and Osborne 2010). In our study seed banks in wetlands were all classified as persistent, thus lower seed bank densities in the alien range suggest lower seed bank accumulation over time possibly associated with a shorter residence time and/or higher seed bank outputs.

Seed banks of invasive versus non-invasive congeners in the native range

Congeneric comparisons based on data collected from the native distribution range are regarded as a useful approach to assess the role of preadaptation in the invasiveness of alien species (Hamilton et al. 2005; Pyšek and Richardson 2007). Here we used this approach to assess whether the characteristics of the seed bank in the native range across habitat types are useful to discriminate between invasive and naturalized, but not invasive, congeneric species. Comparing the characteristics of seed banks for 955 such congeners revealed some robust patterns: invasive species had a higher probability of forming a persistent seed bank and formed significantly denser seed banks than their non-invasive congeners, regardless of their life form and across habitat types. Phylogenetic analyses for 286 of these species confirmed significant greater within-species proportion of records of persistent seed banks and mean density values in invasive than non-invasive congeners. This indicates that the type and density of native seed banks are important attributes that should be included in risk assessments aimed at identifying those species that are more likely to become invasive if introduced into new regions with suitable climatic conditions.

In terms of the potential mechanisms underlying these patterns, the fact that the probability of forming a persistent seed bank was higher in invasive species, irrespective of life form, suggests that higher seed bank densities might be associated with the accumulation of seeds over time in persistent seed banks. This further suggests that invasive congeners might be characterized by a higher degree of dormancy, a higher proportion of dormant seeds, or that their germination requirements might be stricter than those of non-invasive congeners. However, the fact that seed bank density was higher for both transient and persistent components of native seed banks, especially for annual graminoids and woody species, also suggests a greater seed bank input in invasive than non-invasive species. Many invasive species, in fact, produce more seeds than their non-invasive congeners in their native range, even when accounting for differences in plant size (Jelbert et al. 2015), supporting evidence that high seed production is a crucial trait in promoting the invasiveness of alien species (Colautti et al. 2006; Moravcová et al. 2015).

Seed banks of invasive vs non-invasive congeners in the alien range

Comparing congeneric invasive and non-invasive species in their introduced range can help identify the mechanisms that explain why some species become widespread and other not (Hamilton et al. 2005; Pyšek and Richardson 2007; Gioria and Pyšek 2017). In our study, we did not find major differences in the probability of forming a persistent seed bank or in the seed bank density. This suggests that, in the alien range, many invasive and non-invasive congeners share similar seed bank strategies that are useful during the naturalization phase, consistent with recent evidence showing that the formation of a persistent seed bank is an important predictor of the naturalization of alien species in North America (Pyšek et al. 2015). Similarities in the type of seed bank of both invasive and non-invasive congeners in the alien range were expected, as species in both groups must cope with novel conditions and face similar environmental barriers in the introduced range in order to become established (Richardson et al. 2000; Richardson and Pyšek 2012).

This further indicates that factors other than a capacity to form persistent and/or dense seed banks might contribute to the spread of invasive species. These include seed dispersal characteristics (Moravcová et al. 2015) or, for invasive species that reproduce both sexually and asexually, the reliance on vegetative propagation for the colonization of new areas (Fennell et al. 2010; Gioria and Osborne 2010, 2013). Moreover, human-mediated long-distance dispersal often occurs, changing the relative importance of species traits in the invasion process (Richardson et al. 2000; Richardson and Pyšek 2012). However, it is worth noting that congeneric comparisons in the alien range could only be run for a substantially smaller number of species than in the native range (162 species vs 955 species in the native range for GLMMs and 67 vs 286 species for phylogenetic comparisons), potentially contributing to masking the effects of invasive status on seed bank type and density.

Direct comparisons of seed inputs and outputs for invasive and non-invasive congeners in their alien range could provide important insights into the factors that contribute most to the absence of differences in the type and density of the seed bank for these species (Gioria and Osborne 2014; Gioria and Pyšek 2017). In terms of seed inputs, higher seed production in invasive than naturalized species is often reported (Burns 2006; Moravcová et al. 2015; Burns et al. 2013). In terms of seed bank outputs, however, the benefits of higher fecundity may be lost through higher mortality rates in the alien range (Pearson et al. 2011; Connolly et al. 2014), while there is no consistent evidence of differences in germination success between invasive and non-invasive congeners in the alien range (Gioria and Pyšek 2017).

Common characteristics of the seed bank of invasive and non-invasive species

Our study showed that information on seed bank type (transient vs persistent) and, to a lesser extent, seed bank density, acquired in the native range, help discriminate between invasive species and those that may become established but not invasive. The fact that these characteristics of the seed bank of invasive species were consistently different between invasive and non-invasive species in their native range, across life-forms and habitats, indicates that information on the seed banks, is an important factor to include in risk assessments that aim at preventing the introduction of potentially invasive species and/or at identifying the invasive species whose management should be prioritized. However, seed bank persistence is a more reliable characteristic distinguishing invasive from non-invasive species.

Both attributes of the seed bank cannot be regarded as species traits but they depend on a range of species traits and on plastic and/or adaptive responses of these traits to a range of biotic and abiotic conditions during seed development, seed maturation and after seed dispersal (e.g. Fenner and Thompson 2005; Burgos et al. 2008; Donohue et al. 2010; Volis and Bohrer 2013; Baskin and Baskin 2014). The density of seed banks also depends on demographic factors (population density and age structure; Harper 1977), the patchy distribution of the vegetation in a community (Cohen and Levin 1991; Volis and Bohrer 2013), and on residence time, especially for species forming persistent seed banks (Gioria and Pyšek 2016). For such species, seed bank density is not only expected to increase with increasing population density and dominance in the vegetation, but also with the number of seed-rain events (Mason et al. 2007; Richardson and Kluge 2008; Gioria and Osborne 2009b; Zenni et al. 2009; Marchante et al. 2010, 2011; Gioria et al. 2011; see Gioria and Pyšek 2016), potentially resulting in positive feedbacks between increased above- and below-ground abundances (Cox and Allen 2008; Robertson and Hickman 2012).

Such a correlation has not always been observed for invasive plants (Alexander and D’Antonio 2003), with the seed accumulation in soil often being prevented by high mortality rates associated with soil-borne seed pathogens (Orrock et al. 2012) or high levels of predation (Hulme 1998; Krushelnycky 2014; Pearson et al. 2014; Saatkamp et al. 2014). Observed differences in the density of the seed bank might thus be a by-product of successful invasions rather than their cause. The dependence of this variable to so many interacting factors limits our understanding of the mechanisms underlying patterns of differences in this attribute of the seed bank in comparative studies. Moreover, the fact that most seed bank studies used in our analyses were based on samples collected at one point in time and that residence times and population densities were almost always unknown might have led to an under- or over-estimation of seed bank densities, especially for transient seed banks (see Notes S2), making it a less reliable seed bank attribute than seed bank persistence in assessments of the invasive potential of alien species.

Overall, phylogenetically constrained traits played an important role in determining the characteristics of the seed bank, indicative of a propensity for certain genera or species to become invasive. The high contribution of habitat type in explaining the variation in seed bank attributes for invasive and non-invasive species confirmed the highly context-dependent nature of seed-related traits and seed banks and their importance in determining the successful establishment of alien species (Gioria and Osborne 2013; Gioria and Pyšek 2016). Habitat-related patterns found in the native and alien range are also consistent with previous work illustrating that certain invasive plant species form seed banks of different densities depending on the type of habitat in their native (Figueroa et al. 2004) and alien ranges (Gioria and Osborne 2009b).

Significant interactions between the predictors of interest (i.e. origin and invasive status) and life form that we observed in some instances are consistent with field evidence that annual species typically possess more persistent and denser seed banks than perennial species (Thompson and Grime 1979; Thompson et al. 1998; Grime 2001). The largest differences were observed between woody species on the one hand, and annual and perennial species, on the other. This was expected, given the fact that woody species generally possess larger seeds than herbaceous species (Moles et al. 2000; Baskin and Baskin 2014), which often persist for shorter times than small seeds (Thompson et al. 1993, 1998). This is often attributed to the fact that large seeds are less likely to become incorporated in the soil and more prone to predation and/or pathogen infections (Thompson et al. 1993, 1998; Rees and Westoby 1997; Bekker et al. 1998; Turnbull et al. 1999).

Seed weight, however, did not always have a significant effect on the probability of persistent seed bank formation nor on seed bank density, such as in models comparing the seed bank of invasive and non-invasive congeners in their alien range, or that of invasive species in their native and alien range. This supports evidence that seed weight is not a reliable predictor of seed persistence or density (see Moles et al. 2000, 2003), although it is so in certain world regions (Thompson et al. 1993). Thus, this trait should not be used as a surrogate for seed bank attributes in risk assessments aimed at identifying the alien plant species that are more likely to become invasive if introduced in new ranges.

Conclusions

Our study is the first to examine the relationship between the characteristics of soil seed banks and the invasive status of alien species for a broad range of invasive and non-invasive congeners across different habitat types and at a global scale. While shared life history traits and the high spatial variability of seed bank data did not allow making broad generalizations about the mechanisms underlying the characteristics of the seed bank at different stages of the invasion process (naturalization and invasion), a number of robust patterns emerged from our study. A higher probability of forming a persistent seed bank in invasive than non-invasive congeneric species in the native range indicates that seed bank type is a useful attribute to be included in risk assessments. Less robust patterns in congeneric comparisons in the alien range suggest that formation of a persistent seed bank might be useful for the establishment of alien species, but not necessarily to become widespread. Other factors might thus play a greater role in promoting the spread of alien species, including the mode of seed dispersal or the probability of long-distance dispersal associated with human-related activities. Denser seed banks found for invasive woody species in their alien ranges confirm the role of high propagule pressure in determining the invasiveness of these species and provide support for the critical role of early detection and rapid eradication programs in preventing the formation of substantial seed banks. Overall, we showed that both seed bank persistence and density are important to assess the risks of naturalization and invasion, besides providing critical information on the magnitude and duration of the effort required in the control of these species. However, differences in seed bank density might be a by-product of successful invasions rather than their cause, making this attribute of the seed bank less reliable then seed bank persistence in risk assessments.