Main

A critical question in global efforts to reduce biodiversity loss is how best to allocate scarce conservation resources. To what extent should conservation be focused on modified and fragmented landscapes where threats are potentially greatest, versus landscapes that are largely intact9? Although it is expected that both approaches have value, in some human-influenced habitats, many species seem surprisingly resilient to habitat loss and fragmentation, and can coexist with humans in highly modified landscapes10,11, provided that habitat loss does not exceed critical thresholds2. Theory predicts that abrupt biodiversity declines are most likely to occur when habitat availability is reduced to very low levels in the landscape (10–30%)3,4,12. Alternatively, recent evidence indicates biodiversity is best conserved by minimizing human intrusion into intact and relatively unfragmented landscapes, which implies concentrating the impacts of anthropogenic disturbance elsewhere5,13. This is because initial intrusion may result in rapid degradation of intact landscapes, not only via the direct effects of habitat loss, but also the coinciding effects of overhunting, wildfires, selective logging, biological invasions and other stressors5. Such evidence has led to recent calls to increase the protection of substantial intact areas of the Earth’s terrestrial ecosystems14,15. Testing the extent to which these alternative hypotheses explain patterns of extinction risk globally can improve the effectiveness of conservation efforts and inform the formulation of policies, affecting the future of life on Earth.

Recent advances in remote sensing have enabled the development of a spatially explicit, high-resolution global dataset on rates of forest change6, which provide the capacity to quantify the effects of contemporary global forest loss on biodiversity16. We quantified the association between global-scale forest loss and gain within the ranges of 19,432 species and their International Union for Conservation of Nature (IUCN) Red List category of extinction risk, recent genuine changes in extinction risk, and overall population trend direction. The species spanned three vertebrate classes, and included 4,396 (22.6%) listed as threatened (Vulnerable, Endangered, or Critically Endangered) and 15,214 (78.3%) associated with forest habitats. Under the ‘habitat threshold’ hypothesis, we expected the effects of recent forest loss to be most detrimental for species that have already lost a substantial proportion of forest within their ranges. Under the ‘initial intrusion’ hypothesis, we expected species with relatively intact forest within their ranges to show the most severe effects of deforestation.

We obtained range maps for amphibians and mammals from the IUCN Red List17 and those for birds from BirdLife International and NatureServe18. We classified species as ‘non-forest’, ‘forest-optional’, and ‘forest-exclusive’ based on the IUCN Red List habitat classification data17. Within each species’ range, we used fine-resolution forest-change data (2000–2014)6 to calculate the amount of recent forest cover, loss, and gain (Fig. 1). Given that many species were assessed for the Red List in the early period of our recent forest-loss data (or even before this; Methods), it would be ideal to have contemporary forest loss data from before 2000. The most spatially contiguous dataset for 1990–200019 covered >80% of the ranges for only 58.7% of the species in our analyses. However, locations of forest loss were highly spatiotemporally correlated at the scale of species’ ranges between 1990–2000 and 2000–2014 (Methods, Intermediate-term forest change; Extended Data Fig. 7).

Figure 1: Spatial distribution of the six variables used to predict species’ IUCN Red List response variables.
figure 1

a, bd, Forest cover in the year 2000 (a), forest loss between 2000–2014 (b), forest gain (2000–2012) (c), and human footprint (d). e, The interaction term ‘forest loss × cover’ tested alternative hypotheses that forest loss exerts the greatest negative influence on biodiversity at low versus high initial levels of forest cover. High values of this variable (shown in e) correspond to regions of both high forest cover and loss. f, Historical forest loss represents long-term forest loss in years preceding 2000. Values plotted are averages taken over 0.4° grid cells. The maps are derived from current forest change maps6 (ac, e, f), an intact forest landscapes map32 (f), biomes of the world33 (f), and human footprint22 (d).

PowerPoint slide

We also expected that historical deforestation over much longer temporal scales could influence species vulnerability, a phenomenon known as ‘extinction debt’20,21. We calculated historical forest loss as the difference between the extents of area within species’ ranges that historically supported forest cover and the area that remained forested in the year 20006 (Fig. 1). We also calculated the mean ‘human footprint’ value22 within each species’ range, because forest loss could be confounded with other broad-scale anthropogenic pressures (Fig. 1). Using these data, we fit a spatial autologistic regression model to test whether forest loss within species’ ranges is associated with the likelihood that a species: (i) is listed as threatened; (ii) has qualified for uplisting to a higher category of extinction risk in recent decades (see Methods); and (iii) has a declining population trend (as classified by IUCN Red List assessors).

As expected, we found a strong association between rate of recent forest loss and each response variable. The odds of threatened status, declining population trends, and uplisting increased by 5.06% (95% confidence interval: 1.01–9.27), 11.34% (6.45–16.45), and 8.39% (1.53–15.70), respectively, for each 1% increase in recent forest loss for forest-exclusive species. This is not surprising, given that estimated or inferred rates of habitat loss are used to inform IUCN Red List assessments under criterion A2, particularly for species lacking direct data on population trends17. Nevertheless, our results confirm that previous categorical estimates of habitat decline (based on a mixture of inference, qualitative and quantitative analysis) match with our global, systematic analysis of quantitative data on forest loss16.

More importantly, we found strong support for the initial intrusion hypothesis for both forest-optional and forest-exclusive species. Species were more likely to be threatened, exhibit declining population trends and have been uplisted if their ranges contained intact landscapes (>90% forest cover) with high rates of recent forest loss (Fig. 2). Evidence for this lies in the strong positive statistical interaction between forest loss and cover (that is, forest loss × cover, Figs 2a, 3) on all response variables for both forest-exclusive and forest-optional species (maximum false discovery rate (FDR)-adjusted P = 0.025, minimum z = 2.51, Fig. 2a, Supplementary Table 3). For example, at high proportions of initial forest cover (90%), the odds of a forest-exclusive species being uplisted were 15.78% (95% confidence interval: 6.99–25.30) greater for each 1% increase in deforestation. At average proportions of forest cover (57%), the equivalent increase in deforestation was much smaller, with the odds of a forest-exclusive species being uplisted reduced to 3.45% (95% confidence interval: −3.91 to 11.36) (Fig. 3). These results were generally similar across vertebrate classes, but amphibians showed the strongest and most consistent effects across response variables (Extended Data Fig. 2). Predictably, forest loss and its interaction with forest cover had little effect on non-forest species (Figs 2, 3).

Figure 2: Effects of four predictors on the status of 19,432 vertebrate species worldwide.
figure 2

a, Positive ‘forest loss × cover’ terms indicate that the negative effects of forest loss are amplified in landscapes with greater initial forest cover. bd, Forest gain tended to have a positive effect on forest optional and exclusive species (b), whereas historical forest loss (c) and human footprint (d) tended to have negative effects. ‘Threatened status’ refers to IUCN Red List categories of ‘Vulnerable’, ‘Endangered’, or ‘Critically Endangered’. ‘Uplisted in threatened status’ means that the most recent genuine Red List category change for a species has been in the direction of higher endangerment. Forest loss and cover variables were included as main effects, but coefficient estimates are not shown here as they are not readily interpretable in the presence of the interaction term. Error bars represent 95% confidence intervals. Categories for P values are listed as ranges (that is, 0 < P ≤ 0.05, 0.05 < P ≤ 0.1, P > 0.1), and sample sizes (also given in Supplementary Table 1) for non-forest/forest-optional/forest-exclusive are 4,218/3,430/4,218, 10,457/8,827/10,457, 4,757/4,073/4,757 for Threatened status, Declining trend, and Uplisted in threatened status, respectively.

PowerPoint slide

Figure 3: Predicted probabilities of species status as a function of recent forest loss and total forest cover within a species range.
figure 3

All other covariates (forest gain, historical forest loss, and human footprint) were statistically held at their average values when estimating probabilities. For forest-optional and forest-exclusive species, the effect of forest loss is stronger at high levels of initial forest cover; deforestation in intact forests has the most negative impact, supporting the initial intrusion hypothesis.

PowerPoint slide

Historical forest loss also exhibited a strong negative influence on vertebrate biodiversity (Fig. 2), which may be evidence of an extinction debt in which some species are capable of persisting in landscapes long after initial forest loss has occurred, but subsequently decline. Predictably, increased human footprint has had a generally negative influence on the status of vertebrates associated with forest and non-forest systems (Fig. 2). We also found recent forest gain decreased the likelihood of threatened status (forest-exclusive and forest-optional species) and declining population trend (forest-optional species; Fig. 2). However, amphibians primarily drove these relationships; bird and mammal biodiversity did not show statistically significant responses to forest gain (Extended Data Fig. 2), indicating that young secondary forest does not appear to be ameliorating biodiversity declines for these taxa8.

Overall, the global spatial autologistic regression model performed remarkably well (area under the receiver operating characteristic curve (AUC) = 0.78; Extended Data Fig. 1), even when we conservatively excluded entire regions one at a time (Africa, Americas, Asia, Oceania) and evaluated models on these independent data (AUC = 0.74). Furthermore, results remained consistent when we statistically accounted for phylogenetic dependencies, latitude, and time since each species was initially described Extended Data Fig. 3. We also applied alternative approaches to account for spatial autocorrelation and excluded species designated as threatened due to characteristically small and declining or fragmented ranges (that is, under IUCN Red List criterion B) (Extended Data Figs 3, 6). Results were also robust to degree of threat; Critically Endangered, Endangered and Vulnerable species all showed similar patterns in response to forest loss (Extended Data Fig. 9).

Strong support for the initial intrusion hypothesis may be surprising, given existing theory3,23 and that a considerable number of conservation programs focus on areas that have already lost substantial forest24. However, such highly deforested landscapes may have already passed through a substantial local extinction filter, whereby the most sensitive species have been lost25. A recent broad-scale study conducted in the Brazilian Amazon revealed that landscapes still exceeding 80% forest cover have lost 46–60% of their conservation value5. Our results suggest that initial forest loss is a potential indicator of other threats to forest biodiversity that are more challenging to measure at large spatial extents. Mechanisms for intrusion effects include increased unregulated hunting26 (especially near new logging roads27), disease and human disturbance, and invasive species28, as well as the direct effects of habitat loss for interior forest specialists29. Indeed, many of the species with ranges that were characterized by high initial forest cover (before 2000), but intensive recent deforestation, tend to be under hunting pressure (for example, Sira curassow (Pauxi koepckeae)) or are habitat specialists (Mendolong bubble-nest frog (Philautus aurantium), Mentawi flying squirrel (Iomys sipora)) (Supplementary Table 4). If specialists’ habitat is targeted in the initial phases of deforestation (for example, accessible high-economic-value forest (bottomland forest adjacent to rivers)), habitat will be lost at much greater rates than indicated by the overall rate of forest loss within a species’ range30.

As a further exploration of the habitat threshold hypothesis, we fit a model to test whether the strongest negative effects of recent forest loss occurred in landscapes with both high and low levels of remaining forest cover (a statistical interaction between forest loss and forest cover squared; Methods). We found no evidence for such an effect for either threatened status or recent uplisting (Extended Data Figs 4, 5). Notably, the odds of a declining population trend showed evidence for this dual effect for forest-optional and -exclusive species; we speculate that the increased likelihood of a declining trend with deforestation in landscapes with low levels of forest cover, but no relationship for threatened status, may constitute early signs of an extinction debt that remains to be fully paid. Thus, our results do not imply deforestation effects are benign in regions with low levels of remaining forest cover. Although species exposed to deforestation in such landscapes are less likely to be designated as threatened than those exposed to similar rates of deforestation in more intact areas, their populations will continue to decline with further habitat loss, which will in time inevitably lead to increased extinction risk.

The spatially explicit nature of our model enabled quantitative predictions of global hotspots where biodiversity is at particularly high risk given reduced (halving current rates), continued, or accelerated (1.5×) future rates of forest loss (each assumes no future forest loss in protected areas with IUCN categories I–VI; Fig. 4). High-risk hot spots emerged in southeast Asia (particularly Borneo), the central-western Amazon and the Congo Basin where the numbers of threatened forest-exclusive species are predicted to increase by 82–134, 34–74, and 5–11, respectively, over the next 30 years under current rates of deforestation. Together, the number of threatened species for these three regions is predicted to increase by 121–219. Currently, only 17.9% of these areas are formally protected (IUCN classes I–VI; Supplementary Table 5) and only 8.9% have strict protection (IUCN classes I–III). These results, alongside evidence of ongoing erosion of intact forest landscapes31, highlight that areas until recently considered to be of “low vulnerability”9 are in fact where anthropogenic disturbance is increasingly putting species at most risk of extinction. New large-scale efforts to reduce both degradation and loss of intact forest landscapes7 are needed to protect against an intensified wave of extinctions in the world’s last wildernesses.

Figure 4: Projected increases in the number of threatened species under three scenarios of future forest loss.
figure 4

Projections are estimated using the global model. Increased threatened richness (blue to red colour scale) is relative to the fitted probabilities of a species being threatened. For example, a value of 20 would indicate a projected increase of 20 threatened species in a 0.2° grid cell. Only locations with projected increases in threatened species are shown and only forest-exclusive species were used for this projection. Column labels show time spans where the lower limit assumes the effects of forest loss on status are entirely due to deforestation from 2000–2014; the upper limit assumes effects could be partly a function of forest loss in the decades before 2000 (global locations of forest loss are temporally autocorrelated; see Methods, section ‘Intermediate-term forest change’). IUCN protected areas (categories I–VI) are shown in greyscale shading. The maps are derived from the following sources: IUCN Red List species range maps18, recent forest change6, intact forest landscapes32, human footprint22, world biomes33, and the World Database of Protected Areas34.

PowerPoint slide

Methods

No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment.

Species data

We obtained data on three classes of terrestrial vertebrates (mammals, amphibians, and birds) from the IUCN Red List17. We defined threatened species as those classified as Vulnerable, Endangered, or Critically Endangered on the Red List. We also obtained population trends (‘Increasing’, ‘Stable’, ‘Decreasing’, or ‘Unknown’) from the Red List. We excluded ‘Data deficient’ and ‘Extinct in the wild’ species from our analysis with threatened status as the response variable. Similarly, for the decreasing population trend response, we excluded species with unknown population trends.

For analyses in which we examined change in Red List category, it is necessary to compare time points in which all species in the taxonomic group were assessed, and to consider only those Red List category changes between such assessments that resulted from genuine improvement or deterioration in status (that is, excluding changes owing to improved knowledge or revised taxonomy). These genuine changes underpin the Red List Index35,36. We considered species to have been uplisted if their most recent genuine Red List category change was in the direction of increasing endangerment (Least Concern < Near Threatened < Vulnerable < Endangered < Critically Endangered). These data were obtained from Hoffmann et al.37, and were updated to match the taxonomy on the 2016 IUCN Red List; the set of genuine changes for birds was also updated using data in BirdLife International38. The relevant periods of our primary uplisting dataset are 1980–2004 for amphibians, 1996–2008 for mammals, and 1988–1994, 1994–2000, 2000–2004, 2004–2008, 2008–2012, and 2012–2016 for birds. Additionally, we used all available genuine category change data from 2008–2016 for mammals and 2006–2016 for amphibians. Although these more recent category change data (approximately 100 category changes) are not yet comprehensive (that is, not all species in these taxa have been checked for genuine category changes over these times), they cover a wide range of species and are likely to be reflective of recent changes in forest cover for these species. Genuine category change data are currently unavailable for other time periods.

We classified non-avian species according to habitat usage (forest-exclusive, forest-optional, and non-forest) using the IUCN Red List data coding species against the IUCN habitats classification scheme (http://www.iucnredlist.org/technical-documents/classification-schemes/habitats-classification-scheme-ver3). We treated species using only forest habitat as forest-exclusive, those using forest habitat and at least one other habitat type as forest-optional, and those not using forest at all as non-forest. To categorize bird species, we used higher-quality data on forest dependency from BirdLife International38, treating species with high forest dependency as forest-exclusive, medium and low forest dependency as forest-optional, and not normally using forest as non-forest.

The species range maps used in the analysis were derived from the IUCN Red List for mammals and amphibians, and from BirdLife International and NatureServe18 for birds. For each species, we used only range polygons where presence was classified as ‘Extant’ or ‘Probably extant’. Vertebrates without range maps available were omitted from the analyses (108 mammals, 39 amphibians, and 30 birds). Reptiles were excluded from the analysis as IUCN reptile data are relatively limited39.

After screening for data availability using the steps above, the dataset consisted of 19,432 species (19,615 including data-deficient species), 4,396 (22.6%) of which are listed as threatened. The entire dataset represents 58.2% of the terrestrial vertebrate species globally (98.9% birds, 84.9% mammals, 63.1% amphibians) (based on described species totals from IUCN Red List summary table 1).

Predictor variables

We used six predictor variables in our primary analysis (Fig. 1). Here, we describe these variables in detail.

We used the forest change maps (version 1.2) given in Hansen et al.6 for our analyses. The forest cover map indicates the percentage forest cover in each 30 m pixel in the year 2000. The forest loss and gain maps are both binary and indicate whether forest loss or gain occurred in each pixel. Following Hansen et al., we considered forest to have been ‘lost’ if a stand-replacing disturbance (that is, complete removal of tree cover canopy at the Landsat pixel scale) had occurred between 2000 and 2014, and ‘gained’ if establishment of tree canopy from a non-forest state had occurred between 2000 and 2012. In addition, we included a forest loss × forest cover interaction term to test the hypothesis that the effects of forest loss are dependent upon the total amount of forest within a species’ range. A positive coefficient for such a term would indicate that the effect of recent forest loss on our response variables was amplified at when initial forest cover was high (support for the initial intrusion hypothesis). Conversely, a negative coefficient for this interaction term would indicate that the effect of recent forest loss on our response variables was greatest at low forest cover (support for the habitat threshold hypothesis; see main text).

The human footprint map that we used (Global Human Footprint v.2, 1995–2004) measures the extent of human impacts on the environment and is created from nine global data layers covering biome type and biogeographic realm, human population density, human land use and infrastructure (that is, built-up areas, night-time lights, land use/land cover), and human access (coastlines, roads, railroads, navigable rivers)40. Among land cover types, built-up environments increase the human influence index the most, followed by agricultural land cover, and mixed-use land cover (other types do not contribute to the index)22. Thus, loss of forest to these land cover types could cause human footprint to be partially confounded with our forest loss variable, potentially causing our analysis to underestimate the effects of forest loss. A more recent version of this map (1993–2009) was recently released41,42 but the original and updated human footprint maps are highly correlated (r = 0.935 at 2° resolution), so our choice of human footprint map is unlikely to have influenced the results.

In our analysis, ‘historical forest loss’ is an estimate of long-term patterns in forest loss that is not captured by contemporary forest change. To construct this variable, we took the following steps. First, we used a random forest regression model to develop a historical (or potential) forest cover map. We modelled the continuous variable ‘percentage forest cover’ in the year 2000 (from Hansen et al.7) as a function of x and y coordinates, 19 bioclimatic variables (derived from monthly temperature/precipitation) from the WorldClim database13 along with a categorical variable representing forest biomes33. Importantly, to exclude the effects of contemporary anthropogenic disturbance on percentage forest cover we only used data from within ‘intact forest landscapes’ (IFLs) in the regression model. An IFL is defined as “an unbroken expanse of natural ecosystems within areas of current forest extent, without signs of significant human activity, and having an area of at least 500 km232. We assumed that forest cover in intact forest landscapes (IFLs) is representative of the degree of canopy cover that could be historically supported in across the globe. We then extrapolated the fitted values of this model to the areas for a map of potential or historical forest cover (Extended Data Fig. 10a). Second, we subtracted recent forest cover from historical cover to estimate historical loss (Extended Data Fig. 10b) to yield a map of historical forest loss (Extended Data Fig. 10c). We restricted our modelling to within forest biomes, excluding non-forest biomes and the boreal forest/taiga. Although some forest cover may be present outside forest biomes (for example, in savannahs), limitations in available IFL data for these cover types and the taiga make reconstructing historic forest cover in these biomes impractical. Moreover, forest obligate species—our primary focus—seldom occur outside forest biomes. Modelling was conducted at 5-km resolution using rasters in Behrmann cylindrical equal-area projection. We used ArcGIS 10.1 and R for the geospatial analyses43,44. The random forest model was fit using the Rborist R package with the default settings45. We acknowledge that the period of time since historical deforestation can vary widely across locations globally. Nevertheless, in the absence of globally available forest loss data before 2000, this variable is the best available test of whether long-term reductions in forest cover within a species range affects Red List category and overall direction of population trend.

Statistical analysis

We used a 2-decimal degree equivalent equal-area grid (constructed using the Behrmann cylindrical equal-area projection). This resolution is considered appropriate for macroecological analyses that involve species’ range maps46. We rescaled covariates to this resolution by taking their average values across each grid cell (ignoring regions over water). We rescaled species’ ranges to the grid by treating a species as present in a grid cell if any part of its range overlapped that cell. We then averaged covariates across species ranges using the averages of their cell values weighted by the proportion of land in each grid cell.

We modelled the probability of species being threatened, having a declining population trend, or having been uplisted (three separate binary responses) using autologistic regression to account for potential spatial autocorrelation47. The spatial autocovariate was calculated for each species using a symmetric spatial weights matrix as:

where i is the ith species, ki is the set of its neighbours, yj is the response for the jth species, and wij = 1 corresponding to the (i, j) entry of the binary spatial weights matrix48. Geographic distance was calculated using species’ range centroids. The spatial weights matrix and spatial autocovariate were calculated using the spdep package for R44,49.

We used the generalized linear model (GLM) function glm in R to fit the logistic regression model, including the covariates described above, the spatial autocovariate, and taxonomic class (as a fixed effect). We estimated standardized coefficients and 95% confidence intervals for all predictor variables (each was standardized (z-transformed) before analysis). Our hypothesis tests were conducted across all three vertebrate classes with six predictor variables, which risks inflating Type I error rate. Sequential Bonferroni-type multiple comparisons are sometimes used to account for such error inflation, but are highly conservative50. Therefore, we used a FDR procedure (the ‘graphically sharpened method’50) which does not suffer from the same loss of power but corrects for multiple comparisons. FDR-adjusted P values were calculated with p.adjust in R44,51.

Projecting future status changes

We used our model for threatened status of forest-exclusive species to map the predicted increase in threatened species richness at multiple forest loss rates over time. We did this by simulating continued forest loss at rates of 50%, 100% and 150% the current rates for the time spans 2030–2045 and 2045–2070. For example, at the current loss rate the area of forest lost would double in 15 years (by 2030). We modified these forest loss projections by setting predicted future loss to zero within IUCN category I–VI protected areas using the polygon type protected area maps in the World Database of Protected Areas (WDPA)34. Assuming that there are no substantial time lags between forest loss and species being listed as threatened, the resulting predictions (probabilities of species being threatened) correspond to 2030. In the event that intermediate term (approximately 1950–2000) forest loss is also closely linked to threatened status (that is, there are time lags between forest loss and status decisions), we included conservative upper time limits corresponding to half the stated forest loss rates. In all cases, predicted current probabilities of being threatened (from the fitted model) were subtracted from the estimated future probabilities; we then mapped the result by summing probabilities for all species in each raster grid cell. As the maps show qualitatively similar patterns, they can conservatively be interpreted as showing ‘relative hot spots’—an interpretation that is valid even if the true intermediate-term forest rate of loss is substantially higher than in our scenario.

To assess overlap between existing protected areas and hot spots (at high risk of increases to the Red List), we used the ‘predicted increase in threatened species richness’ map for 2030–2045 at the current loss rate. Within each regional panel of this map set in Fig. 4, we considered hot spot areas to be those with at least one quarter of the maximum predicted increase in threatened richness for that region. We estimated the percentage of these areas that is protected using the World Database of Protected Areas (WDPA)34. For this analysis, we report both strictly protected areas: IUCN categories Ia (Strict Nature Reserve), Ib (Wilderness Area), II (National Park), and III (Natural Monument or Feature) and all other IUCN categories (IV, Habiat/Species Management Area; V, Protected Landscape; VI, Protected area with sustainable use of natural resources). In addition, we only consider protected areas with polygon data in the WDPA, which results in a conservative estimate of the percentage of high-risk area that is protected.

Assessing model performance

We used the area under the receiver operating characteristic curve (AUC) to assess model performance for our primary model (predicting threatened status for forest exclusive species). The AUC reflects the true versus false positive rates for a binary classifier with continuous output as a function of the threshold used to determine which outputs correspond to which categories52.

We calculated AUC both for the ‘All species’ model (with ‘class’ as a fixed effect) and separately for each class using models fit to individual classes. We did this with and without the spatial autocovariate term. In each case, we also quantified model performance using fourfold cross-validation by regions of the world (Supplementary Table 2). We used a regional grouping (Africa, Americas, Asia, Oceania) based on the United Nations Statistics Division classification system53. Using entire regions as hold-out test datasets further reduces the positive effects of dependency (spatial, taxonomic, and so on) on model performance metrics54. The raw and cross-validated AUCs (0.784 without cross-validation, 0.743 with cross-validation) for ‘All species’ together (with the autocovariate) indicate that our models perform well (Extended Data Fig. 1). For each model, we also calculated P values from a Wilcoxon rank-sum test55 to quantify whether the AUCs were significantly greater than 0.5 (a baseline at which the model is performing no better than random chance). All P values except the one for mammals with cross validation and no auto-covariate term were highly significant (<0.001) (Extended Data Fig. 1).

Alternative statistical methods to account for spatial autocorrelation

We tested the residuals of our global autologistic regression model for spatial autocorrelation; for all response variables, Moran’s I was <0.15 across all distance classes, indicating that the autocovariate had removed spatial autocorrelation. Further more, to ensure that our results were robust to the sort of spatial model applied, we fit other spatial logistic regression models (that is, Moran eigenvector filtering, simultaneous spatial autoregressive models (SAR), and Bayesian conditional autoregressive models (CAR)) to assess sensitivity to the procedure used for modelling or accounting for spatial autocorrelation. We also fit a non-spatial generalized linear model for reference along with our primary spatial autologistic regression model using the 50 nearest neighbours of each species (instead of 5). In each case, the models were fit using forest-exclusive species with threatened status as the response. We fit models for each taxonomic class separately, as not of all of the procedures could readily incorporate the hierarchical structure of the data. Our results were robust to the spatial autocorrelation modelling method (Extended Data Fig. 6). Details on other spatial models applied are given below.

We fit a Moran eigenvector GLM filtering model by adding covariates to the generalized linear model that were computed using the ME function in the spdep R package49,56. This spatial filtering model involves augmenting the predictor matrix with eigenvectors computed from the spatial weights matrix so as to reduce the spatial autocorrelation of the residuals (as estimated using the Moran’s I statistic). The smallest subset of eigenvectors that causes the permutation-based Moran’s I test P value to exceed a threshold α is chosen for inclusion (we used α = 0.2, which is a common default value).

We fit CAR and SAR models using the binary spatial weights matrix described above. The conditional autoregressive model was fit using the CARBayes R package57. Markov chain Monte Carlo sampling errors were encountered when fitting a few of the CAR models. In such cases, the model results are not available. The simultaneous autoregressive model was fit using the splogit function in the MCSpatial package58. It is based on an approximation (linearization), which allows the model to be fit to large datasets59.

Estimates within taxonomic classes

While the primary results presented in the main text (Fig. 2) are for all classes together (with class included as a fixed effect), we also fit models using data from each class separately (Extended Data Fig. 2). We did this to assess the extent to which our results, particularly for the forest loss × cover interaction, are consistent between classes.

Accounting for the effects of latitude

We fit models including latitude as a main effect (Extended Data Fig. 3a). We did this to test whether our results were robust to this potential confounding variable, which is correlated with numerous variables that may be linked to endangerment such as net primary productivity (NPP) and per capita gross domestic product (GDP). The estimated forest loss × cover interaction term did not change substantially when accounting for (absolute) latitude (Extended Data Fig. 3a).

Quadratic models (loss × cover squared interaction)

We fit models with quadratic interaction terms corresponding to forest loss × cover2 to test whether the models with only the linear forest loss × cover terms were adequate for forest exclusive and optional species. Support for a quadratic interaction term would provide evidence for both the initial intrusion hypothesis and the threshold hypothesis; in other words, the effects of forest loss on species status and trends are most substantial both at very high and very low initial forest amounts (see main text). These quadratic terms were generally non-significant (Extended Data Fig. 4) supporting the hypothesis that the effect of forest loss on the odds of species being threatened, declining, or uplisted varies linearly with forest cover. However, in the overall (all species) models, we found strong evidence that the forest loss × cover2 term was positive when declining trend was the response variable (Extended Data Fig. 4). This suggests that the effect of loss on population trends may be most negative at both low and high levels of forest cover, and smallest (near zero) at intermediate levels of forest cover (Extended Data Fig. 5).

Tropical forest species

As the ecology of tropical forests often responds differently to non-tropical forests, we also examined model results for species found exclusively in tropical forests (Extended Data Fig. 3b). We did this by restricting the species set to those with ranges containing only grid cells that overlap tropical forests. We determined tropical forest regions using a map of biomes33 and treating the following biomes as tropical forest: ‘Tropical & Subtropical Moist Broadleaf Forests’, ‘Tropical & Subtropical Dry Broadleaf Forests’, ‘Tropical & Subtropical Coniferous Forests’ and ‘Mangroves’. The restriction of our dataset to tropical forest species did not substantially alter our primary results, although it did weaken the forest loss × cover effect on the likelihood of declining population trends (Extended Data Fig. 3b).

Range area

Species’ geographic range area is a key predictor of extinction risk, and extent of occurrence and area of occupancy are two parameters used to assess species under criterion B of the IUCN Red List. This can pose a circularity issue for comparative extinction risk analyses, particularly those that attempt to assess the effect of geographic range area relative to the effects of other predictors on species endangerment60. A common remedy is to run the analysis on species classified as Least Concern and those that are listed as Near Threatened or threatened for reasons not directly linked to small geographic range area (that is, not under criterion B)60. We followed this procedure as part of our sensitivity analysis. Specifically, we excluded species listed as threatened under criterion B. Such species made up 2,529 (approximately 58%) of the 4,396 threatened species in our full dataset. The results (Extended Data Fig. 3c) show that our overall conclusions are robust to the exclusion of these species.

Forest loss and cover threshold

In our primary analysis, we used the forest loss and cover variables directly as given in Hansen et al.6. Forest cover is a continuous variable ranging from 0% to 100% cover within each pixel and forest loss is a binary variable indicating whether or not tree cover canopy had been completely removed between 2000 and 2014. Since the effects of forest loss and cover on endangerment (status/trends/uplisting) probably vary depending on the initial amount of forest cover, we replicated our analyses, but truncated forest loss and cover at the 75% threshold (Extended Data Fig. 3d). That is, we treated cover and loss as zero in pixels that had less than 75% initial forest cover. This change did not influence our results substantially (Extended Data Fig. 3d).

Forest loss and gain standardization

The forest loss and gain variables in our analysis can be thought of in terms of percentages of species’ ranges since they are averages of spatial variables across species’ ranges. An alternative way to compute the forest loss and gain variables is as percentages of forest cover within species’ ranges. We used these standardized loss and gain variables (that is, loss divided by cover and gain divided by cover) as part of our sensitivity analysis (we similarly standardized historical loss by dividing by potential cover), and found that their use had little effect on our results (Extended Data Fig. 3e). This provides another way of quantifying forest loss and gain, which may be particularly appropriate for species that have little forest cover within their ranges. This was uncommon in our core dataset as we focused on forest-optional and -exclusive species, that tend to have high forest cover across their ranges.

Accounting for phylogeny

The models that we fit assume that the dependence structure of the observations is purely spatial. However, this may not be valid as species that are phylogenetically similar may be more likely to have the same status, trend, or uplisting variable values, even after accounting for the covariates in the models. To explore this issue of potential phylogenetic dependence and its effect on our results, we fit generalized linear mixed models using glmer in the lme4 R package61, including random effects by taxonomic order (Extended Data Fig. 3f). We were unable to fit more complex phylogenetic models that use full trees (for example, phylogenetic logistic regression) because detailed phylogenetic data are not available for many of the species in our analysis62. However, the addition of taxonomic-based random effects did not substantially alter our results, suggesting that the effects of phylogenetic dependence are weak after accounting for spatial autocorrelation and the other predictors (Extended Data Fig. 3f).

Assessing sensitivity to resolution

We tested the sensitivity of the results to the spatial resolution used in our analysis (2 decimal degree equivalent equal-area) by re-computing the covariates (averages across species’ ranges) at a finer resolution of approximately 5 km. In this analysis, we refined the species’ ranges by clipping them using the species’ altitude limits coded on the IUCN Red List, when available (6,047 of 19,615 species). We also excluded forest loss, gain, and cover inside of known tree plantations using a map of plantations for seven tropical countries63. Covariate averages at high resolution were calculated using Google Earth Engine. Coefficient estimates show relatively low sensitivity to our choice of resolution, clipping ranges by altitudinal limits, and masking out forest variables within known plantations (Extended Data Fig. 3g).

Intermediate-term forest change

Our primary forest change variables are from 2000 to 2014. We also included a derived ‘historical forest cover’ variable to account for long-term forest change. However, given that many species were listed in the early period of our recent forest-loss data (or even before this), it would be ideal to have contemporary forest loss data from before 2000. Unfortunately, no spatially contiguous datasets exist for this period. Nevertheless, to extend the time span for the more recent forest change variables, we added 1990–2000 forest loss and gain estimates to the 2000–2014 estimates, producing estimates of loss and gain for the period 1990–201419. This summed dataset covered >80% of the ranges for only 58.7% of the species in our analyses. Using these data, the forest loss × cover interaction term was weaker. However, consistent with our primary analyses, estimates still tended to be positive for forest-optional and -exclusive species (Extended Data Fig. 3h). It is likely that the smaller effect size estimates are related to uncertainty in the 1990–2000 dataset caused by missing data (Extended Data Fig. 7). Importantly, we found a high correlation between 1990–2000 and 2000–2014 forest loss at low levels of missing data, which suggests that locations of intermediate-term and recent forest loss are correlated at the scale of species’ ranges (there is temporal autocorrelation in forest loss; Extended Data Fig. 7). This correlation is further supported by the country-level correlations between 1990–2000 and 2000–2015 net forest loss (that is, change in percentage cover) obtained using the Food and Agriculture Organization’s (FAO) Global Forest Resources Assessment country-level data64 (unweighted correlation 0.705, country land-area-weighted correlation 0.805; Extended Data Fig. 8). This explains strong effects of forest loss during the 2000–2014 period even though some species may not yet have fully felt the effects of this most recent loss (or had their status updated accordingly).

Year of discovery

Newly described species are often from remote areas (that is, with initial high forest cover) where development is starting to take place (discovery was facilitated by access); such species are highly likely to be classed as threatened65. To explore how time since initial species description influenced our results, we conducted a sensitivity analysis including ‘year of species description’ as a predictor. We gleaned year of description from the taxonomic authority sections of Red List fact sheet accounts. For 18 of the species in our analysis, two adjacent years were reported (for example, “Highton, 1971 (1972)”). In these cases, we used the average of the two years. In addition to a main effect for year, we included the three-way forest loss × forest cover × year interaction. This directly tests the hypothesis that the initial intrusion effect (the statistical interaction between forest loss and cover) is mediated by the time when a species was initially described, with the expectation that most recently described species are more likely to show such effects. However, there was little statistical support for this hypothesis; the strength of the forest loss × forest cover interaction (our primary focus) was largely unchanged (Extended Data Fig. 3i).

Threshold for threatened species

It is possible that species in different threat categories could respond in contrasting ways to forest loss. For instance, we expected species listed as Endangered and Critically Endangered to be more likely to support the habitat threshold hypothesis; these species only become extremely threatened when forest continues to be lost at high rates after most original habitat has been lost. Therefore, we tested effects of forest loss, forest amount and their interaction on successive levels of IUCN threat categories (Extended Data Fig. 9). We compared model results to those obtained when threatened species were taken to be Endangered or Critically Endangered species and Critically Endangered species alone. Our overall conclusions were consistent across threat categories (Extended Data Fig. 9).

Data availability

Data that support the findings of this study have been deposited with figshare at: https://doi.org/10.6084/m9.figshare.4955465.v4.