Introduction

Insect defoliation of forest canopies arises from complex interactions among insect populations and their endogenous and exogenous drivers (Turchin and Taylor 1992). Outbreaks occur when defoliator populations increase exponentially and disturb large areas of forest (Cooke et al. 2006). Spatial propagation of outbreak populations remains poorly understood, in part because defoliation effects are often ephemeral and difficult to quantify. Ephemeral spatial patterns, however, may reveal processes that drive disturbance behavior (Fortin et al. 2003). For example, defoliation that mirrors the distribution of host tree species suggests that populations are responding to bottom-up resource constraints, whereas unrelated patterns suggest top-down (e.g., predation, parasitism, disease) or dispersal-related processes (Hunter et al. 1997), or some combination (Liebhold et al. 2000; McCullough 2000; Peltonen et al. 2002). Improved analysis of the relative role of these processes will enhance landscape models of outbreak occurrence, spread and impact that often lack spatial detail.

Spatial patterns are increasingly used to explain and predict defoliation outbreaks (Liebhold et al. 1991, 1998), albeit primarily over regional-scales (Magnussen et al. 2004; Seidling and Mues 2005; Tobin and Blackburn 2008). The use of spatial tools to understand finer-grain, intermediate- and landscape-scale defoliation patterns has lagged behind. In this paper, we examine patterns at an “intermediate” scale which falls between traditional scales of defoliation research (regional scale) and population ecology research (plot scale). We define intermediate scales to range from 100 s to 10,000 s ha, which allows for characterization of important processes that contribute to defoliation patterns, such as dispersal or disease spread (Weseloh 2003).

Past analyses of defoliation patterns have relied on limited data, primarily aerial sketch maps delineated by hand from a moving airplane or from aerial photographs (Hohn et al. 1993; Candau et al. 1998). Remote visual identification like this works if defoliation exceeds 30 % (MacLean and MacKinnon 1996) and produces polygons classified as defoliated (or not). The resulting regional estimates of defoliation extent are useful for studies of defoliator population dynamics (Gray et al. 2000; Candau and Fleming 2005). These data, however, are limited in the types of processes they can describe because they: (1) provide only binary indicators of defoliation occurrence, which over-simplifies the range of defoliation intensity (0–100 % of canopy foliar biomass) (Dobbertin 2005), and (2) aggregate sketch-map polygons to coarse resolutions (typically >1 km2) to account for geospatial uncertainty, and thus lack the detail to understand finer-scale processes.

Recent technological and analytical advances, however, enable more detailed mapping of the intensity of defoliation events (Fraser and Latifovic 2005; Townsend et al. 2012). For example, Townsend et al. (2012) mapped defoliation intensity across mid-Atlantic forests by measuring changes in vegetation indices from Landsat TM images before and during a European gypsy moth (GM, Lymantria dispar L.) outbreak and tying them to ground-based defoliation measurements. This methodology quantitatively maps the full intensity and extent of defoliation caused by an outbreak of the exotic, but naturalized, gypsy moth (Fig. 1).

Fig. 1
figure 1

Study area and maps of defoliation intensity for two-year gypsy moth outbreak derived from Landsat TM and ground-based defoliation measurements. Red heavily defoliated, black non-forest landuse or clouds, yellow approximate state forest boundaries. (Color figure online)

Maps of defoliation intensity enable us to improve our understanding of the bottom-up, resource factors and spatial processes that contribute to intermediate-scale defoliation patterns. Statistical models can explain how some components of defoliation pattern depend on exogenous factors (i.e., factors that influence insect populations in a density independent fashion). Exogenous factors include the distribution and phenology of host species, topography, and population suppression activities. Once we have accounted for exogenous factors, we can isolate residual patterns that may indicate important endogenous (i.e. density dependent) processes such as insect dispersal. Such residual spatial variation is rarely addressed, though defoliation research frequently focuses on the importance of underlying factors to local-scale disturbance dynamics (Campbell and Sloan 1977; Davidson et al. 2001a).

Forests in the eastern US suffer widespread defoliation by gypsy moth populations and are well-suited for the spatial analysis of defoliation patterns. The gypsy moth was introduced to the United States in Massachusetts in 1869. Since then, it has expanded its range north, south and west, establishing populations that fluctuate to outbreak levels with a periodicity of 5–10 years (Baker 1941; Johnson et al. 2006). The gypsy moth is a voracious generalist that feeds on hundreds of deciduous species but has limited endogenous dispersal capabilities in the US (Doane and McManus 1985; Liebhold et al. 1995).

We used linear modeling and hierarchical partitioning to examine the direction and explanatory strength of exogenous drivers hypothesized to affect gypsy moth defoliation intensity. Our analysis addressed two questions: (1) How much of the spatial variation in defoliation intensity is explained by host-abundance, phenology, topography, and suppression patterns? And (2) Are residual patterns spatially autocorrelated across the landscape in a manner that agrees with gypsy moth movement and dispersal?

We expected exogenous patterns to explain a significant portion of the variance in gypsy moth defoliation intensity, but that some residual spatial pattern would remain. We predicted that residual patterns would be spatially autocorrelated over distances that are biologically relevant to gypsy moth life history traits, such as dispersal distances, and would vary from the first year of an outbreak (2000) to the second (2001). We expected first year variance to be higher with shorter spatial autocorrelation if the initial outbreak was patchier. In the second year, we expected that less variance would be explained by exogenous factors and that residual autocorrelation would range over longer distances if residual patterns reflected dispersal from high density populations to less desirable sites.

Methods

Study area

Our study focuses on the Green Ridge State Forest (GRSF, Fig. 1), which covers 19,000 ha in the Ridge and Valley physiographic province of western Maryland, U.S.A. The forest is managed for timber, wildlife, recreation and water resources. Topography ranges from 400 to 900 m elevation and is characterized by steep ridges that run northeast to southwest. Oak species (Quercus spp.) dominate diverse deciduous forests (Foster and Townsend 2004), providing a matrix of gypsy moth hosts that vary subtly in preference and quality (Montgomery 1991). Gypsy moth populations invaded this area in the 1980s (Francis Zumbrun, former Forest Manager, Maryland Department of Natural Resources (MDNR), personal communication) and have since undergone periodic outbreaks in the 1980s, 1990s, and in 2000–2001.

The abundant oaks at GRSF mix with occurrences of pine and hemlock communities (Foster and Townsend 2004). Red oak (Quercus rubra L.) dominates along flattened mesic ridgetops. Chestnut oak (Q. prinus L.) forests are common on drier ridges and black oak (Q. velutina Lam.) forests occupy midslopes. A large swath of white oak (Q. alba L.) forest occurs at lower elevations mixing with scarlet oak (Q. coccinea Muench.) on xeric hill tops. Mixed hard pine stands, composed primarily of Virginia pine (Pinus virginiana Mill.), occur on xeric aspects near white oak forest. White pine (P. strobus L.) is more common in red oak forests. Deciduous bud-burst phenology is heterogeneous in this landscape at spatial and temporal scales that may interfere with gypsy moth larval dispersal (Foster et al. 2013).

Data

We analyzed spatial datasets that included maps of defoliation intensity and forest composition derived from remote sensing imagery as well as plot-based forest inventory data (Table 1). In 1998–1999, just prior to the gypsy moth outbreak, the Maryland DNR collected Continuous Forest Inventory (CFI) plot data at GRSF. They measured 436 plots for standard forest structure variables such as tree species, diameter (DBH), height, and age. Variable radius plots averaged approximately 0.08 ha in size and were distributed on a systematic grid with 550 m spacing. GPS coordinates for plots were updated in 2004 and 2007.

Table 1 Data sources

We specified general explanatory variable categories (Table 1) using a variety of measurements and indices common to defoliation research (Table 2). Our goal was to use a parsimonious model within the “driver-response paradigm” (Cushman et al. 2007) to understand the relative importance of habitat characteristics within these categories to defoliation intensity. We calculated tree biomass using standard allometric equations (Jenkins et al. 2001). We grouped tree species into gypsy moth food preference classes following Liebhold et al. (1995): susceptible genera such as Quercus (susceptibility class #1), resistant genera such as Acer (class #2), and immune genera such as Fraxinus (#3). Basal area (BA) of preferred species is a common predictor of defoliation susceptibility (Kleiner and Montgomery 1994; Liebhold et al. 1997; Davidson et al. 2001b). Phenological asynchrony is the absolute value of the difference between satellite derived leaf-out dates and model derived gypsy moth egg-hatch dates (Foster et al. 2013). Greater difference between these critical phenological events should decrease defoliation risk by interacting with early stage larval survival. Pesticide status is a binary classification of sprayed areas. Because defoliation is often observed to vary with topography, we derived topographic indices from a DEM (Table 1).

Table 2 Variables tested to explain mapped defoliation intensity from four categories of habitat characteristics

Map sampling

We derived explanatory variables for CFI plot locations, excluding plots harvested or obscured by clouds in defoliation maps, for a total of 376 plots (hereafter called the plot-based dataset). We used defoliation intensity from plot locations as the dependent variable for plot-based regression models.

The plot-based dataset provided quantitative field measurements of host abundance, but did not permit us to assess spatial autocorrelation for distances shorter than 550 m. Windborne dispersal of gypsy moth larvae occurs over distances as short as 100–200 m, and rarely exceeds 1 km (Mason and McManus 1981; Hunter and Elkinton 2000). To measure spatial autocorrelation over shorter distances, we also created models using map-based data (30 m by 30 m resolution). To generate the map-based dataset, we randomly sampled 5,000 points from each mapped data layer, with a minimum inter-point distance of 45 m. The main difference from the plot-based dataset was that we did not have unique forest inventory data for these random locations. To create point estimates of variables within the host abundance category (Table 2), we calculated means from CFI plots that intersected with forest classes in an existing map of twenty forest community types (Table 1) (Foster and Townsend 2004). Therefore, random points were assigned class means as host abundance values. We used the plot-based dataset for model selection and comparison of variable importance and used the map-based dataset to assess fine-scale autocorrelation for selected models.

Statistical analyses

We expected mapped defoliation intensity to vary in relation to four exogenous factors following the basic linear model (Eq. 1).

$$ {\text{Defoliation}} = \beta_{0} + \beta_{1} ({\text{Host Abundance}}) - \beta_{2} \left| {({\text{Phenological Asynchrony}})} \right| + \beta_{3} ({\text{Topography}}) - \beta_{4} ({\text{Pesticide}}) + e_{i} $$
(1)

Rather than create a predictive model, we used a linear modeling framework to understand the significance and direction of the relationship between these factors and defoliation intensity (Mac Nally 2002). We made this distinction because defoliation intensity is not normally distributed. Rather it may be bimodal and is theoretically bounded from 0 to 100 % of canopy foliage (Fig. 2). Though less-suited for prediction in such cases, linear models produce mean responses that are robust with non-parametric data, and thus may be used to assess explanatory variable importance.

Fig. 2
figure 2

Density histogram (grey) of defoliation intensity from a random sample of the defoliation map (2000). Defoliation approximates a skewed Beta distribution with asymmetric peaks near 0 and 100 %. Black line shows the empirical cumulative distribution function

We evaluated all possible models with up to four predictors based on their ability to explain variation in defoliation intensity as described by R2 and Mallow’s Cp using the SAS REG procedure (Version 9.1 of the SAS System for Windows. Copyright © 2002–2003 SAS Institute Inc.). We selected models that explained the most variation and retained significant coefficients for up to one variable of each category in Eq. 1. We then tested the selected models for ecologically plausible interactions. We graphically evaluated model residuals and employed hierarchical partitioning analysis (Chevan and Sutherland 1991) to assess the relative importance of explanatory variables (Mac Nally 2002).

We next used semivariograms to understand residual spatial autocorrelation and thereby to quantify the connectivity of defoliation in the forest landscape (Fortin and Dale 2005; Tobin and Blackburn 2008). We created empirical semivariograms from residuals of defoliation intensity from plot-based models using the geoR module in the R statistical software package (maximum distance = 10,200 m, lag distance intervals = 600 m) (R Development Core Team 2008). We assessed statistical significance using 95 % confidence intervals and Monte Carlo simulation with 1,000 permutations. We evaluated spatial linear models against non-spatial models with nested Likelihood ratio tests.

We evaluated finer-scale autocorrelation in heavily defoliated areas with local spatial statistics (Fortin and Dale 2005). We fit local semivariograms centered on individual pixels using spherical models for highly defoliated patches (maximum lag = 3,000 m) with the program VESPER (Minasny et al. 2005). We delineated patches with Local Moran’s I, which provides a measure of the direction and strength of autocorrelation between a pixel and its neighbors within concentric lag distances (Anselin 1995). We defined heavily defoliated areas as positively autocorrelated with Moran’s I values >0.10 (>97.5 percentile, lag = 555–600 m) calculated using ENVI software (ENVI Version 4.3, Copyright © 2006, ITT Industries, Inc. 4990 Pearl East Circle, Boulder, CO 80301, USA).

Results

Models

CFI plot dataset

The strongest individual predictors of defoliation intensity in 2000 were current year phenological asynchrony (R2 = 0.14) and landform index (LFI) (R2 = 0.07) (McNab 1992). Tree density explained the most variance of measures of host abundance (R2 = 0.06). The final multivariate model for 2000 explained 21 % of the variance in mapped defoliation (Table 3). This model used four predictors and one significant interaction (see Eq. 1): pesticide spray status (Pest00), landform index (LFI), phenological asynchrony (Phen00), tree density (Dens), and the interaction [LFI x Dens]. All the predictors explained significant variation (ANOVA F-tests), although tree density became insignificant when its interaction was included (Table 3). The direction of the variable effects generally agreed with our expectations (Eq. 1). Phenological asynchrony and pesticide spraying negatively affected defoliation intensity, while landform index and tree density had positive effects.

Table 3 Coefficients for linear models of defoliation

In 2001, the best fit model explained 34 % of the variance (Table 3) and included five variables: previous year’s defoliation (Defo00) (R2 = 0.14 individually), pesticide spray (Pest01), relative slope position (RSP), phenological asynchrony (Phen01), and relative basal area of immune hosts (RBA#3). This model had significant interactions between prior year defoliation and pesticide spray [Defo00 × Pest01] and asynchrony and relative slope position [Phen01 × RSP]. Semivariograms of CFI plot-based model residuals showed no significant spatial autocorrelation in either year.

Hierarchical partitioning of final models with and without interaction terms decomposed the variance explained (R2) by each variable (Fig. 3). Large independent contributions to explained variance (i.e. large relative to joint contributions) show that variables explain unique partitions of defoliation variance and that colinearity is inconsequential. Interaction terms complicate this analysis because they are the product of two variables, and thus must jointly explain variance. Hierarchical partitioning of the final model in 2000 with no interactions showed that three of the four variables had high independent contributions, supporting the conclusion that they explained unique aspects of the data (Fig. 3a). Landform index, the topographic variable, had a higher joint contribution. Phenological asynchrony explained the most variance (58 %), followed by tree density (21 %), landform index (19 %), and spray status (2 %). With interaction terms, the joint effect of landform index decreased while density increased, presumably because the interaction terms were jointly explaining defoliation variance (Fig. 3b). The joint contribution of the interaction term, [LFI × Dens], was negative, meaning that this interaction had a suppressing effect on the explanatory power of other variables. For this reason coefficients for one of the raw predictors became insignificant once interactions were included.

Fig. 3
figure 3

Hierarchical partitioning results. Variance explained, as a percentage of the total model R2, by predictors included in final plot-based models for 2000 without interaction terms (a), the full model in 2000 (b), without interaction terms 2001 (c), and the full model in 2001 (d). Independent contributions in grey, joint contributions hatched. Variables include pesticide spray status in 2000 or 2001 (Pest00 or Pest01), phenological asynchrony in 2000 or 2001 (Phen00 or Phen01), landform index (LFI), relative slope position (RSP), tree density (Dens), and basal area of tree species immune to gypsy moth (S3RBA)

Hierarchical partitioning of the 2001 plot-based model showed that defoliation from the previous year explained the greatest proportion of the explained variance (60 %), followed by relative slope position (14 %), phenological asynchrony (13 %), pesticide spraying (7 %) and relative basal area #3 (immune species) (5 %). All of the predictor variables decreased in the proportion of variance explained in comparison to 2000 except for pesticide spray status, which explained more variance (Fig. 3c). Both previous year defoliation and pesticide spray had negative joint contributions, suppressing some of the variance potentially explained by other variables.

Map-based data set

Map-based variables allowed improved characterization of fine-scale spatial patterns (distances <550 m), but were constrained to mean measures of host abundance. As a result, map-based data explained less of the variance in defoliation than the plot-based variables. The final model explained only 16 % of the defoliation variance in 2000 and the coefficient for pesticide spray became slightly insignificant (Table 3). The final model for 2001 (Table 3) explained 20 % of the defoliation variance but some terms became insignificant.

Residuals from map-based models revealed significant spatial autocorrelation at distances both shorter and longer than plot-based minimum spacing (550 m) (Fig. 4). Spherical semivariogram models uncovered global spatial autocorrelation ranging up to 788 m (2000) and 461 m (2001). Sill semivariance in the first year was significantly higher than in 2001, indicating more variability in residual defoliation, as we expected. Plot-based models remain better suited for interpretation of variable importance, due to the observed independence of plot-based model residuals.

Fig. 4
figure 4

Semivariograms for map-based defoliation model residuals (points) with fitted spherical models (lines) show significant autocorrelation. Dashed vertical lines correspond to global autocorrelation ranges derived from spherical semivariogram models

Local autocorrelation

Positive local Moran’s I values indicated spatial autocorrelation with surrounding neighborhood pixels in heavily defoliated forests (Fig. 5). Local autocorrelation measured from individual pixels followed a right-skewed distribution with a mean of 323 m in 2000 (Fig. 6a), with 95 % of ranges falling below 596 m. Defoliation in 2001 was locally autocorrelated with a right-skewed distribution of ranges peaking between 50-300 m and a mean of 341 m (Fig. 6b).

Fig. 5
figure 5

Local Moran’s I at 600 m lag distances. Red defoliated patches with strong positive autocorrelation. Weakening of autocorrelation patches in southern part of the study area suggests localized population crashes in 2001. (Color figure online)

Fig. 6
figure 6

Frequency distributions of local semivariogram ranges for spherical models of defoliation in 2000 (a) and 2001 (b). While most defoliated pixels are autocorrelated with neighbors from 50 to 300 m away, some pixels are autocorrelated up to longer neighborhood distances (1,000–2,000 m). Spikes are an artifact of interactions between bin distance definition and Landsat pixel resolution

Discussion

Our linear models demonstrate that some of the intermediate-scale variation in defoliation intensity (20–34 %) can be explained by exogenous habitat characteristics: distribution of host abundance, phenology, topography, pesticide spray and prior-year defoliation. Significant interactions occurred in the first year of the outbreak between host abundance and topography. The direction and strength of the model coefficients and their interactions are ecologically informative. Phenological asynchrony and pesticide spraying reduced defoliation intensity, while landform index and tree density had positive effects in the model. Surprisingly, relative measures of host abundance (BA or RBA) did not explain as much variance as tree density or foliar biomass. This may reflect the even dominance of preferred gypsy moth hosts at GRSF (mean RBA of preferred hosts = 67 %); relative host abundance varies too little across the landscape to sufficiently explain defoliation patterns. In theory, landform index, phenological asynchrony and pesticide spray status should modify the effect of host abundance, by indirectly affecting host or habitat quality and suitability. The fact that these variables explained more variance than host abundance suggests that modifiers become more important when defoliation occurs in host-dominated landscapes.

While host abundance and topography are known to affect gypsy moth outbreaks, plot-level research analyzing the role of phenological asynchrony on gypsy moth populations has produced equivocal results (Hunter and Elkinton 2000). In our models, phenological asynchrony explained 60 % of the total explained variance in 2000 (Fig. 3), more than three times as much as host abundance, topography, or spray status. Prior studies have identified phenological asynchrony as a factor important to population dynamics of deciduous (Hunter et al. 1997; Hunter and Elkinton 2000) and coniferous defoliators including jack pine budworm (Choristoneura pinus pinus Freeman) (McCullough 2000) and spruce budworm (Choristoneura fumiferana Clemens) (Nealis and Regniere 2004). Until now, however, phenological asynchrony has not been measured across intermediate-scales in a way that unlocks its potential to explain defoliation patterns. Our analysis demonstrates that variations in asynchrony between gypsy moth egg-hatch and bud-burst helped explain which areas of host-dominated forest were defoliated, especially when considered in light of variations in host abundance or topography (Foster et al. 2013). Hunter et al. (1997) found a similar pattern for winter moth (Operopthera brumata), concluding that host plant foliage quality (e.g. phenological asynchrony) controlled the spatial variation in population densities, though not the temporal fluctuations. Although this relationship may not hold for every landscape, the pronounced variation in phenology among red and white oak forests at Green Ridge clearly produces heterogeneity in foliage availability beyond the susceptibility classes defined by Liebhold et al. (1995).

Defoliation intensity from the previous year accounted for only 20 % of defoliation variance in 2001 (i.e., 60 % of the 34 % variance explained by the model). One might expect that defoliation caused by a species with limited dispersal, such as the gypsy moth, would be highly correlated from 1 year to the next. Gypsy moth population densities, however, are known to change rapidly and unpredictably, especially during later phases of an outbreak. For example, Montgomery (1990) found that overwintering egg-mass densities explained only 25 % of the variability in defoliation occurring in the same year and generation. Accordingly, defoliation intensity from prior years should explain a smaller proportion of variance, which is consistent with the 20 % we observed.

Although pesticide spray history explained significant variance in both years, its explanatory strength was low. Defoliation varied significantly in relation to spray patterns in 2000 according to univariate ANOVAs, but with heavier defoliation in sprayed areas. We attribute this counterintuitive observation to the fact that spray programs select areas for pesticide suppression based on criteria such as high egg mass counts, the presence of favorable hosts, and high economic value. Forests with these characteristics are more likely to be defoliated if spraying is ineffective or inadequate. In 2000, spraying of Bacillus thuringiensis (Bt) was followed by rain that could have washed the toxic bacteria off leaves. Dimilin and Mimic were sprayed in other treatments, both of which are less susceptible to rain. A variable response of defoliation to Bt spraying is consistent with previous studies (Liebhold et al. 1996). In addition, spray efforts often focused on ridgetops, hence landform index or relative slope position may have accounted for variance that could have been explained by spray status in the models. In 2001, gypsy moth defoliated sprayed areas less than unsprayed forests, as expected, and spraying had more predictive power. Our results suggest that pesticide spraying may have added noise to defoliation patterns that dampened the strength of relationships with other exogenous variables.

At finer spatial scales, residual defoliation was locally autocorrelated primarily from 0 to 600 m, demonstrating variability in connectivity of gypsy moth populations over these scales. We speculate that dispersal and disease patterns help determine the range of endogenous autocorrelation in residual defoliation. Gypsy moth dispersal does not occur in the adult stage because females are flightless in the U.S. (Keena et al. 2008); they mate and lay eggs on the same trees where they pupate. Gypsy moths disperse primarily by ballooning during the earliest instars as tiny caterpillars, and to a lesser extent when larger caterpillars move around the forest floor. Passive dispersal of early instar gypsy moth (100–200 m, <1 km) (Mason and McManus 1981) agrees well with the ranges of spatial autocorrelation found in residual mapped defoliation intensity.

If dispersal explained local autocorrelation patterns, we expected defoliation to be patchier and more dependent on resource conditions in the first year (2000). In the second year (2001), we expected passive dispersal to spread populations to less desirable sites, producing more diffuse patterns, characterized by lower residual variance and longer autocorrelation ranges. Residual variance was higher in 2000 than in 2001 (Fig. 4), consistent with a patchier first year. However, the range of autocorrelation became shorter in the second year, suggesting that patches were shrinking rather than expanding. Some patches shrank from 2000 to 2001 (Fig. 5) (NW study area) while others expanded (NE study area) and some disappeared (SE study area), consistent with local population crashes. Heterogeneity in local patch behavior complicates interpretation of changes in mean autocorrelation range. Exogenous variables explained less variance in defoliation intensity in 2001 than in 2000, consistent with defoliation becoming less dependent on bottom-up constraints, and more dependent on spatial adjacency. A temporal weakening of the relationship between exogenous resource factors and defoliation impacts has also been observed for a spruce budworm outbreak (Campbell et al. 2008). These factors may be most useful to explain where defoliation outbreaks initiate, while patterns of subsequent generations respond more to dispersal or top-down processes as outbreaks develop.

Exogenous habitat characteristics accounted for significant variation in defoliation intensity (16–34 %) at GRSF. These results are similar to the variance (17 %) attributed to bottom-up effects on population densities of winter moth (Hunter et al. 1997). This suggests that gypsy moth defoliation patterns may depend similarly on bottom-up constraints. Population fluctuations also vary across space in a stochastic fashion, possibly in response to un-measured factors whose spatial distribution is stochastic or density-dependent in nature, such as predation pressure from small mammals (Elkinton et al. 1996; Goodwin et al. 2005), and variability in occurrence and spread of nuclear polyhedrosis virus (NPV) (Dwyer and Elkinton 1995) and the fungal pathogen Entomophaga maimaiga (Dwyer et al. 1998; Weseloh 2003). Gypsy moth demographic data that capture some of these factors increase the explanatory power of defoliation models. For example, Davidson (2001b) included egg mass density with host basal area to model gypsy moth defoliation, which could explain as much as 39 % of the variance (n = 48). Montgomery (1990) used gypsy moth population variables to predict defoliation for plots (n = 32) spread over New England. Demographic data, including density of egg masses, percent of eggs hatched, disease incidence, and early instar survival, combined with some host resource variables, explained 70 % of defoliation variance in his data. Inclusion of these endogenous variables in our models might account for some of the unexplained variance in defoliation intensity, but currently it is unfeasible or impossible to measure these variables continuously at landscape scales. Our limited knowledge of the spatial distributions of these processes provides further support for modeling defoliation patterns in a semi-stochastic fashion with some dependence on measurable exogenous factors.

We designed our model selection strategy to identify the most parsimonious models with significant predictors from each of four exogenous categories. This approach facilitated comparison of the explanatory power of these categories and helped minimize collinearity. However, some models using different combinations of explanatory variables performed equally well to the final models we report here. If we had relaxed our model selection criteria to allow up to two variables from one category, we could at times explain slightly more variance in defoliation intensity with fewer predictor variables. For the first year of the outbreak, multiple measures of host abundance were often selected in the top performing models. These model sets often combined generic abundance variables such as total foliar biomass or tree density with relative measures of host abundance. In contrast, in 2001, topographic variables increased in importance as predictors, with elevation and topographic convergence index often selected together. Models aimed at the highest predictive accuracy could incorporate more explanatory variables than we have, though increased model complexity can make ecological interpretation of such models more difficult.

Another source of uncertainty in our models is the defoliation maps (root mean square error of 11 % defoliation (2000) and 8 % (2001)), which causes some error in defoliation intensity (Townsend et al. 2012). However, these limitations are small when compared to sketch map uncertainty. Finally, though every effort was made to mask out disturbance patterns that were related to human management, it is possible that some defoliation resulted from causes other than gypsy moth. Such unrelated defoliation cannot easily be distinguished by remote sensing methods and may produce variance that is unlikely to be explained by models specific to gypsy moth defoliation.

Our data permitted unbiased sampling designs (systematic and random) and model estimates that otherwise are not usually possible for field-based studies involving a limited number of plots selectively placed in heavily defoliated areas. The bimodal landscape distribution of defoliation intensity resembled a skewed Beta distribution (Fig. 2). Zeide and Thompson (2005) observed a similar defoliation distribution among individual trees, but caused by a sawfly species with gregarious feeding behavior, which is a distinct mechanism from those that might explain gyspy moth defoliation distributions.

Future studies seeking to model defoliation intensity from exogenous spatial variables should consider a Beta distribution in their analysis, as it has been observed empirically and is logical. The range of residual spatial autocorrelation observed here should also be considered, as plot-based sampling designs with lag-distances smaller than the observed autocorrelation ranges may not produce independent observations or error structures. Spatial models should be considered for gypsy moth defoliation models derived from sampling designs with inter-plot distances shorter than 600 m.

Summary

Improved data sources that capture the full dynamic range of defoliation intensity at intermediate scales allowed us to compare the relative importance of explanatory variables measured intensively across the landscape. This analysis illustrates how spatial patterns of defoliation depend on underlying habitat characteristics and provides new evidence for the importance of host phenological asynchrony. We quantified residual patterns of spatial autocorrelation which were consistent with gypsy moth dispersal distances. Similar dependencies may be expected for defoliation events in different defoliator systems, with habitat characteristics accounting for relatively more or less explanatory power when top-down versus bottom-up factors dominate. A more complete understanding of the spatial variability in defoliation intensity will improve efforts to understand the long-term impact that defoliation cycles have on aboveground forest carbon dynamics.