Introduction

The practice of monitoring seasonal biological events, such as date of spring bloom for specific plant species, is centuries old. Phenological calendars were used by the ancient Chinese and Romans to guide agricultural operations (Hopp 1974). Records of phenological events for plants and animals collected during the past several decades are currently being examined for another purpose—quantifying a possible biological response to recent climate change (Schwartz 1999; Abu-Asab et al. 2001; Penuelas and Filella 2001; Walther et al. 2002; Parmesan and Yohe 2003).

Some of our most valuable historical phenological data have been collected informally, often as a hobby by a single individual or as part of a family tradition (e.g., Bradley et al. 1999; Fitter and Fitter 2002). Native annual flowering plants are often measured, but these data can be problematic to interpret because of year-to-year variation of the plants measured and their physical location (microclimate). For this reason, woody perennials have some advantages as indicator species compared to annuals (Lechowicz and Koike 1995), but perennials in natural ecosystems are subject to year-to-year and site-to-site variation in plant age and genetic make-up (i.e., ecotypes) within a species. A more scientifically rigorous approach involves the establishment of a network of gardens of the same species planted at the same time, such as the common lilac (Syringa vulgaris) plantings in Germany in the late nineteenth century (Ihne 1885). A particularly ambitious project was initiated in Europe in the mid-twentieth century, with the planting of phenological gardens that involved 20 countries and several species of deciduous trees, shrubs, and conifers (Schnelle and Volkert 1974; Chmielewski 1996; Menzel 2000).

Plant phenology data for North America are less robust (fewer species, fewer years of data, less inter-regional coordination) compared to Europe. The most geographically dispersed data come from several regional plantings of honeysuckle (Lonicera tatarica and L. korolkowii species) and lilac (S. vulgaris and S. chinensis species) established during the 1960s (Hopp 1974; Schwartz 1994). The original purpose of these plantings was to use phenological information to optimize farming practices (e.g., seeding date and pest control), and predict yield potential of several economically important crops. Cayan et al. (2001) and Schwartz and Reiter (2000) have evaluated some of these data and found evidence for a general advance in spring phenology in the USA since the 1970s. Schwartz and Reiter (2000) noted substantial geographic variability, suggesting more detailed analyses would be warranted in northeastern USA and other regions where the initial analysis suggested shifts in the timing of spring events was strongest.

In the present study we focus specifically on northeastern USA, where average annual temperatures have increased 1.0°C, and winter temperatures (December through February) have increased 1.6°C from 1899 to 2000 (C. Wake, University of New Hampshire, personal communication). We evaluated records from 72 locations in this region where genetically identical lilac plants (S. chinensis clone, “Red Rothomagensis”) were grown and monitored for first leaf and first flower date during the period 1965–2001. Most sites were missing records for some years, and so we also evaluated simulated lilac-honeysuckle phenology for the same time period using a previously validated “spring index” model and temperature data from weather stations located near the actual lilac sites. In addition, we evaluated trends in mid bloom date for apple (Malus domestica) collected at three sites and grape (Vitis vinifera) collected at one site in the northeast during approximately the same time period. The rate of change in ground observations of spring phenology for these three horticultural woody perennials are discussed in relation to climatic change in the region, model predictions, and phenological shifts reported in other recent studies.

Materials and methods

Our primary database included first leaf date (FLD) and first flower date (FFD) for the lilac (S. chinensis) clone, “Red Rothomagensis,” collected from 96 locations in northeastern USA, encompassing the period 1960–2001. Details of the protocol for initial site selection, observer training, and monitoring methods were summarized in Dube et al. (1984). Locations with less than 10 years of record for either FLD or FFD were excluded from our analysis, leaving 72 sites with an average of 21 years of record. The latitude of the 72 sites ranged from 38.57° to 44.56° N, longitudinal range was 68.40° to 79.35° W, and elevation ranged from 1.8 to 165.6 m.

Linear regression analysis of variance (ANOVA), with Julian day of FLD or FFD as the dependent variable and year as the independent variable, was performed for pooled data from all 72 stations. Linear regression analysis was also performed on data from each of the 72 stations individually, and a frequency distribution for the range of regression coefficient values obtained (slopes for FLD and FFD vs year) was developed.

In addition to analysis of the available northeast lilac record as described above, linear regression analyses were also performed on simulated FLD and FFD for the same time period and region, using a previously validated “spring index” model described by Schwartz (1997). This model was developed from over 2,000 station-years of weather, lilac, and much smaller numbers of honeysuckle first leaf and first bloom data collected from 1961 to 1994, from sites throughout the north-central and northeastern USA. Model FLD and FFD predictions are based in part on the number of high degree-day accumulation and synoptic weather events after winter chilling (vernalization) has been satisfied, particularly those events within about 1 week of first leaf. We ran this model using daily weather records from 68 COOP Network stations in the northeast, selected for their proximity to the 72 lilac observation sites. The COOP Network is maintained by the National Climate Data Center, Asheville, North Carolina, and our data were obtained through the Northeast Regional Climate Center (http://climod.nrcc.cornell.edu/climod). In contrast to the actual lilac measurements, the simulated FLD and FFD results have the advantage of no missing years within the 40-year time span of the analysis for all 68 stations included.

For comparison with lilac, we also evaluated historical records of grape phenology for the variety ‘Concord’ grown in Fredonia, New York on the eastern shore of Lake Erie (42.440°N, 79.331°W). The vines monitored were the same vines in a long-term trial that has been continually managed in the same way since 1959. Observations of several phenological stages, including mid-bloom date which is reported here, were taken precisely every year on the basal cluster on similar shoots for uniformity.

Records of apple phenology (a mean of mid-season varieties typified by “Delicious” and “Empire”) were gathered at Cornell University Agricultural Research Station in Geneva, New York (42.868°N, 76.978°W), which also was a northeastern USA lilac observation site. During the approximate 40-year study period there were changes in observers, and minor modifications in methods, but in all years, mid-bloom was defined as the date when the trees were visually judged to have 80–100% of the central blossoms in the flower clusters open. Similar data were collected from apple plantings at Poughkeepsie (41.700°N, 73.921°W) and Peru (44.666°N, 73.569°W), New York. We pooled apple data from all three locations and tested for a significant linear trend in mid-bloom date versus year (regression analysis). The database for apple mid-bloom is larger and reflects a larger geographic area than that for grape, but the apple measurements are less precise as they integrate results for several apple varieties observed by several research and extension specialists.

Results

Measured lilac, apple and grape phenology

Regression analysis of pooled data from the 72 lilac observation locations in northeast USA revealed negative slopes for Julian day versus year of −0.048 and −0.092 day/year for FLD and FFD, respectively (Fig. 1). This linear trend was not statistically significant at the P<0.05 level for FLD (P=0.234), but was significant at P=0.003 for FFD. The scatter in the data for both FLD and FFD were anticipated since many other factors besides long-term climate trends, particularly within-year site-to-site variation in microclimate, will have affected phenological events.

Fig. 1
figure 1

Julian day (days after January 1) of lilac first leaf (a) and first flower (b) versus year from 1965 to 2001, pooled from 72 sites in northeastern USA. Results of linear regression analyses are shown

Figure 2 shows a map of the study region, and summarizes the results of regression analyses on phenological trends for each of the 72 sites individually. These results are also summarized as frequency distributions of slope values obtained at the 72 sites in Fig. 3. For FLD, 60% of the 72 sites had a negative slope (i.e., earlier first leaf), and for FFD 89% had a negative slope. For FLD, 21% of the sites had a significant linear trend at the P<0.05 level, and of these, 73% had negative slopes. For FFD, 28% of the 72 sites had a significant linear trend at the P<0.05 level, and 100% of these had negative slopes. When we subsequently conducted pooled regression analyses for the subset of sites with significant slopes (i.e., n=15 and 20 for FLD and FFD, respectively), we found a highly significant negative trend (P<0.001) for both phenological events, with regression coefficient values of −0.344 and −0.259 day/year for FLD and FFD, respectively.

Fig. 2
figure 2

Map of northeastern USA showing all 72 lilac sites used in the phenology study. Fifteen of the sites had significant slopes (P<0.05) for first leaf date (FLD) versus year; 11 of these were negative (i.e., earlier FLD) and 4 positive (later FLD). Twenty of the 72 sites had significant slopes (P<0.05) for first flower date (FFD) versus year, and all 20 were negative (i.e., earlier FFD)

Fig. 3
figure 3

Frequency distribution of regression coefficient (slope) values of first leaf (a) and first flower (b) date versus year calculated for each lilac site individually. Shaded regions of the bars indicate number of sites where linear trends were significant at P<0.05

Records of mid-bloom date of apple at three locations (Fig. 4a) and grape at one location (Fig. 4b) in the New York state region of northeastern USA showed results similar to that found for the more robust lilac data set. Pooling apple data from the three observation locations resulted in a regression coefficient of −0.20 (about 2 days per decade earlier mid-bloom), significant at P<0.01. A regression analysis for each apple site individually (not shown) indicated a negative slope for each, statistically significant at P<0.08, 0.05, and 0.05, for the Geneva, Peru, and Poughkeepsie sites, respectively. The slope for mid-bloom date for grape was also negative (−0.146), with a P value of 0.14.

Fig. 4
figure 4

Julian day of mid-bloom for apple versus year (a) at three locations (indicated in figure), and grape versus year (b) at one location (Fredonia, New York) in northeastern USA from 1965 to 2001. Results of linear regression analyses are shown

Temperature trends for the region

The advanced spring phenology for lilac, apple and grape was consistent with an approximate +0.7°C increase in average annual temperatures, based on data collected from 68 COOP Network weather stations in close proximity to the lilac observation sites between 1960 and 2001 (Fig. 5a, linear trend significant at P=0.01). However, the rate of increase in average monthly March and April temperatures between 1960 and 2001 was not statistically significant at P<0.05 (Fig. 5b).

Fig. 5
figure 5

Mean values for average annual (a) and monthly average March and April (b) temperatures based on weather records for 68 weather stations in northeastern USA selected for their proximity to the 72 lilac observation sites used for the regression analysis of Fig. 1

Model predictions of lilac phenology

The lilac-honeysuckle “spring index” model developed by Schwartz (1997) predicted statistically significant (P<0.001) negative slopes of −0.21 and −0.123 for FLD and FFD, respectively, when daily temperatures from 1960 to 2001, collected from weather stations in proximity to the lilac observation sites, were used as inputs (Fig. 6). It should be noted that the model reflects trends in daily maximum and minimum temperatures, as opposed average monthly or annual temperatures such as those shown in Fig. 5.

Fig. 6
figure 6

Simulated first leaf (a) and first flower (b) dates using the “spring index” model described by Schwartz (1997), and daily weather inputs from 68 stations in northeastern USA selected for their proximity to the 72 lilac observation sites used for the regression analysis shown in Fig. 1

Discussion

A unique attribute of the lilac data set examined here is that the same clone of lilac was used at all 72 observation sites, so that genotypic variation could be ruled out as a contributing factor to observed variation between sites. However, spatial variation was still high due to site-to-site microclimate differences (documented by COOP Network weather station data), as has been reported for other phenological studies encompassing a relatively large geographic range (Menzel et al. 2001; Parmesan and Yohe 2003). High inter-annual temperature variation was also a constraint to detecting phenological trends, as is common for studies of this type (Badeck et al. 2004). Despite these anticipated large sources of variation, we detected a significant (P=0.003) 1 day per decade earlier FFD since 1960 in the pooled regression analysis (Fig. 1b). Variability for FLD data were such that the trend for earlier FLD was not statistically significant at P<0.05 in the pooled analysis, but analysis of individual sites found that the vast majority of sites with significant slopes were negative for both FLD and FFD (Figs. 2, 3).

Our examination of trends for each lilac site individually found that 21% and 28% of the sites had significant (P<0.05) linear trends for FLD and FFD, respectively (Fig. 3). This is similar to a recent analysis of 751 springtime phenological observations from the International Phenological Gardens in Europe for the period 1951–1996 (Menzel 2000), where 27% of the trends were significant at P<0.05. All of the lilac sites in our study with a significant linear trend for FFD had a negative slope (i.e., earlier flowering), and 73% of the sites with a significant linear trend for FLD had a negative slope.

We examined in more detail the aberrant four sites where ground observations indicated a significant positive, rather than negative, slope for FLD. None of these sites showed a cooling, as opposed to warming, trend in average monthly temperatures (based on nearest COOP Network weather records). In contrast to what we observed for the vast majority of sites in the study region, at these four sites the “spring index” model output was not qualitatively in agreement with actual lilac measurements (i.e., did not confirm a positive slope for FLD). It is not possible to conclusively determine whether the actual lilac measurements or model output are more reliable in this case. Menzel (2000) also reported a minority of sites in Europe with positive slopes for some phenological ground observations, which was contrary to the overall trend for the region. Experimental error is of course a possibility in all studies of this type, which encompass many observers, sites, and years. Delayed spring development associated with a plant aging effect, as reported for some forest tree species, could be a factor (Nienstaedt 1974). However, we found no clear evidence to suspect these were problems in our lilac data set. There was no consistency with regard to latitude, longitude, elevation, duration of the monitoring period, or proximity to large urban areas among these sites. It may be relevant that all four sites with positive FLD slope were near large water bodies (see map, Fig. 2). The large water bodies could have had an impact on some aspect of the diurnal pattern in temperature (or other climate factors affecting phenology of plants in the region) that is not captured in the “spring index” model.

The pooled “spring index” model output (Fig. 6) indicated a 2.1 day per decade earlier FLD and a 1.2 day per decade earlier FFD for lilac for northeastern USA during the 1960–2001 period. The fact that the model appears to be a better predictor of actual observed trends of FLD and FFD (Fig. 1) than would a simple examination of monthly average March or April temperature (Fig. 5), which showed no significant trend, suggests that daily maximum and minimum temperatures and synoptic events incorporated in the model are important factors affecting springtime plant phenology in lilac. As suggested by Zhao and Schwartz (2003), actual or simulated plant phenology may be more capable than average temperatures to reflect subtle climate changes over long periods.

Although the pooled “spring index” model predictions are qualitatively in agreement with pooled ground observations for the region (all negative slopes, Figs. 1, 6), the statistical certainty and the rate of change in spring phenology (magnitude of slope values) are greater for the model compared to pooled observed data. For the period 1965 to 2001, an advance in FLD of 2 days was found for actual observations (Fig. 1) compared to 8 days for model output (Fig. 6). An advance in FFD of 3 days and 4 days was found for actual observations and model output, respectively. The greater statistical confidence (lower P values) for the regression ANOVA of the model output compared to actual lilac observations can be explained in part by two factors: (1) a larger sample size because the model relies on weather station data which are not missing years; and (2) no variation due to experimental errors that can occur in the field when directly measuring FLD or FFD.

The model output has the power to essentially reconstruct missing phenological records because it is based on a more comprehensive data set (available daily weather records) than actual lilac measurements, and the model has been previously well-validated based on a data set from a much broader geographic area (Schwartz 1997). Weather data needed to run the model for this study were available from each location and for all 40 years of the study period. In contrast, the actual lilac measurement data set has gaps within the 40 year time span for most of the 72 sites. Nevertheless, the “spring index” model was designed to be used as an indicator of phenology trends at the continental scale, not necessarily to match or take the place of actual lilac plant measurements for an area when they are available. The important point is that, in our study, the overall trends of both the model output and ground observations corroborate each other.

The more geographically limited apple and grape data sets revealed a similar trend (advance in mid-bloom date of 5 to 7 days, Fig. 4) as found for lilac in the region, although the grape trend was only significant at P=0.14. The sensitivity of apple and grape springtime phenology to temperature has been known for some time (Winkler et al. 1974; Seem and Szkolnik 1978). More recently, Penuelas et al. (2002) reported a significant slope for Julian day of apple bloom versus year for the period 1952–2000 in the Catalonia region of Spain, with a magnitude of −0.55 day/year. They also examined a similar data set for grape in the region and did not find a significant linear trend for bloom, but did find a significant negative slope for FLD (−0.36 day/year). One explanation for the stronger evidence of a shift in bloom date for apple compared to grape in all of these studies may be that apples bloom in very early spring, typically several weeks earlier than grapes, and in most northern latitude locations, winter and early spring temperatures have increased more than summer temperatures.

Conclusions

We used several data sets and analytical approaches to examine trends in spring phenology for lilac, apple, and grape in northeastern USA region during the latter half of the twentieth century. Each data set and approach had some limitations, but collectively the results converged and provided evidence that the general warming trend of the past several decades in northeastern USA has resulted in an advance in spring phenology ranging from 2 to 8 days for these species. The magnitude of this climate impact on phenology is similar to other analyses of ground observations for other plant species in mid- and northern latitudes (Bradley et al. 1999; Abu-Asab et al. 2001; Menzel 2000; Menzel et al. 2001; Penuelas et al. 2002). Our results are also qualitatively in agreement with advancement in “green-up” based on satellite normalized difference vegetation index (NDVI) data (Zhou et al. 2001), and changes in the phases and amplitude of the seasonal atmospheric CO2 drawdown in the northern hemisphere (Keeling et al. 1996).

This and other recent phenology studies have relied on historical records that were initially maintained for purposes other than examination of climate change. Given the importance of reliable data on biological responses to climate change for policy-makers, we concur with Badeck et al. (2004) and others who have suggested it would be prudent to strengthen regional and global phenology monitoring networks. Data collected from such networks will not only document responses of individual species to climate change, but also can be used to provide “ground truth” and complement more integrative measures of shifts in plant seasonality, such as atmospheric CO2 drawdown and NDVI data collected from satellite imagery.