1 Introduction

The frequency of extremes is affected by changes in the underlying distribution, including trends in the mean and higher moments, such as variance (Parey et al. 2013). This fact, along with the large number of recent extreme events, has led to several investigations of variance as it pertains to global, regional, interannual, intraannual, and seasonal surface temperatures, e.g., Michaels et al. (1998), Yin and Knapp (1999), Simolo et al. (2011), Shen et al. (2011), Donat and Alexander (2012), and Hansen et al. (2012). However, the choice of a time period is often somewhat arbitrary, yet important, e.g., Liebmann et al. (2010). For example, Karl et al. (1995) found that while interannual temperature variance in the USA has increased in recent decades (1974–1993), it has decreased over a longer time span (1911–1994). Additionally, while trends in the mean are, for the most part, qualitatively uniform (increasing) regardless of data (Hartmann et al. 2013), this is not so for trends in higher moments. Motivated by Karl et al.’s results, the varied effect of selected data on trends in variability, and the tendency of more recent research to focus primarily on trends following 1950, we ask how variability of monthly means has evolved within the time period 1900–2013. For example, compared to Karl et al.’s results, do the additional 19 years (1995–2013) change the observed trend in the USA? And, in particular, does the early twentieth century continue to impact the observed trend in variability?

To answer these questions, we turn to record-breaking statistics. Due to the distributional invariance, simplicity of underlying concepts, and frequently striking results, e.g., Meehl et al. (2009), the use of record-breaking events (records) in analysis of temperature trends has become more widespread in the past decade, e.g., Benestad (2004), Redner and Petersen (2006), Newman et al. (2010), Wergen and Krug (2010), Munasinghe et al. (2011), Rahmstorf and Coumou (2011), and Rowe and Derry (2012). By way of brief summary, the statistics are based on the following result: for a time series with independent and identically distributed values from a continuous distribution, X=x 1,x 2,x 3x i , the probability that x i is a record is 1/i. It follows that the expected number of records, E(r), for a time series with n values is E(r)=1+1/2+1/3…+1/n. A record is a value that exceeds all previous values in the sequence. The first value is always a record. Therefore, records become logarithmically rare in a stationary time series.

In Anderson and Kostinksi (2010), we used this result to develop an index that indicates the presence of a trend in variability. Since stationary time series have no trends in the mean or higher moments, they are statistically invariant with respect to time-reversal. Therefore, the expected number of record highs (or lows) does not change upon time-reversal. We make use of this time-reversal symmetry (or lack thereof) in the variability index, α,

$$\begin{array}{@{}rcl@{}} \alpha & =& (RH_{fwd} - RH_{bwd}) + (RL_{fwd} - RL_{bwd}) \\ &=& (RH + RL)_{fwd} - (RH + RL)_{bwd} \end{array} $$
(1)

where RH and RL are the numbers of record-breaking highs and record-breaking lows, respectively, and the subscripts indicate direction in time (forward and backward). For time series without a trend in variance, 〈α〉=0 (we use angular brackets to indicate an average over all individual α values in an ensemble). Meanwhile, increasing (decreasing) variance is accompanied by an excess (dearth) of records (highs and lows) in the forward direction and a dearth (excess) of records in the backward direction. Thus, a positive (negative) 〈α〉 indicates an increasing (decreasing) variance and a qualitative trend can be extracted.

Here, we investigate the evolution of interannual temperature variability in the contiguous USA within the period 1900–2013. To do this, we systematically evaluate 〈α〉 for time periods of varying initial and final years. By examining all possible time periods, we avoid a time period bias. Also, the reversibility of α prevents a preference for beginning or ending of time period. In particular, we find this method of observing change in variability proves useful for observing the clustering of records and, thus, periods of increased variability. We also compare results to the evolution of temperature variability for a globally distributed set of stations, finding a surprising similarity of results.

2 Data and definitions

2.1 Selection and preparation of data

In order to examine the evolution of temperature variability in the USA, we use the adjusted monthly mean temperatures from the United States Historical Climatology Network, version 2.5, ushcn.v2.5.0.20140508 (USHCN), Menne et al. (2009). The adjusted data set accounts for inhomogeneities in the raw data, see Menne et al. (2009, 2012) and Vose et al. (2014) for details regarding the process of making adjustments to the data. We use all time series from the contiguous USA that have at least 90 years of data between the years 1900 and 2013. The 90-year minimum is based on the number of records expected in the stationary case: for 90 years, it is 5.08 and for 114 years, it is 5.31. If any of the monthly means in a time series are estimates they are removed from the time series, leaving that year without a data entry (estimated values are monthly means that are calculated from incomplete daily data). The method does not require data to be evenly spaced in time. Removed estimates do not count toward the 90-year minimum length of time series. Each month comprises its own time series and this results in 18,264 time series (station-months) from about 1500 stations. For example, a single time series would be comprised of average monthly temperatures for January at a single station, starting in 1900 and continuing until 2013. The total number of time series per year fluctuates, so to aid in interpretation of results, we include a graph of number of time series per year in Appendix A.

We also examine the adjusted monthly mean temperatures in the Global Historical Climatology Network, version 3, ghcnm.v3.2.1.20130521 (GHCN), Lawrimore et al. (2011). We remove USHCN stations from GHCN to enable comparison. Excluding the USHCN stations, there are 6092 station-months with 90 or more years of data in GHCN. The stations that fit this criterion are predominately distributed in the northern hemisphere. Preparation and analysis of GHCN data is identical to that for USHCN.

While the variability index is generally insensitive to trends in the mean, isolating a variability trend makes interpretation more straightforward. Therefore, we remove the mean trend from the data (detrending) prior to analysis. Each station-month time series is detrended individually using a local regression (LOWESS) with 50 year smoothing, i.e., Cleveland (1979). The LOWESS regression is ideal because it does not assume a functional form and it limits the effect of boundary biases. For each time series, the regression is used to identify a mean trend and then that trend is subtracted from the time series. Only two decimal places are maintained since USHCN and GHCN data have a resolution of 0.0 C (the results are robust and the observed trends discussed in the following sections remain qualitatively the same regardless of detrending; see Appendix B for details). Once these trends are removed, the data is ready for record counting and calculation of 〈α〉 values, as discussed in Section 3.

2.2 Stationary Monte Carlo ensembles

We provide 3-σ for ensembles of stationary time series that mimic the data, USHCN or GHCN, as a reference to aid in interpretation of results. The procedure for calculating 3-σ values follows. For each data set, USHCN and GHCN, we create 1000 ensembles of stationary time series. Each simulated ensemble is created by replacing all USHCN or GHCN temperatures with values drawn from a stationary distribution. We analyze the resulting stationary ensembles with the same methods that are used for the actual USHCN or GHCN data. We use the resulting 〈α〉 values, 1000 〈α〉 values for every time series duration between 1900 and 2013, to calculate 3-σ values (see figures of 〈α〉 in Section 3). This approach maintains the exact locations and times for which the time series in USHCN or GHCN do not have data (see Appendix A) and results in fluctuating values for 3-σ as time series duration changes. It is notable that 3-σ goes to zero when there is only one possible result for α, at the beginning of record-counting α must be zero.

2.3 Mean anomalies

As a reference, we also compute mean anomalies for the data described above. Here, the mean anomaly is calculated in two ways: (1) each time series has its mean subtracted (1900–2013), resulting in anomalies, then all the time series of anomalies are used to calculate average anomalies for each year; (2) first, each time series is detrended via a local regression (see above), then the procedure as formerly described is followed. This results in anomalies for both the detrended and original data. We also compute standard deviations for the distributions of anomalies occurring in each year.

3 Analysis and results

3.1 Evolution of temperature variability in the USA

Here, we return to our initial questions. Does interannual temperature variability change between 1900 and 2013 within the contiguous USA? Also, if there is a change, where and why does it occur? To answer these questions, we consider 〈α〉 for varying durations of time. First, we hold the initial year constant and vary the final year (Fig. 1a), calculating 〈α〉 for each possible set of time series beginning in 1900. Then, we hold the final year constant and vary the initial year (Fig. 1b), calculating 〈α〉 for each possible set of time series ending in 2013. To simplify interpretation, we analyze the detrended USHCN (see Section 2.1). For reference, we include a grey region that marks 3σ for stationary Monte Carlo ensembles in both panels of Fig. 1 (see Section 2.2 for details). Additionally, we examine mean anomalies and the associated standard deviations (Fig. 2).

Fig. 1
figure 1

Evolution of the variability index α for detrended monthly mean time series in USHCN. The variability of temperatures decreases with respect to years prior to 1950 with the largest decreases occurring for longer time series. In panel a, we consider all possible durations of time series with a fixed initial year of 1900. The grey region in each panel marks 3σ for a stationary Monte Carlo ensembles. The maximum value of 〈α〉 is attained at 1918, associated with increasing variability over the period 1900–1918. Alternatively, in panel b, we consider all possible durations of the time series with a fixed final year of 2013. The minimal value of 〈α〉 is attained at 1917, associated with a decreasing variability over the period 1917–2013

Fig. 2
figure 2

Panels a and b depict mean anomalies for USHCN times series. The standard deviations of anomalies each year appear as error bars. The mean anomalies explain a cause of the decreasing variability observed in Fig. 1: an excess of early records. For example, maximum and minimum 〈α〉 values are associated with some of the coldest and hottest years, 1917 (record low) and 1921 (record high), in particular. Panels c and d depict the standard deviations of the anomalies occurring each year. Linear trends are given in red: c −0.18±0.07 C/century (no detrending) and d −0.16±0.07 C/century (with detrending). A decreasing variability is observed, in agreement with analysis of records

In Fig. 1a, we see that on average variability of time series increases (〈α〉>0) during time series that start in 1900 and end between 1909 and 1938. This is indicated by an excess of records (highs and lows) approaching 1938 relative to what is expected in the stationary case. For example, the maximum value of 〈α〉, 0.85, occurs in 1918, associated with the time span 1900–1918. Meanwhile, time series beginning in 1900, but ending between 1939 and 2013, generally indicate decreasing variability (〈α〉<0). In this case, many of the records that occur between 1900 and 1938 are not overcome later in the time series. The cluster of record-breaking temperatures that occurs between 1909 and 1938 continues to dominate, even up to 2013.

In Fig. 1b, we also see the effect of early record-breaking: variability decreases during time series that begin between 1900 and 1938 and end in 2013. For example, the minimum value of 〈α〉, −1.4, occurs for 1917–2013. Here, a period of excess records (1909–1938) occurs at the beginning of the time series relative to the stationary case, so variability decreases relative to those years. When years prior to 1939 are excluded from analysis, changes in variability diminish.

Perhaps most notable is the general decline of 〈α〉 as years are added to time series as depicted in Fig. 1a. Part of this is due to the increasing length of the time series: the maximum possible value for α increases as time series duration increases. Yet part of the decline must also be due to reigning record-breaking temperatures from early in the time series. In panel b, we see this too. Interestingly, the results are in agreement with Karl et al.’s observations despite the additional years of data. For time series that begin in 1985 or later, variability is seen to increase (〈α〉 > 0), but this increase is easily overwhelmed when time series are extended further back in time. When years prior to 1985 are added, 〈α〉 is seen to decay, dropping below zero for all initial years prior to 1937.

One last note regarding the interpretation of Fig. 1: not all time series have an initial year of 1900 or final year or 2013. However, in Fig. 1, 〈α〉 is recorded according to the variable year regardless of the actual fixed year. For example, when the final year is fixed, the number of time series 1900–2013 is only 84 % of that for a fixed initial year (10,880 versus 12,900). Indeed, 〈α〉=−0.59 for fixed initial year (1900–2013) and 〈α〉=−0.65 for fixed final year (1900–2013). Therefore, while the discrepancy is minor, we add for comparison that 〈α〉=−0.65 for all time series (18264) at maximum length.

The mean anomalies for this data also support observations of early, extreme record-breaking and the associated decreasing variability. This is evident in Fig. 2, panels a (not detrended) and b (detrended). (Section 2.3 discusses computation of mean anomalies). For 1900–1918, 〈α〉 reaches its maximum value, 0.85. Notably, in the detrended data, the average coldest year (1917) occurs at the end of this time span, see Fig. 2b. For 1917–2013, 〈α〉 reaches its minimum value, −1.4. Here, the average coldest (1917) and hottest (1921) years occur at the beginning of the time series, again see Fig. 2b (without detrending, we have the coldest year (1917) and third hottest year (1931) and fourth hottest year (1921) occurring between 1909 and 1938, see Fig. 2a). Indeed, these temperatures were extreme enough to warrant early notice, e.g., Brooks (1918), where it is noted that in 1917 abnormally cold weather began in August, continued through October, abated in November, and continued again in December. Concerning December in particular, “In the Appalachian and Atlantic coast region ...this cold wave established new low December records for most states, and at many stations ...the weather was the coldest ever recorded since 1870 at least.” Concerning our warmest (detrended) year, 1921, this year was also noted early on, e.g., Henry (1921). It still shows up in lists of the hottest years on record: 1921 is the fourth hottest mean temperature between 1895 and 2008 for the contiguous USA, Shen et al. (2012) (since we do not use gridded data, our rankings are not identical but similar to those found by Shen et al. (2012)). Considering the mean trend, it is not surprising that the low temperatures of 1917 should remain record-breaking, the continuing effect of 1921 is more surprising.

Since trends in variability are observed via 〈α〉 and record-breaking temperatures, we look for further evidence of the trends. In Fig. 2c, d, we see the standard deviation of monthly mean temperature anomalies each year. These standard deviations also appear as error bars in panels a and b. The standard deviations decrease slightly over time and a linear trend is given in red: −0.18±0.07 C/century (no detrending), −0.16±0.07 C/century (with detrending). In either case, we see a decrease of about 0.2 C between 1900 and 2013. This is in agreement with results from analysis of record-breaking temperatures above (Fig. 1). We also note that the range of the annual mean anomalies clearly decreases between 1950 and 1970, see panels a and b. Since this trend is associated with the mean anomalies, rather than the individual anomalies, it is distinct from the trends we are exploring here.

3.2 Global evolution of temperature variability

Next, we analyze a globally distributed set of time series, excluding the contiguous US stations used above, (GHCN) in the same manner as above. First, we hold the initial year constant and vary the final year (Fig. 3a). Then, we hold the final year constant and vary the initial year (Fig. 3b). In both panels, we include a grey region that marks 3σ for stationary Monte Carlo ensembles, see Section 2.2 for details. For comparison, we again examine mean anomalies (Fig. 4). For GHCN, we again see an indication of decreasing variability over the period 1900–2013, though 〈α〉 values are smaller than those for the contiguous USA (above). Furthermore, the same “check mark” patterns (Fig. 1) appear in the GHCN data (Fig. 3) with correlation coefficients of 0.86 for the trends in both panels (a) and (b). This indicates that record-breaking occurs, to some extent, in unison throughout the northern hemisphere.

Fig. 3
figure 3

Similar to Fig. 1, but for global data (GHCN) excluding stations from the contiguous U.S. (USHCN). Again, variability decreases with respect to early decades. Additionally, the same “check mark” patterns seen for USHCN (Fig. 1) are apparent (correlation coefficients of 0.86 each). Here, however, the values of 〈α〉 are lower. In panel a, we consider all possible durations of the time series with a fixed initial year of 1900. Again, we see maximum values of 〈α〉 in 1917 and 1918; 1917 is the coldest year. In panel b, we consider all possible durations of the time series with a fixed end year of 2013. The grey region in each panel marks 3σ for a stationary Monte Carlo ensembles. The greatest decreases are for 1913 and 1914, followed by 1917, associated with time series 1913–2013, 1914–2013, and 1917–2013

Fig. 4
figure 4

Panels a and b depict mean anomalies for GHCN times series excluding stations from the contiguous USA (USHCN). The standard deviations of anomalies each year appear as error bars. The anomalies suggest a reason for the decreasing variability observed in Fig. 3: an excess of early records. For example, maximum 〈α〉 values are associated with some of the coldest years, 1917 and 1918 (record lows). Panels c and d depict the standard deviations of the anomalies occurring each year. Linear trends are given in red: c −0.02±0.04 C/century (no detrending) and d −0.03±0.04 C/century (with detrending). By this measure, there is no change

First, let us consider Fig. 3a. Increasing variability is associated with most time series beginning in 1900 and ending between 1914 and 1969. The maximum 〈α〉 value, 0.42, occurs for 1917, associated with 1900–1917. Indeed, as with USHCN, 1917 is an extreme cold year, a likely contributing factor in the high correlation between 〈α〉 values in USHCN and GHCN. Alternatively, let us consider Fig. 3b. Time series beginning between 1900 and 1956 and ending in 2013 generally suggest decreasing variability and an excess of records occurring early in the century. For example, the minimum 〈α〉 value, −0.49, occurs for 1913, associated with 1913–2013. This minimum is closely followed by that for 1917–2013, −0.44. In short, we have agreement with USHCN: an early cluster of records occurring between 1910 and 1940 that continues to dominate through 2013.

We note that, as was true for the USHCN data in Fig. 1, 〈α〉 is recorded according to the variable year regardless of the actual fixed year. This is particularly true for the GHCN data where many time series used do not have data before 1915 and many time series do not have data after 1990. For example, when the first year is fixed, the number of time series for 1900–2013 is only 85 % of that for a fixed last year. Indeed, 〈α〉=−0.19 for fixed initial year (1900–2013) and 〈α〉=−0.10 for fixed final year (1900–2013). Therefore, we add for comparison that 〈α〉=−0.09 for all time series at maximum length.

Is there also evidence of the changing variability within the standard deviations of monthly mean anomalies? Unlike USHCN, the average standard deviations for GHCN do not appear to change much over time: the linear trends are an order of magnitude smaller than for USHCN. A linear trend is given in red: −0.02±0.04C/century (no detrending) and −0.03±0.04 C/century (with detrending). By this measure, unlike for USHCN, there is essentially no trend.

4 Concluding remarks

In terms of both attention and magnitude, trends in surface temperature variability have been largely overshadowed by trends in mean surface temperature. Yet, to predict future trends and extremes, we must understand how both the mean and higher moments are changing. There are multiple ways to do this, for example, one can focus on seasonal, intrannual, or interannual trends in variance. Here, we focus on the evolution of variability of interannual monthly mean temperatures.

To do this, we build on prior research where we found a significant decrease in variability for GHCN version 2 (including USHCN stations) Anderson and Kostinksi (2010). Here, we offer a substantial expansion of those results by asking about the evolution of the signal and treating USHCN separately from GHCN (given its dominance in the GHCN ensemble). We also now use the recently updated versions of GHCN (v3) and USHCN (v2.5). As before, we make use of our variability index, α, but here we calculate and interpret it for varying initial and final years, thus avoiding a time period bias and learning about its evolution. We also add an analysis of standard deviations for comparison.

We find that variability has decreased for both USHCN and GHCN between 1900 and 2013. In the case of USHCN, this result is corroborated by an observed trend in standard deviation of anomalies each year. Our analysis indicates that the decrease in variability is caused largely by early high and low record-breaking temperatures, particularly those clustered around 1920. Indeed, 1917 (record low value) and 1921 (record high value) are among the coldest years and hottest years between 1900 and 2013. However, when the earlier decades are removed, especially 1900–1940, the trend is diminished or reversed. This is consistent with an earlier study by Karl et al. (1995), even though we add 30 years of data (1900–1910 and 1995–2013). They found that interannual variance had decreased in the USA between 1911 and 1994, but increased in more recent decades, 1974–1993. We find the largest interannual variability between 1900 and 2013 is still found early in the century despite the current warming trend.

Lastly, while 〈α〉 values indicate a decreased variability for GHCN monthly mean temperatures, unlike USHCN, standard deviations of anomalies each year do not corroborate. This indicates that 〈α〉 may be able detect differences in variability that are not observable using more traditional methods. Indeed, the similarity of the results for the USHCN and GHCN data sets is striking. Though they share no stations, both data sets contain “check mark” patterns caused by an early clustering of records surrounding 1920 (see Figs. 1 and 3).