Main

Although most land regions show robust warming over the past century13, the pattern of change has not been spatially uniform14. This heterogeneity results from regional differences in the response of the climate system to increasing radiative forcing, and from the background noise of climate variability. Together, these factors substantially increase the challenge of climate change detection, attribution and projection at regional and local scales14,15,16.

The spatial pattern of changes in extreme weather events has generated arguments that global warming has caused dynamic and/or thermodynamic changes that have differentially altered extreme event probabilities1,17. Thermodynamic arguments are well understood and observed. For example, the accumulation of heat in the atmosphere has resulted in upward trends in hot extremes, downward trends in the majority of cold extremes, and more intense hydroclimatic events1,2. Dynamic arguments have greater uncertainties15,16,17,18,19. Changes in the large-scale atmospheric circulation—for instance, an increase in the occurrence or persistence of high-amplitude wave patterns—could alter the likelihood of extreme events20. Recent extremes in the Northern Hemisphere mid-latitudes1,2,17 have motivated hypotheses of a dynamic linkage between ‘Arctic amplification’, altered atmospheric circulation patterns, and changes in the probability of mid-latitude extremes3,4,5,17. Despite divergent views on the causal direction of this linkage17, altered atmospheric dynamics are consistently invoked. Although trends in mean-seasonal mid-atmospheric geopotential heights have been identified (fig. 2.36 of ref. 21; Fig. 1), evidence of changes in the occurrence of sub-seasonal atmospheric patterns remains equivocal, as does their contribution to extreme event probabilities6,7,8.

Figure 1: Trends in mid-atmospheric geopotential heights.
figure 1

Northern Hemisphere polar projections of 1979–2013 seasonal trends (m yr−1) in 500 hPa geopotential heights. Trends are computed for winter (a; December, January, February (DJF)), spring (b; March, April, May (MAM)), summer (c; June, July, August (JJA)) and autumn (d; September, October, November (SON)) seasons. Geopotential height fields are sourced from NCEP-DOE-R2.

PowerPoint slide

Previous efforts to detect trends in atmospheric circulation may have been hampered by narrowly defined, spatially sensitive, and/or non-standardized metrics3,6,7,8,17. We therefore employ a large-scale spatial characterization approach—self-organizing map (SOM) cluster analysis—to track the occurrence of highly generalized mid-atmospheric circulation patterns. We use 500 hPa geopotential height anomaly fields to describe daily circulation, and group each day’s pattern into one of a predefined number of SOM clusters based on a measure of pattern similarity9,10,11,12 (Methods). The number of clusters is largely dependent on the degree of specificity/generality required to test a particular hypothesis9,10,11,12. To facilitate generalized large-scale mid-atmospheric classification, we use four clusters per domain. Using three reanalyses (Methods), we calculate linear trends (yr−1) in the time series of annual values of (1) the total number of days in each season on which each SOM pattern occurs (‘occurrence’ (d yr−1)); (2) the mean length of consecutive occurrence (‘persistence’ (d event−1)); and (3) the longest consecutive occurrence (‘maximum duration’ (d event−1)). We consider trends in each metric to be robust if matching circulation patterns from all three reanalyses have statistically significant trends of the same sign. We assess the robustness of trends for seven mid-latitude regions (Figs 2 and 3a), over both the era of satellite observation (1979–2013; ‘satellite era’) and the era of rapidly diminishing Arctic sea ice22 (1990–2013; ‘ice era’). We report circulation patterns that pass these robustness criteria, but also discuss results in the context of (1) comprehensive multiple hypothesis testing; (2) removal of the assumption of linear time-series relationships; (3) use of fewer/more clusters; and (4) addition of atmospheric thermal dilation controls (Methods; Extended Data Table 1 and Extended Data Figs 1, 2, 3).

Figure 2: Trends in surface temperature extremes and atmospheric circulation patterns.
figure 2

Trends are calculated for each Northern Hemisphere season (December, January, February (DJF), winter; March, April, May (MAM), spring; June, July, August (JJA), summer; September, October, November (SON), autumn) for two periods: 1979–2013 (satellite era) and 1990–2013 (ice era). Regional domains (see Fig. 3a) in which one or more of the four SOM circulation patterns demonstrate robust trends in mid-atmospheric circulation pattern occurrence (O), persistence (P), or maximum duration (M) are shown in green (Extended Data Figs 1 and 2). Positive (+) and negative (−) symbols are displayed when all three reanalyses show statistically significant trends in a particular circulation pattern (5% significance level; Methods), and agree on the sign of those trends. Multiple symbols within a box indicate multiple robust pattern trends. White boxes without symbols indicate no statistically significant trends and/or reanalysis disagreement (see Methods). Regional domains with positive and/or negative trends in cold (05) and/or hot (95) extremes receive (+) or (−) symbols when the three reanalyses agree on the sign of the area-weighted trend. Red and blue boxes indicate that the extreme temperature trend results in warming and cooling, respectively, while grey boxes indicate reanalysis disagreement.

PowerPoint slide

Figure 3: Trends in circulation patterns and hot extremes over Europe.
figure 3

a, 1979–2013 trends in summer hot extreme occurrences for all regional domains based on 2-m maximum/minimum temperatures from NCEP-DOE-R2. be, SOM-derived mid-atmospheric circulation patterns (500 hPa geopotential height anomalies) over Europe. White boxed values show pattern frequencies in the top left and SOM node numbers in the top right. fi, Time series of SOM circulation pattern occurrence (black (d yr−1)), persistence (blue (d event−1)) and maximum duration (red (d event−1)). The slope of the trend line (yr−1) and P values (in parentheses) are colour coded, with the values from 1979–2013 (solid trend line) displayed above those from 1990–2013 (dashed trend line). jm, Spatially rendered trends in hot extreme occurrences for days that correspond to each SOM circulation pattern. nq, Time series of the area-weighted mean of hot extremes per pattern occurrence, referred to throughout the text as a measure of the intensity of temperature extremes associated with each pattern. Statistically significant trends (5% significance level; Methods) are shown by stippling in the mapped panels and by bold font in the scatter plots.

PowerPoint slide

Source data

Of the 112 total circulation patterns analysed in each period (Methods), the three reanalyses exhibit statistically significant trends in pattern occurrence for a total of 17, 16 and 16 patterns in the satellite era, and 15, 13 and 14 patterns in the ice era (Extended Data Table 1a). Of these significant occurrence trends, 12 satellite-era and 10 ice-era patterns are robustly significant across all three reanalyses (Fig. 2). The majority of robust satellite-era trends occur in summer and autumn, while robust ice-era trends are more evenly distributed over summer, autumn and winter. These patterns are diverse, and include anticyclonic, cyclonic and ‘dipole’ circulations (Extended Data Figs 4 and 5). Patterns with robust trends in both satellite and ice eras are limited to summer and autumn over western Asia and eastern North America.

While the number of significant trends in pattern persistence varies from 5 to 10 across the individual reanalyses (Extended Data Table 1a), only three robust pattern persistence trends are identified in each period (Fig. 2). Robust maximum duration trends are more prevalent, including five in the satellite era and six in the ice era. These are predominantly associated with summer anticyclonic patterns, although the maximum duration of central Asia winter troughing events demonstrates a robust ice-era increase (Extended Data Figs 4 and 5). In regions with robust trends in multiple patterns, those patterns are generally complimentary. For example, in summer over eastern North America, robustly increasing satellite-era trends in anticyclonic patterns co-occur with robustly decreasing trends in cyclonic patterns.

We next explore the extent to which trends in mid-atmospheric circulation patterns have influenced the likelihood of temperature extremes. For each period, we compute area-weighted trends in the seasonal occurrence of temperature extremes for all days, and for those days associated with each SOM pattern (for example, Fig. 3a, j–m; see Methods). The three reanalyses generally agree on the direction of all-days trends: consistent with enhanced radiative forcing and global warming, most regions and seasons show positive trends in hot occurrence, and negative trends in cold occurrence (Fig. 2).

Hot extremes are projected to increase due to the dynamic and thermodynamic effects of global warming1,23. Consistent with other assessments1,2, we find substantial increases in extreme heat occurrence over the mid-latitudes (Extended Data Figs 6 and 7). For instance, the regional-mean occurrence of summer hot days over Europe, western Asia and eastern North America has increased 0.10, 0.16 and 0.13 d yr−1 yr−1, respectively, over the satellite era (Fig. 3a and Extended Data Table 2a–c). By definition, one would expect (on average) ∼4.5 5th/95th percentile events per 3-month season, meaning that an increase of 0.10 d yr−1 yr−1 accumulated over the course of the satellite era (35 years) yields an additional ∼3.5 d yr−1, an ∼75% increase.

Heatwaves, similar to those which occurred in western Russia in 2010 and Europe in 2003, develop when persistent anticyclonic patterns, often referred to as ‘atmospheric blocking’, initiate a cascade of self-reinforcing, heat-accumulating physical processes24,25. In addition to the increasing trends in extreme heat occurrence, robust positive trends in the occurrence, persistence and maximum duration of satellite-era summer mid-atmospheric anticyclonic patterns are detected over Europe (Fig. 3c, g), western Asia (Fig. 4a, e), and eastern North America (Extended Data Fig. 4c). Robust positive trends in the occurrence of satellite-era anticyclonic patterns are also detected—along with increasing hot extremes—in autumn over eastern North America (Fig. 4c, g), eastern Asia (Fig. 4d, h) and central North America, and in spring over Europe (Extended Data Fig. 1).

Figure 4: Circulation pattern and thermal extreme trends for selected regions.
figure 4

Trends in thermal extreme occurrences for selected regions and seasons based on 2-m maximum/minimum temperatures from NCEP-DOE-R2. a–d, SOM-derived mid-atmospheric circulation patterns (500 hPa geopotential height anomalies) over western Asia in summer (a), central Asia in winter (b), eastern North America in autumn (c), and eastern Asia in autumn (d). White boxed values show pattern frequencies in the top left and SOM node numbers in the top right. In contrast to Fig. 3, just one of the four SOM circulation patterns is displayed from each region. eh, Time series of SOM circulation pattern occurrence (black (d yr−1)), persistence (blue (d event−1)) and maximum duration (red (d event−1)). The slope of the trend line (yr−1) and P values (in parentheses) are colour coded, with the values from 1979 to 2013 (solid trend line) displayed above those from 1990 to 2013 (dashed trend line). il, Spatially rendered trends in thermal extreme occurrences for days that correspond to each SOM circulation pattern. mp, Time series of the area-weighted mean of temperature extremes per pattern occurrence, referred to throughout the text as a measure of the intensity of temperature extremes associated with each pattern. Statistically significant trends (5% significance level; Methods) are shown by stippling in the mapped panels and by bold font in the scatter plots. Refer to Extended Data Figs 6 and 7 for satellite-era and ice-era trends in temperature extremes over the regional domains.

PowerPoint slide

Source data

Increases in hot extremes may result from dynamic changes (namely greater occurrence and persistence of anticyclonic patterns) as well as from thermodynamic changes such as land cover change or global warming (reflected in the increased intensity of extreme temperature when anticyclonic patterns occur). Over Europe, the summer occurrence of circulations similar to dipole patterns with ridging over the eastern half of the domain (Fig. 3c) increased 0.45 d yr−1 yr−1 over the satellite era, while the persistence and maximum duration increased 0.05 and 0.19 d event−1 yr−1, respectively (Fig. 3g). The trend in the frequency of hot events coincident with this pattern (0.06 d yr−1 yr−1; Fig. 3k and Extended Data Table 2a) accounts for ∼61.5% of the total trend in hot extremes over Europe (0.10 d yr−1 yr−1; Fig. 3a). In addition, the number of hot extremes per pattern occurrence has increased for all four patterns (Fig. 3n–q and Extended Data Table 2a). Under the assumption of pattern stationarity, we perform a quantitative partitioning of the dynamic and thermodynamic contributions to the overall extreme temperature trend, as well as to the extreme temperature trends associated with each pattern10 (Methods). This partitioning reveals that ∼27.3% of the 0.10 d yr−1 yr−1 overall increasing trend in hot extremes is driven by the dynamic influence of increased occurrence of the dipole pattern (along with ∼35.2% thermodynamic and ∼−1.0% interaction influences). Additionally, of the 0.06 d yr−1 yr−1 portion of the hot extremes trend associated with this pattern, ∼57.3% is attributable to thermodynamic influences and ∼44.3% to increased pattern occurrence (Extended Data Table 2a). Together, these results suggest that the observed increase in extreme summer heat over Europe is attributable to both increasing frequency of blocking circulations and changes in the surface energy balance. Similar results are found in other regions that exhibit robust upward trends in anticyclonic patterns (Fig. 4 and Extended Data Table 2).

Global warming is also generally expected to decrease the frequency of cold extremes1. In autumn over eastern Asia, the occurrence of satellite-era cold extremes decreased 0.08 d yr−1 yr−1 (Extended Data Fig. 6g and Extended Data Table 2f), indicating a reduction of ∼60% over the 35-year period. A majority of this decreasing trend (0.05 d yr−1 yr−1; Extended Data Table 2f) is attributable to changes associated with one pattern type: cyclonic circulations capable of advecting cold air equator-ward (Fig. 4d). Less frequent occurrence of cyclonic patterns (Fig. 4h), in conjunction with less intense cold temperature anomalies when cyclonic patterns occur (Fig. 4p), drives ∼62.5% of the overall decreasing trend (partitioned ∼35.4% dynamic, ∼21.9% thermodynamic, ∼5.3% interaction). Of the 0.05 d yr−1 yr−1 decrease in extreme cold associated with the trend in cyclonic patterns, partitioning indicates ∼56.5% dynamic and ∼35.0% thermodynamic influences (Extended Data Table 2f).

In contrast to this expected extreme cold decrease, winter cold extremes over central Asia have increased 0.07 d yr−1 yr−1 over the ice era (Extended Data Fig. 7a). One-hundred-and-forty-nine per cent of this trend (0.10 d yr−1 yr−1; Fig. 4j) occurred when mid-atmospheric circulation was similar to a pattern of troughing in the south and east, and ridging in the northwest (Fig. 4b; ∼111.4% dynamic, ∼25.6% thermodynamic, ∼11.6% interaction). Trend percentages exceeding 100% indicate that other circulation patterns provide negative contributions. Occurrence and persistence of this dipole pattern robustly increased (1.0 d yr−1 yr−1 and 0.12 d event−1 yr−1, Fig. 4f) at the expense of all other circulations (Extended Data Table 2d). Partitioning indicates that ∼75.0% of the extreme cold trend associated with this pattern is due to the dynamic influence of increased pattern occurrence, with ∼17.2% linked to thermodynamic influences (Extended Data Table 2d).

Substantial dynamic contributions to the overall trend in cold extremes could be expected given that circulations that support the equator-ward advection of Arctic air will bring anomalously cold temperatures to lower-latitude locales20. Increased occurrence of such patterns has previously been observed, and linked to reduced regional sea-ice and decreased baroclinicity over the Barents and Kara seas4,17,26,27,28. Positive thermodynamic contributions to the extreme cold trend indicate processes that are in opposition to the direct warming effects of enhanced radiative forcing. For example, positive thermodynamic contributions from three of the four winter patterns over central Asia (Extended Data Table 2d) suggest that these thermodynamic contributions are largely independent of atmospheric circulation, and therefore potentially related to surface processes such as increased snow cover and enhanced diabatic cooling4,28.

The circulation trends detected here cannot as yet be attributed to anthropogenic or natural causes, nor can they be projected to continue into the future. Attribution and projection will require an increased understanding of the causes of the circulation trends, including the ability to identify the signal of an anthropogenically forced trend from the noise of internal decadal-scale climate variability16,29. However, our quantitative partitioning, in conjunction with targeted climate model simulations16,29,30, offers the potential to fingerprint dynamic and thermodynamic climate influences in isolation, which in turn may facilitate attribution of the observed trends, and projection of future trends. We hypothesize that the main assumption of our quantitative partitioning—pattern stationarity—is justified given the expectation that circulation responses to enhanced radiative forcing are likely to reinforce pre-existing modes of natural variability15,16. A related assumption is that the reanalyses act as reasonable proxies for the state of the three-dimensional atmosphere through time. Given uncertainties in the data assimilation and numerical modelling that underpin atmospheric reanalysis, we have restricted our identification criteria to those trends that are statistically significant in all three reanalyses.

Our approach finds robust trends in mid-atmospheric circulation patterns over some regions, and suggests that both dynamic and thermodynamic effects have contributed to observed changes in temperature extremes over the past 35 years. Although thermodynamic influences have largely dominated these changes, dynamic influences have been critical in some regions and seasons. Long-term projections of future dynamic contributions are challenging given the substantial underlying decadal-scale variability, as well as the uncertain impact of anthropogenic forcing on mid-latitude circulation15,16. However, given our finding that many patterns have exhibited increasing (decreasing) intensity of extreme hot (cold) events, and that those trends are coincident with a nearly categorical increase in thermodynamic forcing, the observed trends of increasing hot extremes and decreasing cold extremes could be expected to continue in the coming decades, should greenhouse gases continue to accumulate in the atmosphere.

Methods

Categorization of circulation patterns

We use SOM cluster analysis9,10,11,12 to categorize large-scale circulation patterns over seven Northern Hemisphere domains31 using daily 500 hPa geopotential height anomaly fields from the NCAR/NCEP-R1 (ref. 32), NCEP-DOE-R2 (ref. 33) and ECMWF ERA-Interim (ref. 34) reanalyses. Daily anomalies are calculated by subtracting the seasonal cycle (calendar-day mean) from each grid cell. Reanalyses are analysed individually to maintain their physical consistency, and to facilitate their intercomparison. The SOM’s unsupervised learning algorithm requires neither a priori knowledge of which types of circulation patterns might be detected, nor the specific geographic regions in which they might occur. Geopotential height anomaly fields from each day are assigned to one of a pre-defined number of nodes, according to pattern similarity. The final SOM patterns are obtained by minimizing the Euclidian distance between iteratively updated nodes and their matching daily geopotential height anomaly fields11. Each SOM pattern can therefore be viewed as a representative composite of relatively similar circulation patterns.

Owing to global-scale warming, trends in geopotential height anomalies record both altered atmospheric circulation patterns and the thermal expansion of the troposphere. To isolate the signal of circulation pattern change, previous clustering analyses have assumed uniform thermal dilation and removed either the domain average35 or domain average linear trend36 from the daily-scale anomalies. In our analysis, we find that 1979–2013 trends in Northern Hemisphere geopotential height anomalies are non-uniform in both magnitude and sign (Fig. 1), and demonstrate substantial seasonal, regional and latitudinal differences (Extended Data Fig. 3e). These findings suggest that for the relatively short period of our analysis, an assumption of uniform thermal dilation is inappropriate. Moreover, the strong spatial heterogeneity indicates the importance of large-scale dynamics in the regional geopotential height trends, and so the removal of local geopotential height trends would conflate dynamic changes with thermal dilation. Therefore, in the main text we present results and conclusions based on raw geopotential height data. However, despite the lack of uniform expansion, we have conducted an analysis that attempts to account for the effects of thermal dilation. SOM analyses are performed on geopotential heights that have been detrended by removing the seasonal mean hemispheric trend (Extended Data Fig. 3e) from each grid cell. Results from this analysis indicate that the magnitude, significance and sign of circulation trends are sensitive to the method of controlling for thermal expansion (Extended Data Fig. 3f–j). Despite this sensitivity, the conclusions presented in the main text are supported, in that both raw and detrended analyses generally suggest trends of similar magnitude, sign and significance.

Based on domain-wide pattern correlations between daily height field anomalies and different SOM node counts12 (Extended Data Fig. 8), we divide circulation patterns over each domain into four SOM nodes (for example, Fig. 3b–e). To determine a suitable number of nodes, a suite of different node counts were analysed. We found that four nodes were sufficiently great in number to capture a diversity of highly generalized circulation patterns, but sufficiently few to facilitate convenient presentation and, critically, to prevent overly similar SOM patterns12. To test the sensitivity of our results to the number of nodes, we present 2-, 4-, 8- and 16-node SOMs for the summer season over the European domain (Extended Data Figs 1 and 2). Based on these analyses, it is apparent that a 2-node SOM is insufficient to capture the diversity of circulation patterns that are found in the reanalysis data (Extended Data Fig. 1a), whereas 8- and 16-node SOMs produce nodes with overly similar circulation patterns (Extended Data Figs 1c and 2). Examination of the 4-, 8- and 16-node SOMs largely verifies the conclusions drawn from the 4-node SOM: that the occurrence of patterns with ridging over the eastern half of the European domain has increased over time, while the occurrence of complimentary patterns has decreased. We note that these pattern trends are not identified in the 2-node SOM, confirming that two nodes are too few to capture specific circulation patterns that are critical for extreme temperature occurrence. Similar node-count analyses for other regions/seasons likewise verify the conclusions drawn from the 4-node analyses that are presented in the main text (not shown).

Calculation of robust trends in circulation patterns

For each season in each year, we calculate (1) the total number of days on which each SOM pattern occurs (occurrence (d yr−1)); (2) the mean length of consecutive occurrence (persistence (d event−1)); and (3) the longest consecutive occurrence (maximum duration (d event−1)). A trend in one of these characteristics is considered robust when the trend in that pattern is (1) statistically significant in all three reanalyses, and (2) of the same sign in all three reanalyses. Trends with regression coefficients that surpass the 5% significance (95% confidence) threshold are considered statistically significant. Trends are calculated across satellite-era and ice-era annual time series using the approach of ref. 37, which allows us to account for temporal dependence. Here, the trends are calculated using linear least squares regression, but to account for temporal dependence, the confidence bounds of annual time-series trends with lag-one autocorrelation greater than the 5% significance level are recalculated by adjusting the number of degrees of freedom used to compute the regression coefficient significance37. As a result, unlike a simple linear regression, this approach does not rely on the independent and identically distributed assumption for the residuals, but instead accounts for temporal dependence using a red noise assumption.

The approach of ref. 37 also assumes that the distribution of the residuals is Gaussian. Using the Anderson–Darling38 test for normality, we find that 91–100% of residual distributions in each metric of each reanalysis do not reject the null hypothesis of Gaussianity when multiple hypothesis testing controlling the familywise error rate39 (FWER) at the 5% significance level is considered (Extended Data Table 1b). The Gaussianity assumption is therefore largely appropriate. However, due to the identification of non-normality in some distributions, particularly in the persistence and maximum duration metrics, we apply Box–Cox power transformations40 to all distributions. Using the Anderson–Darling test, we find that 96–100% of the distributions of the residuals in the transformed setting do not reject the null hypothesis of Gaussianity at the 5% significance level. Furthermore, when multiple hypothesis testing is considered by controlling the FWER at the 5% level, 100% of the individual tests are non-significant (Extended Data Table 1b). In addition, the number of pattern trends identified as significant in the transformed case is largely consistent with the non-transformed regression analysis. However, additional significant trends in the persistence and maximum duration metrics are identified when Box–Cox transformations are used (Extended Data Table 1a). The large overlap of results between the transformed and non-transformed analyses suggests that although individual residual distributions may vary, the Gaussian assumption applied throughout this study is for the most part quite robust, although in some cases not valid. In sum, the non-transformed analysis that fits a linear relationship to circulation pattern metrics allows for a relatively simple classification of two short analysis periods, while simultaneously accounting for temporal dependence in a large number (>2,000) of individual time series (7 regions × 4 nodes × 4 seasons × 3 characteristics × 2 time periods × 3 reanalyses).

Because SOM nodes are calculated independently for each reanalysis, individual SOM patterns must be matched between the three reanalyses in order to determine whether an individual pattern shows robust results across all three reanalyses. To determine which SOM patterns are the closest match between the three reanalyses, the root mean square error (RMSE) is calculated between the SOM patterns of one reanalysis and those of the other reanalyses. Patterns with the smallest RMSEs are considered matches. Although we undertake a multi-reanalysis robustness evaluation, further work is needed to confirm that other available reanalyses (such as CFSR and MERRA) show the same trends.

Multiple hypothesis testing of linear trends

It is possible that some of the trends identified as significant in any individual reanalysis could occur by chance. In addition to screening for those patterns that are significant in all three reanalyses (our ‘robustness’ criterion), we also employ formal multi-hypothesis testing using several methodologies. The first is the familywise error rate (FWER). This type of error metric controls the probability of falsely rejecting any null hypothesis, and is considered one of the strictest forms of error control39. Since a certain number of false rejections can happen by chance alone, one can account for this formally by using the k-familywise error rate41 (k-FWER). The k-FWER controls the probability of falsely rejecting k or more null hypotheses, and aims to formalize the concept that some of the hypotheses will be rejected by chance. One option for the value of k is to use the expected number of hypotheses that will be rejected at a given significance level. For instance, in our study, out of 112 total ‘local’ hypotheses, 5 or 6 hypotheses will be significant at the 5% significance level by chance (112 × 0.05 = 5.6). In this case, one can evaluate the probability that 7 or more hypotheses are falsely rejected, since on average about 6 could be rejected as significant by chance. The third metric is the false discovery rate (FDR), which controls the expectation of the ratio given by the number of false rejections divided by the total number of rejections39.

All of the above measures of error control aim to guard against hypotheses being falsely declared as significant in the context of multiple tests. To be thorough, we have implemented all three types of error control at both global significance levels of 5% and 10%. The results of these analyses are summarized in Extended Data Table 2. We note that all three metrics heavily favour the null as they are designed to protect against the possibility of false positives. Despite this, the presence of local tests that reject the null represents a strong confirmation of the significance of those local tests. The fact that a number of local hypotheses still prevail as significant, even after imposing much stricter multiple testing error controls, arises partly from the fact that some of the local P values indicate trends that are so highly significant that they can withstand the stricter multiple testing error control metrics. This rigorous multiple-testing error control yields increased credibility to the scientific conclusions of robust trends in pattern occurrence.

Temperature extremes

Daily-scale hot and cold extreme occurrences are calculated using temperature anomalies at each grid cell. Temperature anomalies are computed by removing the seasonal cycle from daily reanalysis 2-m maximum/minimum temperatures. Similar to previous studies1,2, temperature extremes are calculated based on the statistical distribution of daily temperature anomalies20. Hot/cold extreme thresholds are defined as the 95th/5th percentile value of the 1979–2013 daily 2-m maximum/minimum temperature anomaly distribution (for example, for JJA, the population of daily-maximum temperature anomalies from the months of June, July and August in the years 1979–2013). Hot/cold extreme occurrences are defined as days on which the daily temperature anomalies are greater/less than (or equal to) the hot/cold extreme thresholds. Reanalysis temperature extremes are qualitatively similar to those found in station-based observations2. Given this similarity, we use the reanalysis temperatures in order to maintain internal physical consistency between daily 2-m temperatures and daily atmospheric circulation (as represented by the 500 hPa SOM circulation patterns). Trends in temperature extreme occurrence are computed across satellite-era and ice-era annual time series following the methodology of ref. 37.

Quantitative partitioning

To determine the dynamic and thermodynamic contributions to trends in temperature extreme occurrence, we adapt the climate change partitioning methodology of Cassano et al.10. Our adapted methodology partitions the contributions of dynamic and thermodynamic changes to (1) the overall trend in temperature extreme occurrence and (2) the trends associated with individual SOM circulation patterns. Previous applications of the Cassano et al.10 methodology indicate that partitioning is largely insensitive to the number of SOM nodes used in the analysis42. All trends in temperature extremes in the below methodology are area-weighted averages. Following Cassano et al.10:

where E is the frequency of extreme temperature occurrence, fi is the frequency of occurrence of SOM pattern i, Ei is the frequency of extreme temperature occurrence when SOM pattern i occurs, and K is the total number of SOM nodes. We decompose E and f into time mean and deviation from time mean components:

Now we differentiate the above equation with respect to time, noting that the mean values are constants:

The derivative on the left-hand side provides the area-weighted average trend in the seasonal occurrence of temperature extremes for all days. The summation on the right-hand side, from left to right, provides the thermodynamic, dynamic and interaction contributions for days associated with each SOM pattern, i.

The thermodynamic contribution of each circulation pattern’s extreme temperature trend assumes that each SOM pattern is stationary in time, and that trends in extremes that result during this pattern are the result of influences unrelated to circulation, such as changes in long-wave radiation from increasing greenhouse gas concentrations, or changes in surface fluxes of moisture and/or radiation resulting from changes in land cover. The thermodynamic contribution associated with each circulation pattern is determined by taking the product of the trend in the intensity of temperature extremes and the mean occurrence of the circulation pattern. Trends in the intensity of temperature extremes are computed by calculating the trend in area-weighted extreme occurrence per pattern occurrence (for example, Fig. 3n–q).

The dynamic contribution of each circulation pattern’s extreme temperature trend assumes that, on average, a circulation pattern is associated with a portion of the total extreme event trend, and that changes in the occurrence frequency of that circulation pattern will modify the occurrence frequency of extreme events. The dynamic contribution associated with each circulation pattern is determined by taking the product of the trend in circulation pattern occurrences and the mean number of extreme events per pattern occurrence.

The third component represents the interaction between dynamic and thermodynamic changes, and captures contributions that result from changes in the dynamic component acting on changes in the thermodynamic component, such as the positive/negative feedbacks of surface–atmosphere interactions. The interactive term is determined by computing the trend in the product of circulation pattern occurrence deviations and intensity of temperature extreme deviations.

Code availability

SOM code is available at http://www.cis.hut.fi/projects/somtoolbox/. All other analysis code is available upon request from the corresponding author (danethan@stanford.edu).

Reanalysis data sets

NCAR/NCEP-Reanalysis 1 data was downloaded from http://www.esrl.noaa.gov/psd/; NCEP-DOE-Reanalysis 2 data was downloaded from http://www.esrl.noaa.gov/psd/; ECMWF ERA-Interim data was downloaded from http://www.ecmwf.int/.