Keywords

1 Introduction

The analysis of the spatial structure of co-variability of climatic time series at different locations forms an integral part of meteorological and climatological research. Traditional techniques in this context are based on principal component analysis (PCA) of the covariance matrix of the dataset under consideration. By construction, such approaches only capture the first two statistical moments of the distributions of the individual time series, and the resulting empirical orthogonal functions (EOFs) thus do not describe the behavior of extreme events. By combining a nonlinear synchronization measure with complex network theory, we introduce a methodology that can fill this gap and show how it can be applied for climatological analysis but also for statistical prediction of extreme rainfall events.

In the recent past, so-called climate networks have attracted great attention as tools to analyze spatial patterns of climatic co-variability, complementarily to traditional PCA-based techniques (e.g., Donges et al. 2009a,b, 2011; Gozolchiani et al. 2011; Ludescher et al. 2013; Steinhaeuser et al. 2012; Tsonis and Roebber 2004; Tsonis and Swanson 2008; Van Der Mheen et al. 2013). Here, we show how these approaches can be extended to capture the dynamical characteristics of extreme events. The key idea of the methodology that shall be presented in the following sections is to identify rainfall time series measured at different locations with network nodes and represent strong synchronizations of extreme events in these time series by network links connecting the respective nodes. The climatological mechanisms driving the synchronization and propagation of extreme rainfall events are assumed to be encoded in the topology of the resulting climate network. Different aspects of this topology can be quantified by means of suitable network measures, and upon providing climatological interpretations of these network measures, we will show that the spatial patterns they exhibit reveal the underlying climatological mechanisms (Boers et al. 2013). Furthermore, using directed and weighted networks, we will show how this approach can be used for statistical prediction of extreme events (Boers et al. 2014a), given that the synchronization patterns are sufficiently pronounced.

While we restrict ourselves to present its application to satellite-derived rainfall data, the methodological framework is more general and can in principle be applied to analyze collective synchronization patterns of extreme events in many types of complex systems. The methodology should be considered as a general data exploration tool that can provide the basis for building scientific hypotheses on the mechanisms underlying the synchronization of extreme events in large, interactive systems.

2 Climatic Setting

The monsoon season in South America from December to February (DJF) is characterized by a southward shift of the Intertropical Convergence Zone (ITCZ) and by an amplification of the trade winds due to the differential heating between ocean and land (Zhou and Lau 1998) (Fig. 15.1). These low-level winds transport moist air from the tropical Atlantic ocean toward the tropical parts of the continent, where they cause abundant rainfall. Substantial fractions of this precipitation are recycled back to the atmosphere by evapotranspiration, and the winds carry the water vapor farther west across the Amazon Basin toward the Andes. There, the shape of the mountain range forces the winds southward toward the subtropics (Marengo et al. 2012; Vera et al. 2006). The specific exit regions of this moisture flow vary considerably from the central Argentinean plains to southeastern Brazil. These variations are associated with frontal systems approaching from the South, which are triggered by Rossby waves of the polar jet streams (Carvalho et al. 2010; Siqueira and Machado 2004). A dominant southward component of the flow leads to the South American Low-Level Jet (SALLJ) east of the Andes (Marengo et al. 2004), which conveys large amounts of moisture from the tropics to southeastern South America (SESA). The occurrence of this wind system is associated with huge thunderstorms (so-called Mesoscale Convective Systems Durkee et al. 2009) in this region (Salio et al. 2007). On the other hand, if the flow to the subtropics is directed mainly eastward, it leads to the establishment of the South Atlantic Convergence Zone (SACZ), a convective band that extends from the central Amazon Basin to southeastern Brazil (SEBRA) (Carvalho et al. 2004). The oscillation between these two circulation regimes leads to the so-called South American rainfall dipole and constitutes the dominant mode of intraseasonal variability of the monsoon (Nogués-Paegle and Mo 1997).

Fig. 15.1
figure 1

Topography of South America and key features of the South American monsoon system, including the main low-level wind directions, the Intertropical Convergence Zone (ITCZ), the South Atlantic Convergence Zone (SACZ), and the South American Low-Level Jet (SALLJ). The geographical regions southeastern South America (SESA), southeastern Brazil (SEBRA), and Amazon Basin are referred to in the main text

3 Data and Methods

Data

We employ satellite-derived rainfall data from the Tropical Rainfall Measurement Mission (TRMM 3B42 V7, Huffman et al. 2007) with 3 hourly temporal and 0. 25× 0. 25 spatial resolutions, resulting in N = 48, 400 time series with values measures in mmh −1. Daily (3 hourly) extreme events are defined locally as points in time for which the corresponding rainfall rate is above the 90th (99th) percentile for the corresponding time series, confined to the monsoon seasons (DJF) from 1998 to 2012.

Event Synchronization

Event Synchronization The nonlinear synchronization measure we employ is called Event Synchronization and was first introduced in Quian Quiroga et al. (2002). It quantifies the synchronicity between events in two given time series x i and x j by counting the number of events that can be uniquely associated with each other within a prescribed maximum delay, while taking into account their temporal ordering: Consider two event series \(\{e_{i}^{\mu }\}_{1\leq \mu \leq l}\) and \(\{e_{j}^{\nu }\}_{1\leq \nu \leq l}\) containing l events, where e i μ denotes the time index of the μ-th event observed at grid point i. In order to decide if two events \(e_{i}^{\mu }\) and e j ν with \(e_{i}^{\mu }> e_{j}^{\nu }\) can be assigned to each other uniquely, we first compute the waiting time \(d_{ij}^{\mu,\nu }:= e_{i}^{\mu } - e_{j}^{\nu }\) and then define the dynamical delay:

$$\displaystyle{ \tau _{ij}^{\mu \nu } =\min \frac{\{d_{ii}^{\mu,\mu -1},d_{ ii}^{\mu,\mu +1},d_{ jj}^{\nu,\nu -1},d_{ jj}^{\nu,\nu +1}\}} {2} }$$
(15.1)

We further introduce a maximum delay τ max which shall serve as an upper bound for the dynamical delay. If then \(0 <d_{ij}^{\mu,\nu } \leq \tau _{ij}^{\mu \nu }\) and \(d_{ij}^{\mu,\nu } <\tau _{\mathrm{max}}\), we count this as a directed synchronization from j to i:

$$\displaystyle{ S_{ij}^{\mu \nu } = \left \{\begin{array}{l l} 1&\quad \text{if}\quad 0 <d_{ij}^{\mu,\nu } \leq \tau _{ij}^{\mu \nu }\quad \text{and}\quad d_{ij}^{\mu,\nu } \leq \tau _{\mathrm{max}}, \\ 0&\quad \text{else}.\\ \end{array} \right. }$$
(15.2)

Directed Event Synchronization from j to i is given as the sum of all S ij μ ν (for fixed i and j) (Boers et al. 2014a, 2015b):

$$\displaystyle{ \mathrm{ES}_{ij}^{\mathrm{dir}}:=\sum _{\mu \nu }S_{ ij}^{\mu \nu }. }$$
(15.3)

A symmetric version of this measure can be obtained by also counting events at the very same time as synchronous and taking the absolute value of the dynamical delay in Eq. (15.2),

$$\displaystyle{ \overline{S}_{ij}^{\mu \nu } = \left \{\begin{array}{l l} 1&\quad \text{if}\quad \vert d_{ij}^{\mu,\nu }\vert \leq \tau _{ ij}^{\mu \nu }\quad \text{and}\quad d_{ ij}^{\mu,\nu } \leq \tau _{\mathrm{ max}}, \\ 0&\quad \text{else},\\ \end{array} \right. }$$
(15.4)

and computing the corresponding sum:

$$\displaystyle{ \mathrm{ES}_{ij}^{\mathrm{sym}}:=\sum _{\mu \nu }\overline{S}_{ ij}^{\mu \nu }. }$$
(15.5)

A major advantage of this measure is that it allows for a dynamical delay between events in the original time series x i and x j . In classical lead-lag analysis (using, e.g., Pearson’s correlation coefficient), this is not the case, since it only provides one single delay between the two time series, namely, the time window by which the time series x i and x j are shifted against each other. Since the various climatological mechanisms underlying the interrelations between time series measured at different locations cannot be assumed to operate on one single time scale, the temporal homogeneity assumed by a classical lead-lag analysis is not justified. Furthermore, the identification of the correct lead (or lag) is not a well-defined problem, as there may be several maxima of the correlation value over the range of leads or lags.

Network Construction

In the following, the notations ES for the measure or ES for the corresponding similarity matrix will be used if a statement applies to both versions of Event Synchronization. From the matrix ES, we derive networks by representing its strongest entries by network links. It has to be assured that these values are statistically significant. For this purpose, we construct 10, 000 surrogates of event time series preserving the block structure of subsequent events by uniformly randomly distributing the original blocks of subsequent events and compute ES for all possible pairs. From the resulting histogram of values, we obtain the threshold T 0. 95 corresponding to the 5 % confidence level. The link density of the network is then chosen such that the smallest entry of ES that is represented by a network link is above T 0. 95. In terms of the adjacency matrix A, this is captured by

$$\displaystyle{ A_{ij} = \left \{\begin{array}{l l} \mathrm{ES}_{ij}&\quad \text{if}\quad \mathrm{ES}_{ij}> T^{0.95}, \\ 0 &\quad \text{else}.\\ \end{array} \right. }$$
(15.6)

Note that the values of ES have been assigned to the links as weights. Of course, one can also set the corresponding entries of A to 1 in order to obtain an unweighted network. In case of ESsym, the corresponding network will be undirected, while for ESdir, it will be directed.

Network Measures

On undirected and unweighted networks, we will consider four different network measures: First, we consider betweenness centrality (BC), which is defined on the basis of shortest network paths, i.e., the shortest sequences of links connecting two nodes:

$$\displaystyle{ \mathrm{BC}_{i}:= \frac{\sum _{l<k\neq i}\sigma _{kl}(i)} {\sum _{l<k\neq i}\sigma _{kl}} \;, }$$
(15.7)

where σ kl denotes the total number of shortest network paths between nodes k and l and σ kl (i) the number of shortest network paths between k and l which pass through node i. Since BC is a nonlocal centrality measure, we expect BC to exhibit high values in regions which are important for the long-ranged, directed propagation of extreme events.

Second, we are interested in the mean geographical distance (MD, Boers et al. 2013) of links at each node:

$$\displaystyle{ \mathrm{MD}_{i}:= \frac{1} {DG_{i}}\sum _{j=1}^{N}A_{ ij}\mathrm{dist}(i,j) }$$
(15.8)

where dist(i, j) denotes the great-circle distance between the grid points corresponding to the nodes i and j. MD should show high values in regions where extreme events occur synchronously with extreme events at remote locations and thus quantifies similar aspects of the topology as BC, although not based on network paths. Therefore, to confirm our interpretation of BC, we would expect this measure to have a similar spatial distribution as BC.

Third, we employ the clustering coefficient, defined as the fraction of neighbors of a given node that are themselves connected:

$$\displaystyle{ \mathrm{CC}_{i}:= \frac{\sum _{j<k}A_{ij}A_{jk}A_{ik}} {\sum _{j<k}A_{ij}A_{ik}} }$$
(15.9)

CC measures complementary aspects of the topology as compared to the previous two measures and should be high in regions where extreme events exhibit large spatial coherence as, for example, due to large thunderstorms.

Furthermore, we introduce a combination of these measures, called long-ranged directedness (LD, Boers et al. 2013). For this purpose, we calculate the normalized ranks of BC, CC, and MD, denoted by NRBC, NRCC, and NRMD, respectively, and put

$$\displaystyle{ \mathrm{LD}_{i}:= \frac{1} {2}\mathrm{NRBC}_{i} + \frac{1} {2}\mathrm{NRMD}_{i} -\mathrm{ NRCC}_{i}. }$$
(15.10)

The prefactors in this definition are motivated by the fact that BC and MD are expected to quantify similar aspects of the network topology, while CC was introduced to estimate complementary properties of the network. We thus take the mean of the normalized ranks of BC and MD and subtract the normalized rank of CC. High values of LD should indicate regions which are important for the long-ranged propagation of extreme events, while low values should indicate regions where extreme events strongly cluster, but do not propagate over long spatial distances.

On directed and weighted networks, we will consider the well-known in- and out-strength, defined as

$$\displaystyle{ \mathcal{S}_{i}^{\mathrm{in}}:=\sum _{ j=1}^{N}A_{ ij}\quad \text{and}\quad \mathcal{S}_{i}^{\mathrm{out}}:=\sum _{ j=1}^{N}A_{ ji} }$$
(15.11)

On the basis of these measures, we define the measure network divergence (\(\varDelta \mathcal{S}\), Boers et al. 2014a) as the difference of in-strength and out-strength at each grid cell:

$$\displaystyle{ \varDelta \mathcal{S}_{i}:= \mathcal{S}_{i}^{\mathrm{in}} -\mathcal{S}_{ i}^{\mathrm{out}}. }$$
(15.12)

This measure can be used to identify source and sink regions of extreme events on a continental scale. In order to investigate where extreme events originating from a given source region go to, we define the strength out of a geographical region R into a node i as

$$\displaystyle{ \mathcal{S}_{i}^{\mathrm{in}}(R):= \frac{1} {\vert R\vert }\sum _{j\in R}A_{ij}\;, }$$
(15.13)

where \(\vert R\vert\) denotes the number of grid cells contained in R.

4 Results and Discussion

We will first use undirected and unweighted networks to show that the methodology introduced above reveals climatic features which are consistent with the scientific understanding of the South American monsoon system. This is mainly intended as a proof of concept. Thereafter we will show that, using directed and weighted networks, the approach can in certain situations be used to predict extreme events.

Climatic Analysis of Extreme Rainfall

We compute the measures BC, MD, CC, and LD for undirected and unweighted networks with a prescribed link density of 2 %. These networks are derived from ESsym computed for daily events above the 90th percentile.

BC and MD show a very similar spatial distribution, with high values over the ITCZ, the Amazon Basin, as well as at the eastern slopes of the Andes along the entire mountain range (Fig. 15.2a, b). These regions are in fact crucial for the large-scale distribution of extreme events over the South American continent: The low-level trade winds drive them from the tropical Atlantic toward the continent (Zhou and Lau 1998), and upon a cascade of rainfall and evapotranspiration over the Amazon Basin (Eltahir and Bras 1993), the winds force the moist air against the Andean slopes, leading to so-called orographic rainfall (Bookhagen and Strecker 2008). The positioning of the branch of high BC and MD values from the western Amazon Basin along the Andean slopes toward the subtropics corresponds to the climatological location of the SALLJ, which provides the moisture necessary for extreme rainfall events (Marengo et al. 2004).

Fig. 15.2
figure 2

Network measures for undirected and unweighted networks encoding the synchronization structure of daily rainfall events above the 90th percentile of the monsoon season (DJF). (a) Betweenness centrality (BC). (b) Mean geographical distance (MD). (c) Clustering coefficient (CC). (d) Long-ranged directedness

In contrast, the only regions over the mainland that exhibit high values of CC (Fig. 15.2c) are SESA, where some of the largest thunderstorms on Earth occur (Zipser et al. 2006), and the eastern coastal regions of the continent, which are exposed to the landfall of the so-called squall lines (Cohen et al. 1995).

By construction, LD shows high values where BC and MD both show high values and particularly low values in most parts of SESA, where CC is high. However, LD is also relatively high in SEBRA, concisely corresponding to the climatological position of the SACZ (Carvalho et al. 2002, 2004). These high LD values indicate the highly dynamical character of extreme events associated with this convergence zone.

The spatial distributions of the four measures BC, MD, CC, and LD hence reveal these important climatological features, and our interpretation of these network measures is thus consistent with the understanding of the South American monsoon system (Boers et al. 2013).

Prediction of Extreme Rainfall

We construct directed and weighted networks on the basis of ESdir (cf. Eq. 15.6), computed for 3 hourly events above the 99th percentile. Network divergence \(\varDelta \mathcal{S}\) of the resulting network exhibits negative values (i.e., source regions for extreme events) over the ITCZ and the Amazon Basin, followed by pronounced positive values (i.e., sinks of extreme events) at the eastern slopes of the Andes (Fig. 15.3a). Surprisingly, SESA, which was described as one of the exit regions of the low-level flow from the tropics, is a pronounced source region of extreme rainfall. In order to reveal where these events subsequently propagate, we compute the strength out of the spatial box denoted by SESA in Fig. 15.3 and infer that while some extreme events propagate northeastward, there also exits a concise signature of targets extending from SESA to the eastern slopes of the Central Andes in Bolivia. Thus, extreme rainfall in the Bolivian Andes should be predictable from preceding events in SESA. In Boers et al. (2014a), the authors revealed the interplay of frontal systems approaching from the South, the Andean orography, and the low-level moisture flow from the tropics as responsible climatic mechanism. This interplay leads to the opening of a wind channel conveying warm and moist air from the western Amazon Basin to SESA. These air masses collide with cold air in the aftermath of the frontal system, leading to abundant precipitation. The typical propagation trajectory of the associated rainfall clusters is dictated by the northward movement of the frontal system and its alignment with respect to the Andean mountain range. Based on these insights, a simple forecast rule is formulated in Boers et al. (2014a), which predicts 60 % (90 % during positive phases of the El Niño Southern Oscillation) of extreme rainfall events at the eastern slopes of the Central Andes.

Fig. 15.3
figure 3

Network measures for directed and weighted networks encoding the temporally resolved synchronization structure of 3 hourly rainfall events above the 99th percentile of the monsoon season (DJF). (a) Network divergence (\(\varDelta \mathcal{S}\)). (b) Strength out of SESA (\(\mathcal{S}^{\mathrm{in}}(\mathrm{SESA})\)), where SESA is defined as the spatial box extending from 35S to 30S and from 60W to 53W

5 Conclusion

In this chapter, we showed how complex networks can be employed to reveal spatial patterns encoding the dynamical synchronization of extreme rainfall events and how this can be used for climatic analysis as well as to estimate the predictability of extreme rainfall. We constructed networks on the basis of synchronization of extreme rainfall events in South America and showed that combining the network measures betweenness centrality, mean geographical distance, and clustering allowed to identify the main features of the South American monsoon system. Furthermore, we showed that a directed network approach can be applied to reveal typical propagation patterns of extreme rainfall events. Specifically, a pathway from southeastern South America to the Central Andes was revealed, which provides the basis for predicting extreme events in the Central Andes.

Further Reading

Similar approaches to the techniques described in this chapter have been taken to study spatial patterns of extreme rainfall in the Indian monsoon system (Malik et al. 2012; Stolbova et al. 2014). The methodology introduced here has also been applied to reveal the specific synchronization pathways associated with the two main circulation regimes of the South American monsoon described in Sect. 15.2, indicating that the Rossby waves responsible for frontal systems in fact control extreme event synchronization over the entire South American continent (Boers et al. 2014c). Directed networks have in addition been used to identify the geographical origins of extreme rainfall events in the main hydrological catchments along the Andean mountain range in view of their potential predictability (Boers et al. 2015b). Furthermore, the techniques presented here can be employed to compare different datasets and in particular to evaluate the dynamical implementation of extreme events in global and regional climate models (Boers et al. 2015a). While all these approaches are static in the sense that networks are constructed for the entire time frame available, in Boers et al. (2014b) it is shown how this can be generalized to a dynamical analysis using sliding windows. In that study, it was revealed that the network clustering of strong evapotranspiration events strongly depends on the phase of the El Niño Southern Oscillation.