1 Introduction

Extreme high-temperature events (EHTEs) have a major impact on society, socioeconomic development, and ecological systems (Craven et al. 2007; Mohammed and Tarpley 2009; Barlow et al. 2015; Schaller et al. 2018). Long-lived EHTEs are also called heatwave events, and have occurred frequently in recent decades with destructive consequences (García-León et al. 2021). The European heatwave event in the summer of 2003 broke temperature records in numerous cities and led to the deaths of > 35,000 people, large economic losses, and forest fires over large regions (Gruber et al. 2003; Poumade`re et al. 2005). In the summer of 2010, Eastern Europe and western Russia experienced another extreme heatwave event, which lasted for more than one month and resulted in > 15,000 deaths and large economic losses (Gilbert 2010; Barriopedro et al. 2011; Katsafados et al. 2014). Eastern China experienced a record-breaking heatwave during the summer of 2013, which led to an economic loss of 59 billion Chinese yuan (Hou et al. 2014; Sun et al. 2014). Northeast Asia was affected by a destructive extreme heat event during the summer of 2018, resulting in electricity disruption in many cities (Lu et al. 2020). Numerous studies have noted that the frequency of EHTEs will increase due to global warming (Ding et al. 2010; Zhou and Wang 2016). As such, it is necessary to further understand the physical mechanisms that cause EHTEs. Anomalous circulation is considered to be a major factor in the occurrence of EHTEs (Jézéquel et al. 2018; Yeo et al. 2019). Downflow of air increases the air temperature due to adiabatic heating. In addition, it clears the cloud cover, which allows more solar radiation to reach the surface (Gershunov et al. 2009). However, Chen and Lu (2015) noted that the atmospheric circulation associated with EHTEs is location-dependent. Nicolas et al. (2017) investigated the mechanisms that cause heatwaves over eastern China, and highlighted the importance of the increase in the mean temperature. In addition, numerous studies have shown that remote meteorological factors are also responsible for EHTEs (Xie et al. 2019). For example, Hu et al. (2012) suggested that warmer Indian Ocean conditions favoured the occurrence of EHTEs across the southern Yangtze River valley in China. Muyuan et al. (2018) proposed that the heatwaves in Europe during the summer of 2018 were related to downstream European blocking and a positive North Atlantic Oscillation.

Multiple factors are responsible for EHTEs, but the precise mechanisms involved remain unclear. Therefore, forecasting of EHTEs is challenging. Slight errors in the initial conditions can increase rapidly and degrade the forecasts, because of the chaotic nature of the atmosphere (Lorenz 1963). The limitations of forecast models also lead to forecast uncertainties. As a result, the predictability of EHTEs is a complex issue. Since the studies of Thompson (1957) and Lorenz (1963) of atmospheric predictability, the predictability of extreme events has been an important field of research. Several methods have been proposed to study the predictability of extreme events. Mu et al. (2003) proposed the conditional nonlinear optimal perturbation (CNOP) method to study the predictability of dynamical systems. The CNOP represents the fastest growing initial perturbation and has the maximum nonlinear evolution over a prescribed time interval. It has been widely used in investigations of the predictability of the El Nino–Southern Oscillation (ENSO) and tropical cyclones (Duan and Mu 2009; Mu et al. 2009; Xu et al. 2021). The statistical information of dynamical nonlinear systems has also been used to predict extreme events. Mohamad and Sapsis (2018) developed a sequential method to constrain the statistics of extreme events. For two cases (i.e., the nonlinear oscillator and a high-dimensional system), this method can accurately determine the statistics of extreme events, indicating the effectiveness of predicting extreme events. Kaveh and Salarieh (2020) studied chaotic systems and proposed that a statistical representation of the system could predict extreme events. This method was shown to be effective for ideal models, such as the Logistic and Henon maps, and a physiological control system. Apart from the aforementioned methods, machine learning methods have been widely used in studies of the predictability of extreme events. Ham et al. (2019) applied machining learning, based on a convolutional neural network, to forecast ENSO over multiple years, which outperformed dynamical forecast models. Ratnam et al. (2020) used machine learning, based on artificial neural networks (ANNs), to predict the Indian Ocean Dipole. The ANN models were also more skilful than the dynamical forecast models.

The Lyapunov exponent (LE) is a classic index used to quantify rates of error growth, which are linked to predictability. Early studies used the maximal LE to quantify atmospheric predictability (Fraedrich et al. 1986; Wolf et al. 1985). A larger maximal LE indicates a lower predictability. However, the maximal LE characterises the rates of error growth over the whole attractor, whereas the predictability reflects the local atmospheric properties (Eckmann and Ruelle 1985). Therefore, the local or finite-time LE was proposed to estimate atmospheric predictability (Nese 1989; Yoden and Nomura 1993). Although the local LE can quantify the predictability, it also has limitations. Firstly, the initial error needs to be infinitesimally small. Secondly, the local LE only characterises the evolution of error growth in the linear regime, and does not account for nonlinear error growth. To overcome these limitations, Ding and Li (2007) proposed the nonlinear local LE (NLLE) method to study predictability. This method accounts for nonlinear error growth of a finite size, and has been widely used in studies of atmospheric predictability (Li and Ding 2011; Feng et al. 2014, 2018; Ding et al. 2015; Hou et al. 2018, 2022; Li et al. 2020a; He et al. 2021). However, the NLLE method is not applicable to studies of the predictability of extreme events. To overcome this problem, Li et al. (2019) proposed the backward NLLE (BNLLE) method, based on the NLLE method. The BNLLE method studies the dynamics of error growth before a given state. After obtaining the dynamical information regarding error growth before the extreme state, the predictability of the state can be determined. The BNLLE method has been used to study the local predictability of specific states in theoretical models (Li et al. 2020b, 2021, 2022). However, the BNLLE method has not been applied to actual examples of extreme weather and climate events.

Two EHTEs occurred in Europe during late June and July of 2019, which broke all-time high-temperature records in some European countries (Vautard et al. 2020). In total, 1435 deaths were caused by these EHTEs. The objective of this study was to use the BNLLE method to estimate the local predictability of these two EHTEs. In addition, the dynamical information regarding the error growth was investigated.

The remainder of this paper is organised as follows. In Sect. 2, we describe the BNLLE method, the data used in this study, and the two EHTEs. The local predictability of these two EHTEs, based on the BNLLE method, is assessed in Sect. 3. Finally, a discussion and conclusions are presented in Sect. 4.

2 Data and methodology

2.1 Data

The forecasts of the surface (2 m) temperature were taken from the THORPEX Interactive Grand Global Ensemble (TIGGE) dataset (https://apps.ecmwf.int/datasets/data/tigge/levtype=sfc/type=fc/), which consists of ensemble forecast data from 13 global numerical weather prediction centres. The model of the European Centre for Medium-Range Weather Forecasts (ECMWF) outperforms better than other centres. Therefore, in this study, we used the ensemble forecast daily data from the ECMWF, with a horizontal resolution of 0.5° × 0.5° (Bougeault et al. 2010; Swinbank et al. 2016). The forecasts are made four times a day (00/06/12/18Z), and the forecast range is up to 15 days. There is a total of 51 ensemble forecast members, including 1 control forecast and 50 perturbed forecasts. The study period is June and July 2019 in Europe, when two EHTEs occurred. The daily observed surface (2 m) temperatures used in this study are from the ERA-interim analysis dataset (https://apps.ecmwf.int/datasets/data/interim-full-daily/levtype=sfc/), with a 0.5° grid (Dee et al. 2011). The daily ERA-interim data are also updated four times a day (00/06/12/18Z).

2.2 BNLLE method

The BNLLE method was developed on the basis of the NLLE method. To estimate the predictability of an extreme state, the BNLLE method first takes the extreme state as the target state, and then investigates the dynamical information regarding error growth during the period before the target state. From this dynamical information, the prediction lead time of the extreme state can be determined. The BNLLE method is as follows.

For a dynamical system, the growth of initial errors \({\varvec{\delta}}\left( {t_{0} } \right)\) perturbed on an initial state \({\varvec{x}}\left( {t_{0} } \right)\) is governed by

$${\varvec{\delta}}\left( {t_{0} + \tau } \right) = {\varvec{\eta}}\left( {{\varvec{x}}\left( {t_{0} } \right),{\varvec{\delta}}\left( {t_{0} } \right),\tau } \right){\varvec{\delta}}\left( {t_{0} } \right),$$
(1)

where \({\varvec{\eta}}\left( {{\varvec{x}}\left( {t_{0} } \right),{\varvec{\delta}}\left( {t_{0} } \right),\tau } \right)\) is the nonlinear error propagation term that propagates the initial errors \({\varvec{\delta}}\left( {t_{0} } \right)\user2{ }\) forward to \({\varvec{\delta}}\left( {t_{0} + \tau } \right)\). By this time, the dynamical point on the trajectory that originated from the initial state \({\varvec{x}}\left( {t_{0} } \right)\) has evolved to a future state \({\varvec{x}}\left( {t_{1} } \right)\). Given an extreme state \({\varvec{x}}\left( {t_{ex} } \right)\) in phase space, Eq. (1) allows the growth of initial errors during the period before the extreme state to be expressed as follows:

$${\varvec{\delta}}\left( {t_{ex} } \right) = {\varvec{\eta}}\left( {{\varvec{x}}\left( {t_{ex - \tau } } \right),{\varvec{\delta}}\left( {t_{ex - \tau } } \right),\tau } \right){\varvec{\delta}}\left( {t_{ex - \tau } } \right),$$
(2)

where \({\varvec{x}}\left( {t_{ex - \tau } } \right)\) is the state at time \(t_{ex - \tau }\) prior to the extreme state \({\varvec{x}}\left( {t_{ex} } \right)\) and \({\varvec{\delta}}\left( {t_{ex - \tau } } \right)\) represents the errors at time \(t_{ex - \tau }\). \(ex - \tau\) should be > 0. The average nonlinear rate of error growth from time \(ex - \tau\) to time \(ex\) is

$${{ \lambda }}\left( {{\varvec{x}}\left( {t_{ex - \tau } } \right),{\varvec{\delta}}\left( {t_{ex - \tau } } \right),\tau } \right){ = }\frac{{1}}{{\uptau }}{\text{ln}}\frac{{{{\parallel\delta}}\left( {t_{ex} } \right)\parallel}}{{{{\parallel\delta}}\left( {t_{ex - \tau } } \right)\parallel}}{, }$$
(3)

where \({\varvec{\lambda}}\left( {{\varvec{x}}\left( {t_{ex - \tau } } \right),{\varvec{\delta}}\left( {t_{ex - \tau } } \right),\tau } \right){\varvec{\delta}}\left( {t_{ex - \tau } } \right)\) is the NLLE (Ding and Li 2007), which depends on the state \({\varvec{x}}\left( {t_{ex - \tau } } \right)\), initial errors \({\varvec{\delta}}\left( {t_{ex - \tau } } \right)\), and integral time \(\tau\). For a number of initial errors perturbed on the state \({\varvec{x}}\left( {t_{ex - \tau } } \right)\), the mean NLLE is given as follows:

$${\overline{\lambda }}\left( {{\varvec{x}}\left( {t_{ex - \tau } } \right),{\varvec{\delta}}\left( {t_{ex - \tau } } \right),\tau } \right){ = }{{\langle\lambda}}_{{\varvec{i}}} \left( {{\varvec{x}}\left( {t_{ex - \tau } } \right),{\varvec{\delta}}\left( {t_{ex - \tau } } \right),\tau } \right)\rangle_{{\text{n}}} ,$$
(4)

where \({{\langle}}\rangle_{{\text{n}}}\) denotes the average of samples with a large size n (n \(\to \infty\)). The mean relative growth of the initial error (RGIE) can be obtained as follows:

$${\overline{\text{E}}}\left( {{\varvec{x}}\left( {t_{ex - \tau } } \right),{\varvec{\delta}}\left( {t_{ex - \tau } } \right), \tau} \right){\text{ = exp}}\left( {{\overline{\lambda }}\left( {{\varvec{x}}\left( {t_{ex - \tau } } \right),{\varvec{\delta}}\left( {t_{ex - \tau } } \right), \tau} \right){\uptau }} \right).$$
(5)

Ding and Li (2007) noted that the RGIE \({\overline{\text{E}}}\left( {{\varvec{x}}\left( {t_{ex - \tau } } \right),{\varvec{\delta}}\left( {t_{ex - \tau } } \right), \tau} \right)\) would eventually reach saturation. In this case, the distribution of initial states with a large number would evolve to the climatological distribution, and information regarding the initial states would be completely lost; the forecast would no longer be credible. Therefore, to determine the prediction lead time of the extreme state \({\varvec{x}}\left( {t_{ex} } \right)\), we need to find its corresponding initial state \({\varvec{x}}\left( {t_{ex - \tau } } \right)\). For the corresponding initial state \({\varvec{x}}\left( {t_{ex - \tau } } \right)\), the initial RGIE should reach saturation at time \(t_{ex}\). Figure 1 shows the procedure used to search for the corresponding initial state. The extreme state \({\varvec{x}}\left( {t_{ex} } \right)\) is in the time-series data [\(x\left( {t_{1} } \right)\), \(x\left( {t_{2} } \right)\), …, \(x\left( {t_{n} } \right)\), …]. To determine its prediction lead time, it is necessary to examine the dynamical information regarding error growth during the period before the extreme state. Based on the dynamical information, the corresponding initial state of the extreme state is determined, and then the predictability of the extreme state. The dynamical information regarding error growth of a state \({\varvec{x}}\left( {t_{ex - 1} } \right)\) before the extreme state is now considered. Firstly, a large number of initial errors with the same magnitudes, but in different directions, are superimposed on the state \({\varvec{x}}\left( {t_{ex - 1} } \right)\). The RGIE \({\overline{\text{E}}}\left( {{\varvec{x}}\left( {t_{ex - 1} } \right),{\varvec{\delta}}\left( {t_{ex - 1} } \right), \tau} \right)\) is then calculated. If it reaches the saturation level at time \(t_{ex}\), the state \({\varvec{x}}\left( {t_{ex - 1} } \right)\) is the corresponding initial state, and the prediction lead time of the extreme state \({\varvec{x}}\left( {t_{ex} } \right)\) is determined to be one time unit. If it does not reach the saturation level at time \(t_{ex}\), we then choose the state \({\varvec{x}}\left( {t_{ex - 2} } \right)\) and repeat this process. Based on this procedure, the corresponding initial state \({\varvec{x}}\left( {t_{ex - \tau } } \right)\) is determined, and the prediction lead time of the extreme state \({\varvec{x}}\left( {t_{ex} } \right)\) is \(\tau\). It should be noted that when several corresponding initial states are found, an initial state that maximizes the local predictability limit, is determined as the corresponding initial state (Li et al. 2019).

Fig. 1
figure 1

Schematic diagram of determining the corresponding initial state of the extreme state. Red and blue solid circles represent extreme state and the corresponding initial state, respectively. The black solid circle represents the state \({\varvec{x}}\left( {t_{ex - 1} } \right)\) at the time of \(t_{ex - 1}\)

3 Results

3.1 Two extreme high-temperature events in Europe

The two EHTEs in Europe during late June and July 2019 were considered in this study. Figure 2a shows the surface air temperature (SAT) anomalies in Europe during the summer (June and July) of 2019. In northeastern Europe, the SATs were lower than normal and had negative SAT anomalies. Most other regions had anomalously high SATs. The SAT anomalies were not uniform throughout Europe. Some regions had higher temperatures, while other regions had lower temperatures. For the region 36–55° N/0–60° E, the anomalies had a relatively uniform spatial distribution. Therefore, we adopted this area as the study region. The time-series of daily mean SATs averaged over this region is shown in Fig. 2b. There were two obvious periods of higher SATs during late June and July. For the first period, the average SATs began to increase on June 17 and peaked on June 25. For the second period, the average SATs began to increase on July 15 and peaked on July 25. These two periods correspond to the two EHTEs.

Fig. 2
figure 2

a Average surface air temperature anomalies (relative to 1981–2010, units: ℃) during June and July of 2019. b Time-series of daily mean surface air temperature (units: ℃) averaged over the regions (36–55° N/0–60° E) denoted the black box in summer (June and July) of 2019. The black box in Fig. 2a denotes the study region. The shadings in b show two periods for the two EHTEs

3.2 Local predictability of the two extreme high-temperature events

We used the BNLLE method to quantify the local predictability of the two EHTEs. With the BNLLE method, an extreme condition first needs to be set as the target condition. The maximum temperatures during the two EHTEs occurred on June 25 and July 25, and thus the temperature conditions on these days were the target conditions. Figure 3 shows the spatial distributions of maximum SATs on the highest-temperature days of the two EHTEs. The maximum average temperatures on these days exceeded 40 °C in some regions. On June 25, the hottest regions were mainly located east of 45°E. On July 25, the highest temperatures were mainly in the western and eastern regions.

Fig. 3
figure 3

Spatial distributions of daily mean surface air temperatures (shading, ℃) on a June 25 and b July 25

To determine the predictability of the extreme conditions, the corresponding initial conditions should be first found. Based on the BNLLE method, the RGIEs averaged over the region can be calculated from the 51 ensemble forecasts and the ERA-interim analysis dataset. When the average starting RGIEs of the initial condition reach saturation on the day of the extreme high temperatures, this initial condition is the corresponding initial condition. For June 25, we first verified whether the temperature condition on the previous day (June 24) was the corresponding initial condition. If it was not the corresponding initial condition, then we repeated this process for previous days until we determined which previous day was the corresponding initial condition. We conducted the same procedure for July 25.

The TIGGE data provide forecasts from 00Z on each day, and the ERA-interim analysis data are available for four times each day (00Z, 06Z, 12Z, 18Z). Therefore, 00Z is the initial time, and the initial errors can be calculated from the difference between the TIGGE and ERA-interim analysis data. Errors at other forecast times can also be obtained from the two datasets. Based on the BNLLE method, we found that the SAT conditions on June 14 were the initial conditions corresponding to the extreme high temperatures on June 25. Figure 4a shows that the average RGIEs at 12Z for the first EHTE vary with time. For the first EHTE, the initial error size is 0.13 on June 14, and increases steadily in the early stages. On June 26, the average RGIE reached a peak and then decreased. On June 27 and 28, the average RGIEs remained largely unchanged. The average of the RGIEs in the last five days was taken to be the saturation level. Given that the average RGIE reached the saturation level on the day of the highest temperature (June 25), the prediction lead time of the first EHTE is 11 days. For the second EHTE, the SAT conditions on July 16 are the initial conditions corresponding of the highest temperatures on July 25. Figure 4b shows that the initial error size was 0.07 (i.e., less than that of the first EHTE), which increased steadily before July 25. After July 25, the RGIE stopped increasing and remained broadly constant. The saturation level was also obtained by calculating the average of the RGIEs for the last five days. The average RGIE reached the saturation level on the day of the highest temperature (July 25). Therefore, the prediction lead time of the second EHTE is nine days. For these two EHTEs, we also examined whether these events could have been predicted if the forecasts were made on June 13 and July 15. Our results show that the RGIEs reached saturation levels before the two highest-temperature days (not shown). As such, the forecasts made on June 13 and July 15 could not have predicted the two EHTEs. However, the temperature conditions on June 14 and July 16 would have allowed the prediction of the two EHTEs, and therefore represent the predictability limits of these two events.

Fig. 4
figure 4

RGIEs averaged over the region at 12Z on each day for a the first and b the second EHTEs. Black dashed lines denote where the RGIEs reach the saturation level

Figure 5 shows the corresponding initial conditions of the two EHTEs. The southeastern region was the hottest (up to 28 °C). Other regions, such as the northwestern and northeastern areas, had lower temperatures (< 20°). For the corresponding initial condition of July 16, the hottest region was located in the southeast (up to 30 °C; Fig. 5b). Most northern regions had temperatures of < 20 °C. The two corresponding initial conditions are important for forecasting of the two EHTEs. The two EHTEs can be predicted in advance from the two corresponding initial conditions, which are the most distant conditions from the two extreme conditions. In addition, the initial conditions eventually evolve into the extreme conditions in 11 and 9 days, respectively. Therefore, the two corresponding initial conditions are closely related to the predictability of the two EHTEs and may be associated with the precursor signals of the EHTEs.

Fig. 5
figure 5

Same as Fig. 3, but for the corresponding initial conditions on a June 14 and b July 16

The average RGIEs shown in Fig. 4 are the dynamical information regarding error growth averaged over the whole study area, and do not reflect the regional characteristics. The regional dynamical characteristics are also an essential feature of the predictability and require further investigation. Figure 6 shows the spatial variations of the NLLEs at 12Z from June 14 to 28, which represents the average nonlinear growth rates of errors on those days. Positive NLLEs indicate that the errors were increasing from the initial errors at the initial time (00Z on June 14). Negative NLLEs indicate that the errors decreased as compared with the initial time. At 12Z on the first day (Fig. 6a), the positive NLLEs are located in the southern regions, whereas the negative NLLEs are mainly located in the northern regions, particularly north of the Caspian Sea. This indicates that the errors were increasing in the southern regions and decreasing in the northern regions from 00 to 12Z on June 14. In addition, the NLLEs on the first day have the largest spatial distribution. With increasing forecast time, the positive NLLEs tend to decrease and the negative NLLEs tend towards zero. Lorenz (1969) noted that smaller initial errors have a higher growth rate than larger initial errors. As such, the initial errors become larger in scale with increasing forecast time, which leads to a decline in growth rates. On June 15 and 16, the NLLEs in most regions decreased, but were still not zero. Most regions had relatively large NLLEs, indicative of high rates of error growth. The high rates are mainly located in the southwestern regions and Black Sea. The negative NLLEs are mainly distributed in the northwestern regions and to the north and east of the Caspian Sea. This indicates that the errors in these regions were still smaller than those for the initial time (00Z on June 14; i.e., these regions still have a decreasing error rate from the initial time). From June 17 to 19, the NLLEs continued to decrease and approach zero, demonstrating that the errors stopped increasing. From June 20 to 28, the NLLEs are close to zero in most regions. Only a few regions have positive NLLEs, which are slightly above zero. These regions have low rates of error growth. Based on the variations of NLLEs from June 14 to 28, the different geographical regions played different roles in the error growth rate over the region. The southwestern regions and the Black Sea caused rapid error growth, whereas the northern regions reduced the error growth rates during the early forecasting periods.

Fig. 6
figure 6

Spatial variations of NLLEs at 12Z from June 14 to June 28 for the first EHTE. ao Denote the dates from June 14 to June 28, respectively

Figure 7 shows the spatial variations in the corresponding RGIEs at 12Z on each day from June 14 to 28. The error growth exhibits different spatial distributions during the forecast periods, similar to those of the NLLEs. On the first seven days, from June 14 to 20, the larger positive RGIEs are mainly located in the southwestern regions and Black Sea, indicating larger error growth. Most northern regions have negative RGIEs, especially north of the Caspian Sea. For these regions, the negative RGIEs demonstrate that the errors are smaller than those at the initial time. However, the errors kept increasing, as reflected by the error growth rates (Fig. 6). As the forecast time increases, the errors for most northern regions exhibit rapid growth. The largest error growth is mainly located in the northwestern area and to the north of the Caspian Sea. On June 25, the larger positive RGIEs are mainly located over the entire region, and only a few areas have negative RGIEs. From June 25 to 28, the RGIEs over the entire area remain unchanged, demonstrating the errors have reached the saturation level. In general, the spatial variation in the RGIEs corresponds to that in the NLLEs, and it is clear that the geographical regions have affected the error growth.

Fig. 7
figure 7

Same as Fig. 6, but for the RGIEs

Figure 8 shows the spatial variations in NLLEs at 06Z on each day from July 16 to 30 for the second EHTE. On the first three days, larger NLLEs occur in all regions, indicating the high rates of error growth at this stage. The first day has the highest rate of error growth. The positive NLLEs are mainly distributed over southern regions, while the negative NLLEs are mainly distributed over northern regions. Unlike the first EHTE, the Black Sea region has a lower rate of error growth, which is close to zero. From July 19 to 24, the NLLEs across the whole region kept decreasing and NLLEs for some regions are close to zero. On the last six days, more regions have NLLEs close to zero, and only a few regions have NLLEs above zero.

Fig. 8
figure 8

same as Fig. 6, but for the second EHTE

Figure 9 shows the spatial distributions of corresponding RGIEs at 12Z from July 16 to 25 for the second EHTE. Like the first EHTE, the RGIEs have a non-uniform spatial distribution, indicating a dependence on the geographical region. Northern regions have positive RGIEs, whereas southern regions have negative RGIEs. From July 16 to 22, the errors increased for most regions and some northern regions have RGIEs greater than zero. From July 23 to 25, most regions have larger RGIEs, except the Black Sea, where the RGIEs are still negative. From July 26 to 30, the RGIEs are largely unchanged, indicating that the errors reached the saturation level. Compared with the first EHTE, the second EHTE has larger errors east of the Caspian Sea, and smaller errors in the southwestern region and Black Sea during the saturation periods. Therefore, the geographical regions played different roles in the error growth for the two EHTEs.

Fig. 9
figure 9

Same as Fig. 7, but for the second EHTE

From the dynamics of the error growth, the local predictability limits of the two EHTEs were determined to be 11 and 9 days by the BNLLE method. We further verified whether these local predictability limits are reasonable. The root-mean-square error (RMSE; Fortin et al. 2014) was used to measure the magnitude of forecast errors and the forecast skills between the ensemble mean and observed data. The RMSE is different from the RGIE, which records the relatively exponential nature of growth of the initial errors. Figure 10 shows the variations in daily mean RMSEs from June 14 to 28 for the first EHTE. During the first six days, the RMSEs are relatively small in all regions. The northern regions have smaller RMSEs and the southern regions have larger RMSEs. From June 20 to 22, the RMSEs for the northeastern region became larger. Some other regions also exhibited increases in RMSEs. On June 23, the larger RMSEs in the northeastern regions extended to the southern regions, and the northwestern regions had large increases in RMSEs. On June 24 and 25, most regions have larger RMSEs, and the RMSEs in the northwestern regions underwent a marked increase. During the last three days from June 26 to 28, larger RMSEs characterised a wider region as compared with the previous 12 days, especially in the western and northeastern regions (i.e., > 7 °C). Figure 10p shows the average RMSEs over the region from June 14 to 28. The RMSEs increased steadily during the first six days, and then increased rapidly from June 20 to 26. On June 26, the RMSEs reached a peak and then started to decrease, albeit at high values. Therefore, the ECMWF ensemble prediction system had a higher forecasting skill during the first six days. However, the forecasting skill decreased gradually as the forecast time increased. On June 26 to 28, the forecast errors were large over most regions, indicating degradation of the forecast skill. Therefore, if a forecast had been made on June 14, the extreme high temperatures on June 25 could have been predicted and, as such, the predictability lead time of 11 days is robust.

Fig. 10
figure 10

Variations of daily mean root-mean-squared error over the regions as a function of time for the first EHTE

Figure 11 shows the variations in daily mean RMSEs from July 16 to 30. Like the first EHTE, the RMSEs on the first six days have smaller values, and increase steadily. The RMSEs for the northwestern and southeastern regions then increase rapidly. On July 25, the RMSEs of > 7 °C in these two regions increased more rapidly than in the other regions. For the last six days, most regions have larger RMSEs, indicating a low forecasting skill. The average RMSEs over all regions are relatively small in the first six days and increase steadily (Fig. 11p). From July 22, the RMSEs increase markedly. From July 25 to 26, the RMSEs reached the saturation level. Subsequently, the RMSEs continued to increase from July 27 to 30. Therefore, the ECMWF ensemble prediction system had a higher forecasting skill during the first six days, and the forecasting skill clearly decreased in some of the following days. On July 26 to 30, the RMSEs became larger for most regions, indicating the forecasting skill was negligible. Therefore, the predictability lead time of nine days is also robust.

Fig. 11
figure 11

Same as Fig. 10, but for the second EHTE

4 Discussion and conclusions

Extreme high-temperature events are destructive to society, socioeconomic development, and ecological systems, and are occurring more frequently. Accurate forecasts and an enhanced knowledge of the predictability of EHTEs are essential. In this paper, a new method (the BNLLE method) was used to estimate the local predictability of two EHTEs that occurred in Europe during the summer of 2019. The BNLLE method was developed based on the NLLE method, and it considers the nonlinear characteristics of chaotic systems. To quantify the local predictability of extreme conditions, the BNLLE method focuses on the dynamical characteristics of error growth preceding the extreme condition. From the dynamical characteristics of the error growth, the BNLLE method searches backward for the corresponding initial state of the extreme condition. When the corresponding initial state is determined, the local predictability of the extreme condition is able to be determined.

Based on the BNLLE method, the temperature conditions on June 14 and July 16 were determined to be the corresponding initial conditions for the two EHTEs. Based on the temporal variations in average RGIEs over the region, the errors underwent steady growth during the early period. This demonstrates that the errors evolved in a linear regime, and that the forecasting skills were reliable. For the two EHTEs, the errors took 11 and 9 days to reach the saturation level, which is when the two highest-temperature days occurred. During the saturation periods, the errors evolve in a nonlinear regime, and the forecasts are no longer accurate (Ding and Li 2007). Therefore, the determined corresponding initial conditions are correct. We also assessed whether the two EHTEs could be predicted if the forecasts were made before June 14 and July 16. The results show that the RGIEs reached saturation levels before the two highest-temperature days (not shown), indicating the loss of predictability skill. As such, the temperature conditions on June 14 and July 16, which allow prediction of the two EHTEs, are the most distant conditions from the extreme conditions. Given that the corresponding initial conditions eventually evolved into the extreme conditions, the determined corresponding initial conditions are closely related to the predictability of the two EHTEs. In fact, the corresponding initial conditions can be hypothesised to be precursory signals for the forecasts of EHTEs. Future research will test this hypothesis.

In addition to the analysis of the error growth averaged over the region, the regional dynamics of the error growth were investigated. The NLLEs represent the error growth rates and have a heterogeneous spatial distribution during the forecasts. For the two EHTEs, positive NLLEs are mainly located in the southern regions, whereas negative NLLEs characterise the northern regions. The first day has the largest error growth rates. As the forecast time proceeds, the error growth rates decrease gradually and eventually approach zero, but distinct regions still have different error-growth rates. For the first EHTE, during the first six days (from June 14 to 19), the southwestern regions and Black Sea had higher rates of error growth than other regions. For the last nine days (June 20 to 28), all regions had similar rates of error growth, close to zero. For the second EHTE, the error growth rates were also heterogeneously distributed and were highest on the first day. In addition, the error growth rates decreased with time. Like the first EHTE, positive and negative error growth rates characterised the southern and northern regions, respectively. Unlike the first EHTE, the Black Sea region had a lower rate of error growth, demonstrating that the Black Sea region played a different role in the error growth of the two EHTEs. We also examined the spatial variation in RGIEs over the entire region. In general, the spatial variations in the RGIEs correspond to those of the NLLEs. For the first EHTE, during the early period the southwestern region and Black Sea had higher rates of error growth, and the area to the north of the Caspian Sea had a lower rate. With increasing forecast time, the errors increased for most regions until the highest-temperature day (June 25). Over the next few days, the RGIEs remained largely unchanged as the errors reached the saturation level. For the second EHTE, during the early period the southwestern region had relatively high rates of error growth as compared with the other regions. During the later period, the northern region had a higher rate of error growth than the southwestern and Black Sea regions. The errors remained largely unchanged after July 25. Therefore, the error growth is dependent on the geographical region. Moreover, some geographical regions, such as the Black Sea, had different roles in the error growth for the two EHTEs.

We then used RMSEs to verify whether the local predictability limits estimated by the BNLLE method are reasonable and conform to reality. The RMSEs can represent the increase of forecast errors and the forecast skill. For both EHTEs, the RMSEs increase steadily, but are relatively small during the early periods. Subsequently, they increased markedly in some northern regions. The larger RMSEs then gradually extended to a wider region, indicating a loss of predictability. From the increase in the RMSEs, it was confirmed that the predictability limits of 11 and 9 days for the two EHTEs are robust. These results demonstrate that the BNLLE method is an effective technique to quantitatively study the predictability of EHTEs. In addition, compared with other methods, the BNLLE method has lower computational costs and takes less time to calculate the predictability of EHTEs. It is expected that the BNLLE method can be effectively used for investigating the predictability of future extreme weather and climate events.