1 Introduction

South America (SA) has numerous types of climates where tropical, sub-tropical, and extratropical features are present. The SA climate is influenced by the presence of the Andes Cordillera (Garreaud et al. 2009; Insel et al. 2010; Viale et al. 2019; Espinoza et al. 2020) and the role of the Amazonia in humidity supply and vapor transport in the region (Swann et al. 2015; Ruiz-Vásquez et al. 2020). During the austral summer, the high temperatures and low-level-jet circulation subserve convective precipitation in central and southern SA (Romatschk and Houze 2010; Rasmussen et al. 2014; Mulholland et al. 2018; Poveda et al. 2020) which enhances the occurrence of flooding, landslides, and casualties in numerous countries (e.g., Giraldo-Osorio et al. 2019; Ehrlich et al. 2021). Heatwaves also persist during this season, with an increase in extreme temperatures observed within the region (Geirinhas et al. 2018; Chesini et al. 2019a; Olmo et al. 2020), which favors the development of forest fires (e.g., Urrutia-Jalabert et al. 2018; Dos Reis et al. 2021). During the austral winter, the SA climate has more impacts in Southern South America (Southern 30° S) than in summer, with an intensification of baroclinic instabilities, the weakening of the South Pacific High (Barrett and Hameed 2017) generating fronts that reach northern latitudes from the southern hemisphere storm track (Rudeva et al. 2019; Espinoza et al. 2020; Aceituno et al. 2021). During this season, cold nights also impact numerous regions (Chesini et al. 2019b; Bitencourt et al. 2020). Consequently, studying climate variability and extremes is necessary for assessing vulnerability, adaptation, and impacts on the continent in a climate change context.

The surface meteorological observations in SA are sparse, not homogeneous across the continent, and not always available to the community (Skansi et al. 2013; Condom et al. 2020), but it is essential to quantify changes in climates extremes and to evaluate biases in different gridded products (e.g., Rivera et al. 2018; Schumacher et al. 2020) and, consequently, climate simulations. In this regard, Regional Climate Models (RCMs) are powerful tools developed to dynamically simulate meteorological conditions, especially for downscaling Global Climate Models (GCMs) to finer resolutions to study climate change impacts and adaptation assessment under different scenarios. RCMs improve the comprehension of local climate phenomena through similar model configuration strategies, forcing GCMs and climate scenarios (Giorgi 2019). In SA, the first available RCM could only simulate a few months at low resolutions due to the computational limitations of their time. For example, two season months for SA (Eta model in Chou et al. 2000; DARLAM model in Nicolini et al. 2002) and two 5-months-long simulations with MM5 in Southern SA (Rojas 2006). Since the launch of the Coordinated Regional Climate Downscaling Experiment (CORDEX, Giorgi et al. 2009; Giorgi and Gutowski 2015), coordinated and consistent ensembles of dynamical downscaling over different regions have been available worldwide. These common experimental frameworks—such as CORDEX-Phase I (Giorgi et al. 2009; Giorgi et al. 2022) and CLARIS-LPB (Europe-South America Network for Climate Change Assessment and Impact Studies in La Plata Basin, Boulanger et al. 2016)—enabled the growing availability of RCM simulations over SA.

Several RCMs evaluations were performed but with a focus on specific regions of SA (e.g., Llopart et al. 2014; Carril et al. 2012; Carril et al. 2016; Bozkurt et al. 2018; Heredia et al. 2018; Olmo and Bettolli 2021; Bettolli et al. 2021), on specific extremes (Chou et al. 2014; Carril et al. 2016; Solman and Blazquez 2019; Blázquez and Silvina 2020; Olmo and Bettolli 2021) or mean values (Falco et al. 2019; Teichmann et al. 2021). The former simulations (at nearly 50 km of spatial resolution) allowed the understanding of (i) the RCM strengths and weaknesses (Solman 2013), (ii) the improved representation of regional processes adding value to their driving GCM (e.g., Rojas 2006, Giorgi et al. 2014, Solman and Blázquez 2019; Falco et al. 2019; Bozkurt et al. 2019) and (iii) the identification of systematic biases in their climate simulations (Solman et al. 2008; Chou et al. 2014, Sánchez et al. 2015; Solman 2016). In recent years, a brand-new set of RCMs simulations at high resolution for the complete SA domain has been available through the CORDEX-CORE initiative (Gutowski et al. 2016), improving the horizontal resolution from ~ 50 km to ~ 25 km. The Brazilian Eta model (Chou et al. 2014) also provides complementary simulations to the CORDEX simulations of ~ 20 km horizontal resolution for almost all South America. Even though some features of CORDEX (e.g., Solman and Blázquez 2019; Olmo and Bettolli 2021) and the Eta model (e.g., Dereczynski et al. 2020; Reboita et al. 2022) simulations have been addressed, an inter-comparative assessment of them has not been assessed yet in terms of their representation of extreme climate indices (climatology and trends), and their future projections over the complete SA domain.

In light of this, this work aims to assess a set of RCMs simulations in representing climate extreme indices over the different Intergovernmental Panel on Climate Change (IPCC), Sixth Assessment Report (AR6, Zhongming et al. 2021) South American regions. For this purpose, RCMs simulations within the CORDEX-CORE framework and Eta model simulations—driven by different Coupled Model Intercomparison Project Phase 5 (CMIP5) GCMs—were analyzed in the historical and future period from the Representative Concentration Pathways 8.5 (RCP8.5) emission scenario. This work contributes to an integrative approach across the different RCMs and the reliability of expected changes in extremes in the CORDEX- SA domain. This document has the following structure: Sect. 2 describes the methodology and datasets used. Section 3 shows the spatial and interannual distributions of extreme indices over the AR6 regions in SA and historical trends. This section also presents and discusses the climate change projections for the indices in the RCP8.5 scenario. Finally, we discuss and summarize the main findings of our study in Sects. 4 and 5, respectively.

2 Methodology and datasets

2.1 Data

Daily accumulated precipitation (Pr), maximum (Tx), and minimum (Tn) temperature from three different RCMs (Table 1) over SA contributing to the CORDEX initiative were considered for this study (Fig. 1). The RCMs simulations evaluated here correspond to the finest spatial resolution available for the SA domain: The REgional MOdel (REMO version 2015) and the REGional Climate Model system (RegCM version 4.7) simulations from the CORDEX-CORE framework (Gutowski et al. 2016; ~ 25 km) and Eta model simulations (Chou et al. 2014; ~20 km). We used simulations driven by three different GCMs from the Coupled Model Intercomparison Project Phase 5 (CMIP5, Taylor et al. 2012) (Table 1), covering the historical period (1981–2005) and climate change projections for the (2071–2099) period in the RCP8.5 scenario.

The evaluation of the RCMs was carried out in the historical period, considering two different sources of information: the NOAA Climate Prediction Center gridded product (CPC) (Xi et al. 2010) and the new high-resolution agrometeorological dataset retrieved from the fifth global reanalysis of the European Centre for Medium-Range Weather Forecasts (ERA5), hereafter AgERA5 (Boogaard et al. 2020; Hersbach et al., 2020). The CPC gridded data set has ~ 0.5° grid resolution and is one of the few observational gridded data sets available at a daily scale for SA that encloses the three variables of interest (Pr, Tn, and Tx). AgERA5 is a meteorological product that provides data in ~ 0.01° spatial resolution with fine topography, land-use patterns, and land-sea delineation of the high-resolution operational reanalysis ERA5. Despite reanalysis limitations in reproducing precipitation, especially representing inland (convective) and orographic precipitation, when compared with observations (e.g., Qin et al. 2021; Terblanche et al. 2021), they have shown to be improving across time (e.g., Schumacher et al. 2020; Alexander et al. 2020; Gleixner et al. 2020; Crossett et al. 2020; Nogueira 2020; Chen et al. 2021) and provide a valuable, consistent dataset in sparsely-gauged regions. These limitations and improvements are expected to be inherited to the AgERA5 product.

To better understand the limitations of gridded meteorological products, we included meteorological stations in the analysis to compare climatologies and representativeness of extreme indices over SA. We retrieved daily Pr, Tx, and Tn weather stations records from the National Weather Services of Argentina, Uruguay, and Paraguay; the Brazilian National Institute of Meteorology (INMET), the Brazilian National Water Agency (ANA, only for Pr), the Peruvian Meteorological Agency (SENAMHI); and the Meteorological stations from the Chilean Water Agency (DGA) and the Chilean Meteorological Agency (DMC), and the Global Historical Climatology Network (GHCN) stations, available on different sources as detailed in Supplementary Material. Quality control was applied to the observations regarding outliers (removing records that exceeded five times the standard deviation of the mean), physically plausible values (such as Pr > 0 and Tx > Tn), and missing data, considering a complete year when it had less than 15 missing days. Despite having a large number of initial observations, especially from the GHCN, the final sample of stations was significantly reduced due to data availability during the historical reference period of the RCMs simulations (Table 1, Figure S1).

Table 1 Regional climate models used in this study

Finally, considering the different climatic characteristics of SA, the total domain was divided into seven subregions following the IPCC climatic regions presented in the recent AR6 report (Iturbide et al. 2020) (Fig. 1)Footnote 1. The RCMs domains are represented in Fig. 1 with SAM-20 for the Eta model domain and SAM-22 for REMO and RegCM models. Note that the SA domain from the Eta model did not cover the entire SA region (SAM-20). In this particular case, the common area among all simulations was considered for the analysis. The number of meteorological stations considered for each subregion is presented in Table 2.

Fig. 1
figure 1

South America domains, and ETOPO1 (NOAA 2009) terrain elevation in meters. RCMs’ domains are shown in orange (SAM-22) and blue (SAM-20). Regions used for the assessment (AR6 regions) are closed in solid black lines: North-West South America (NWS); North–South America (NSA); South-America-Monsoon (SAM); North–East South America (NES); South–East South America (SES); South–West South America (SWS); South–South America (SSA). This figure uses SAM-22 and SAM-20 as the CORDEX abbreviation to identify the domains

Table 2 Number of stations (STN) considered per variable within South America AR6 regions

2.2 Methodology

The evaluation of the RCMs was focused on extremes. With this aim, four selected extreme climate indices from the Expert Team on Climate Change Detection and Indices (ETCCDI, Klein Tank et al. 2009) were analyzed: maximum consecutive 5 days precipitation (Rx5day), the maximum number of consecutive days with Pr < 1 mm (CDD), the maximum value of daily maximum temperature (TXx) and minimum value of daily minimum temperature (TNn). The indices were computed annually for each grid cell over SA (in the case of RCMs and gridded products) and each meteorological station. The indices were calculated with the climdex R Package (Bronaugh 2014) in the case of stations and with its Fortran version (FClimdex) for the gridded datasets. The native resolutions of the gridded products were conserved for this step.

The analysis was divided into three stages: (1) Comparison of gridded products and surface observations, (2) Spatial distribution, interannual variability, and trends; and (3) Climate change projections. The detailed methods and objectives for each step are presented in the following paragraphs.

(1) Comparing observed and gridded product climatologies permit quantifying the behavior in the representation of extreme indices in each AR6 region for stations, with more than 15 years of registers, because of the low time coverage of the initial stations (Figure S1). Although the evaluation of reference products was not within the scope of this article, we assessed the Pearson correlation coefficient, Root Mean Square Error (RMSE), and Kling Gupta efficiency score (KGE, Gupta et al. 2009, Eq. 2.1) between CPC, AgERA5 and surface observations (See Figure S2).

$$KGE=1-\sqrt{{(r-1)}^{2}+{\left(\frac{{\mu }_{s}}{{\mu }_{o}}-1\right)}^{2}+{\left(\frac{{\sigma }_{s}}{{\sigma }_{o}}-1\right)}^{2}}$$
(2.1)

Where \(r\) is the Pearson correlation coefficient, \(\mu\) is the mean and \(\sigma\) the standard deviation of the gridded products \({\left(-\right)}_{s}\), and surface observations \({\left(-\right)}_{o}\).

(2) The spatial distribution simulated by each RCM was computed by averaging, in time, each grid cell within a specific AR6 region (Eq. 2.2). The area-averaged time series was calculated by averaging all grid cells in the same region by year (Eq. 2.3), this series represents the interannual variability of each region. The magnitude of the trends of the climate extreme indices was also calculated using Sen’s slope method (Sen 1968) and the Mann-Kendall test (Mann 1945; Kendall 1948) for statistical significance, considering a confidence level of 95%.

(3) Lastly, we computed the mean projected changes for the extreme climate indices in the RCP8.5 scenario for the 2071–2099 period, considering 1981–2005 as a base period. Additionally, the inter-model agreement in the signal of change was computed. To this end, all models were regridded to a common ~ 25 km grid through bilinear interpolation.

$${\langle x\rangle }_{i,j}^{AR}, =\frac{\sum _{y}{x(i,j)}_{y}}{{n}_{y}}$$
(2.2)
$${\langle x\rangle }_{y}^{AR}={\left(\frac{\sum _{i}\sum _{j}x\left(i,j\right)}{{n}_{ij}}\right)}_{y}$$
(2.3)

Where \({\langle x\rangle }_{i,j}^{R}\), represent the time-averaged serie for a \((i,j)\) grid cell in a region during \({n}_{y}\) years, and \({\langle x\rangle }_{y}^{R}\), stand for space-averaged series in a specific AR region. The result of Eq. (2.1) gave \({n}_{ij}\) points for each region, additionally, \({\langle x\rangle }_{y}^{R}\) had the same length as the reference period (\({n}_{y}\)), i.e., (1981–2005).

3 Results

3.1 Gridded datasets vs. in-situ observations

SA is a vast region with diverse climates, in which in-situ observation is sparse and not homogeneous across the continent and not always available for the community (Skansi et al. 2013; Condom et al. 2020). In this sense, it is useful to consider gridded data sets when evaluating climate models (e.g., Solman and Blázquez 2019; Bozkurt et al. 2018). Therefore, identifying their limitations and differences across SA is crucial to provide insight into the observational uncertainty and to set a reference for comparisons. A brief analysis is done in this work regarding indices climatology. Figures 2 and 3 display climatologies for the precipitation and temperature-based indices over SA for the meteorological stations and the two gridded products in the 1981–2005 period. The Pearson correlation coefficient, KGE, and RMSE for single stations for each AR6 region (Figure S2) reflected the difficulty in representing precipitation of the gridded products with no overall dominant behavior across the regions, except for Tn performing better in all regions at the CPC product.

When comparing observed climatologies with reference products, all datasets depicted a similar spatial distribution of the climatological patterns of SA but exhibited differences in intensity, mainly in some sparsely gauged regions (NSA, NWS, and SAM, Fig. 2) for precipitation-based indices. The largest disagreement among products were detected for Rx5day, especially in the northeastern portions of SA (NES, SAM, NSA) and southwestern Patagonia (SWS), where, over these regions, in general, gridded products generally exhibited lower values of Rx5day than the stations (Fig. 2). Particularly, in SAM and NSA, CPC presented considerably higher values for Rx5day than AgERA5, doubling in magnitude in some portions. In contrast, in north NWS, along the Colombian Andes, AgERA5 exhibited higher values of Rx5day than CPC and seemed to capture the observed magnitudes from this region better. However, there is insufficient data to verify this pattern across the region. The most significant discrepancies were found in northern SA, agreeing with Almazroui et al. (2021) when analyzing precipitation totals in several datasets.

Fig. 2
figure 2

Climatologies for Rx5day (mm/5 days, upper panel) and CDD (days, bottom panel) over SA for the weather stations (left), AgERA5 (middle), and CPC (right) in the historical period (1981–2005)

For the consecutive dry days (CDD), the main differences were presented over the west part of SES and in SSA, where AgERA5 distinguished from CPC and stations, with lower values for CDD. In particular, these regions enclose complex topography due to the Andes Mountain range (Fig. 1), where the ERA5 reanalysis usually overestimates precipitation, leading to low values of CDD (Balmaceda-Huarte et al. 2021). Regarding the gridded data, considerable differences were observed in SAM and northern SA (NSA and NWS); commonly, CPC exhibited an extended pattern of more consecutive dry days over these regions than AgERA5. These discrepancies could be associated with the different data sources used to construct the gridded products in these regions that are affected by the good quality and time coverage of the meteorological stations (see Figure S1) and satellite data.

In southern Chile (south of SWS), the orographic precipitation is well represented by AgERA5 and exhibited a strong gradient of Rx5day, similar to the stations. At the same time, CPC underestimated the index value and misrepresented its spatial behavior. The NWS region shows the highest magnitudes with up to 400 mm/5 days (out of scale), and the lowest values were observed over the northwest part of the SWS region with a mean of 0.02 mm/5days (− 18°S, Quillagua, Chile) north of the Atacama desert (~ 28 °S). Overall, both spatial and magnitude patterns all over SA present differences, which may be related to the lack of stations in some areas, especially over the high-altitude regions.

Regarding extreme temperature indices (Fig. 3), the main characteristics of TNn and TXx were well reproduced by AgERA5 and CPC, although some differences with observations were exhibited in specific regions. In the case of TXx, CPC and AgERA5 generally presented colder temperatures than observations along the subtropical Andes (SWS), southeast (SES), and south SA (SSA). These TXx results for AgERA5 were consistent with ERA5 reanalysis results, as shown by Balmaceda-Huarte et al. (2021), and could be associated with differences in elevation-representativeness of topography, and local process, at their different horizontal resolutions.

Fig. 3
figure 3

Climatologies for TXx (°C, upper panel) and TNn (°C, bottom panel) over SA for the weather stations (left), AgERA5 (middle), and CPC (right) in the historical period (1981–2005)

For the coldest temperatures (TNn), disagreements among data sets were observed in southeastern SA (SES), where CPC and AgERA5 showed warmer temperatures compared to the stations (STN), CPC, and AgERA5, which showed warmer temperatures. Similarly, in South-South America (SSA), SSA higher values of TNn were observed in gridded data sets, particularly on the coastline of the continent (< 5 °C). This common feature of overestimating TNn observed in both regions seems to be more intensified in AgERA5. Moreover, in the northwest regions of SA (NSA and NWS), where station data is scarce, AgERA5 exhibited warmer temperatures than CPC.

3.2 Evaluation of RCMs extreme indices

The differences between the gridded products became evident in the previous section, particularly over regions where in-situ data is sparse, and the topography is complex. Therefore, only the gridded products were considered for comparisons to ease the following analyses.

Figure 4 exhibits the spatial variability of the extreme indices for each RCM forced by the different GCMs and for the gridded observational products. In particular, Rx5day results were more dependent on the RCM than on the driving GCM. REMO and RegCM simulations often overestimated Rx5day in almost all regions, whereas Eta tended to underestimate this index. In the case of CDD, regional differences could be observed in the RCMs performances. Over SWS, the large dispersion of CDD in the boxes detected in CPC and AgERA5 was well captured by RegCM and Eta simulations but not by REMO simulations that underestimated the index and the regional spread. In NWS, REMO presented closer values and similar dispersion to CPC. At the same time, RegCM and Eta simulations overestimated the regional distribution and the mean value of this index in this region. In NES, Eta simulations—except the one driven by CanESM2—presented the best performances, whereas RegCM showed higher CDD values than CPC and AgERA5. The REMO simulations overestimated the spatial variability in this region, although they better represented the regional mean of CDD than RegCM.

Fig. 4
figure 4

Box and whisker plots of mean values of the ETCCDI indices analyzed in this study at every grid point over the different SA regions for the observational datasets CPC (gray), AgERA5 (dark gray), and the RCMs driven by different GCMs: Eta (green), RegCM (red) and REMO (blue) for the 1981–2005 period. Solid central mark: mean, dashed line: median; bottom and top of the box: 25th and 75th percentiles

For both precipitation indices, differences among GCMs within each RCM were more distinguishable in some regions (NSA, NES, SAM, and SES). In particular, for Rx5day, these discrepancies were more notable in the CORDEX-CORE RCMs than in Eta, except in NSA and NES. In these regions, Eta-MIROC5 simulations highly overestimated Rx5day, differing from the rest of the Eta simulations. In the case of REMO and RegCM, simulations driven by MPI-ESM tended to perform differently than those driven by NorESM1 and HadGEM2-ES, particularly for REMO.

Regarding temperature indices, RCMs simulations tended to underestimate (overestimate) TXx (TNn) when compared to CPC in almost all regions (Fig. 4), except for some simulations of TXx in the NSA, SAM, and NES regions. For both temperature-based indices, more significant discrepancies in terms of spatial variability were observed in NSA and SAM. In these regions, higher values of TXx were exhibited by REMO and RegCM models compared to CPC and AgERA5, more noticeable in REMO. In the case of Eta, results were more dependent on the driving GCM: Eta nested by CanESM2 performed differently from HadGEM2-ES and MIROC5 and exhibited higher values of TXx in both regions. All RCMs generally overestimated the spatial dispersion of TXx in every region, displaying larger boxes than CPC and AgERA5, especially in the SES, SAM, NSA, and NES regions. In the case of the minimum temperatures (TNn), models performed very similarly, typically exhibiting higher values of TNn than CPC and AgERA5, more intensified in SSA and SWS.

When comparing interannual variability, i.e., averaging all grid cells within each AR6 region annually, all analyzed indices show differences in the mean values and the inter-quartile range, generally lower than spatial-averaged series with little overlap between simulations (Fig. 5). The Rx5day index doesn’t present concordant distributions between different GCMs forcing a singular RCM (e.g., Eta in SWS, RegCM in NWS, and REMO in SAM). Similar results occur in the CDD index, being slightly drier conditions in the mean but with a smaller interquartile range (IQR). Except for NWS and SWS, all interannual distributions do not overlap between the same RCM simulations, reflecting the inherent variability within global models. In the case of the temperature-based index, they better represent the areal-interannual variability than the precipitation ones (similar IQR within each region and model).

Fig. 5
figure 5

Same as Fig. 4 but for the regional averaged interannual distribution of the ETCCDI indices analyzed in this study

3.3 Evaluation of RCM trends for extreme indices

Trends of the regional-averaged series of the extreme indices are displayed in Figs. 6 and 7. The trends are shown by region, for CPC, AgERA5, and each of the RCMs simulations. Precipitation-based indices showed less consistent results among RCMs simulations and a minor agreement between AgERA5 and CPC (Fig. 6). On the other hand, all data sets coincided with warmer trends for TXx and TNn in almost all regions, though some were not statistically significant (Fig. 7). For both extreme temperature indices, RCMs simulations well capture the signal of the trend of each region although, in general, they differed in intensity.

Fig. 6
figure 6

Trends estimated using Sen’s slope analysis in each AR6 region (columns) and model (rows) for a RX5day and b CDD considering the period 1981–2005. The Brown (green) color indicates dry (wet) conditions. Significant trends at the 95% significance level were marked with an asterisk

Fig. 7
figure 7

Same as Fig. 6 but for TXx (a) and TNn (b). The Blue (red) color indicates cold (warm) conditions. Units are °C dec–1

In the case of Rx5day, CPC presented significant and positive trends in northwest SA (NWS) and South-SA (SSA) (Fig. 6). Notably, in NWS, the gridded products presented opposite, statistically significant trends for the Rx5day index, a region where both products presented difficulty representing precipitation (see Figure SM2). In the same area, Eta-MIROC5 simulations also showed negative and significant trends, while, in agreement with the results of CPC, Eta-HadGEM2-ES runs presented wetter trends (p value < 0.05). The rest of the RCMs simulation did not show significant trends for this region. However, they did not coincide in the regional trend sign either. In SSA, the wet signal exhibited by CPC and AgERA5 was well reproduced by most RCMs simulations. Moreover, Eta-MIROC5 and RegCM4-HadGEM2-ES simulations captured the significance of this trend, similar to CPC. Notwithstanding, in NSA, AgERA5 presented significant upward trends for Rx5day, and the majority of the RCMs simulations agreed with the sign of this trend.

Fewer significant long-term changes were observed in the case of CDD. CPC only presented a significant upward trend in southeast SA (SES) for this index. Almost all RCMs simulations agreed on the sign of this trend, except REMO-NorESM1 and Eta-CanESM2 runs. Furthermore, in regions SWS and SAM, where typically CDD > 100 days (Fig. 2), CPC, AgERA5, and most RCMs simulations coincided with positive trends. In particular, some Eta simulations also exhibited significant upwards trends in NSA and NES with larger Sen’s slope values than the rest of the data sets, especially in NES. Moreover, in this last region, RCMs forced by HadGEM2-ES and MIROC5 tended to simulate drier trends than the rest of the GCMs.

Warmer trends were generally observed for temperature-based indices in almost all regions and data sets (Fig. 7). In the case of TXx, consistent results among data sets were observed in north-east SA (NES), where CPC, AgERA5, and most of the RCMs simulations, agreed on a significant upward trend. In this region, the exceptions were the Eta-MIROC and CanESM2, Eta-REMO, MIROC-HadGEM2-ES, and MIROC-NorESM1 simulations, which did not present a significant trend for this index. In SAM and NSA, many RCMs simulations and AgERA5 showed significant upward trends for TXx, although no significant trend was detected in CPC in these regions. In addition, CPC exhibited negative trends in northwest SA (NSA), but this was not identified in other data sets.

For the TNn index, CPC exhibited (generally) higher values of Sen’s slope than for TXx (Fig. 7). Strong significant trends in NES and NSA were observed in this data set (> 1.5 °C dec− 1). Some RCMs simulations from RegCM and REMO in these regions also presented statistically-significant trends but with minor magnitudes. In particular, the RegCM-MPI-ESM run exhibited (positive) significant trends in almost all areas and stronger trends than the rest of the RCMs simulations. On the contrary, Eta-CanESM2 simulates a cooling tendency for most regions. AgERA5 did not present significant trends in any region and exhibited weaker trends for TNn than CPC. Similar to TXx, CPC showed a negative signal of change for TNn in NSA, and only Eta driven by CanESM2 agreed on the sign of this trend.

A deep inspection of the spatial distribution of the trends can be observed in Figure SM1 to Figure SM4 from Supplementary Material. For precipitation-based indices (Figure SM1 and SM2), particularities inside each region could be detected, especially in NWS and SES for Rx5day, where trends with opposite signs were exhibited in the same region. This feature translated into differences in the signal of change observed in Fig. 7 for CPC, AgERA5, and some RCMs. While, in the spatial patterns of the TXx and TNn trends, more homogeneous results were observed within each region, and the average regional trends displayed in Fig. 7 well summarized this information.

The presented results highlighted that trends in extreme precipitation indices were not reproduced in all regional simulations in magnitude and direction of change. In the case of temperature indices, the agreement was higher for TXx and TNn; nevertheless, CPC showed stronger heating signals for SAM, NSA, and NES regions; even a strong cooling signal (NWS, Figure S5, and Figure S6) that was not represented in the models. The misrepresentation of trends may not necessarily be a consequence of the regional modeling itself but of the aggregation of different climates within the AR6 regions and should be deeper analyzed in future studies.

3.4 Climate change projections

Figures 8, 9, 10 and 11 display the changes in the precipitation- and temperature-based indices for the late 21st century (2071–2099) with respect to the 1981–2005 reference period for the different RCMs and driving GCMs under RCP8.5 future scenario. Model agreement is also included for Rx5day and CDD, depicted as the percentage of agreement in the sign of the change among simulations.

Regarding RX5day (Fig. 8), wetting signals were generally found over SES, NES, NWS, and some areas of SA. This index’s decrease was primarily located in NSA but with different rates of change among simulations. The model agreement was larger over SES (positive), NSA, and SWS (negative) changes. In this sense, the REMO and RegCM models tended to present congruent changes over the different regions of SA. In contrast, the Eta simulations showed lower positive changes in RX5day and important drying signals, particularly in Brazil. Furthermore, opposite changes were mainly found in the northern and southern parts of SWS, with positive changes in the north and negative ones in the south.

Fig. 8
figure 8

Mean projected changes in Rx5day [%] for the period 2071–2099 compared to the base period 1981–2005 for each RCM (columns) driven by the different GCM (rows) under the RCP8.5 scenario. Agreement among models (expressed as the percentage of RCMs) is shown in the bottom-right corner, colored by the projected change signal (from negative to positive), in which models agree

In the case of CDD (Fig. 9), consistent positive changes in CDD, i.e., longer dry spells, were found over NES and some portions of NSA. On the contrary, some NWS, SES, and the Andes cordillera areas show a negative projection on CDD. Notwithstanding, the change rates varied among RCM simulations, especially in the Eta-CanESM2 simulation, which depicted the most considerable changes over different portions of SA (> 80 days).

Fig. 9
figure 9

Same as Fig. 8 but for CDD [days]

In the case of maximum temperatures, TXx projects general warming over SA, with the most significant rates of the change mainly identified over some portions of NSA, NWS, and SAM (Fig. 10). The continental regions with the largest changes often vary among RCMs and the driving GCM. However, consistent results were found for the simulations driven by the HadGEM2-ES model, the only GCM in common for the three RCMs included in this study. It is interesting to highlight that the Eta-CanESM2 run shows the warmest extreme temperatures in northwestern SA (NWS)- up to 12 °C of change in some portions, which differs from the other simulations. On the other hand, the lowest changes are presented in SES and some parts of SSA, although the changes for TXx reach 5 °C in some cases.

Fig. 10
figure 10

Mean projected changes in TXx [°C] for the period 2071-2099 to the base period 1981–2005 for each RCM (columns) driven by the different GCM (rows)

When considering the coldest temperatures (TNn), the rates of change over SA are more homogeneous, despite the notable increases over the Andes mountain range and in the southern tip of the continent (parts of SWS and SSA), where most of the simulations depict the most significant warming signals. Compared with the CORDEX-CORE RCMs, the Eta simulations do not enhance the Andes cordillera warming regarding the magnitude of the TNn changes in adjacent regions. As observed in TXx, the slightest degree of change was observed in the SES region, which presents values close to 0 in RCMs simulations driven by NorESM1 and MIROC5 (Fig. 11).

Fig. 11
figure 11

Same as Fig. 8 but for TNn [°C]

4 Discussion

This work evaluated two RCMs from the CORDEX initiative and the Eta Model simulations over the AR6 regions of South America. The assessment focused on four extreme climate indices (Rx5day, CDD, TXx, and TNn) and considered the historical and future simulations from the RCP8.5 emission scenario (2071–2099). Two gridded products were used as reference: an observational-based data set from the NOAA Climate Prediction Center (CPC) and a new high-resolution data set retrieved from ERA5 reanalysis (AgERA5), both previously compared with surface observations.

Differences between gridded products and observations across SA were briefly addressed by comparing AgERA5 and CPC with in-situ observed climatologies. In this analysis, more significant differences among data sets were observed in NWS, NSA, SAM, and SSA, coincident with the regions more sparsely gauged. Even though in-situ records exist in those regions (Condom et al. 2020), they were not available for the reference period used in this study. CPC generally showed longer dry conditions, i.e., more CDD and less magnitude in the Rx5day index, south of 20°S and on the Pacific coast. Nevertheless, both products presented general coherent spatial behavior through SA compared with the observations.

In the case of extreme temperature indices, TNn tended to be overestimated in the SES region, while TXx had a cold bias in SES and SSA in both reference products. The most remarkable differences were found in the Andes, presumably due to topography differences and lack of representativeness of typical mountain meteorology processes in the case of AgERA5 (e.g., thermal circulations, Whiteman 2000), among other effects. Regional model evaluation in more recent periods, e.g., driven by CMIP6 GCMs, may be evaluated more robustly due to the increasing observation in SA (Condom et al. 2020).

Spatial-averaged distributions in the historical period (1979–2005) showed a better representation than previous regional simulations (e.g., Carril et al. 2012; Carril et al. 2016). The Rx5day index was better represented in SWS and SSA, with a high agreement between reference products and regional simulations; nevertheless, large interquartile range and median differences arise for the other regions, especially in the NES region. REMO simulations tended to overestimate extreme precipitation (Fig. 4), and Eta simulated closer IQR and median to reference products in SA. Consequently, there are remaining challenges in modeling extremes in precipitation in agreement with the literature (e.g., Solman and Blázquez 2019; Olmo and Bettolli 2021; Reboita et al. 2022).

Significant differences between RCMs and gridded products arose for the CDD index in the mean climatology and the interannual variability in all regions. The CDD climatology was well represented in the SES by all RCMs. Generally, Eta simulations presented more similarity to reference products, and the CORDEX models showed similar behavior in the region (Figs. 4 and 5).

When comparing space-averaged simulations, the regional models could reproduce observed climatologies as discussed in previous studies (Teichmann et al. 2021; Reboita et al. 2022). Nevertheless, the interannual distribution of the indices showed a significant challenge in climate simulations for precipitation-based indices, where large differences arose between GCMs, forcing the same RCM and also between regional models (Fig. 5). This could be associated with the driving global climate model and their difficulty in simulating global teleconnections (e.g., Endris et al. 2016; Ratna et al. 2019; Kristóf et al. 2020; Hewitt et al. 2020) and/or miss-representation of local processes as convection (e.g., Haberlie and Ashley 2019; Solman and Blásquez 2019; Betolli et al. 2021), aspects that should be addressed in future climate model assessments.

Regarding temperature indices, the reference products exhibited different climatologies, AgERA5 being generally warmer in TNn and cooler in TXx than CPC, whereas the RCMs tended to agree more with AgERA5. Regional model simulations showed a high agreement in both temperature indices, with more differences in SSA, SES, and NWS for both indices. TNn simulations presented a warm bias of ~ 10 °C in the southern part of SA. These biases were not present in CPC and AgERA5 products and may be inherent to the regional simulations rather than differences in topography. For example, the misrepresentation of energy balance was shown not to be always better represented in finer resolution (Bozkurt et al. 2019). TXx simulated indices presented warm and cold biases regarding the reference products but presented substantial variation in the IQR, especially in SES, NSA, and NES (Fig. 4). This feature could be associated with the GCMs forcings for the Southeastern SA region, as previously documented in Vaurolo-Clarke et al. (2021).

Simulated trends significantly differed in the magnitude of Rx5days within the region (Figure S3), with positive and negative signs of change over SA depending on the RCMs, driving GCM, and reference products. This heterogeneity was previously documented with in-situ observations: e.g., Regoto et al. (2021) for Brazil (− 30 to 30 mm/decade); Cerón et al. (2021) in Colombia, Cerón et al. (2021) in La Plata Basin using CHIRPS (− 20 to 20 mm/decade). Other seasonal studies provided evidence of seasonal changes in the North of Chile, up to 10 mm/decade (Souvignet et al. 2012), and both positive and negative trends were reported by Schumacher et al. (2020) in Chile. Olmo et al. (2020) found similar spatial behavior for compound events of daily heavy precipitation in SSA by analyzing CPC and stations. Regional models did not represent the Rx5day spatial heterogeneity trend of gridded products.

The regions with the more reliable simulated climatological trends in the Rx5day index, i.e., with a high agreement between CPC, AgERA5, and most of the RCMs, were found in SSA with a positive trend (Fig. 6, Table S1). All other regions presented diverging signs among the regional models and/or within the reference products. In the case of the CDD index, SES and NES regions were consistent with the observed trends of longer maximum dry spells (Fig. 6, Table S2); only the RegCM runs reproduced the sign of the historical trends in SSA. The CDD trends had a different spatial pattern over SA between gridded products, RCMs, and driving GCMs. The reference products coincided the most in the east of the NES (−) and western SES (+) regions trends (Figure S3). When averaging by region, this heterogeneity was smoothed by a high drying agreement (Table S2); nevertheless, the pattern extension presented vast differences within each GCM (opposite trends in the same RCM) and the gridded precipitation products.

Although there is a high agreement between RCMs in precipitation indices, the Eta model presents more agreement in the sign of the trends over averaged regions in South America (additional information in Table S1 and Table S2), but with biases in magnitude and spatial distribution (Figure S2 and Figure S3) as previously reported (Dereczynski et al., 2020; Reboita et al. 2022), as a response of high inner variability across the different regions.

Regarding temperature indices, there was generally more agreement with observed trends within high differences in the sign of the tendency in NSA, SAM, and center of SWS for TXx, and multiple cooling cells in CPC not documented by AgERA5 (Figure S5 and Figure S6). CPC, AgERA5, and the RCMs showed strong agreement in TXx trends. Except for NWS and the eastern part of SES and SSA, all AR6 regions indicated warming trends. This warming trend pattern was well reproduced by most of the RCMs (Table S3 and Figure S5). For this index, REMO exhibits perfect agreement between their driving models all over the area. These findings contribute to previous results of low biases simulated by this model in mean temperature (Solman 2016; Remedio et al. 2019). Similar results prevailed for TNn trends, where historical trends are generally well represented for the RCMs, except for the SAM and NSA regions, where four models simulated cooling trends. In particular, TNn trends in the RegCM model presented perfect region-averaged agreement all over South America (Table S4).

Concerning climate change projections, our analysis projected wetter conditions over large areas of South America, particularly over SES, NES, and NWS, where positive changes of Rx5day were detected in most RCM models, more robust over SES. The latter was in agreement with slightly negative changes in CDD. The positive changes shown in this work are consistent with previous findings in the literature using different model experiments (Ortega et al. 2019; Donat et al. 2019; Blázquez and Silvina 2020; Olmo et al. 2022; de Medeiros et al. 2022). In contrast, the ETA simulations depicted important drying signals concentrated mainly over Brazil, in line with Reboita et al. (2022) results using an ensemble mean. On the other hand, negative precipitation changes were identified primarily in NSA but with different rates of change among simulations. In particular, longer dry spells (CDD) are projected over NES, according to all RCM simulations already documented by Coppola et al. (2021), even though increases in Rx5day are expected in this region, indicating changes in precipitation distribution.

There is more confidence in the temperature indices’ projections than in precipitation. For the annual maximum temperatures, strong (weak) signals of change were projected over the northern regions of SA (SES) by all RCM simulations. Although, the magnitude of change varied depending on the RCMs and the driving GCM. The larger changes over the northern regions coincided with Reboita et al. (2022) results at a seasonal scale using the same regional simulations used in this study.

For TNn, enhanced warming along the Andes mountain range, projected by most RCMs, is consistent with other high-sites observed and modeled tendencies (e.g., Niu et al. 2021; Pepin et al. 2022). However, in this work, this was not detected in Eta simulations.

Although there is general agreement between models, the magnitudes of the change may be biased due to differences in climatological aspects previously documented in the literature (e.g., Solman 2016; Dereczynski et al. 2020; Teichmann et al. 2021). Statistical downscaling methods can serve as a complementary tool for a more reliable magnitude of change, helping to understand the projected signals in a climate change scenario (e.g., Xu et al. 2021).

Note that changes found in this paper may have local differences in intensity compared to ones illustrated in the brand-new IPCC Atlas (Gutiérrez et al. 2021), such as in the number of dry spells for the late future. This might be due to using different RCM simulations in the IPCC model ensemble mean (including previous CORDEX runs but not the ETA simulations), whereas our results were analyzed for each RCM individually and focused only on CORDEX-CORE and ETA simulations.

5 Conclusion

The largest differences we found between RCMs and the observational products CPC and AgERA5 were in the areas with prominent surface observation scarcity: NWS, SAM, and NSA. Overall, RCMs were capable of representing climatologies of extremes and spatial variability. However, RCMs presented difficulties in representing the temporal aspects of the extremes described by the area-averaged time series, particularly for precipitation indices.

When analyzing trend agreement, RCMs frequently differed in the trend signal of CPC/AgERA5 products in the precipitation indices. Consequently, significant uncertainties remain in the regional models for precipitation in South America, probably associated with a misrepresentation of mesoscale processes such as convection and circulation. A critical task for a better understanding is to improve the density of the in-situ observational network. In contrast, temperature indices represented better the observed trends all over the region than Pr indices, including positive and negative signals, except for SSA (for TXx and TNn) and SAM (for TNn).

Climate change projections for the 2071–2099 period presented different spatial behavior for extreme Rx5day, with differences in the agreement of magnitude and the expected change. An increase in extreme precipitation is expected in future projections over several regions: NES, NWS, SES, and some portions of SAM (Southern Bolivia), NSA (west of Brazil), and SWS (northern Chile and coastal South of Peru). The Consecutive Dry Days are expected to increase in SA, except for SW of SES, south of SSA, north of Chile and the coast of Peru, and the extratropical Andes in SWS (25–35° S). Temperature indices projected warmer conditions for TXx and TNn all over the area, especially along with the Andes Mountain range, with high agreement between the models.

Although the analyzed period is relatively short due to the historical RCMs evaluated, our results were congruent with other studies that examined more extensive periods. A remaining challenge in SA is evaluating extreme events using different regional model configurations, case study analysis in smaller resolutions, and complementing other observational products (e.g., satellite imagery).