1 Introduction

Hydrological disasters, such as floods, are the natural hazard that most affect people in the world, causing major social and economic losses (CRED/UNDRR 2020). Climate change and the impact of human activities on land use may be changing the pattern of intense rainfall and consequent flooding, creating more variability in these phenomena (Blöschl et al. 2017). Flash floods, i.e., sudden local floods typically due to heavy and local rain, often occur in small mountainous rivers, where there may be low rainfall and river level monitoring. The lack of monitoring occurs especially in developing countries and challenges the disaster risk reduction. When monitored, uncertainties in data measurement occur due to the small spatial and temporal resolution of the rainfall event, the inability of the instrument to measure such extreme event, the event itself may destroy the measuring instrument and the rating curve (discharge versus stage) may not be calibrated for such high discharge values (Benito et al. 2004; Brázdil et al. 2006; Borga et al. 2008; Gaume and Borga 2008; Corato et al. 2014; McMillan et al. 2018). To assist studies in unmonitored catchments, the International Association of Hydrological Sciences (IAHS) encouraged a decade of PUB (Predictions in Ungauged Basins), which started in 2003 (Sivapalan et al. 2003). The recommendations of the studies were to understand the landscape of the catchment, to analyse nearby catchments with similarities, to use models to describe the hydrological and hydraulic processes, and, later, to analyse the uncertainties involved in the simulation (Blöschl 2016).

Studies that perform hydrological and hydrodynamic reconstructions of past extreme hydrological events, such as historical floods, are useful to analyse and understand how the hydrological and hydrodynamic dynamics occur in these extreme conditions (e.g. Tegos et al. 2022). Typically, historical flood reconstruction deals with lack of data and use different types of recorded data and evidence, such as flood marks, sediment deposits, documentary sources, newspaper news, and data from rainfall and stream gauges (Benito et al. 2004; Himmelsbach et al. 2015; Zhang et al. 2018; Bomers et al. 2019). Since reconstructions of past events usually contain great uncertainties, studies often test scenarios and hypotheses of rainfall events (Balasch et al. 2011; Dimitriadis et al. 2016; Bomers et al. 2019).

The use of hydrologic and hydrodynamic models that simulate rainfall can estimate fundamental characteristics of a past flood, such as the extent of flood, time to peak and maximum discharge. Hydrological models, i.e., rainfall-runoff models, are applied on a catchment scale, and hydrodynamic models are usually applied to simulate a flood wave propagation on a river scale. Some reconstructions of past floods use only hydrologic modelling (e.g. Bürger et al. 2006), others use only hydrodynamic modelling (e.g. Remo and Pinter 2007; Klimes et al. 2014; Bomers et al. 2019; Bellos et al. 2020), and others use both types of modelling (e.g. Balasch et al. 2010, 2011; Velásquez et al. 2018; Vanelli et al. 2020; Zhang et al. 2021). Frequently, the only observed and available data from past floods are the maximum flow depth, with no information on the river discharge and the event precipitation, which can be estimated using models (Bomers et al. 2019). HEC-HMS (USACE 2000) is a hydrologic model widely used for rainfall and flood event reconstruction (e.g. Balasch et al. 2010; Walega 2013; Zhao et al. 2018; Zhang et al. 2021). Examples of hydrodynamic models used in studies are HEC-RAS (USACE 2016) (e.g. Remo and Pinter 2007; Balasch et al. 2011; Klimes et al. 2014; Bomers et al. 2019), FLO-2D (O’Brien et al. 1993) (Haltas et al. 2016; Zhang et al. 2021) and iRIC (Nelson et al. 2016) (e.g. Irie et al. 2015; Shokory et al. 2016; Rai et al. 2018). Dimitriadis et al. (2016) compared three widely used hydraulic models with Monte Carlo simulation and found that the inflow discharge, channel and floodplain roughness are the three input variables associated with the highest uncertainty in the results.

Hydrological and hydraulic models are commonly used in the assessment of flood risk, but the lack of input data in unmonitored areas increases the uncertainty of the results. Remote sensing can be useful to estimate input parameters, such as rainfall and soil moisture data (Poortinga et al. 2017). Probabilistic models can reduce uncertainty by estimating the probability of a flood occur under different conditions. Common probabilistic methods used in flood risk assessment are Monte Carlo simulation (e.g. Apel et al. 2006; Dimitriadis et al. 2016) and Bayesian models (e.g. Gaume et al. 2010; Roslin et al. 2018).

On 5 January 2017, an extreme hydrometeorological event occurred in the region of the municipality of Rolante, southern Brazil, which triggered landslides and a flood (SEMA/GPDEN 2017; Zanandrea et al. 2019; Cardozo et al. 2021). Heavy rainfall occurred in the mountainous region of the basin, where landslides were triggered, and floods occurred in the downstream plain, where the urban area of Rolante is located. The municipality of Rolante suffered economic losses of around 22 million US dollars due to a flash flood event, which was a historical phenomenon of atypical magnitude for the region, suggesting that a landslide dam break could have occurred (SEMA/GPDEN 2017). Understanding what led to this flood of great relevance to the region is a challenge because this basin was poorly monitored. The Rolante basin presented no official and reliable quantitative data on the event precipitation and river flow (SEMA/GPDEN 2017).

This study aimed to build a hydrological and hydrodynamic reconstruction of the flood event in January 2017 in the region of Rolante, a town in southern Brazil, using modelling and unofficial collection data to test different rainfall scenarios and understand what contributed to this flood. This study applied an innovative approach to promote a hydrological and hydrodynamic reconstruction and understanding of a flash flood, based on data collected by the residents since there was no official monitoring of the flood event. The goal is to discover if only the heavy rainfall could generate a flood of this magnitude or if another phenomenon is needed to explain the flood. Some residents hypothesised that the landslides triggered by the rainfall dammed the river and later collapsed, causing a flash flood. But this rumour was never confirmed. If only the rainfall cannot explain this quick flood event, a study about landslide dam breaks should be done to test this hypothesis. This rainfall and flood reconstruction can contribute to risk mitigation and disaster management in other areas with low monitoring. Learning from the past is essential to preventing future disasters.

2 Methods

2.1 Study area

The rainfall and flood event happened in the basin of the Rolante River, with an area of 771 km2, located between the municipalities of Rolante, Riozinho and São Francisco de Paula, in south of Brazil (Fig. 1). The confluence of the Mascarada River and Riozinho stream forms the Rolante River. After heavy rainfall in January 2017, several landslides occurred in the northeast region of the drainage basin, in the Mascarada sub-basin (Cardozo et al. 2021). Downstream of the landslides region, the urban area of Rolante flooded, with around 16,000 inhabitants.

Fig. 1
figure 1

Location map of the study area: Rolante River basin, Brazil

Rolante basin is 771 km2 with elevation from 25 to 997 m. The land use is 53% Atlantic Forest, 20% agriculture, 15% silviculture and 2% urban area. The geology is mainly basalt in the medium and high lands, and sandstone in the lowlands. The climate is humid subtropical. The annual rainfall is well distributed throughout the year, with an average of 1500 mm/year. January has an average rainfall of 180 mm. The rainfall event that happened on 5 January 2017 lasted approximately 4 h and presented accumulated rainfall similar to that accumulated throughout the month (SEMA/GPDEN 2017). In this basin, precipitation can occur concentrated on the hillslope due to the orographic effect caused by the large difference in elevation between the top of the basin and the outlet.

2.2 Rainfall event and flood of January 5th, 2017

Intense rainfall event on 5 January 2017 triggered more than 400 landslides (Cardozo et al. 2021) in the study basin. In the zone where the landslides occurred, the valley is confined, with elevations ranging from 200 up to almost 1000 m and slopes up to 85° (Zanandrea et al. 2020). The local community reports that there was temporal and spatial concentrated rainfall of high volume in the mountain area, but there were no official rainfall gauges to measure the precipitation where the landslides occurred (SEMA/GPDEN 2017). Around 23 km downstream from the landslide region, a flood occurred in the town of Rolante, a floodplain area.

The basin was ungauged, so there was no direct estimation of the rainfall and discharge data of the event. Censi and Michel (2018) estimated the rainfall data with the remote sensing product GPM IMERG by comparing precipitation to the closest gauged basins. GPM IMERG did not represent the spatial distribution of the rainfall well and underestimated the precipitation volume. According to resident’s reports, the intense rainfall occurred during the afternoon of 5 January 2017, from 2 to 6 p.m., and the flood started around 7 p.m. in the town of Rolante. The peak discharge occurred around 11 p.m. and started to decrease at 24 a.m. in the town (SEMA/GPDEN 2017) (Fig. 2). Images of the event are presented in Fig. 3, showing the town of Rolante flooded, located in the lower part of the basin (Fig. 3a), and the upstream mountainous region with more than 400 landslides (Fig. 3b).

Fig. 2
figure 2

Timeline of the rainfall event and the Rolante River flood

Fig. 3
figure 3

Consequences of the rainfall event: a flood in the town of Rolante (photo from the civil defense); b landslides

Such a flood event was of great magnitude compared to the history of the region. Residents suggested that a dam break could have caused such a flood, but no artificial dam breaks were reported, so a landslide dam break was suggested. In this study, a computational reconstruction of the rainfall event was conducted to verify whether this flood could happen only with intense rainfall or if another natural phenomenon is needed to explain the flood.

2.3 Simulation: hydrologic and hydrodynamic modelling

The reconstruction of the rainfall and flood event was conducted in two stages, with hydrologic and hydrodynamic modelling (Fig. 4).

Fig. 4
figure 4

Steps of hydrologic and hydrodynamic modelling in the rainfall event reconstruction

2.3.1 Rainfall-runoff simulation

To determine the river discharge caused by the rainfall event, the Hydrologic Modelling System (HEC-HMS), version 4.2.1, (USACE 2000) was applied in the Rolante drainage basin. Hydrologic modelling first went through the calibration process using data of four past rainfall events, followed by validation, and subsequent simulation of the rainfall event under study. The goal was to define the hydrograph at different locations throughout the basin. The furthest monitored downstream location is at the stream gauge Mascarada, which is upstream from the town of Rolante. Consequently, only the locations upstream from the stream gauge could be calibrated and validated. This stream gauge started monitoring the region in 2017, after the rainfall event.

HEC-GeoHMS, an integrated tool with GIS available on ArcGIS, was used to define the physical characteristics of the basin. The characteristics were defined with a DEM with 12.5-m resolution. Subsequently, these data were entered into the HEC-HMS to describe the basin. The main input data in the simulations was the rainfall to cause runoff in the basin. The methods chosen to represent the sub-basins and rivers are shown in Table 1, defining initial conditions such as initial soil moisture. All the five parameters shown in Table 1 were calibrated. The Thiessen Polygon method was used to determine the meteorological model. The flow from the mountainous region where the landslides occurred is estimated to reach the centre of the town after around 4.6 h.

Table 1 Hydrologic methods chosen and the five parameters calibrated in the HEC-HMS model

The only station with an available rating curve (discharge vs. river stage) adjusted in the region is the Mascarada gauge station, located in the largest sub-basin of the Rolante basin. Therefore, only the sub-basins located upstream from this gauge station could be properly calibrated, representing an area of 311 km2. Two other large sub-basins located in Rolante that could not be calibrated are the Riozinho sub-basin and the Areia sub-basin, shown in Fig. 1, with respective areas of 125 and 151 km2. To deal with the lack of data for calibration, their parameters were defined according to the validated values of the Mascarada sub-basin, using correlations found in the calibrated parameters. Therefore, the Mascarada gauge station was used to calibrate and validate parameters in the Mascarada sub-basin and to approximate the parameters for other sub-basins.

The HEC-HMS model was used to simulate discrete rainfall events. The Nash–Sutcliffe efficiency coefficient (NSE) (Nash and Sutcliffe 1970) and the percentage bias (PBIAS) (Gupta et al. 1999) were used to evaluate the calibration and validation of the model. Satisfactory values of NSE are greater than or equal to 0.5 and of PBIAS are ± 25% for discharge (Moriasi et al. 2007).

The model was validated with the average of the calibrated parameters for each sub-basin. A challenge was to determine the initial abstraction parameter of the sub-basins for validation, as it greatly varies according to the soil moisture at the beginning of the rainfall event. Thus, the initial abstraction parameter was defined from an analysis of the previous rainfall before the calibrated events. A statistical analysis was performed, defining which Antecedent Precipitation Index (Kohler and Linsley 1951) (for 12 h, 1 day, 3 days, 5 days, 7 days, or 10 days) showed the best correlation with the initial abstraction calibrated. The 7-day Antecedent Precipitation Index had a better correlation and was used to define the initial abstraction for each event used for validation. Finally, after three validation events with satisfactory error values (NSE and PBIAS), it was considered that the rainfall-runoff simulation of the Mascarada sub-basin had satisfactory parameter values to simulate the rainfall event under study.

The unofficial precipitation data from farmers was entered to determine the hydrographs of the event of 5 January 2017 at different sites in the basin. Unofficial data was the best viable alternative given that the Rolante basin had no monitoring and remote sensing imagery also did not adequately estimate the rainfall in this event. Defining the spatial and temporal resolution of the rainfall was a challenge, as there was little official data measured in the region. The closest official gauge was in another basin near to the Rolante basin and measured about 50-mm precipitation. SEMA/GPDEN (2017) collected unofficial rainfall data of seven rural rainfall gauges from local farmers, in which the highest value found was 272 mm accumulated for that day of the event. For this reason, these unofficial data from farmers were used to determine the spatial distribution of rainfall. The Thiessen polygons for the unofficial farmers’ rainfall gauges were determined, and these unofficial polygons were merged with the Thiessen polygons previously used in the calibration and validation of the model (Fig. 5). Afterwards, weights were defined according to the areas of the polygons. In short, Thiessen polygons were made from the farmers’ rainfall data and added to the official Thiessen polygons.

Fig. 5
figure 5

Official and unofficial gauges were used to delimit Thiessen polygons and the spatial distribution of rainfall

The temporal resolution of farmers’ data was not available, since they had the precipitation amount of the entire day. Therefore, hypothetical rainfalls were created according to the Huff distribution (Huff 1990), in which there are four hypothetical temporal distributions of rainfall, each with a quartile with the highest rainfall intensity. For instance, rainfall concentrated in the last quarter of its period is concentrated in the 4th quartile of the Huff distribution. Figure 6 shows scenarios of the rainfall event under study in the mountainous region, where the rainfall was the most intense.

Fig. 6
figure 6

Four scenarios of rainfall temporal distribution based on Huff (1990) for the mountainous region. Rainfalls concentrated in the 3rd and 4th quartile are closer to the report of residents

The total rainfall duration was approximately 4 h and became slowly heavier according to the reports of the local community. Therefore, it was defined that the closest scenarios to the observed would be those with rainfall concentrated in the 3rd quartile and the 4th quartile of the Huff distribution, i.e., the rainfall presented a high intensity near the end of the event. Therefore, three rainfall event scenarios were simulated: the one with the few official data available, called the “Official” scenario, and the two scenarios with data from both farmers and different temporal distributions (“Unofficial—3Q” and “Unofficial—4Q” scenarios) (Table 2). The hydrographs resulting from the three HEC-HMS scenarios were used as inputs in the hydrodynamic simulation to analyse the extent of the flood caused by this rainfall event.

Table 2 Scenarios of simulated rainfall events

2.3.2 Hydrodynamic simulation

The Nays2D Flood model, from the International River Interface Cooperative (iRIC) (Nelson et al. 2016), was used to propagate the hydrograph flow downstream of the Mascarada gauge. The main input data in the model were the topobathymetric information, Manning roughness coefficient, and the hydrograph (Table 3).

Table 3 Input data used in the Nays2D flood model

The topography data entered is a 5-m resolution DEM in TIFF format. The DEM was created by joining the DEM of the Mascarada River sub-basin (acquired by the DigitalGlobe—WorldView satellite from AW3D, with a resolution of 1 m), with the DEM that covers the entire basin area (generated by the ALOS—PALSAR satellite, with 12.5-m resolution). Therefore, a manipulation of the images was conducted to combine them, with the resulting DEM having a pixel representing an area of 5 × 5 m.

The grid x and y sizes were defined as 5 m. The input hydrograph and the slope at the location where the simulation started were entered in the model. Finally, the calculation conditions were defined. The constrained interpolation profile (CIP) finite difference method was chosen for calculations, as it simulates steady flows with great precision, stability and with relatively long time steps. As boundary conditions, it was set downstream water surface as free flow and an initial water surface with zero depth. The computational time was set to 0.05 s since the calculation converges.

Simulations were performed for the three rainfall intensity scenarios (Table 2). The simulations of each scenario took between 7 and 10 days to generate the results, depending on the data processing capacity of the computer used. Finally, the simulation results were compared with the flood data recorded in the field. The parameters analysed were the maximum water depth and the flood extent regardless of time, to evaluate the maximum flood reached by this event in different locations.

SEMA/GPDEN (2017) collected data on the flood a few days after the event by collecting the GPS location of 61 points of the flood marks and defining the boundary of the flood extent along a 20 km river path. The discharge was not measured during the event and the only posterior evidence was flood marks, which indicated the boundaries of the flood extent. Estimating the observed flood extent was challenging because transforming points into an area contain uncertainties due to topography, hydraulics and DEM resolution. Consequently, the flood extent was estimated with two approaches. An approach estimated flood extent using the HAND model (Rennó et al. 2008) with the points collected in the field. Another approach interpolated the points collected in the field with the ArcGIS “Topo to Raster” tool, which estimated the flood extent area, and the “Raster Calculator” tool was used to estimate the depth in each pixel, in which the DEM was subtracted from the interpolated area. These two approaches to estimate the flood extent were called “HAND flood extent” and “point-interpolation flood extent” respectively. These flood extents are considered the observed flood extent in the event and were used to evaluate the performance of the simulation results in two types of analyses: maximum flood depth and maximum flood extent.

Percentage bias (PBIAS) and coefficient of determination (R2) were used to assess the maximum depth at 1000 random points. The closer to PBIAS = 0 and R2 = 1, the better the performance. An adapted ROC curve (Fawcett 2006) and the Cohen’s Kappa Index (Cohen 1960; Guzzetti et al. 2006) were used to evaluate the flood extent. For the adapted ROC curve, the closer to 100% POD (Probability of Detection) and 0% POFD (Probability of False Detection), the better the performance. For the Cohen’s Kappa Index, the closer to the value 1, the better the performance.

3 Results

3.1 Rainfall-runoff simulation

The discharges at the Mascarada gauge for four other precipitation events were automatically calibrated. All the four calibrated hydrographs and the three validated events presented hydrographs with satisfactory Nash–Sutcliffe (NSE) and percentage bias (PBIAS) according to Moriasi et al. (2007). The mean of these simulation performances was NSE = 0.91 and PBIAS = 5.70% for calibration and NSE = 0.76 and PBIAS = 2.27% for validation events. It was considered that the basin was properly calibrated and validated up to the point of the Mascarada gauge station.

The estimated hydrograph at the Mascarada gauge for the January 2017 event is shown in Fig. 7. This hydrograph presents three scenarios: the scenario with the official rainfall, the scenario with unofficial rainfall concentrated in the 3rd quartile of the Huff distribution (3Q rainfall distribution scenario) and the scenario with unofficial rainfall concentrated in the 4th quartile of the Huff distribution (4Q rainfall distribution scenario). At the Mascarada gauge, the maximum discharge was around 900 m3/s at 6:45 p.m. for the unofficial rainfall scenarios and around 250 m3/s at 10:30 p.m. for the official rainfall scenario.

Fig. 7
figure 7

Hydrographs resulting from the HEC-HMS model of the 2017 event at the Mascarada gauge for three different scenarios

The estimated hydrographs for the non-validated locations downstream of the Mascarada gauge (Riozinho sub-basin and Areia sub-basin) are shown in Fig. 8. The urban area of the town of Rolante lies in the region where the Areia stream converges into the Rolante River. At this convergence, the main river presented a maximum discharge of around 1300 m3/s at 9:20 p.m., for unofficial rainfall scenarios, and around 500 m3/s at 10:30 p.m. for the official rainfall scenario. These main values of time to peak are exposed in Table 4.

Fig. 8
figure 8

Hydrographs resulting from the HEC-HMS model of the 2017 flood event for three scenarios in a just after the Riozinho stream converges into the Rolante River; b just after the Areia stream converges into the Rolante River, where the town is located, for three different scenarios

Table 4 Time and peak discharge of the three simulated scenarios and the time reported by residents

3.2 Hydrodynamic simulation

The maximum flood depth resulting from the Nays2D Flood simulation for both unofficial rainfall scenarios was overestimated, whereas it was underestimated for the official rainfall scenario compared to data collected in the field. The PBIAS and R2 depth metrics for the different scenarios are shown in Table 5.

Table 5 Maximum simulated flood depth for three scenarios and their performance

The second analysis of the simulated scenarios was the flood extent. In this case, the results of the simulations were compared with the two areas considered as observed in the field: “HAND flood extent” and “point interpolation flood extent”. Therefore, two adapted ROC curves were analysed, as shown in Fig. 9.

Fig. 9
figure 9

Adapted ROC curve for simulated flood extent of three scenarios compared to two approaches to estimate observed flood extent: a “point interpolation flood extent”; b”HAND flood extent”

Another metric used to analyse the performance of the simulations and to compare with the adapted ROC curves was the Cohen’s Kappa index. Table 6 shows both metrics for evaluating different simulated scenarios, comparing the “point interpolation flood extent” and “HAND flood extent”. For the ROC analysis, the shorter the Euclidian distance from the perfect performance, the better the performance.

Table 6 Metrics of ROC and Kappa Index to evaluate the simulation performance of flood extent

As an example of a good performance scenario, the result of the simulation of the 4Q rainfall distribution scenario is shown in Fig. 10.

Fig. 10
figure 10

Result of the simulated flood extent of the “4Q Rainfall distribution” scenario, with details at the beginning of the town of Rolante

The simulation results are associated with modelling constraints related to the assumptions to reconstruct the rainfall event. Examples of modelling constraints include data availability and the chosen boundary conditions. A parameter that greatly modifies the results of the hydrodynamic simulations is the channel and floodplain roughness (Dimitriadis et al. 2016). This reconstruction aimed to reduce uncertainties associated with modelling constraints through model calibration and validation of several parameters, including terrain factors such as manning roughness. The modelling results of the unofficial scenarios are supported by the analysis of a video recorded during the peak flood at the Mascarada gauge. More simulation uncertainties are discussed in the subSect. 4.3.

4 Discussion

4.1 Rainfall-runoff simulation

The hydrograph analysis of unofficial rainfall scenarios at the Mascarada station shows that the “3Q rainfall distribution” scenario presented a peak of 907 m3/s at 6:30 p.m. and the “4Q rainfall distribution” scenario presented a peak of 950 m3/s at 7:00 p.m. (Fig. 7). A flow with this order of magnitude also matches the discharge estimation made through a video recorded during the event by a resident. As expected, the scenario with a posterior rainfall intensity (“4Q rainfall distribution”) generated a later and higher discharge peak since it presented a more humid soil when the rain was heavier. The hydrograph of the official rainfall scenario presented a much lower discharge, with a maximum value of 251 m2/s at 10:20 p.m., which is almost a quarter of the value of the “4Q rainfall distribution” scenario and more than 4 h later, demonstrating a significant difference in time and magnitude. The analysis of these hydrographs is for the Mascarada station, where the rainfall-runoff simulation was calibrated and validated and, therefore, the result is more reliable than for the downstream hydrographs that are discussed below.

The Riozinho sub-basin contributed to an increase in the discharge of the Rolante River by 14% in the unofficial rainfall scenarios and by 40% in the official rainfall scenario. Similarly, the Areia sub-basin contributed to an increase in the discharge of the Rolante River by 22% in unofficial scenarios and 35% in the official scenario. This demonstrates how the rainfall was concentrated in regions where there was no official measurement of rainfall. This shows the contribution of the unofficial scenarios created according to farmers and residents. The lack of official measurements in this basin challenged the understanding of this precipitation event since the rainfall was concentrated in the ungauged Mascarada sub-basin.

The residents reported that the flood peaked in the town of Rolante around 11 p.m. to 12 a.m. (SEMA/GPDEN 2017). The maximum river stage occurred just after 9 p.m. and started to decrease around 10 p.m. in the simulated unofficial scenario. In the official scenario, the flood peaked in the town at 10:30 p.m., which is closer to the reported by local community. The unofficial rainfall scenarios represented the maximum flow volume better, although the time of peak flood of the official rainfall scenario is more consistent with what was observed by residents in the town. However, a video recorded by the residents indicated the moment when the river reached its maximum level at the Mascarada gauge around 5 p.m. The unofficial rainfall scenario indicated that the peak discharge at the Mascarada gauge would have occurred around 7 p.m., whereas the official rainfall scenario would have been after 10 p.m. This indicates that the unofficial scenario could better represent the spatial and temporal distributions of rainfall.

According to the time of concentration from the mountainous region to the town centre calculated by Kirpich, the time to peak is 4.6 h. This was a parameter used in the simulation, which resulted in a time to peak of around 5.3 h–still 2 h earlier than reported. The inaccuracy of the 2-h delay of the maximum flood peak in the town of Rolante of the unofficial rainfall scenario can be due the simplification of the use of the rainfall-runoff model and the estimation of input data and initial conditions. This inaccuracy could also be explained by a possible landslide dam holding water for a while before failure, which would delay the flood. Locals raised this hypothesis, but it was not confirmed. Finally, another possibility is that the approximation of the temporal distribution of rainfall according to the reports of residents generated these uncertainties.

4.2 Hydrodynamic simulation

The hydrodynamic simulation was conducted by propagating the hydrograph from the Mascarada gauge to the town of Rolante, over a distance of 9 km. The scenarios showed similar results of the maximum flood depth resulting from the hydrodynamic simulations, but the best correction found is for the unofficial rainfall scenarios. The values of the coefficient of determination (R2) found were close to 0.4 for all scenarios. According to PBIAS, the unofficial scenarios overestimated the depths, whereas the official scenario underestimated the depth. The reason why the coefficient of determination (R2) found a low correlation may be related to the only available DEM for the urban area of Rolante that has a resolution of 12.5 m. This resolution hides the details of the topography, representing a flatter terrain by misrepresenting occasional depressions and elevations. Consequently, this is a source of errors when analysing the flood depth. In summary, a DEM with 12.5-m resolution represented 83% of the study area, and a DEM with 1-m resolution represented only 17% of the basin, which could appropriately describe the terrain for this depth analysis.

Regarding the simulated flood extent and the ROC curve analyses, the slightly best performance found was for the unofficial rainfall “3Q rainfall distribution” scenario. The Kappa Index indicated that the official scenario presented a slightly better performance, but all scenarios showed a similar performance for the flood extent. This happened because the false positive result (Probability of False Detection) for all scenarios presented a low value and dragged the metric to a similar result in all scenarios. A low value of false positive does not provide a conclusion since the extent of flood increases from the river channel. For example, a tiny flood extent results in a low false positive value, but it does not mean that this was a good scenario performance. The true positive result (Probability of Detection) was better for.

In the analysis of the adapted ROC curves based on two estimations of the observed flood extent, the simulations presented a better correlation with the “point interpolation flood extent” estimate because it presented higher values of correct detection (Probability of Detection) (Fig. 9a). The observed “point interpolation flood extent”—in red in Fig. 10—presented more terrain details than the observed “HAND flood extent”—in black. This means that the “point interpolation flood extent” represented small areas that were not flooded due to high elevation (islands). The simulations made with the Nays2D Flood model also presented these details of locations that did not flood because they were elevated, forming small islands within the flood. This precision in the details may explain the better performance of the adapted ROC curve of the “point interpolation flood extent” compared to the performance of the “HAND flood extent”. Nevertheless, both approaches used to calculate the observed flood extent are estimations and cannot be considered as measured in the field. Both estimations are discussed in this study because there were no observed flood extent data collected from the field.

The results of the unofficial scenarios showed that the ordinary response of this basin to intense rainfall in the mountainous region can cause a flood of great magnitude downstream in the floodplain area, where the town of Rolante is located. Therefore, intense rainfall alone may have caused the flood that occurred in January 2017 in the town of Rolante. The reports of the local community were essential to reconstruct the rainfall and flood. The rumours that a landslide dam break caused the flood indicated that the rainfall was concentrated only in the mountainous region, where only a few people lived and reported a heavy rain. These rainfall and floods were unusual for the local population, which indicates a possible change in the seasonal pattern of the local rainfall.

4.3 Uncertainties and limitations of the event reconstruction

This study used a two-step methodology of models to reconstruct the flood caused by a rainfall event. The hydraulic methods were adapted to deal with issues due to the lack of observed data. There are many uncertainties and limitations associated with each method applied to reconstruct the event. The rainfall-runoff simulations went through calibration and validation before simulating scenarios of the rainfall event under study. However, the only sub-basin with gauge and available discharge data was the Mascarada sub-basin, which is the largest sub-basin of Rolante. Studies suggest relying on data from neighbouring gauged basins for the model set up of ungauged basins (Blöschl 2016). Therefore, the parameters for the other sub-basins were estimated based on the parameters calibrated for the Mascarada sub-basin. Although this estimation was made from a neighbouring sub-basin that presents similar environmental characteristics, the lack of monitored data and the impossibility of calibrating and validating the other sub-basins brought uncertainty to the study.

Additionally, the lack of rainfall data created uncertainties in the temporal and spatial distribution of the rainfall event under study. Rainfall data could be estimated by remote sensing in data scarce areas to reconstruct a rainfall event. However, remote sensing did not estimate the rainfall event in Rolante properly (Censi and Michel 2018). Unofficial data collected by the farmers were used to define the spatial distribution of rainfall. These data were of great value due to the lack of official data, but they present uncertainties and do not represent the detailed temporal resolution of precipitation, since they were the accumulated value of daily precipitation. A merging of two types of Thiessen polygons was conducted to use the farmers’ data: Thiessen polygons from unofficial gauges were merged with Thiessen polygons from official gauges. This methodology proved to be useful, being easily applied and presenting good results. Therefore, unofficial data provided by residents can assist studies of disasters, especially flash floods, as in this case study, since they usually require a detailed temporal and spatial distribution of rainfall and flow data. Flood studies using data provided by the population in real time are becoming more common due to easy access of smartphones and the internet (e.g. Brouwer et al. 2017).

This study had access to an appropriate DEM with a 1-m of resolution for the Mascarada River sub-basin. For the other regions of the basin, including where the town is located, only a DEM with 12.5-m resolution was available. This resolution caused limitations in representing the terrain and in conducting an adequate simulation of the flood wave. An adjustment was required to join both DEMs with different resolutions, by excavating a slot where the river flows. The 1-m resolution DEM was satisfactory, but the 12.5-m resolution DEM proved to be harmful in the flow propagation, as it did not represent the local topography with the appropriate details. Terrain factors, such as the channel and floodplain roughness, as well as the input hydrograph greatly contribute to uncertainties in the reconstruction of rainfall and flood events using modelling (Dimitriadis et al. 2016).

The hydrodynamic simulation propagated a fluid considered as clear water, and the influence of the fluid viscosity was not verified. This is a simplification because the rivers transported a large amount of soil and vegetation due to the landslides and the fluid had a high concentration of sediments. The fluid should be considered a non-Newtonian fluid, which modifies its movement. The hydrodynamic simulation is simplified because it does not consider the fluid viscosity. Rheological analysis of the fluid should be conducted in future studies.

The performance evaluation of the simulated flood extent contains uncertainties, as it compares it with two possible observed flood extent: “point interpolation flood extent” and “HAND flood extent”. In other words, the verification of the model's performance was not performed by comparing the raw data (points collected in the GPS of the extent boundary), but by comparing the estimation of flood extent area. Extreme flood events, as in this case study, often lack data on both the depth and flood extent, especially in poorly monitored regions as in developing countries, such as the Rolante basin.

Studies of past rainfall events often use hydrologic and hydrodynamic modelling (e.g. Bellos et al. 2020; Vanelli et al. 2020; Zhang et al. 2021; Tegos et al. 2022), but they are associated with inherent uncertainties related to model structure and input data (Dimitriadis et al. 2016).Collecting data from different sources such as remote sensing (e.g. Dhara et al. 2020) or from the local population (e.g. Brouwer et al. 2017) might reduce uncertainties in poorly monitored areas. Other approaches to reconstructing past rainfall events that might reduce uncertainties are the use of probabilistic models (e.g. Gaume et al. 2010) and artificial neural networks (e.g. Bomers et al. 2019).

5 Conclusions

This study reconstructed and analysed the extreme rainfall and flood event that occurred in January 2017 in Rolante, southern Brazil, which is a region with poor monitoring of rainfall and river flow. The methods included both hydrologic and hydrodynamic models, in addition to the report of residents. The rainfall-runoff model was calibrated and validated with official data from the Rolante basin. Hydrologic methods were adapted to simulate scenarios of the rainfall event as it was a data scarce area. The main adaptations were to define the spatial and temporal estimation of rainfall as well as to estimate the observed flood extent. Thiessen polygons were adapted using both official and unofficial rain gauges. The temporal distribution of rainfall was analysed with different scenarios based on reports of the local community. The observed flood extent was estimated on the basis of points of the flood extent boundaries in two approaches: using GIS and a hydrological terrain model. Then, the simulated flood extents were compared to the observed flood extents using ROC curve and Kappa index. The adapted methods proved to be useful for understanding past rainfall events in data scarce areas.

The farmers and the adapted hydrologic methods were essential for understanding and reconstructing the rainfall event. The spatial analysis of the flood extent for different scenarios resulted in contradicting results depending on the metrics analysed. This shows the importance of analysing the simulation results with several indices and coefficients. Although the results of the simulations did not present a single scenario with performance that stands out in relation to the others, it was possible to draw some conclusions of the rainfall event. First, the ordinary hydrological response of the Rolante drainage basin to intense rainfall can cause large-scale, quick, and intense floods, such as the one observed in January 2017. The hypothesis raised by locals that there was a rupture of a landslide dam could still be true, but there is no need of such a phenomenon to explain the magnitude of this flash flood event. Second, the rainfall was very concentrated in the mountain area, where landslides occurred. This may be an indication of changes in the rainfall patterns due to climate change.

The disaster in Rolante demonstrated how multiple hazards involving intense and concentrated rainfall, landslides and flash flooding made it difficult to prevent and mitigate the impacts of the disaster. This study can contribute to other computational reconstruction and analyses of extreme rainfall and flood events, especially in poorly monitored regions with low accessibility. Learning from past rainfall that triggered disasters is fundamental for reducing disaster risks by predicting floods and taking preventive measures.