1 Introduction

Flood hazard mapping has become an integral part of any landuse planning and one of the most revisited research objectives throughout the world (Alfieri et al. 2014; Chen et al. 2009; Masood and Takeuchi 2012). In Canada alone, floods have caused over $9 Billion in damages since 2000 (Jakob and Church 2011; http://www.emdat.be/database) and floods in the Canadian prairies have been a major concern as recent floods have led to major losses during the recent floods of 2011 and 2014 (Ahmari et al. 2016; Szeto et al. 2015). The Canadian prairies, spread over the provinces of Alberta, Saskatchewan, and Manitoba, have a unique landscape characterized by potholes or sloughs that are disconnected during the dry summers but contribute during floods through a fill and spill mechanism (Shook et al. 2013). Due to the flat topography of the prairies, flooding extents vary widely from a few hundred meters to tens of kilometers in the plains. For example, flood extents estimated using satellite images for the Red River flood of 2009 showed flooding extents as wide as 20 km at a few locations along the river (http://130.179.67.140/dataset/nrcan-flood-maps).

In general, flood hazard is typically quantified by estimating flows corresponding to different return periods and determining their flooding extent. Estimating peak flows are carried out by frequency analysis of floods (Büchele et al. 2006; Merz and Thieken 2009) or using a hydrological model to generate streamflow for extreme rainfall (Sarhadi et al. 2012). Such methods rely on appropriate hydraulic models to translate these flows into flood extents, depths, and velocities. The commonly used approach to simulate flows in river channels is the one-dimensional (1D) hydraulic modeling approach. In the recent past, two-dimensional (2D) hydrodynamic models have also gained prominence due to improvements in computational time, model structure, and parameterization (Falter et al. 2013; Tsakiris 2014; Shen et al. 2015). It is also a common practice to develop a coupled 1D/2D models for flood hazard mapping (Bates and De Roo 2000; Werner 2004; Apel et al. 2009; Liu et al. 2015), where the channel is modeled in a 1D framework and the adjoining floodplain in 2D. Although hydraulic models were initially developed to be applied over small river reaches, advances in computational efficiency of hydraulic models have enabled generation of flood hazard maps at national (Falter et al. 2013; Hall et al. 2003; FLORIS 2005) and global scales (Winsemius et al. 2013; Wu et al. 2014; Trigg et al. 2016).

A general practice is to develop a hydraulic model by calibrating it to an observed flood event and using the model to delineate flooding extents corresponding to different return periods (Mosquera-Machado and Ahmad 2007; Crispino et al. 2015). This is a deterministic approach wherein the floodplain is distinguished as either flooded or non-flooded and it neglects various uncertainties that might affect the model output, i.e., the hazard map. A flood inundation model that was calibrated on a historical event, may give poor prediction of a synthetic design event if the flood magnitude changed (Di Baldassarre et al. 2010). Recent studies have encouraged the representation of hazard maps as probabilistic flood hazard maps (PFHM) that reflect a degree of assurance with respect to inundated areas instead of a crisp delineation between inundated and dry areas (Aronica et al. 2002; Domeneghetti et al. 2013; Romanowicz and Beven 2003; Hall et al. 2005). Uncertainties in topography (Jung and Merwade 2015; Mukolwe et al. 2016; Schumann et al. 2008; Cook and Merwade 2009), boundary conditions (Smemoe et al. 2007; Sarhadi et al. 2012; Pedrozo-Acuña et al. 2015), or model parameters (Aronica et al. 1998; Di Baldassarre and Claps 2011; Hall et al. 2005) have been shown to affect the hazard mapping. Studies on uncertainty in flooding extent have not been restricted to overbank flooding only but have also been extended to modeling flooding due to failure of flood protection structures. Flood hazard assessment due to dike breaches has been carried out by various researchers in both deterministic and probabilistic context (Vorogushyn et al. 2010; Viero et al. 2013; Mazzoleni et al. 2014).

Although interest toward using probabilistic methods for flood mapping is increasing worldwide, not many studies in Canada have been carried out in this context. For example, Laforce et al. (2011) downscaled future projections generated by generalized climate models to be used as input to hydrologic and hydraulic models to determining flooding extents for different climate scenarios for two small watersheds in Quebec without accounting for uncertainties in the development of the hydraulic model. Even in studies carried outside of Canada, smaller river reaches that can be modeled using 1D or 2D models were usually considered to exhibit the benefits of probabilistic flood mapping as the computational time required to run numerous simulations is short for smaller reaches. In scenarios where flood mapping for longer reaches (> 100 km) are needed and when the rivers show distinctly varying hydraulic characteristics along the river (change in slope, floodplain widths) and complex hydraulic models are required to be set up, the efficacy of probabilistic methods have not been tested. Even identifying flooded and non-flooded locations in the floodplain is usually done by assigning a minimum depth of inundation below which a location is considered non-flooded. Such thresholds are also subjective and their effects on the extent of flood maps should also be evaluated.

The objective of the present study is to adapt existing probabilistic flood mapping (PFM) approaches and apply them to a prairie river with limited data as a case study, and subsequently, evaluate the PFM’s utility. The presented study is unique as there have been no previous attempts to apply the probabilistic flood modeling technique to arrive at flood extent maps in the challenging prairies. Results of the probabilistic flood modeling are analyzed in the context of: (a) the probability of flooding of any location in the floodplain of the study reach; (b) the sensitivity of flooding extents to various uncertainties using an all-at-a-time sensitivity analysis approach; (c) subjectivity in defining a criterion to distinguish between flooded and non-flooded areas, and the variation of the criterion and its effects on flooding extents; and (d) evaluating a classified flood hazard map (CFHM) for Canada, which was developed in a previous study (Elshorbagy et al. 2017), against the developed PFHM to determine the applicability of large extent (area) flood maps developed using only topographical information to local scales, when the flooding extent from a hydrodynamic model is available.

2 Study area and data products

The Qu’Appelle River basin is an important river basin located within the province of Saskatchewan and serves as a significant tributary of the Assiniboine River. In 2011, the Assiniboine River and its tributaries, the Qu’Appelle and Souris Rivers, observed floods that eventually led to damages that were estimated at over $1 billion in Saskatchewan and Manitoba with flood estimates in parts of the river basin having return periods of 500 years (Blais et al. 2016). The same river was subjected to floods in 2014 again due to a series of significant rainfall events over the entire basin (Ahmari et al. 2016). The Qu’Appelle River, which is a significant tributary of the Assiniboine, recorded a flow of 345 m3/s during the 2011 flood at Welby, located just above its confluence with the Assiniboine, and the return period of the flow at that location was estimated to be 140 years (Blais et al. 2016). The Qu’Appelle River has not been studied in the past in the context of floods although it contributes significant flow to the Assiniboine. Flood mapping in this river basin is particularly challenging due to non-availability of fine resolution terrain data for the most part of the basin.

The river is also the source of water for major cities, such as Moose Jaw and Regina, Saskatchewan. Owing to its importance, the Qu’Appelle River basin is considered as a key study basin under Floodnet—an NSERC funded project for a Canada-wide strategic research network for flood forecasting and impact assessment. The river basin is dominated by prairie topography, which includes flat terrains and dynamic non-contributing areas. Floods in the prairies are usually driven by snowmelt with peak flows usually occurring during the months of April and May (Gray et al. 1985). The case study considered is located within the Qu’Appelle River basin and consists of reaches from two rivers, Moose Jaw and Qu’Appelle as shown in Fig. 1a. The reach under study is 113 km in length and begins at a streamflow gauging station called Moose Jaw above Thunder Creek, located just above the city of Moose Jaw, and ends just below the city of Lumsden. The Moose Jaw River is a major tributary, contributing almost 50% of the flow to the Qu’Appelle River at Lumsden. Another major tributary is the Wascana Creek that flows into the Qu’Appelle just above the city of Lumsden. The reach is selected for the following reasons: (1) The Moose Jaw River, flowing up to the confluence with the Qu’Appelle River (shown as “confluence MJ” in Fig. 1a) flows within a defined valley with a limited floodplain (~ 300 m) and hence, can be easily modeled with a 1D hydraulic model (reach length 55 km). Figure 1b shows a typical cross section along the Moose Jaw River; (2) the Qu’Appelle River reach from the confluence of Moose Jaw up to the city of Lumsden has an average channel width of 30 m whereas the floodplain extends beyond 1 km at a few locations. The floodplains along the Qu’Appelle also consist of wetlands that retain water and have a fill and spill mechanism. These features are well represented in the hydraulic model in a 2D framework. Hence, the reach of the Qu’Appelle River (shown as a red solid line in Fig. 1a) was modeled in 2D. The total length of the reach modeled in 2D is 53 km. A typical cross section in this sub-reach is shown in Fig. 1c; and (3) the Qu’Appelle River flows through a levee at the city of Lumsden constructed to protect the city from a 1 in 500 year flood, therefore, the flow is mainly restricted within the levee with no meandering. The length of this reach is 7 km and the conditions are ideal for the reach to be modeled in 1D.

Fig. 1
figure 1

a Location of the Moose Jaw and the Qu’Appelle Rivers considered in the study, b typical terrain profile along the Moose Jaw River, and c typical terrain profile along the Qu’Appelle River

The Canada digital elevation model (CDEM) was used to extract terrain information for the study reach. The CDEM is derived from the Canadian Digital Elevation Data, which were extracted from the National Topographic Data Base (NTDB), the Geospatial Database (GDB), various scaled positional data acquired by the provinces and territories, and remotely sensed imagery. The CDEM is available for download at various resolutions ranging from 0.75 arc second (~ 20 m at the equator) to 12 arc seconds (~ 326 m at the equator) as tiles that are consistent with the National Topographic System (NTS; Official division and identification system for the base topographic maps of Canada). The finest resolution CDEM was extracted for the study area using the Geospatial Data Extraction option in http://maps.canada.ca/czs/index-en.html, and projected using a projected coordinate system (NAD 1983 UTM Zone 13N), resulting in the resolution of the map to be around 18 m. Bathymetry data from surveys were available for a small reach of the Moose Jaw River flowing through the city of Moose Jaw (located in the upstream end) and along the Qu’Appelle River flowing through the town of Lumsden (located in the downstream end). The data were mosaicked with the extracted DEM to improve the accuracy in representation of the channel bed. An average difference in elevation between the DEM and the surveyed points was used to reduce the elevation of the DEM along the stream centerline to reflect the actual channel bed in the DEM. Although hydraulic studies at smaller extent are carried out using very fine resolution LiDAR data, the non-availability of such data over the study reach forced the use of a coarser national dataset as terrain data. Landuse data were obtained from Agriculture and Agri-Food Canada through the government of Canada Web site (http://open.canada.ca/data/en/dataset/), and roughness coefficients were assigned to each landuse based on Chow et al. (1988) to update the landuse map with the roughness values.

Streamflow data are available at three different locations along the study reach in the HyDAT database provided by Environment and Climate Change Canada. The gauging station O5JE001 (Moose Jaw above Thunder Creek) was considered as the upstream boundary condition for the Moose Jaw River (bottom left corner of Fig. 1a). The study reach also consisted of a confluence between a tributary (Wascana Creek) and the Qu’Appelle River and has a stream gauge located just above the confluence (05JF005; Wascana Creek above Lumsden, the top right corner of Fig. 1a). The third stream gauge is located on the Qu’Appelle River within the city of Lumsden, and the discharge at this location is the accumulated flow from both the upstream gauged tributaries (05JE001 and 05JF005) and any ungauged flow coming from the lateral creeks flowing into the Qu’Appelle.

3 Model development

For the present study, HEC-RAS 5.0.3, developed by United State Army Corps of Engineers, and capable of performing 1D and 2D hydraulic calculations for a full network of natural and constructed channels, floodplains, and flood protected areas, was used to simulate both steady and unsteady state flow. A GIS extension for HEC-RAS, called HEC-GeoRAS (http://www.hec.usace.army.mil/software/hec-georas/), was used to delineate cross sections, assign topographic data and the roughness coefficient for each cross section, define hydraulic structures (bridges, weirs etc.), and export the model to HEC-RAS. For the reach of the Qu’Appelle River that was modeled in 2D, a computational mesh that contains cells of predefined sizes and whose extent encompasses the entire floodplain along the reach was manually delineated in HEC-RAS. The default size of the computational cells within the mesh was set as 50 m. A mixture of irregular-shaped (3-sided to 8-sided polygons) cells is generated at the edges of the mesh, or at locations where a square cell cannot be generated. HEC-RAS automatically generates the hydraulic properties such as elevation profile, volume, and roughness for each computational cell within the 2D mesh utilizing the DEM and landuse data. A detailed description of the definition of 2D hydraulic properties can be found in Brunner (2016).

The hydraulic model was calibrated for the 2011 flood in the Qu’Appelle valley. Daily streamflow measurements at gauging stations O5JE001 and 05JF005 were used to extract streamflow hydrograph for a period of 35 days that contained the peak flood event (April 1, 2011–May 04, 2011) at those locations. The hydrographs were used as the upstream boundary conditions. The observed peak flow at Moose Jaw River and Wascana Creek were 197 and 95 m3/s, respectively, and the peak flow observed at Lumsden was 300 m3/s. The roughness parameters were manually adjusted to calibrate the model. Although the roughness values in the landuse maps were distributed and cross sections were assigned more than one type of landuse, previous studies have shown that the model results are not sensitive to spatially varied roughness values in the floodplain (Werner et al. 2005). For the present study, roughness parameters were lumped into two values, one for the channel and one for the floodplain within HEC-RAS. Due to non-availability of observed flooding extent along the reach considered in the present study, the flooding extents obtained from the calibrated model could not be compared with observed extent for the 2011 flood. The only available data were water surface elevations observed at a few locations in the city of Moose Jaw and at the gauging station at Lumsden. Hence, the water surface elevations obtained from the model results were compared with the observed water surface elevations at these locations The difference in elevation between the simulated and the observed water surface were found to be < 0.4 m at all the locations. The accuracy of the model would have been further affected if the bathymetry data were not incorporated at locations where it was available. At intermediate locations, the procedure used to estimate the channel depth in the present study could affect the calibrated flooding depth along the reach and cannot be verified. This is a drawback of most floodplain studies that are carried out using limited bathymetric data. The maximum water surface elevations obtained from the model were translated to a map showing depth of inundation and extent of flooding along the entire reach. The calibrated roughness values were found to be 0.032 and 0.041 for the channel and the floodplain, respectively. Model verification was carried out by comparing the flooding extents obtained from the model for another high flow event in the Moose Jaw River (2013), with a high-resolution areal imagery for the Moose Jaw River taken during the high flow event. The observed peak was 195 m3/s at the upstream gauging station 05JE001 and was recorded on 03/05/2013 and the areal image of the flooded areas was captured on 02/05/2013. The modeled inundation extent was overlaid on the areal image to visually compare the extents and the comparison is presented in Fig. 2 for two enlarged portions along the river. It can be observed that the modeled extents are in good agreement with actual extents at most places with a few locations identified as non-flooded in the image shown as flooded in the model. This overestimation of the flooding extents could be attributed to the vertical accuracy of the DEM.

Fig. 2
figure 2

Modeled flooding extents for the 2013 high flow (shown as red lines) overlaid on high-resolution areal images of flooded areas for the same event along the Moose Jaw River

4 Probabilistic modeling framework

4.1 Uncertainty in model parameters and boundary conditions

In the present study, two major components that influence the flooding extent—roughness parameters and flood hydrographs—were considered to be uncertain and were varied within plausible ranges to assess their effect on flooding extent and depth. The roughness parameters mainly represent the state of vegetation or landuse in the floodplain that can change over time. Some studies show that the effective roughness coefficients may vary for different flood magnitudes (Romanowicz and Beven 2003; Horritt et al. 2007; Di Baldassarre et al. 2010). Therefore, considering the roughness parameters as uncertain is valid. Roughness values of the channel and floodplain were sampled in the range of 0.01–0.05. Similarly, uncertainty in measured or estimated flood magnitudes may also exist for various reasons, such as erroneous measurement of flood stages and flow magnitudes during a flood event, and extrapolation of stage–discharge relationships. Di Baldassarre and Montanari (2009) show that the discharge uncertainty for gauged flows can be as high as 40% and can significantly affect model calibration and validation. Hence, the flow hydrographs were varied to account for uncertainty in the measured inflow. Some previous studies considered perturbing the inflow hydrographs as incremental percentages at each ordinate to create multiple hydrographs and determine flooding extents (Savage et al. 2016a, b; Pedrozo-Acuña et al. 2015). When hydrologic models are used to determine floods of larger return periods, the resulting hydrograph may already contain errors accumulated during the hydrograph prediction. In addition, using multiple hydrological models may result in significant variation in the peak flow estimations and that would also have an effect on the shape of the hydrograph. A more realistic approach to perturb the hydrograph would be to generate realizations wherein the shape of the hydrograph and the time of peak are also varied. The 2011 flood in the Qu’Appelle valley was estimated to have a return period of 140 years (Blais et al. 2016), and although such estimates of return periods are not available for the Moose Jaw and Wascana Creek tributaries, it can be assumed that they were also large. To account for uncertainty in flows with such large return periods, the hydrographs have to be perturbed such that each realization results in a hydrograph that is realistic in shape but varies in the peak value and time. For the present study, the observed hydrographs at the two gauging stations were perturbed using a method proposed by Savage et al. (2016a, b) that uses an additive residual model to perturb a base hydrograph Q b t at each time step t as

$$Q_{t}^{\text{Up}} = Q_{t}^{b} + \rho_{t}$$
(1)

where Q Up t is the updated discharge at time t and ρ is a residual term estimated as,

$$\rho_{t} = \alpha \rho_{t - 1} + \varepsilon_{t}$$
(2)

where α and ɛt are error terms corresponding to the previous and the current time step, respectively. At t = 0, the value of ρt would be equal to the error term ɛt. The error term ɛt is randomly sampled from a normal distribution with mean 0 and standard deviation equal to a fractional error term defined as,

$$\sigma_{t} = \beta Q_{t}^{b}$$
(3)

The parameters α and β control the amount of error introduced into the hydrographs. Savage et al. (2016a, b) suggested that assigning α a value of 0.3 and β a value of 0.127 allowed the realizations to be within 40% of the base hydrograph level. The same values were used for the present study to maintain the variation in the hydrograph magnitudes within 40%. By adopting this approach for perturbing the hydrographs, we intend to create numerous scenarios of flood hydrographs that could enable us to identify locations that would likely be inundated during the best case (flood magnitudes 40% lower than the 2011 flood) and worst case scenarios (flood magnitudes 40% higher than the 2011 flood).

A total of 5000 samples for the roughness parameters and hydrographs were generated using the Latin hypercube sampling technique using the SAFE Toolbox for MATLAB (Pianosi et al. 2016). Figure 3 shows the range of the sampled roughness parameters and the inflow hydrographs. The base hydrograph is highlighted to show the original shape of the hydrograph, and it can be observed from the figure that the random realizations do not follow the shape of the base hydrograph owing to the method used to generate the hydrographs. A MATLAB code was developed to replace the roughness coefficients in the geometry file and inflow hydrograph in the inflow file that are used by HEC-RAS to run the model. For each run, a single value of channel roughness and floodplain roughness was assigned to all the cross sections in the 1D reaches and to the cells representing the channel and floodplain in the 2D area in the geometry file. Similarly, the inflow hydrographs at both the locations were also replaced.

Fig. 3
figure 3

Realizations of a Manning’s roughness parameters b hydrograph of the 2011 flood at gauging station 05JE006 and c hydrograph of the 2011 flood at gauging station 05JE005. The solid black line in (b) and (c) indicates actual observed 2011 flood hydrograph at the gauges

For each combination of channel roughness, floodplain roughness, and inflows, the model was run to obtain an inundation map, showing depth of inundation at all locations in the floodplain corresponding to that combination. Using the maps from all successful realizations, the probability of inundation at each pixel within the floodplain was determined. The probability of flooding of a particular location along the reach was determined as,

$$P_{i} = \frac{{N_{i} }}{{N_{T} }}$$
(4)

where Pi is the probability of inundation (PoI) of the i-th pixel, Ni is the number of times the i-th pixel was inundated (depth ≥ 0.1 m) and NT is the total number of successful simulations.

4.2 Sensitivity analysis

As multiple realizations of inflows and parameters were used for uncertainty assessment of the hazard map, the results could also be used to carry out a sensitivity analysis. Global sensitivity analysis is closely related to uncertainty analysis wherein the uncertainty analysis focusses on quantifying uncertainty in model outputs whereas sensitivity analysis focusses on attributing the uncertainty to their sources (input or model parameters) (Pianosi et al. 2016). For the present study, the regional sensitivity analysis (RSA) method, also known as Monte-Carlo filtering (Spear and Hornberger 1980) was chosen to assess the sensitivity of model outputs (flood extents) to roughness and inflow. The RSA method is an all-at-a-time sensitivity analysis approach where all the input factors (inflows and parameters in the present case) are varied simultaneously to induce variations in the output.

The method involves splitting the model outputs obtained by perturbing the model parameters (roughness and inflow in the present case) and running the model for each perturbation into subsets that are above or below a particular threshold. The empirical cumulative distribution functions (CDFs) of each perturbed parameter for both subsets were then plotted together and a Kolmogorov–Smirnov (KS) statistic, was calculated and used as a sensitivity measure. When the model outputs were split into two subsets, the KS test statistic, called the maximum vertical distance (mvd) was calculated as,

$${{mvd}} = \hbox{max} \left| {F_{{x_{i} |y_{b} }} (x_{i} |y \in Y_{b} ) - F_{{x_{i} |y_{a} }} (x_{i} |y \in Y_{a} )} \right|$$
(5)

where mvd is the maximum vertical distance between two CDFs, \(F_{{x_{i} |y_a}}\) and \(F_{{x_{i} |y_b}}\) are the CDFs of the variable xi when considering input samples associated with two subsets a and b. Larger values of mvd indicate higher sensitivity and vice versa. Approaches that suggest dividing the model outputs into multiple equal parts (percentiles of equal intervals) and analyzing the marginal CDFs for each part to avoid a subjective bifurcation have also been presented in the past (Wagener et al. 2001; Pianosi et al. 2016). For the present study, the model outputs were split into two subsets and the mvd was estimated.

4.3 Criterion to identify flooded/non-flooded areas

To determine flooding extents from each realization, the depth map should be converted to a binary map indicating wet areas (1) and dry areas (0). To do so, a minimum depth of inundation must be defined to distinguish between a dry cell and a wet cell. Theoretically, any location with a depth of inundation greater than zero can be considered inundated. However, this would include areas that have very low depths (< 0.001 m) as most inundation maps are prepared by overlaying depths on a terrain dataset such as a DEM and the difference is used to estimate the depth of inundation. Few studies in the past have considered taking depths greater than 0.1 m to designate a particular location as flooded (Savage et al. 2016a, b; Candela and Aronica 2016). The value considered is subjective and can affect the boundaries of the flooding extent. Hence, the flooding extents should be estimated using different minimum depths and their effects on the flooding extent should be evaluated. For the present study, the minimum depth of inundation was varied from 0.1 to 0.3 m with an increment of 0.1 m.

5 Comparing PFHMs with topography-based large area flood maps

Various strategies have been developed to delineate hazard maps without the use of hydraulic models. Researchers in the past have also used only the topographic data and other characteristics derived from them to identify locations that are most likely to be flooded in the event of a flood. Such methods include using analytic hierarchy process (Wu et al. 2015; Papaioannou et al. 2015), a modified topographic index (Manfreda et al. 2011; 2014), regression analysis (Jafarzadegan and Merwade 2017), or reclassifying digital elevation models based on distance and height from the nearest stream (Elshorbagy et al. 2017). These methods do not require a flow value but still result in maps that could vastly be classified as flood prone or hazard maps. However, the evaluation of such maps against observed/modeled extent would be essential.

Maps produced using such approaches are not strictly hazard maps but could still be considered as hazard maps owing to the fact that they consider height of a location from the nearest stream, which would translate as stage of flow at that location. One such study was carried out for Canada by Elshorbagy et al. (2017). In their work, a nation-wide classified flood hazard map (CFHM) was generated for Canada and the various hazard levels were classified from “severe” to “very low” depending on the proximity and height of a location to the nearest stream. An area was considered located in a “severe” hazard zone if it was low-lying (~ 2 m above bank elevation) and close to the stream (less than 1 km from the stream centerline). Other categories were similarly defined based on such criteria. The resulting map was deterministic in nature, but provided an indication of places that are most likely to be flooded. These maps can be validated using extents derived from hydraulic models to further justify the use of these maps for initial assessment of flood sensitive locations. In order to do so, we compare the CFHM to the extents of the PFHM over the reach considered in the present study. In this study, we use indices that quantify the probability of correctly classifying a pixel within the inundation extent as flooded or non-flooded, called sensitivity and specificity (Altman and Bland 1994). These indices have been used in the past in classification studies (e.g., Murtaugh 1996; Cutler et al. 2007) as well as in comparing two flood maps for the same location (Elshorbagy et al. 2017). Since “sensitivity” and “sensitivity analysis” are both used in this paper, the term “sensitivity” used in the context of comparison will be replaced by “Degree of Agreement” for the remainder of the paper. Degree of Agreement (DoA) is defined as,

$${{DoA}} = \frac{{F_{c} }}{{F_{c} + F_{oc} }}$$
(6)

where numerator Fc denotes the total number of pixels predicted as flooded in the CFHM and the PFHM (true positives). The denominator in the equation denotes the sum of the numerator, and \(F_{oc}\), which denotes pixels that were predicted as flooded in PFHM, but classified as non-flooded in CFHM (false negatives). Here, “flooded pixels in PFHM” refers to all points within the boundary of the PFHM, whereas flooded pixels in CFHM refers to pixels falling within the hazard level “severe.” DoA ranges from 0.0 to 1.0, with values closer to 1.0 indicating higher agreement of flooded pixels between the two maps. The second index, specificity, is defined as

$$S_{p} = \frac{{NF_{c} }}{{NF_{c} + NF_{oc} }}$$
(7)

where the numerator NFc denotes the total number of pixels predicted as non-flooded in the CFHM and PFHM (true negatives), and the denominator is the sum of the numerator, and NFoc that denotes non-flooded pixels in PFHM that are falsely classified as flooded in CFHM (false positives). Here, “non-flooded pixels in PFHM” refer to all points outside the PFHM boundary but within the extent of CFHM, whereas “non-flooded pixels in CFHM” refers to pixels that are outside the boundary of PFHM and do not belong to the hazard level “severe.” Sp also ranges from 0.0 to 1.0, with values closer to 1.0 imply higher agreement of non-flooded pixels between the two maps and lower values imply that the flooded area is over-estimated by the CFHM. The criteria for defining “flooded” and “non-flooded” pixels in both CFHM and PFHM can be modified and the measures of DoA and Sp can be calculated. For example, if locations having 80% PoI are considered flooded, then the pixels with PoI less than 80% would be considered non-flooded and the CFHM can be evaluated by calculating the values of DoA and Sp for those pixels. The CFHM is classified into different hazard levels based on criteria that are independent of actual flood discharges and may not be the true representation of flooding. Therefore, more than one hazard level in CFHM can be combined to classify the location as flooded (e.g., “severe” and “high” can be combined). The aim of this analysis is to quantify overestimation/underestimation of the flooded extents given by the CFHM and provide guidelines to modify and interpret the maps based on the requirements of decision makers.

6 Results and analysis

6.1 Probabilistic modeling

Although the model was run for 5000 realizations, some of the combinations resulted in model instability. A total of 854 realizations that resulted in an unstable model that did not produce results were omitted from the analysis and the water surface profiles of the remaining 4146 successful realizations were utilized to determine the flooding extent in a raster form. All locations (pixels) with a depth of inundation greater than or equal to 0.1 m was considered flooded in each realization and the probability of inundation (PoI) of each point in the floodplain is determined using Eq. (4). The probabilistic flood hazard map (PFHM) showing the PoI of each flooded pixel in the floodplain is shown in Fig. 4. The PoI is high at most locations along the reach and a few locations along the study reach are enlarged in the figure to visualize the variations in the PoI clearly. Locations with high PoI (> 0.9) are those that are found to be inundated in most of the realizations. It indicates that these locations can be expected to be inundated even when peak flow is 40% lower than the 2011 flood. Such high PoIs at most locations can be attributed to the topography of the floodplain along the study reach. The Moose Jaw River is characterized by narrower valleys with the floodplain extending to only a few hundred meters along the river until the confluence with the Qu’Appelle River (Fig. 1b). In the Qu’Appelle River, the floodplain extends for over a kilometer at places and is very flat and thus, the water leaving the channel and flowing onto the floodplains is most likely to inundate the entire valley (Fig. 1c). However, even in locations with high PoI, the depth of inundation varies and the variation is shown in Fig. 5. The PoI and the average depth of inundation (DoI) are plotted for two small stretches, one in the Moose Jaw River that was modeled in 1D (top left and bottom left panels in Fig. 5) and the second in the Qu’Appelle River that was modeled in 2D (top right and bottom right panels in Fig. 5). In the stretches shown in the figure, the low PoI is associated with those locations where the average DoI is small and vice versa. High DoI is restricted to the channel in the Moose Jaw River whereas there are multiple locations in the floodplain that have a high DoI in the Qu’Appelle reach. This is due to the presence of “potholes” in the floodplains that retain water during high flows. The information from the DoI obtained from the ensemble of realizations can also be used to assess the uncertainty in the inundation extents and depths. Figure 6 shows the coefficient of variation (CoV) for the DoI plotted against the mean depth of inundation as scatter plots as well as its spatial variation. The CoV is largely dependent on the sample size, with high CoV indicating a higher variability (uncertainty) resulting from a larger variation in DoI across realizations. Large values of CoV indicate larger variation in depth across realizations and indicate an increased uncertainty in inundation. It can be observed from the spatial distribution of the CoV values that most locations with high CoV (CoV > 0.5) are located at the outer edges of the PFHM that is characterized by low DoI (Fig. 6a). Similarly, the spatial distribution of CoV for the second stretch (Fig. 6b) shows low CoV (CoV < 0.5) at most locations, indicating less uncertainty in their inundation depths. The scatter plot indicates the presence of locations with CoV greater than 1, but this occurs at a few locations that are visually indistinguishable. In the present case study, high CoV values were restricted to locations with low DoI. However, locations with high CoV (> 0.5) for high DoI can also be expected depending on the profile of the floodplains. This approach provides depth of inundation at every pixel in the floodplain, along with two measures of uncertainty: probability of inundation (PoI) and variability in the depth of inundation (DoI) as represented by the CoV.

Fig. 4
figure 4

Probabilistic flood hazard map for the entire reach (top) and enlarged portions at a location along Moose Jaw River (bottom left) and along the Qu’Appelle River (bottom right)

Fig. 5
figure 5

a Probability of inundation for a smaller stretch of the Moose Jaw River (left panel) and the Qu’Appelle River (right panel), and b average depth of inundation (DoI) along the same reaches

Fig. 6
figure 6

Scatter plot of coefficient of variation versus mean depth of inundation, and the spatial distribution of the CoV values for a a stretch along the Moose Jaw River and b a stretch along the Qu’Appelle River

Sensitivity analysis using the RSA method was carried out to assess the sensitivity of the total inundated area to the inflow and roughness parameters. As the study reach was long and consisted of floodplains of different topography, the variation in sensitivity analysis results at multiple locations were also determined. Two smaller reaches, used for the previous analysis, were considered and the RSA method was applied. As the RSA requires the model outputs (total inundated area) to be divided into two subsets, the median of the total inundated areas obtained from all realizations was considered as the dividing point. The empirical CDFs of both subsets of inflows and roughness parameters were plotted and the mvd measure was estimated. Figure 7 shows the mvd estimated at the two smaller stretches. The mvd estimate for the stretch located on the Moose Jaw River (Fig. 7a) indicates that the channel roughness (nch) is the most influential on the maximum flooded area, followed by the floodplain roughness (nfp) and the hydrograph (hyd). The channel roughness governs how quickly water is routed into the floodplain once the bank full condition is reached and affects the rate of flood inundation and therefore, was found to be the most sensitive parameter in hydraulic models in multiple studies (e.g., Aronica et al. 1998; Hall et al. 2005; Savage et al. 2016a, b). It can be observed that the boundary condition has a lower effect although the peak flows are varied by ~ 40%. Instances where the influence of upstream boundary conditions (inflow hydrographs) on the model outputs is high at the upstream end of the modeled reaches and low in the middle reaches can been found in previous studies (Hall et al. 2005; Pappenberger et al. 2006). The stretch of the Moose Jaw River for which the RSA was performed is located roughly at the center of the reach modeled in 1D, and therefore the roughness parameters could be more influential on the inundated area for the stretch considered. The mvd estimate for the stretch of the Qu’Appelle River (Fig. 7b) indicates that the total inundated area is largely insensitive to nch, nfp, and hyd. This could be due to (a) the flooding extent during all realizations covers the entire floodplain thus causing little to no change in the flooding extents, and (b) the stretch corresponds to a flatter floodplain that contains “potholes” and milder slopes along the river, where damping of the parameter effects could be expected. The robustness of mvd is evaluated by estimating its value over 100 bootstrapped samples having the same size as the number of successful realizations (4146). The mvd was estimated for each bootstrapped sample and is shown as black dots in Fig. 7. It can be observed that the value of mvd does not change significantly across the bootstrapped samples, and can be considered robust for both the stretches for which the sensitivity analysis was carried out. For the present case study, the channel and floodplain roughness on the total inundated area is found to be more influential where the river is characterized by steeper slopes, compared to locations with milder slopes and flatter floodplains.

Fig. 7
figure 7

The estimated mvd of inflow (hyd), channel roughness (nch) and floodplain roughness (nfp) for a a stretch of the Moose Jaw River (modeled in 1D), and b a stretch of the Qu’Appelle River (modeled in 2D). The dots indicate the mvd obtained from each bootstrapped sample

To determine the effect of subjectivity in distinguishing between wet and dry areas in the floodplain by defining the minimum depth of inundation, three different values were considered as mentioned in the section detailing the probabilistic mapping framework. For each realization, the minimum depth of inundation was increased from 0.1 to 0.3 m and the PoI was determined for each of these depths using Eq. (4). PFHMs corresponding to each of the minimum depths were obtained. For clarity, the results showing the effects of minimum depth are presented for two small stretches in Figs. 8 and 9. Figure 8 shows the variation in the probability of inundation as well as the areal extents of inundation when different minimum depths are considered along a smaller stretch in Moose Jaw River. Results indicate little to no change in locations having a high PoI, whereas variation is noticeable in the middle and low ranges of PoI. The variation in the inundated area when the minimum depth is varied is shown as boxplots and it can be observed that the median of the inundated area reduces with an increase in minimum depth from 0.54 to 0.5 km2 (~ 10%). Variation in the PoI along the stretch of the Qu’Appelle River is shown in Fig. 9. The reduction in the PoI at various locations along the boundaries of the floodplain with the increase in minimum depth is clear at a few locations in this stretch. The boxplot showing variation in the inundated area when minimum depths are increased shows a marked difference in the flooding extents evident from the reduction in the inter-quartile ranges as well as the overall inundated area. The median of inundated areas reduces from 7.8 to 6.8 km2 (~ 13%). Although the percentage of changes between the two stretches are comparable, their effect on the overall flooding extent is significant in the Qu’Appelle River as the floodplain is mainly flat and the depth of inundation is low at many places. The choice of defining a minimum depth to consider a location to be flooded or not is a crucial step as it not only helps determine the extent of the flooding, but can also help decision makers take necessary precautions during development along the floodplains (e.g., ensure plinth level of structures is higher than minimum depth of inundation).

Fig. 8
figure 8

Variation in the probability of inundation and inundation extent (km2) due to the variation in the minimum depth of inundation for a small stretch of the Moose Jaw River

Fig. 9
figure 9

Variation in the probability of inundation and inundation extent (km2) due to the variation in the minimum depth of inundation for a small stretch of the Qu’Appelle River

6.2 Evaluation of CFHM against PFHM

To evaluate the large area flood hazard maps against local scale hazard maps obtained using detailed hydraulic modeling, a qualitative and quantitative comparison was carried out between the CFHM and the PFHM. The CFHM that contained the study reach was extracted from the larger map that contained all Canada, and the PFHM and the CFHM for a small stretch along the Qu’Appelle River is presented in Fig. 10. A visual comparison between the two maps indicates that the hazard level “severe” has a slightly wider extent in comparison with the PFHM. As the CFHM was developed based on distance and elevation and did not reflect the extent of an observed flood, overestimation/underestimation of the extents can be expected. It can also be noted that the PFHM shows extent of flood on the Qu’Appelle River alone, whereas the CFHM takes into account other streams that flow into the Qu’Appelle and locations closer to such streams in terms of distance would be classified as “severe.” In can also be seen from the figure that there are small areas within the flooding extents of the PFHM that are dry. The CFHM also classified these locations into a lower hazard indicating the ability of the CFHM to provide a fairly good representation of the flooding extent using topographic information only.

Fig. 10
figure 10

a The classified flood hazard map for a stretch along the Qu’Appelle River, and b the probabilistic flood hazard map showing the probability of inundation (PoI) for the same stretch

Quantitatively, the comparison between the CFHM and PFHM was carried out using Eqs. (6) and (7) by varying the criteria used to define “flooded” and “non-flooded” pixels in both maps. The CFHM contains locations that are classified as hazard level “very low” (shown in blue color in Fig. 10a). These locations are highly unlikely to be flooded and are omitted from the comparison as the pixels belonging to the hazard level are large in number and would affect the calculation of the measures used for comparison, by significantly increasing the number of non-flooded pixels in the floodplain. First, the comparison was carried out by considering the hazard level “severe” as flooded areas in the CFHM and the remaining hazard levels as non-flooded, and the entire extent of the PFHM was considered as flooded. The DoA and Sp values for the entire reach was found to be 0.91 and 0.75, respectively. It indicates that the CFHM is able to identify a flooded pixel in the PFHM as flooded with an accuracy of 91%, whereas the accuracy in identifying a non-flooded pixel is 75%. The reduction of accuracy in identifying the non-flooded pixels is due to the presence of hazard level “severe” outside the boundary of the PFHM. The measures were also calculated by dividing the study reach into two sections to determine the effect of topography on these measures. The DoA and Sp were calculated for two stretches: Moose Jaw River up to its confluence with the Qu’Appelle River, characterized by a narrower floodplain (shown in Fig. 1b), and a portion of the Qu’Appelle River below the confluence, characterized by a flatter and wider floodplain (shown in Fig. 1c). For the Moose Jaw River, the DoA and Sp values were found to be 0.81 and 0.86, respectively, whereas the values were 0.95 and 0.69 for the stretch on the Qu’Appelle River. The identification of flooded areas is better (value of DoA is higher) in the flatter Qu’Appelle River, whereas the Sp is lower (identification of non-flooded areas is less accurate) when compared to the values estimated for the Moose Jaw River. The reduction in Sp over the Qu’Appelle River indicates that the hazard level “severe” extends beyond the boundary of the PFHM significantly. The reduction of accuracy in identifying “non-flooded” areas (as shown by lower values of Sp) by the CFHM at flatter and wider floodplains could be attributed to the way the CFHM is prepared. The hazard levels in the CFHM is defined by a user-defined criteria considering distance and elevation of that location relative to the nearest stream. In flatter floodplains, small variation in elevation over large areas could result in a particular hazard level covering a wider area compared to other hazard levels. At such locations, the accuracy of CFHMs in representing flooding extents could be improved by utilizing additional information in the form of observed historical flood extents/hydraulic model outputs.

Second, the effect of combining more than one hazard level in the CFHM was evaluated by combining the “severe” and “high” levels and considering them as flooded, while maintaining the entire extent of the PFHM as flooded. The DoA and Sp were recalculated for the new classification and were found to be 0.94 and 0.61 for the Moose Jaw River, and 0.99 and 0.35 for the stretch of the Qu’Appelle River. Improvement in identifying flooded areas correctly for both stretches indicates the presence of a significant number of pixels classified as “high” within the boundary of the PFHM, but results in a significant overestimation of the total flooded area. The reduction in Sp is more in the flatter floodplains of the Qu’Appelle River (~ 35%) as the locations defined as “very high” in the CFHM can extend over larger areas in flat floodplains whereas the extent of the same hazard level is limited in narrower floodplains. From this analysis, it seems that the “severe” flood hazard class in the CFHM is a reasonable compromise for preliminary estimation of flooded and non-flooded areas.

Third, the criteria for flooded area in the PFHM was modified based on the probability of inundation, and its effect on DoA and Sp in order to evaluate the accuracy of the CFHM at locations with different PoI values. The delineation of flooding extent in the PFHM was modified by varying the minimum PoI from 0 to.99 in increments of 0.2, while considering the hazard level “severe” as flooded in the CFHM. Table 1 shows the variation in the DoA and Sp with an increase in the minimum PoI used to define the extent of PFHM as “flooded” for the Moose Jaw River. A marginal increase of ~ 6% is observed in DoA, whereas a reduction of ~ 10% is observed in the Sp measure. The results indicate that the agreement between the CFHM and the PFHM in the flooded pixels is higher at locations with a very high probability of inundation. The results also highlight the complementary nature of both measures, wherein the modification of the CFHM to improve representation of flooded areas (high DoA) could lead to a reduction of accuracy in the representation of non-flooded areas (low Sp). Therefore, the modification of the CFHM to match actual inundation extents can lead to an improvement at the cost of non-flooded area being incorrectly classified as flooded.

Table 1 Variation in DoA and Sp with an increase in minimum PoI considered to define extent of flooding in the PFHM

7 Conclusions

A probabilistic framework to flood mapping, by accounting for uncertainties in inflow hydrographs and roughness parameters, is applied on a prairie river in the present study. The Qu’Appelle River that is prone to frequent floods in the Canadian prairies was selected and a hydrodynamic model was developed for a reach of the river. The model parameters, namely the channel and floodplain roughness and upstream boundary conditions were perturbed within a plausible range and a probabilistic flood hazard map (PFHM) was generated that assigned the probability of inundation of each location within the floodplain. The PoI is high for most locations along the Qu’Appelle River as the floodplain is relatively flat and tends to be inundated even for flows 40% lower than the base flood hydrograph. The variability in the depth of inundation (DoI) represented in the form of coefficient of variation was found to be low at locations with high DoI and vice versa, indicating less uncertainty at locations where the DoI is high. The sensitivity analysis results indicated that the influence of channel roughness on flooding extents is more at locations characterized by steep slopes and unidirectional flow, whereas the flood extents were insensitive to roughness parameters and boundary conditions at locations where the channel had a mild slope and flat floodplain characterized by small depressions that retain water during high flows. The study was extended to determine the effect of choosing a threshold to distinguish between flooded and non-flooded areas. It was observed that the reduction in inundated area and the PoI is significant with an increase in the minimum flooding depth in locations with flatter and wider floodplains. Evaluating the CFHM by comparing it with the PFHM indicates that the CFHM is fairly accurate in the representation of flooded areas for the present case study even though it was prepared using only topographical data. Overestimation of flooded areas by the CFHM at locations where the floodplains are characterized by flat low-lying areas is observed and using the CFHM to identify flood-prone areas would require additional information.

This study is limited by the lack of finer resolution digital elevation model and the non-availability of observed flooding extents for the modeled event. Most locations in Canada do not have detailed LiDAR survey carried out that could be used in accurately determining the flood extents. Hence, determining an appropriate DEM and resolution would be of importance. Future work can include assessing the variation of flooding extents by considering DEMs from different sources and resolution to determine the error in flood mapping in Canada. The identification and use of high-resolution satellite images to determine flooding extents could also be used. This study considers an approach to generate random realizations of hydrographs that are suited best for reconstructed hydrographs or hydrographs obtained from hydrological models. If hydrological models are used to generate boundary conditions for hydraulic models, uncertainty in flooding extent due to such estimated hydrographs can also be assessed.