1 Introduction

Precipitation sustains human society by providing essential water resource and threatens us with flooding and droughts through its uneven and extreme distribution in time and space. Therefore the features of precipitation are frequently studied from various perspectives (e.g. Trenberth et al. 2003). Western Canada (Fig. 1) holds more than 80 percent of Canada’s farmable area (Veeman and Veeman 2015). Since water supply is fundamental to farming, especially during the growing season, and precipitation is the most direct and economic way of water supply, a comprehensive understanding of precipitation features in Western Canada is required (Bonsal et al. 1999).

The object-based approach known as the Method for Object-based Diagnostic Evaluation (MODE) is a promising approach which provides an advanced understanding of precipitation (e.g. Prein et al. 2017b). It was originally proposed to evaluate the skill of numerical weather prediction models. Ebert and McBride (2000) proposed an object-oriented verification procedure within the framework of “contiguous rain areas”. By identifying rain field bounded by a user-specified threshold, their approach decomposed the total error of precipitation forecast into three parts: location, rain volume, and pattern. Davis et al. (2006a) developed an object-based method of defining rain areas, which is complementary to the approach of Ebert and McBride (2000). Specifically, they were able to measure more object attributes such as area, intensity, and matching of forecast and observed objects. Using this matching algorithm, Davis et al. (2006b) grouped the spatially and temporally coherent areas into rain systems, and compared properties of these rain systems simulated by the weather research and forecasting (WRF) model with stage-IV precipitation observations from the National Centers for Environmental Prediction (NCEP).

Recently, Clark et al. (2014) further incorporated the time dimension into MODE (MODE-TD, or MTD) and thus made it possible to track and provide information over the entire life cycle of the object including longevity, timing of initiation-dissipation, propagation speed, and evolution. Using this tracking algorithm, Prein et al. (2017a) detected hourly precipitation from mesocale convective systems (MCSs) in WRF model and compared it with stage-IV analysis. They found the WRF model is able to produce realistically propagating MCSs, which was a long-standing challenge in climate modelling. Prein et al. (2017b) further applied this tracking algorithm to identify and analyze future convective storms in the US. These studies show this tracking algorithm is encouraging for advanced analysis of various climate phenomena, especially for precipitation-related research. While more and more studies utilize MTD, all of them are limited to regions where precipitation dataset with high spatiotemporal resolution (no larger than 5-km in spatial and no more than 1-h in temporal) is available.

However, it remains an open question whether this state-of-the-art technique is appropriate to be applied in regions without high spatiotemporal-resolution precipitation dataset. While MTD receives more and more interests (e.g. Prein et al. 2017b), its applicability to regions with different data quality has never been systematically addressed. Western Canada (Fig. 1) is a region with extreme topographic gradients, land-sea contrast, and strong precipitation gradient. Therefore it receives a lot of interests in the climate community (e.g. Erler et al. 2015; Erler and Peltier 2016; Li et al. 2019). Till now, however, the MTD approach has never been applied in this region. This can be due to the absence of high resolution precipitation dataset. Since no fundamental reason suggesting MTD is inappropriate to be applied in regions without high resolution precipitation dataset, we are interested to invest the performance of MTD in western Canada, using precipitation datasets with different spatial and temporal resolutions.

Dynamical downscaling using regional climate models with horizontal grid length less than 4 km can resolve deep convection (Prein et al. 2013). These models are therefore named as convection-permitting climate models (CPCMs). It is generally accepted that CPCMs provide more reliable climate information than the traditionally used large-scale models (Weisman et al. 1997; Prein et al. 2015). CPCMs use non-hydrostatic governing equations and cloud microphysical processes to resolve deep convections (Dudhia 1993). They are able to discriminate between small scattered showers and large organized structures with different convection intensities (Westra et al. 2014), and able to reveal the diurnal cycle of precipitation better (Scaff et al. 2019; Woodhams et al. 2018). Besides, increasing resolution leads to a more realistic representation of the orography and land surface (Prein et al. 2013). The considerably improved precipitation forecasts by CPCMs provide an opportunity to further investigate precipitation features in regions with insufficient observations (Li et al. 2017).

The current study applies the new tracking algorithm to investigating precipitation systems in western Canada, using three near-real time precipitation datasets as references and a CPCM simulated precipitation dataset with finer spatiotemporal resolution, which can complement the relatively coarser reference datasets. The purpose of this study is twofold: (1), investigating advanced features of the simulated and observed precipitation that go beyond simply comparing the amount and intensity; (2), detecting the sensitivity of MTD results to datasets with different spatiotemporal resolutions, and providing suggestions for the application of MTD to data scarce regions. The remainder of this study is organized as follows. Section 2 describes the three near-real time precipitation datasets, the convection-permitting climate model, and the MTD software. Section 3 provides a description of MTD results and precipitation features in western Canada, Sect. 4 detects the way how spatial and temporal resolution of source data influence the MTD results, using two control experiments, and Sect. 5 contains conclusions and discussions.

2 Data and methodology

2.1 Gridded precipitation estimates

For the present study, three near real-time quantitative precipitation estimates with different spatiotemporal resolutions have been utilized: Canadian Precipitation Analysis [CaPA, Fortin et al. (2018)], North American Regional Reanalysis [NARR, Mesinger et al. (2006)], and Multi-Source Weighted-Ensemble Precipitation [MSWEP, Beck et al. (2019)] Version 2 (V2). Table 1 presents spatiotemporal resolutions for the three precipitation estimates, and for the CPCM simulated precipitation (Sect. 2.2). To ensure the same MTD settings, the three reference datasets have been regridded to a common 4-km Lambert Conformal projection, which is the same grid to the CPCM output (see more details in Sect. 2.4). All the four datasets have been documented as performing well from various traditional statistical perspective (e.g. Fortin et al. 2018; Mesinger et al. 2006; Beck et al. 2019; Li et al. 2019).

Table 1 Overview of the spatiotemporal resolution for the 3 precipitation estimates, and for the CPCM simulated precipitation

CaPA is a gridded quantitative precipitation estimate product based on the optimal combination of near real-time precipitation observations from both in situ and radar networks with a first guess provided by the Global Environmental Multiscale (GEM) model (Fortin et al. 2018). Precipitation accumulations over 6-h periods are estimated four times per day at 0000, 0600, 1200, and 1800 UTC. A lead time of 6- to 12-h before each time interval, the short-term forecast is generated by the GEM model (Mailhot et al. 2006). The forecast is known as the background field, or first guess of the CaPA product. Differences between the cubic root of observed precipitation and the cubic root of forecast precipitation at neighbouring locations are then calculated. These differences are interpolated by residual simple kriging and the resulting precipitation increment is added to the background field at the analysis location. A back-transformation is performed and a bias correction is applied to the final product. During the past decade, CaPA has been employed intensively for various researches in Canada, and has been reported as “a good alternative source of precipitation data for regions with a sparse observational network” (Eum et al. 2014). For more information on CaPA analysis, please refer to Fortin et al. (2015), Hanes et al. (2017), and Fortin et al. (2018).

NARR is an atmospheric and land surface hydrology dataset for North America, available from 1979 to 2003 and being continued in near-real time as the Regional Climate Data Assimilation System (R-CDAS). It is essentially based on the April 2003 frozen version of the NCEP mesoscale Eta forecast model and its data assimilation system (EDAS) (Mesinger et al. 2006; Bukovsky and Karoly 2007). NARR assimilates precipitation observations into the atmospheric analysis, which is reported as successful and including two-way interaction between precipitation and the improved land surface model (Ruiz-Barradas and Nigam 2006). The precipitation estimate in NARR is available at 3-h temporal and 32-km spatial resolution. For more information on NARR, please refer to Mesinger et al. (2006).

MSWEP is a gridded precipitation dataset developed at Princeton University. It takes advantage of the complementary strengths of gauge-, satellite-, and reanalysis-based precipitation datasets and therefore provides precipitation estimates covering the entire globe at high spatial (0.1\(^{\circ }\)) and temporal (3-h) resolution. For more information on MSWEP, please refer to Beck et al. (2019).

The present study is not intended to compare the quality of the datasets because they are designed for different applications and have different spatiotemporal resolutions, which can influence MTD results. Instead, we are more interested on the spatial and temporal features of precipitation revealed by the sophisticated MTD method, and thus having a more comprehensive understanding of the precipitation systems in western Canada. Besides that, the sensitivity of MTD simulation to source data with different spatial and temporal resolutions and the corresponding suggestions on choosing source datasets will be addressed. The two observation-based datasets, CaPA and MSWEP, together with NARR will be used as reference datasets. The CPCM simulated precipitation has finer spatiotemporal resolution. It is therefore expected to complement the relatively coarser reference datasets.

2.2 CPCM setup

The Weather Research and Forecasting (WRF) model version 3.6.1 was used to downscale the 6-h 0.7\(^{\circ }\)\(\times \) 0.7\(^{\circ }\) resolution European Centre for Medium-Range Weather Forecasts (ECMWF) interim reanalysis (ERA-Interim; Dee et al. 2011) to 4-km horizontal grid spacing over western Canada (Fig. 1). The total model domain size is 2560 km in the east–west direction and 2800 km in the north–south direction. The model contains 639 \(\times \) 699 horizontal grid points and 37 vertical eta levels with top level at 50 hPa. The WRF simulation extends from 1 October 2000 to 30 September 2015 (Li et al. 2019; Kurkute et al. 2019).

The New Thompson microphysics scheme (Thompson et al. 2008) and the Yonsei University (YSU) scheme for planetary boundary layer (Hong et al. 2006) are used as the main physical packages in our WRF simulation. For short-wave and long wave radiation, the Community Atmosphere Model (CAM) scheme from the CAM 3 climate model are used (Collins et al. 2004). The land surface model component is Noah land-surface model (Chen and Dudhia 2001). With 4-km horizontal resolution, the model resolves deep convection explicitly and the deep cumulus parameterization is turned off. No sub-grid cloud cover or shallow cumulus parameterizations have been used in the present study (Li et al. 2019).

Fig. 1
figure 1

WRF simulation domain (2560 km x 2800 km), showing topographic height in meters above mean sea level (MSL). The two red lines divide the research area into three parts. From left to right, they are region on the west side of the Rockies (i.e. the coastal region), the Rockies (i.e. the mountainous region), and the region on the east side of the Rockies (i.e. the prairies), respectively

2.3 MODE time domain (MTD)

MTD is part of the Developmental Testbed Center’s (DTC) Model Evaluation Tools (MET; the current version is available online at http://www.dtcenter.org/met/users/downloads/). In MTD, contiguous regions of grid points exceeding a specified threshold encompassing both space and time are identified as 3-dimensional (3D) time-domain objects. Information of the identified object including object size, location, propagating speed, lifetime, and track length can be further analyzed.

Fig. 2
figure 2

Schematic diagram show relationships between speed, size, and temporal resolution of mask fields. See text for details

The main idea behind MTD (as well as MODE) is a convolution-thresholding approach. This idea breaks down into four major steps. The first step is smoothing, which involves a convolution operation on two functions. One is the raw data field, and the other is the convolution filter. The smoothing process is governed by the filter’s tunable parameter, the radius. The second step is threshold, which applies a user-defined threshold to the smoothed field. The threshold produces a mask field (i.e. the identified 2D precipitation object) where grid values equal or larger than the threshold will be assigned as one, and other grids will be zero. The third step is isolating. A unique number will be assigned to each connected piece of the mask field, which forms the identified 3D precipitation system. Note that the term “connected” requires the mask fields to be connected both in space and time (plus or minus one time step). From this perspective, MTD analysis may favour source data with high temporal resolution.

Figure 2 is a schematic diagram illustrating the minimum requirement on temporal resolution. The mask field A stands for a MTD identified 2D precipitation object at \( \displaystyle t_1\), and the mask field B stands for a 2D precipitation object at next time step \( \displaystyle t_2\) (i.e. \( \displaystyle t_1\) + \(\delta \displaystyle t\), where \(\delta \displaystyle t\) is the temporal resolution of the source data). We draw a line segment between the centroids of A and B. The length of the line segment located within the mask field A is \( \displaystyle l_1\), which is positively related to the size of A; the length of the line segment located within the mask field B is \( \displaystyle l_2\), which is positively related to the size of B. The length of the line segment between the centroids of A and B is \( \displaystyle l\), which stands for the distance between A and B. The mask field A at \( \displaystyle t_1\) will be connected with the mask field B at \( \displaystyle t_2\) when \( \displaystyle l < l_1 + l_2\). We know that \( \displaystyle l = {\overline{v}} * (t_2 - t_1)\), where \({\overline{v}}\) stands for the average speed between \(t_1\) and \( \displaystyle t_2\). Therefore, when A and B are connected, we have:

$$\begin{aligned} {\overline{v}} * (t_2 - t_1) < l_1 + l_2 \end{aligned}$$
(1)

The requirement on temporal resolution \(\delta \displaystyle t\) is:

$$\begin{aligned} \delta t < \frac{ l_1 + l_2}{{\overline{v}}} \end{aligned}$$
(2)

Inequality (1) indicates that MTD simulation with coarse temporal resolution (large \((t_2 - t_1)\) value) tends to identify slow moving (small \( \displaystyle {\overline{v}}\) value) precipitation system when the size of the mask fields is nearly constant (for similar \(l_1 + l_2\) values). For the fast moving precipitation system, it will either be split into several small subsystems (only the slow part can be identified), or (partly) be removed when the subsystem (3D object) has less than 2000 grid cells. For all the four datasets, each grid cell has been regridded to a common 4-km grid with a Lambert Conformal projection (see more discussion in Sect. 2.4). Note that, in the third step, an additional constraint was used in this study. It requires the isolated 3D precipitation systems to have a minimum volume of 2000 grid cells. This constraint was added for two reasons. First, precipitation system with grid cells less than 2000 indicates small precipitation area and short duration. This kind of precipitation can hardly cause any hazard and thus will not be considered in this study. Second, when 3D systems with grid cells less than 2000 are considered, MTD will generate a large number of 3D objects. In this case, MTD simulation consumes huge amount of computational resources and is not suitable for long-term climate study. Similar computational constraint has been addressed by Prein et al. (2017a).

The fourth step is restoring, which restores the original data to object interiors. It makes it possible for further analyses on the objects such as intensity-related calculation. For more information on MTD, please refer to MODE Technical Notes (Bullock et al. 2016) and MET Users Guide (Gotway et al. 2018).

We apply threshold to the smoothed data rather than the original data, because the later derives a large number of objects, and most of which are quite small. When a larger radius is used in smoothing, normally the identified object tends to be larger and fewer, because small object tends to be smoothed out. Correspondingly, larger threshold leads to fewer and smaller objects.

To test the sensitivity of MTD results to different settings, we repeated all analyses using threshold values of 0.1, 2, and 4 mm h\(^{-1}\), and smoothing radii of 8, 16, and 32 grid cells. Most results presented in this study are derived by using a smoothing radius of eight grid cells (32 km), a threshold of 2 mm h\(^{-1}\) on source data (hereinafter: MTD8\(\_\)2), and a smoothing radius of 32 grid cells (128 km) with a threshold of 0.1 mm h\(^{-1}\) (hereinafter: MTD32\(\_\)0.1), unless otherwise noted. We are interested in results derived from MTD8\(\_\)2 because, in this case, moderate to heavy precipitation [ > 2 mm h\(^{-1}\), the description of precipitation intensity in this study is based on Table 7.2 of Ahrens (2009)] can be identified properly. Specifically, thunderstorms on the Prairies, which will be investigated in more details in our future extreme related studies, can be identified quite well by MTD8_2 settings. We show results of MTD32_0.1 because it maximizes the size of identified precipitation systems, which can reduce the negative influence of coarse temporal resolution (Sect. 3.2).

Since NARR and MSWEP are 3-h datasets, thresholds of 0.3 and 6 mm (per 3-h) were used for the simulation of MTD32\(\_\)0.1 and MTD8\(\_\)2, respectively. Likewise, thresholds of 0.6 and 12 mm (per 6-h) were used for CaPA, and thresholds of 0.1 and 2 mm (per hour) were used for CPCM for running MTD32\(\_\)0.1 and MTD8\(\_\)2, respectively. Note the temporal resolution of MTD outputs is the same to the corresponding source data. In this study, MTD output frequency for CaPA is 6-h, 3-h for NARR and MSWEP, and hourly for CPCM.

In this study, the MTD simulations were run from 1 September 2004 to 30 September 2015, driven by precipitation datasets of CaPA, NARR, MSWEP, and precipitation derived from the CPCM simulation, respectively.

Fig. 3
figure 3

Track density. It shows total number of identified precipitation systems within grid of 100 \(\times \) 100 km from 1 September 2004 to 30 September 2015, for MTD8_2 (ad) and MTD32_0.1 (eh). The black contours show topographic height equal to 1300 m above mean sea level, indicating the location of the Rockies

2.4 Regrid the source data

In order to ensure the same MTD settings have the same smoothing and threshold effects for different source data, all the source data have to be collocated on the same grid. The regrid_data_plane tool, which is part of the Model Evaluation Tools provided by the Developmental Testbed Center, was used to regrid all the precipitation datasets onto the same grid to the WRF output: a 4-km grid with a Lambert Conformal projection. Choosing which common grid to use is largely depended on the research question that is being addressed. In this study, we are interested in detecting the influence of spatial and temporal resolution of source data on the MTD results. Therefore, the higher-resolution source data is desired to be preserved. Note that interpolating the coarser-resolution data to a higher-resolution grid will not artificially produce fine-scale structure (Wolff et al. 2014). This approach allows different source data to remain on their original quality and not be subjected to any interpolation. In Sect. 4, the regrid_data_plane tool was used to regrid CPCM original outputs to a coarser spatial resolution, which is the same to NARR (i.e. 4-km to 32-km grid). The coarsened CPCM outputs were then regridded back to 4-km grid to drive the MTD. In this case, the regridded CPCM outputs lost the original fine-scale structure and will be named as “CPCM_spatial” in Sect. 4, indicating their coarser spatial feature.

3 Spatiotemporal feature of precipitation systems

3.1 Spatial feature of precipitation systems

The observed and simulated track densities of the precipitation systems between 2004 and 2015 are shown in Fig. 3, which is plotted based on the centroid centre of the identified 2D objects. For settings of MTD8_2, the coastal region is noticeable for high track density, indicating high frequency of precipitation above the rate of 2 mm h\(^{-1}\). This feature can be seen from all the four datasets while the density values vary among different datasets. Recall that MTD generates 2D objects hourly for CPCM, 3-h for NARR and MSWEP, and 6-h for CaPA. This output frequency difference makes track densities of CPCM generally higher than NARR and MSWEP, and CaPA to have the lowest density values. Therefore, in the present study, only the distribution pattern has been analyzed instead of the specific density values.

Comparing Fig. 3a–d with Fig. 1 reveals that the density values are highest along windward of the Rockies and much lower along the interior valleys. Going further east, the density values get higher from the interior Rockies to the Prairies. This sandwich pattern can be seen from all the four datasets. From this perspective, our CPCM simulation captures spatial distribution of moderate to heavy precipitation ( > 2 mm h\(^{-1}\)) quite well.

Fig. 4
figure 4

Probability density functions (PDFs) for the settings of MTD32_0.1 (columns 1, 3) and MTD8_2 (columns 2, 4), and for the regions located on the west side of the Rocky Mountain (columns 1, 2) and east side of the Rocky Mountain (columns 3, 4). The axis-ranges vary between different plots. The shadings show estimates of the 5–95 percentile sampling uncertainty based on 100 bootstrap samples. Green lines show MTD results derived from CaPA, and yellow lines show MTD results from CPCM simulations with the same spatiotemporal resolution to CaPA

For MTD32_0.1 (Fig. 3e–h), the track density values are generally higher than MTD8_2. This is partly due to the reason that very light precipitation can be identified when using 0.1 mm as the threshold, and partly that using 32 grid cells as the smoothing radius makes the 2D objects easier to get connected and thus the 3D objects easier to reach 2000 grid cells. The density pattern of MTD32_0.1 is different from MTD8_2. Specifically, high density values can be seen in mountainous region near (55\(^{\circ }\) N, 125\(^{\circ }\) W) and the adjacent prairies. Comparing Fig. 3a–d with e–h indicates that light precipitation (0.1 \(\sim \) 2 mm h\(^{-1}\)) occurs frequently over the interior Rockies and the adjacent prairies while moderate to heavy (> 2 mm h\(^{-1}\)) precipitation is relatively rare there. The former can be attributed to the continuously orographic lifting of the prevailing westerlies, and the later can be attributed to the limited water vapour supply over the mountainous area (Rohli and Vega 2017; Clair 2009). Again, this feature can be seen from all the four datasets, giving more confidence to use the CPCM simulated precipitation for further study.

The probability density functions (PDFs) derived from the four precipitation datasets are used to obtain a general understanding of the spatial and temporal features of precipitation systems in western Canada. Since the west side of the Rockies has a maritime climate and a continental climate on the east, for all the analyses below, precipitation systems are divided into two groups based on the location relevant to the Rockies (west and east). As shown in Fig. 1, The two red lines are boundary lines which divide western Canada into three parts: the coastal region, the mountainous region, and the prairies. Precipitation systems spending most (> 70%) of the lifetime on the west side of the Rockies are grouped to the coastal region, and precipitation systems spending most of the lifetime on the east side of the Rockies are grouped to the prairies. Mountainous precipitation systems are not considered below since they are basically light precipitation caused by orographic lifting (see above), which can hardly cause any hazard.

To further verify the validity of comparing the high-resolution CPCM simulated precipitation to lower resolution CaPA, an extra MTD experiment has been done with the CPCM precipitation first upscaled to the coarser resolution of CaPA, spatially and temporally. Specifically, the CPCM simulated hourly precipitation was first accumulated to 6-hourly at 0000, 0600, 1200, and 1800 UTC. Then the CPCM simulated precipitation was regridded to a 15-km spatial resolution, which is very close to the spatial resolution of CaPA. In this way, the CPCM simulated precipitation has nearly the same spatial and temporal resolution to CaPA. The CPCM and CaPA are then regridded back to 4-km, and drive MTD with the same thresholds and the same smoothing radii, respectively. Figure 4 shows PDFs for the settings of MTD8_2 and MTD32_0.1. The typical values of the size, track length, lifetime, and speed are all remarkably well simulated by the CPCM when comparing with CaPA. Despite the spatial and temporal resolution of the source data can influence the MTD results systematically, which will be discussed in Sect. 4, Figure 4 shows the high reliability of the CPCM simulated precipitation. The logic is that if two samples are affected in the same way and the two affected samples have the same distribution, then we can state that the unaffected two samples have the same distribution as well. In this sense, we are now confident to do MTD analyses using the CPCM simulated precipitation.

Fig. 5
figure 5

Probability density functions (PDFs) for the size (a-d) and track length (e-h) of precipitation systems. Figures (a, b, e, f) are for precipitation objects identified on the west side of the Rocky Mountain and (c, d, g, h) for the east. The axis-ranges vary between different plots. The shadings show estimates of the 5–95 percentile sampling uncertainty based on 100 bootstrap samples. The unit of size is 10\(^4\) km\(^2\), and 100 km for the unit of track length

Fig. 6
figure 6

Precipitation provided by a CaPA, b NARR, c MSWEP, and d our CPCM simulation at 0600 UTC 20 June 2013, and the corresponding 2D objects identified by MTD32_0.1 (e-h), and MTD8_2 (i-l). For a-d, the unit of colorbar is mm h\(^{-1}\), and the range of colorbar for NARR and MSWEP is three times of CPCM because the former two are 3-h accumulation and the later two are 1-h (same reason for the colorbar range of CaPA). For e-l, the color of each object stands for the unique number assigned by MTD in the isolating step (Sect. 2.3). The unique number is to identify each isolated object and does not has any indication of precipitation intensity (therefore not shown here)

Figure 5 shows PDFs of the size and track length derived from the four precipitation datasets as shown in Table 1. The MTD simulations are driven by the original datasets with only regridding (to 4-km) has been done for the reason discussed in Sect. 2.4. The first impression from Fig. 5a–d is that the size of precipitation system is sensitive to MTD settings. The object size for MTD32_0.1 is generally larger than MTD8_2. Specifically, the typical (i.e. with the largest probability, and the same below) precipitation size for MTD8_2 is approximately 0.7 \(\sim \) 2 \(\times \) 10\(^{4}\) km\(^{2}\), for precipitation systems on both west and east side of the Rockies (Fig. 5b, d). The typical precipitation size for MTD32_0.1 is approximately 2 \(\sim \) 4 \(\times \) 10\(^{4}\) km\(^{2}\) on the west side of the Rockies and 3 \(\sim \) 5 \(\times \) 10\(^{4}\) km\(^{2}\) for the east. For moderate to heavy precipitation (> 2 mm h \(^{-1}\)), the size of maritime precipitation systems near the coast is approximately 1.8 \(\times \) 10\(^{4}\) km\(^{2}\), which is similar to the size of continental precipitation systems on the Prairies (Fig. 5b, d). When light precipitation (> 0.1 mm h \(^{-1}\)) is considered, however, the typical size of maritime precipitation systems is smaller than that of the continental ones(Fig. 5a, c). It indicates that, normally, light precipitation (0.1 \(\sim \) 2 mm h \(^{-1}\)) on the Prairies is larger in size than that near the western coast. This feature can be seen from all the four datasets. Therefore, it can be the intrinsic precipitation character in western Canada despite that, to our knowledge, no previous study exists documenting the typical size of precipitation system in western Canada.

Figure 5a–d show that difference exists between the typical size of precipitation systems identified from different source data. It appears that, besides influence from MTD settings, the identified typical size of precipitation system can be influenced by the inherent property of the source dataset as well. To figure out the way how source data influences the size of the identified precipitation system, the June 2013 extreme rainstorm event over southern Alberta (Li et al. 2017) was used as an example. Precipitation provided by the four source datasets at 0600 UTC 20 June 2013, and the corresponding 2D objects identified by MTD32_0.1 and MTD8_2 are shown in Fig. 6. The minor precipitation area located in the southeast part of the domain has not been captured by the CPCM while it can be seen in the other three precipitation observation analysis and reanalysis products (Fig. 6a–d). As a result, for MTD32_0.1 simulation, the identified 2D object is smaller than that in CaPA, NARR, and MSWEP (Fig. 6e–h). For MTD8_2 simulation, only the main precipitation area can be identified in the CPCM while the minor precipitation area can be identified in CaPA and MSWEP. The smaller object has not been identified in NARR because the 3D object has less than 2000 grid cells (Sect. 2.3). Note, the lack of the precipitation feature in the southeast part of the domain in CPCM does not erode our confidence to use this product for climatological purposes. Since the CPCM does climate simulation with only boundary forcing constantly updated by reanalysis, an individual precipitation event happened in observation is not demanded to be fully captured by the CPCM (Prein et al. 2017a). Instead, the track density patterns in Fig. 3 suggest CPCM does a good job of representing the spatial pattern of variability. The comparison experiment shown in Fig. 4 and the discussions below show the capability of representing the spatiotemporal features in western Canada.

For the case shown in Figure 6, the smaller CPCM object identified by MTD32_0.1 (Fig. 6h) is due to the absence of precipitation east to the main precipitation area when compared with CaPA, NARR, and MSWEP. In NARR, the 2D object identified by MTD8_2 (Fig. 6j) is smaller than that in CaPA, MSWEP, and CPCM. It means that, after smoothing and threshold, the area of moderate to heavy precipitation (> 2 mm h \(^{-1}\)) in NARR is smaller, which can either be attributed to the smaller precipitation area in the original dataset, or the smaller intensity value, or both. Therefore, Fig. 6 confirms that the inherent property of the source data can influence MTD results. In Sect. 4, we will show that the temporal resolution of source data can have significant influence on MTD results as well.

Figure 5a–d shows that the size of identified precipitation systems in CPCM is normally smaller than that in CaPA, NARR and MSWEP, for both light (Fig. 5a, c) and moderate to heavy precipitation (Fig. 5b, d). Note, the smaller precipitation area does not necessarily mean that CPCM underestimate the precipitation when comparing with CaPA, NARR and MSWEP. Precipitation amount is related to the intensity and the number of the identified 2D objects as well, as implied in the density figures (Fig. 3). An explanation of this systematically difference in size will be further discussed in Sect. 4.

Figure 5e–h show PDFs for track length of the identified precipitation systems. Overall, the CPCM simulated track length agrees well with CaPA, NARR and MSWEP, especially for precipitation systems located on the west side of the Rockies (Fig. 5e, f). For precipitation systems on the east side of the Rockies (Fig. 5g, h), the CPCM simulated track length is slightly longer for moderate to heavy precipitation. For different MTD settings, the typical track lengths can be slightly different. For MTD32_0.1, the typical track lengths for all the four datasets are approximately 200 km. For MTD8_2, however, the typical track lengths for CaPA, NARR and MSWEP are slightly shorter than CPCM, which is approximately 200 km.

3.2 Temporal features of precipitation systems

One major advantage of MTD is that the time dimension has been incorporated into the identified objects. Therefore, time-related features are able to be examined. Fig. 7a–d show PDFs of lifetime for the observed and simulated precipitation systems. Precipitation systems on different sides of the Rockies have similar distributions for lifetime. Overall, the typical lifetimes of precipitation systems in CPCM are slightly shorter than 10-h, and slightly longer than 10-h for MSWEP and NARR. Precipitation systems in CaPA generally have the longest typical lifetimes, approximately 25 h on the west side of the Rockies and 20 h for the east. Considering precipitation systems in CPCM generally have smaller size than the other three dataset, one can imagine that smaller 2D objects have a lower possibility to get connected and thus can have shorter lifetime. Besides that, the longer lifetime of MSWEP, NARR, and CaPA is further related to the coarse temporal resolution of the datasets, which will be clarified using a control experiment in Sect. 4.

Fig. 7
figure 7

As in Fig. 5, but for the object lifetime and propagation speed. The unit of lifetime is hour, and km h\(^{-1}\) for the unit of speed

Using GOES-7 ISCCP-B3 satellite data for 1987–88, Machado et al. (1998) found the lifetime of convective systems for all seasons in the Americas has the following distribution feature: lifetime less than 9 h is about 60\(\%\), greater than 24 h is 5\(\%\). While precipitation system defined in present study is not exactly equal to the convective system in Machado et al. (1998), our CPCM simulated results on lifetime distribution basically agree with their study. Recently, Prein et al. (2017a) examined features of mesoscale convective systems in United States, the lifetime distribution agrees well with our study, especially for our CPCM simulated distribution shown in Fig. 7b, d.

Comparing Fig. 7a with b indicates that moderate to heavy precipitation lasts longer than light precipitation on the west side of the Rockies. For MTD32_0.1, the lifetime of typical precipitation systems identified from CPCM is around 7 h, 12 h for MSWEP, 14 h for NARR, and 26 h for CaPA (Fig. 7a). For MTD8_2, in which case the light precipitation is not considered, the typical lifetime of precipitation for CPCM is 9 h, 15 h for both MSWEP and NARR, and 26 h for CaPA(Fig. 7b). For lifetime distribution of precipitation systems on the east side of the Rockies, results derived from MTD32_0.1 are quite similar to MTD8_2 for CPCM, both typical lifetimes are 8 h. For MSWEP, NARR, and CaPA, however, moderate to heavy precipitation lasts shorter than light precipitation (Fig. 7c, d).

Figure 7e–h show PDFs for the speed of precipitation systems derived from different datasets. Overall, the propagation speed derived from MTD8_2 is generally faster than MTD32_0.1. For the same MTD setting, precipitation systems on the east side of the Rockies generally move faster than the west side. Indeed, due to the position of jet streams and the large open topography, Canadian Prairies are well known for their local fast moving low pressure storms, as indicated by their names: Alberta Clippers, Saskatchewan Screamers, or Manitoba Mauler, depending on which province the storms originate from (Rohli and Vega 2017; Clair 2009).

When comparing the PDFs of propagation speed derived from different datasets, it can be seen that the typical propagation speed derived from CaPA is approximately 6 km h\(^{-1}\) for all the four cases (Fig. 7e–h). The propagation speed derived from the CPCM simulation turns out to be the fastest: for MTD32_0.1, the typical speed is slightly slower than 20 km h\(^{-1}\) on the west side of the Rockies, and slightly faster than 20 km h\(^{-1}\) on the east; for MTD8_2, the typical speed is 20–30 km h\(^{-1}\) on the west side of the Rockies, and approximately 30 km h\(^{-1}\) on the east. For NARR and MSWEP, the typical propagation speeds fall between CaPA and CPCM simulation, and generally slower than 20 km h\(^{-1}\).

It appears that the CPCM captures the propagation speed of precipitation in western Canada better than the three reference datasets for the following three reasons. First, living experience in Saskatchewan tells us that Saskatoon usually gets precipitation half day later than Alberta. Therefore, subjectively we have a sense that the propagation speed of precipitation in western Canada is about 30 km h\(^{-1}\). Second, previous studies found the propagation speed of precipitation in north America is generally faster than 25 km h\(^{-1}\), more in line with the CPCM simulated results. Based on Hovmöller diagram of 12-year hourly summer precipitation observations from automated surface observing system, Li and Smith (2010) found the propagation speed of precipitation on the east side of Rockies is about 50 km h\(^{-1}\) in June. Using rain streak span vs duration of radar-derived precipitation data, Carbone et al. (2002) found the mesoscale convective systems in the lee of the Rockies travels eastward at speeds in the range of 25.2–108 km h\(^{-1}\) with median phase speed at 51.5 km h\(^{-1}\). Despite the methodology, precipitation datasets, and the research domains are not exactly the same to the present study, all these studies arrive at the conclusion that the precipitation generally propagates faster than 25 km h\(^{-1}\). Recently, using the same methodology to the present study, Prein et al. (2017a) showed that the typical speeds of mesoscale convective systems from stage-IV are approximately 10 km h\(^{-1}\), 20–40 km h\(^{-1}\), 20 km h\(^{-1}\), and 20 km h\(^{-1}\) for the southeast, midwest, mid-Atlantic , and northeast part of the United States, respectively. Third, in theory, the propagation speed of precipitation is largely determined by the phase speed of large-scale forcing or the advection from low- to midlevel “steering” winds (Carbone et al. 2002). According to Sienkiewicz and Chesneau (1995), surface storm centers move at approximately 1/2 to 1/3 of the 500-hPa wind speed. Since the annual mean “steering” wind speed at 500 hPa in western Canada is approximately 54 km h\(^{-1}\) (figure not show here), the speed of surface storm centers are expected to be 18–27 km h\(^{-1}\), more close to the CPCM simulated results.

Fig. 8
figure 8

Annual cycle of precipitation systems, showing monthly average number of the MTD identified precipitation systems (3D objects) for the period from 1 September 2004 to 30 September 2015. The axis-ranges vary between different plots. The shadings show estimates of the 5–95 ercentile sampling uncertainty based on 100 bootstrap samples

Comparing Fig. 2 with Fig. 5 explains the speed difference derived from different datasets. In Fig. 5, the typical precipitation size for MTD8_2 is approximately 0.7 \(\sim \) 2 \(\times \) 10\(^{4}\) km\(^{2}\), corresponding to a typical radius of approximately 55 km. Based on this typical radius and inequality (1), for 6-h dataset, the maximum speed which can be identified by MTD is around 25 km h\(^{-1}\); for 3-h dataset, the maximum speed which can be identified by MTD is around 33 km h\(^{-1}\); and for hourly dataset, the maximum speed which can be identified by MTD is around 100 km h\(^{-1}\). Figure 7f, h show that the propagation speed derived from CaPA is generally less than 25 km h\(^{-1}\); the propagation speeds derived from NARR and MSWEP are generally less than 33 km h\(^{-1}\); and the propagation speed derived from the CPCM simulated precipitation is generally less than 60 km h\(^{-1}\). They agree well with the maximum values derived from Fig. 5 and inequality (1). For the very rare cases when propagation speeds are faster than the corresponding maximum values, the radii of the precipitation systems are larger than that for the typical size. Therefore, they can be identified occasionally. In conclusion, this study clearly shows that datasets with higher temporal resolution are more capable to capture the fast moving precipitation systems.

It should be noted here that each dataset is developed for specific applications, and all the datasets used in the present study have been documented as appropriate for their designated usages (e.g. Fortin et al. 2018; Mesinger et al. 2006; Beck et al. 2019; Li et al. 2019). One purpose of this study is to show that, the temporal resolution of source data can have large influence on the advanced research on the spatiotemporal features of precipitation. In Sect. 4, we will show that, hourly or higher temporal resolution is required for understanding precipitation features in North America using MTD.

Figure 8 shows the annual cycle of precipitation systems in western Canada. Maritime precipitation near the coast has quite different pattern of annual distribution from the continental precipitation on the Prairies. For moderate to heavy precipitation (Fig. 8b, d), the coastal region receives more precipitation processes (i.e. the MTD identified precipitation systems) in winter (November to March) than that in summer (June to August). On the Prairies, however, the distribution pattern is reversed: precipitation processes are mainly concentrated in summer (June to August), and rarely happen in winter (November to March). When light precipitation (0.1–2 mm h\(^{-1}\)) is considered (Fig. 8a, c), coastal region receives more precipitation processes in early summer (April to June), while the Prairies always receive most precipitation processes in summer (June to August).

The annual cycle of the MTD identified precipitation systems agrees well with the current understanding of precipitation in western Canada. In winter, the Great Basin region (southwestern United States) is most frequently dominated by highs while the Gulf of Alaska has the maximum frequency of lows. As a result, the Pacific coast of north America has most cyclonic activities in winter (Barry and Chorley 2009), agreeing well with our MTD simulated results shown in Fig. 8b. Comparing Fig. 8a with b indicates that light precipitation (0.1 \(\sim \) 2 mm h\(^{-1}\)) occurs frequently in spring and early summer (April to June). This can be due to the oceanic water vapour supply and the orographic lifting (Clair 2009). In summer, precipitation on Prairies is mainly caused by convections. Figure 8d shows that the CPCM with its high temporal resolution picks up many more short-lived convective storms than other datasets on the Prairies during summer. The convective weather season runs from May to early September, whereas July is the most active month for convection, followed by June and August (Vickers et al. 2001). As a result, more precipitation falls onto the Prairies during the summer growing months than at any other time of the year, as documented by Hare and Thomas (1979).

Despite the monthly averaged numbers of precipitation systems derived from different datasets are different, all the four datasets have quite similar annual distribution pattern, indicating the high reliability of the annual cycle pattern described above. It further indicates the capability of our CPCM to capture the annual cycle of precipitation processes in western Canada. Similar annual cycle pattern to results shown in Fig. 8 has been documented by Clair (2009), Bailey et al. (1997), and Hare and Thomas (1979).

4 Influence of spatiotemporal resolution of source data

In Sect. 3, we showed that the MTD identified precipitation features can be influenced by the temporal resolution of source data. To further identify the way how spatial and temporal resolution of source data influences the MTD results, two control experiments have been done. Both control experiments are driven by the CPCM simulated precipitation, for its fine spatial and temporal resolution. All the MTD settings are the same to the MTD experiments shown in Sect. 3. Specifically, the smoothing radius of 8 grid cells and threshold of 2 mm h\(^{-1}\) was used for identifying moderate to heavy precipitation, and the smoothing radius of 32 grid cells and threshold of 0.1 mm h\(^{-1}\) was used for identifying light precipitation. In the first experiment, 6-h accumulations at 0000, 0600, 1200, and 1800 UTC were used to drive MTD. The influence of temporal resolution is expected to be provided by comparing results of this experiment (CPCM_temporal_8 and CPCM_temporal_32) to the results of experiment driven by the original CPCM outputs (CPCM_8 and CPCM_32). In the second experiment, the original CPCM outputs were regridded to the same grid as NARR, which is 32-km for the spatial resolution, then regridded back to 4-km to drive the MTD. In this case, the CPCM simulated precipitation lost the original fine-scale structure. Therefore, the influence of spatial resolution is expected to be provided by comparing results of this experiment (CPCM_spatial_8 and CPCM_spatial_32) to the results of experiment driven by the original CPCM outputs (CPCM_8 and CPCM_32).

Fig. 9
figure 9

Probability density functions (PDFs) for the settings of MTD32_0.1 (columns 1, 3) and MTD8_2 (columns 2, 4), and for the regions located on the west side of the Rocky Mountain (columns 1, 2) and east side of the Rocky Mountain (columns 3, 4). The axis-ranges vary between different plots. The shadings show estimates of the 5–95 percentile sampling uncertainty based on 100 bootstrap samples. Yellow lines show MTD results derived from the original CPCM simulations with high spatiotemporal resolution, green lines show MTD results from CPCM simulations with coarser spatial resolution, and blue lines show MTD results from CPCM simulations with coarser temporal resolution

Figure 9 shows features of precipitation systems identified from the original CPCM simulations with high spatiotemporal resolution (yellow lines), the CPCM simulations with coarser spatial resolution (green lines), and the CPCM simulations with coarser temporal resolution (blue lines). Overall, coarser spatial resolution has little influence on MTD results. Results derived from datasets with high and coarser spatial resolution are very similar for all the cases shown in Fig. 9. The reason is that MTD smooths the source data to remove the small objects (Sect. 2.3). The present study used smoothing radii of 8 and 32 grid cells (32 and 128 km) for moderate to heavy precipitation and light precipitation, respectively. Therefore, source data with 4-km or 32-km spatial resolution have little difference after smoothing. Consequently, the MTD results are very similar.

Figure 9 shows that coarser temporal resolution has noticeable influence on all the four precipitation features (i.e. size, track length, lifetime, and speed), highlighting the importance of temporal resolution to MTD analyses. It is interesting to see that source data with coarser temporal resolution leads to larger identified precipitation systems, for all the cases shown in Fig. 9a–d. The reason lies in the fact that, in MTD analysis, small objects tend to be smoothed out as the way a human being would pick the fairly large and typical objects in the raw field (Bullock et al. 2016). When the temporal resolution of source data gets coarser (i.e. larger \(\delta \)t in inequality 2), only precipitation systems with relatively large size can get connected and thus be identified by MTD.

Source data with coarse temporal resolution tends to identify precipitation systems with short track length and slow propagation speed (Fig. 9e–h, m–p). It has been explained in Sect. 3.2 that, for 6-h dataset, MTD can normally identify precipitation systems propagating slower than 25 km h\(^{-1}\). Fast moving precipitation systems tend to be split into several small subsystems and only the slow part can be identified (Sect. 2.3). Slower propagation speed further explains the shorter track length. Indeed, multiplying the typical speed and lifetime generally derives the typical track length, for all the four cases shown in Fig. 9.

Fig. 10
figure 10

Joint probability density functions for precipitation systems identified by MTD8_2, driven by the CPCM original outputs. The black lines are linear regressions, and the shadings show estimates of the 5–95 percentile sampling uncertainty based on 100 bootstrap samples

The lifetime of precipitation system identified from 6-h source data is noticeably longer than its counterparts identified from hourly datasets, for all the four cases shown in Fig. 9 i–l. Since precipitation systems identified from 6-h dataset can be split into several small subsystems (see above), one may expect the corresponding lifetime to be shorter than that from the hourly dataset, which appears to contradict the results shown in Fig. 9. One possible explanation can be that, from a statistical perspective, MTD driven by 6-h source data mainly selects precipitation systems with longer lifetime. To verify this speculation, joint probability density functions for precipitation size and lifetime were plotted, for MTD8_2 case driven by CPCM original outputs (Fig. 10). It can be seen that, for regions on both west and east side of the Rockies, the lifetime is positively related to the size of precipitation system. This positive relationship is statistically significant as the correlation coefficients are larger than 0.6 and the p-values are smaller than 0.01 (Kutner et al. 2005).

In conclusion, the two control experiments above show that the spatial resolution of source data has little influence on MTD results from a climate perspective. However, the temporal resolution of source data has noticeable influence on MTD results. Specifically, MTD driven by dataset with coarse temporal resolution can only identify precipitation systems with relatively large size and slow propagation speed. This kind of precipitation systems normally have short track length and relatively long lifetime.

Based on the new knowledges from the two control experiments, we can now reexamine results shown in Figs 57. It can be seen that precipitation systems identified from NARR and MSWEP generally have larger size, shorter track length, longer lifetime, and slower propagation speed than that identified from CPCM outputs. Results derived from CaPA even go slightly further than NARR and MSWEP. These features agree well with results derived from the first control experiment. This agreement indicates the reliability of the conclusions from the first control experiment. This agreement also indicates that, besides the influence from the MTD settings and the inherent property of the source data as shown in Fig. 6, the temporal resolution of the source data has significant influence on the MTD results as well.

5 Summary and conclusion

Using a novel object-based algorithm, advanced spatial and temporal features of precipitation over western Canada were analyzed. A sandwich pattern of density distribution can be seen from all the four datasets for moderate to heavy precipitation (Fig. 3a–d). Despite no previous study has documented precipitation frequencies for different intensities in western Canada, all the four datasets show that light precipitation (0.1 \(\sim \) 2 mm h \(^{-1}\)) occurs relatively more frequently in the mountainous area while moderate to heavy precipitation (> 2 mm h \(^{-1}\)) is rare there. Intensive precipitation is restricted by the availability of water vapour in the mountainous region while light precipitation occurs frequently due to the orographic lifting (Rohli and Vega 2017; Clair 2009).

The size of maritime precipitation system near the coast is similar to the continental precipitation system on the Prairies for moderate to heavy precipitation (> 2 mm h \(^{-1}\)). When light precipitation (> 0.1 mm h \(^{-1}\)) is considered, the typical size of maritime precipitation system is smaller than its continental counterpart (Fig. 5). It therefore indicates that light precipitation on the Prairies is larger in size than that occurs near the coast. Near the west coast, moderate to heavy precipitation lasts longer than light precipitation while the lifetime distributions of MTD32_0.1 and MTD8_2 are quite similar on the Prairies. Precipitation systems on the Prairies generally move faster than the coastal precipitation, because the former can move relatively freely on the large open area. Precipitation on the Prairies mainly occurs during summer while the coastal region receives most of its moderate to heavy precipitation in winter and light precipitation in early summer.

MTD results derived from different datasets can be different, reflecting the intrinsic difference between the datasets (Fig. 6). Besides that, the temporal resolution of source data can be critical to the temporal related MTD research. Specifically, we show that for a typical precipitation system in western Canada, which has a typical size of 0.7 \(\sim \) 2 \(\times \) 10\(^{4}\) km\(^{2}\), the maximum propagation speed can be identified by 6-h data is approximately 25 km h\(^{-1}\), 33 km h\(^{-1}\) for 3-h dataset, and 100 km h\(^{-1}\) for hourly dataset. Since the propagation speed of precipitation systems in North America is generally between 0 \(\sim \) 80 km h\(^{-1}\) [Fig. 7 of this study, Fig. 10 of Prein et al. (2017a)] we argue that hourly or higher temporal resolution is required for temporal related precipitation study.

Besides temporal features, the two control experiments further show that even the spatial features of the identified precipitation systems can be influenced by the temporal resolution of source data. MTD driven by dataset with coarse temporal resolution tends to identify precipitation systems with large size and slow speed, because the 2D objects are easier to get connected as shown in inequality (2). Large precipitation systems normally live longer, and slow moving precipitation systems normally have short track length.

Most spatiotemporal features discussed in Sect. 3 are supported by all the four datasets used in this study, indicating the high reliability of our results. However, there are two details that merit specific attention. First, in Sect. 3.1, all the four datasets show that the light precipitation on the Prairies is larger in size than that occurs near the coast. Since the coastal region shown in Fig. 1 is limited in size, it is possible that only part of the large precipitation systems were located in the domain of the coastal region and thus partly identified by MTD. In this extreme case, the size of light precipitation on the coastal region can actually larger than that has been identified by MTD. Second, choosing which smooth radius for MTD analysis is largely depended on the scale of the target precipitation system, and partly depended on the spatial resolution of the source data. Normally, small smooth radii are preferred for meteorological research, because they can even provide details on the small storms (e.g. Cai and Dumais Jr 2015). Relatively larger smooth radii are preferred for climate research for computation efficiency (e.g. Prein et al. 2017a). According to Davis et al. (2006a), smooth radius is recommended to be less than 150-200 km, in order to resolve gaps between multiple precipitation areas that are nearby. The MTD32_0.1 setting in this study used 128 km (i.e. 4 \(\times \) 32) as the smooth radius to identify large precipitation systems. This is a relatively large smooth radius. However, it makes the identified 2D objects easier to be connected, especially for the 3-h and 6-h datasets.

The present study aimed for a better understanding of the spatiotemporal features of the precipitation systems in western Canada, and sharing our experience in using MTD. Unfortunately, grid precipitation dataset with high temporal resolution is currently not available in western Canada. However, spatiotemporal features derived from all the four datasets agree well with each other despite the specific values are different. For example, all the four datasets have similar precipitation distributions as shown in Fig. 3, and have similar annual distributions as shown in Fig. 8. Comparing Fig. 9 with Figs. 57, one can imagine that if NARR and MSWEP were hourly datasets, the distributions of precipitation size, track length, lifetime, and speed would be quite similar to our CPCM results. The overall good performance of CPCM outputs shows that, for regions where precipitation dataset with high temporal resolution does not exist, carefully designed and quality assured CPCM simulations can provide an alternative way for the advanced object-based research.