Keywords

1 Introduction

Precipitation is the key forcing variable to estimate land surface hydrological fluxes and states (Nijssen and Lettenmaier 2004). The spatial and temporal pattern, intensity and duration of precipitation have significant effects on hydrological cycles (Sorooshian et al. 2011). Floods, droughts and some other natural hazards are usually caused by anomalies in precipitation (Hong et al. 2004). Consequently, the accuracy of a hydrological simulation is heavily dependent upon the accuracy of precipitation data.

Precipitation data can be measured at local scale by rain gauges, simulated by atmospheric models and retrieved from remote sensing data at global scale. Traditionally, rain gauge observation is treated as the ground truth of precipitation data. However, the distribution of available gauges significantly varies around the world. Undeveloped regions like Africa, Southeast Asia and South America are lacking in rain gauges. Model simulation can provide global-covered rainfall data, but the performance of simulated precipitation data need improve, especially in the areas with complex and multiple landcover.

Thanks to the rapid development of remote sensing techniques, plenty of satellite-based precipitation products have been developed and released, such as the Tropical Rainfall Measuring Mission (TRMM), the Climate Prediction Center MORPHing technique (CMORPH), and the newest Global Precipitation Measurement (GPM), providing unprecedented opportunities for hydrological modeling and prediction, especially for those areas where meteorogical stations are scarce (Li et al. 2013). Owing to the advantages of high spatiotemporal resolution, spatially-distributed, readily accessible and long-recorded information, satellite-based precipitation products have been applied to a number of hydrological studies (Li et al. 2015). However, cloud conditions, retrieval algorithms, and land surface properties can all induce errors in the final precipitation estimation. Consequently, the errors associated with satellite-based precipitation products should be characterized before the data can be used in models (Turk et al. 2008). There are several works to evaluate and compare GPM and/or TRMM, for example, Li et al. compared multiple sets of precipitation data from the Yangze River, Xu et al. examined data from the Yalung Zangbo River, and so on. Recently, He et al. (MRC 2010) compared GPM and TRMM data in the upper Mekong River and found both products can give reliable rainfall estimation in the study region, while GPM captures precipitation events better than the TRMM. Until now, according to the former studies, there has been no comprehensive evaluation of GPM and TRMM over the Lower Mekong River Basin (LMRB). This is partly due to the fact that there are only a few rain gauges in this region and so it is difficult to collect ground rainfall observations promptly.

In this study, we evaluated the performance of the GPM and TRMM in the LMRB in two ways: firstly by making a direct comparison between satellite-rainfall data and ground observations, and secondly using an indirect method to compare the simulated discharge under a distributed hydrological model with that using GPM and TRMM data, separately. Since the simulated discharge is a temporal and spatial integration of rainfall, by using a physical processes-based hydrological model, we can evaluate the reliability of input rainfall through evaluating the simulation results of discharge. The results of our study will demonstrate the accuracy of the IMERG and 3B42V7 in this region, which may help potential users to select data appropriately, and will also provide information to the data producers to help them improve their products. Moreover, the hydrological simulation result will promote the remote sensing application in hydrology and related communities.

2 Study Area and Data

2.1 Study Area

This study focused on the LMRB, i.e. the lower reaches of the Mekong River Basin. Figure 5.1 shows the geographic information and elevation of the MRB. The Mekong river is the most important transboundary river in Southeast Asia, rising from the Tibetan Plateau in China and flowing through China’s Yunnan Province, Myanmar, Lao People’s Democratic Republic (PDR), Thailand, Cambodia and Vietnam, six countries in all, before finally discharging into the South China Sea (MRC 2010). It is the tenth largest river in the world, with a length of almost 4900 km, a total catchment area of 795,000 km2 and an average discharge of 14,500 m3/s (MRC 2010). The Mekong River flows for about 2200 km, with a decrease of almost 4500 m in elevation from the Tibetan Plateau to the Golden Triangle where the borders of Thailand, Lao PDR and Myanmar come together and where the river becomes very steep and narrow. As the river reaches the lower basin region, major tributary systems develop, which show obvious difference between the left bank and right bank. The left bank tributaries drain the high-rainfall areas of Lao PDR, while the right bank tributaries drain a large part of Northeast Thailand, which has a lower rainfall rate. The Mekong River then joins the largest freshwater lake in Southeast Asia, Tonle Sap Lake at Phnom Penh, where the main stream breaks into a number of branches and finally rushes out into the ocean.

Fig. 5.1
figure 1

The study area, including elevation and main stream information

2.2 Satellite Precipitation Products

The TRMM 3B42V7 and GPM IMERG precipitation data are selected in this paper as the precipitation forcing data. TRMM 3B42V7 data are generated by combining information from both passive microwave and infrared observations with high spatial (0.25°) and temporal (3 h) resolutions, and have been calibrated against ground rain gauge observation at the monthly scale to remove the bias of satellite retrievals (Huffman et al. 2007). The GPM mission is composed of an international network of satellites that provide the next generation of global rain and snow observations (Guo et al. 2016). The IMERG precipitation data estimates come from the various precipitation-relevant satellite passive microwave sensors comprising the GPM constellation and are computed using the 2017 version of the Goddard Profiling Algorithm (GPROF2017), gridded, inter-calibrated to the GPM Combined Instrument product, and are then combined into half-hourly 0.1° × 0.1° fields (Huffman et al. 2012).

2.3 Land Surface Data

In order to develop a distributed hydrological model for the MRB, several land surface data are needed, including digital elevation model (DEM) data, soil data, land cover data and vegetation data.

DEM data, provided by USGS at 90 m resolution, is used to generate a digital river channel and network. The river network was generated using AcrGIS functions, and statistical values related to river properties such as river length, slope, density were also calculated. In order to be consistent with th resolution of hydrological model, the river network at 90 m resolution was re-projected into the calculation grid of the hydrological model at a 10 km  ×  10 km resolution.

Soil properties are very important parameters for runoff generation, evapotranspiration estimation, and routing processes. Soil type and soil depth data were obtained from the FAO digital soil map of the world and used to derive soil properties, such as hydraulic and thermal conductivity.

Land use and land cover data were downloaded from the USGS 1-km Global Land Cover Characteristics Database version 2.0. The dynamic of vegetation is represented by using NDVI data provided by MODIS.

2.4 Meteorological Forcing Data and Discharge Data

The other forcing climate data, including air temperature (mean, maximum and minimum), wind speed and sunshine duration, are derived from the ECMWF ERA-interim dataset. The mean air temperature is calculated by taking the average of the maximum and minimum ERA-interim 2-m temperature. The relative humidity is calculated using the specific humidity data and mean temperature data from ERA-interim dataset. All of the ERA-interim data are processed into the scale fit for the model.

The in situ observation discharge data for the MRB were obtained from the Mekong River Commission (MRC). The upper part of the Mekong River, also known as the Lancang River, has been developed over the last several decades. More than ten dams and reservoirs have been constructed and are currently operating along the Lancang River. Consequently, the flow regime of the Lancang River has been highly impacted. Conversely, there are almost no big reservoirs located along the main stream of the LMRB, and the flow regime is a largely natural one. In order to remove the impacts of upstream dam operation on the LMRB hydrological simulation, the observed stream flow at Chiang Sean station, which is the upmost station in the LMRB, was input into the hydrological model as the discharge of upstream. Discharge observed at Laung Prabang, Mukdahan, Pakse and Strung Treng are used as validation data to compare the 3B42V7 simulation and IMERG simulation.

3 Model Description and Set Up

The hydrological model used in this work is Geomorphology Based Hydrological Model (GBHM) (Yang et al. 2002a, b), the model and its methods has been successfully applied to many different types of rivers, ranging from catchment-scale (Cong et al. 2009; Gao et al. 2008; Jijun et al. 2008; Yang et al. 2004) to continental-scale [16]. The GBHM consists of four key characteristics: a gridded discretization scheme, a sub-grid parameterization scheme, a hillslope based hydrological modeling module, and a kinematic wave flow routing module (Yang et al. 2002a, b).

The digital basin is defined using DEM (YANG et al. 1997). The digital basin is sub-divided into a number of cascade-connected flow intervals following the flow distance from outlet to upper source, and using the area and width functions to group the topography and divide the catchments into a series of flow interval hillslopes (Yang et al. 2002a, b). The hillslope is the fundamental computation unit in the model, providing lateral inflow estimates to the same portion of the main stream (see Yang et al. 2002a, b for further details). The catchment parameters related to topography, land use and soil are then calculated for each simulation unit. By establishing a digital basin, the study basin can be divided into a discrete grid system. The grid is represented by a number of geometrically symmetrical hillslopes. The complex, two-dimensional water kinematics can be simplified to a single dimension by applying this flow-interval and hillslope-river based scheme of sub-grid parameterization. A physically based model is used to simulate the hydrological processes of snowmelt, canopy interception, evapotranspiration, infiltration, surface flow, subsurface flow and the exchange between the groundwater and the river for each hillslope. Finally, a nonlinear, numerical river routing scheme is used to calculate the catchment runoff. The model structure is shown in Fig. 5.2 (Wang et al. 2016).

Fig. 5.2
figure 2

The structure of the Geomorphology Based Hydrological Model

In order to stay consistent with the time period of the IMERG, the study period is selected as 2014/4/1–2015/12/31, i.e. 21 months. The 3B42V7 and IMERG precipitation data were interpolated to 10 km resolution in the GBHM model processing, the Chiang Sean gauge’s discharge input data were hourly scale to control the upper discharge. Due to the lack of observation data in the Mekong River, and the short study period, it is very difficult to calibrate the model used in this study. The parameters set and the model calibration used were utilized in a former Mekong-focused study (Wang et al. 2016). The discharge station information is shown in Table 5.1.

Table 5.1 Discharge station information

4 Evaluation Matrix

Three indices were used to measure the model performance for the two precipitation data simulations: the modified Nash-Sutcliffe efficiency coefficient (NASH), the ratio of the absolute error to the mean (RE) and the relative Root-mean-square error (RMSE). These indices were used to evaluate the agreement between the simulated and observed hydrographs at monthly and daily scales. The equations used to calculate these indices are as follows:

$$ NASH=1-\frac{\sum_{i=1}^T{\left( ob{s}_i- si{m}_i\right)}^2}{\sum_{i=1}^T{\left( ob{s}_i-\overline{obs}\right)}^2} $$
(5.1)

Equation 5.1 is the NASH efficiency coefficient formula. NASH efficiency coefficients range from negative infinity to 1. The closer to 1, the more accurate the discharge hydrograph simulation.

$$ RE=\frac{\overline{sim}-\overline{obs}}{\overline{obs}}\times 100\% $$
(5.2)

Equation 5.2 is the formula for relative error. The smaller the RE, the more accurate the discharge simulation.

$$ RRMSE=\frac{\sqrt{\frac{1}{n}\ast {\sum}_{i=1}^n{\left( ob{s}_i- si{m}_i\right)}^2}}{\overline{obs}} $$
(5.3)

Equation 5.3 is the relative root-mean-square error (RRMSE). A smaller RRMSE indicates a more accurate simulated discharge result. The \( \overline{sim} \) and \( \overline{obs} \)in these equations represent the mean simulated and observed discharges, respectively. i refers to the time (the number of day or month). n refers to the total number of days or months.

5 Results

5.1 Direct Evaluation of the IMERG and 3B42V7

There are around 50 rain gauges located in the LMRB. However, during our study period, only 34 gauges provided rainfall observations. Figure 5.3 shows the distribution of these 34 stations. It is clear that most stations are located around the LMRB. These stations are operated by Thailand and Vietnam. There are few stations in Lao PDR, with almost no stations in Cambodia. Such a sparse distribution of rain gauges fails to provide reliable rainfall information for regional application. However, data from these stations can be used to evaluate the IMERG and 3B42V7 through the point-pixel comparison.

Fig. 5.3
figure 3

Distribution of rain gauges with observation in the study period

Figure 5.4 shows a scatter plot for the IMERG and 3B42V7 against ground rain gauge observations for the study period as a whole. It is obvious that both products perform similarly in the LMRB, i.e. they both provide good estimations for low rainfall occurrences but underestimate heavy rainfall occurrences. Quantitatively, the IMERG has a slightly lower RRMES and a higher correlation coefficient than the 3B42V7 (Table 5.2).

Fig. 5.4
figure 4

Performance of the IMERG and 3B42V7 at 34 gauges during the study period

Table 5.2 Statistic matrix for IMERG and 3B42V7 evaluation

5.2 Evaluation of Discharge Simulation

Figure 5.5 shows the comparison of the IMERG and 3B42V7 simulations against observations at the five discharge stations, while Table 5.3 summarises the models’ performance when driven by different precipitation data. From Fig. 5.5 it can be found that (1) both the IMERG and 3B42V7 are able to drive the GBHM and can get reliable discharge simulations at the monthly scale; (2) in most discharge stations, the discharge driven by the IMERG-simulation is closer to the observed discharge than that driven by the 3B42V7-simulaiton. Such findings are as expected, since both the TRMM and GPM are calibrated with ground observations at the monthly scale, and the GPM is developed on the basis of the TRMM with more accurate sensors and greater accumulated experience.

Fig. 5.5
figure 5

Comparison of monthly discharge at the five discharge stations

Table 5.3 Discharge simulation results of the two data force experiments

More specifically, from the evaluation indexes shown in Table 5.3, it is clear that both the IMERG and 3B42V7 simulations have monthly NASH coefficients larger than 0.8, except for the 3B42V7 simulation at Nong Khai station. From the view of RRMSE, both simulations are performing well, with RRMSE less than 0.3 for most stations. Moreover, for all five stations, the IMERG simulation shows higher NASH coefficients and lower RRMSEs than the 3B42V7 simulation. This indicates that the new generation satellite rainfall product performs as well as expected.

Table 5.3 also shows the statistical results at the daily scale. In general, the daily discharges were also well simulated by both the IMERG and 3B42V7. Compared with those simulated at the monthly scale, the NASH coefficients of the daily simulations decreased slightly (0.05~0.17) and the RRMSEs increased from 0.3 to 0.4, with only one exception, the Nong Kai station. Figure 5.6 shows the comparison of simulated and observed discharges at the daily scale. It is obvious that the simulation results, both IMERG and 3B42V7, are generally in agreement with observations. This means that the satellite precipitation data, which are calibrated at the monthly scale, are able to drive hydrological models to produce reliable discharge simulations in the LMRB even at the daily scale.

Fig. 5.6
figure 6

Comparison of daily discharge at the five discharge stations

From both Table 5.3 and Fig. 5.6, we find that the 3B42V7 fails to simulate discharge at Nong Khai station, while the IMERG provides a modest simulation. In order to identify the reasons of this difference we compared the original 3B72V7 precipitation data with that of the IMERG. Figures 5.7 and 5.8 show the monthly average rainfall of the 3B42V7 and IMERG and the differences between them in July and August of 2015, respectively. From these two sets of maps it is clear that there are small difference in most regions of MRB, except in the sub-basins around Nong Khai station. Both of the two satellite-based products suggest that the rainfall center is around Nong Khai during July and August, while the rainfall center of 3B42V7 is on upstream of Nong Khai against to the rainfall center of IMERG is located at downstream of Nong Khai. From the differences shown in the maps in Figs. 5.7 and 5.8, the 3B42V7 demonstrates more rainfall in the region from Luang Prabang to Nong Khai than IMERG. While the 3B42V7 demonstrates less rainfall in the region from Nong Khai to Mukdahan than IMERG. However, there are almost no rain gauges in this region, so it is hard to say which product is more accurate indirectly. Consequently, we can only evaluate the performance of rainfall products indirectly by comparing the discharge simulations.

Fig. 5.7
figure 7

Averaged rainfall of the 3B42V7 (left), IMERG (middle) simulations, and the differences between them (left) in July 2015

Fig. 5.8
figure 8

Averaged rainfall of the 3B42V7 (left), IMERG (middle) simulations, and the differences between them (left) in August 2015

From Table 5.3, it is clear that the 3B42V7 simulation shows a much bigger RE (31.4%) at Nong Khai station than the IMERG simulation does (12.4%). Figure 5.6 suggests that the 3B42V7 simulation overestimated discharge mainly during the raining season. However, both Fig. 5.6 and Table 5.3 demonstrate that the 3B42V7 simulation has a comparable accuracy to the IMERG simulation at Mukdahan station. From Figs. 5.7 and 5.8 we can identify that the 3B42V7 simulation obviously underestimates rainfall in the region from Nong Khai to Mukdahan. It is speculated that the overestimations and underestimations compensate for each other, thus making the 3B42V7 simulation performance comparable to the IMERG simulation at Mukdahan station. By combing the above-mentioned findings, we conclude that (1) the IMERG has a good rainfall estimation in the region from Luang Prabang to Mukdahan, and (2) the 3B42V7 overestimates rainfall in the region from Luang Prabang to Nong Khai and underestimates rainfall in the region from Nong Khai to Mukdahan.

6 Conclusions

Precipitation is one of the most important inputs of land surface models, hydrological models and ecological models. In regions like the LMRB, where rain gauges are sparsely and unevenly distributed, satellite-based precipitation data can provide a valuable and unique data source for hydrological research.

This study presents a first evaluation of the TRMM 3B42V7 and GPM IMERG in the LMRB, both through a direct point-pixel comparison and an indirect discharge comparison. The point-pixel comparison, conducted over 34 rain gauges, shows that the 3B42V7 and IMERG perform similarly in rainfall estimations, while statistically, the IMERG is slightly better than the 3B42V7, with smaller RRMSEs and bigger correlation coefficients.

For discharges simulated at most stations, both the IMERG and 3B42V7 generally demonstrated good results at the monthly scale, with a NASH higher than 0.8 and RRMSE less than 0.3. The results at the daily scale were less accurate than those at the monthly scale, but still acceptable with a NASH around 0.6.

The 3B42V7 simulation showed obvious overestimation of discharge at Nong Khai station, with a 0.01 daily NASH and 0.68 RRMSE. At the same time, the IMERG simulation performed well, with a daily NASH of 0.62 and RRMSE of 0.412. This difference indicates that the IMERG outperforms the 3B42V7 in this region. A more detailed analysis found that the 3B42V7 overestimated rainfall in the region from Luang Prabang to Nong Khai and underestimated rainfall in the region from Nong Khai to Mukdahan.

Some limitations exist in this study. Although our study shows that the IMERG provides a better input rainfall product than the 3B42V7 for the GBHM, this trend should be tested in other cases, such as using different remote sensing-based rainfall data for different distributed hydrological models in different basins. Another uncertainty may lie in the short period of this study. To achieve more stable and reliable evaluation results, a longer simulation period is preferred. This will become available as GPM data is continuously accumulated.