Performance assessment of gridded climatic data and modeling spatial accuracy ranking matrix for gridded precipitation using a new pixel-based approach: a district-wise case study of Punjab

Saleem, Muhammad Usman; Akram, Mian Sohail

doi:10.1007/s12517-023-11442-w

Performance assessment of gridded climatic data and modeling spatial accuracy ranking matrix for gridded precipitation using a new pixel-based approach: a district-wise case study of Punjab

Original Paper
Published: 03 May 2023

Volume 16, article number 347, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Arabian Journal of Geosciences Aims and scope Submit manuscript

Performance assessment of gridded climatic data and modeling spatial accuracy ranking matrix for gridded precipitation using a new pixel-based approach: a district-wise case study of Punjab

Download PDF

193 Accesses
Explore all metrics

Abstract

In this study, the performance of several daily gridded climatic variables, e.g., maximum and minimum temperatures “Tmax (CPC) and Tmin (CPC),” relative humidity “RH (CDC),” wind speed “WS (CPC),” and precipitation collected from National Oceanic and Atmospheric Administration, Physical Science Laboratory “NOAA (PSL),” was evaluated with ground observed data in 9 different districts of Punjab. It was concluded, after in-depth investigations, that Tmax (CPC), Tmin (CPC), and RH (CDC) follow the same quality and performance as ground observations. The performance of gridded precipitation in different geospatial approaches, e.g., point-to-point and polygon-to-point, was also evaluated. Results revealed that the performance of daily mean gridded precipitation of district “RF (CPC DM)” is most reliable and accurate with gauge-based observation. The accuracy of monthly long-termed mean precipitation RFM (CPC) in the remaining 27 districts of Punjab was predicted by using techniques of statistical and geostatistical interpolations. The model was optimized based on minimum root mean square error (RMSE), and a monthly spatial accuracy ranking matrix was developed using a new pixel-based approach. This research formulated the district-wise accuracy of RFM (CPC) by using the first rank in the spatial accuracy ranking matrix. Accuracy mapping revealed that RFM (CPC) is most accurate in South Punjab districts (Bahawalpur, Bahawalnagar, RY Khan, Rajanpur, Muzaffargarh, DG Khan, Layyah, Multan, Lodhran, Khanewal, and Vehari) throughout the year. It performed well in Central Punjab districts (Lahore, Sheikhupura, Kasur, Nankana, Sahiwal, Pakpattan, Okara, Faisalabad, Chiniot, T.T. Singh, Narowal, Gujrat, Gujranwala, and Sialkot) during winter, spring, and autumn seasons, while in Monsoon season, its accuracy is moderate. RFM (CPC) poorly performed in Northern Punjab districts (Rawalpindi, Jhelum, Chakwal, Attock, Mianwali, Khushab, Bhakkar, Jhang, Sargodha, and Hafizabad) throughout the time series of data.

Development of high resolution daily gridded precipitation and temperature dataset for potohar plateau of indus basin

Article 04 September 2023

Evaluation of gridded precipitation products in the selected sub-basins of Lower Mekong River Basin

Article 15 November 2022

Monthly gridded precipitation databases performance evaluation in North Patagonia, Argentina

Article 27 August 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The climate of a region consists of rainfall, daily maximum and minimum temperatures, relative humidity, and wind speed. Such variables are of great concern in different areas of earth sciences, e.g., agro-meteorology, agro-hydrology, and agro-ecology (Tapiador et al. 2012; Beck et al. 2019). The particular elements of a climate, temperature, and precipitation control the phenology, productivity, abundance, interaction, and geographical distribution of biodiversity and the biotic ecosystem of a region (Early and Keith 2019). However, the unavailability of long-term quality-controlled and spatially continuous climatic data is a significant obstacle to observing a region’s climate (Prein & Gobiet 2017; Nashwan et al. 2019a, b). In precipitation, if rain gauge records are accurate and available for long time series, then less dense spatial distribution hinders their use (Bell et al. 2015; Kidd et al. 2017; Nashwan et al. 2018). To solve this problem, extensive range of global and regional gridded datasets have been developed, which are of high resolution and provide spatiotemporal coverage globally and regionally in ungauged or sparse gauged regions (Gyalistras 2003; Haylock et al. 2008; Yatagai et al. 2009; Schiemann et al. 2010; Belo‐Pereira et al. 2011; Herrera et al. 2012). The applications of high-resolution gridded datasets have been overgrown recently, mainly due to spatial–temporal resolutions. Despite that, there is a significant concern about accuracy in gridded data, which led to a large number of studies related to performance evaluation of gridded data at the regional level (Dinku et al. 2008; Ma et al. 2008; Kyselý and Plavcová 2010; Sylla et al. 2013; Eum et al. 2014; Prakash et al. 2015a, b; Prein and Gobiet 2017; Nashwan et al. 2019a). The performance evaluation of gridded data is performed mainly by comparing it with reliable ground data. Several climatic datasets have developed based on different data sources, e.g., ground observations, satellite, radar, re-analysis, and emerging techniques. These datasets are very useful for finding spatial–temporal trends (Trenberth et al. 2003; Ghosh et al. 2012; Fischer and Knutti 2014) and the reliability of climatic extremes (Trenberth et al. 2003; Ghosh et al. 2012; Fischer and Knutti 2014). In particular to precipitation, literature shows a substantial difference between different global datasets at daily (Wong et al. 2017) to yearly (Sun et al. 2018) time scales, which limits the understanding of regional to global precipitation. There is a need to develop standard accuracy thresholds for gridded climatic datasets, particularly rainfall, on a local to regional scale in Punjab, a province of Pakistan.

The accuracy and performance of the gridded climatic dataset over Pakistan were conducted by Karori and Zhang (2008); Khan et al. (2014, 2018); Bui et al. (2019); Ullah et al. (2019); Adnan et al. 2020; Baudouin et al. (2020); Ougahi and Mahmood (2022); Usman et al. (2022). However, these studies are on particular climatic regions, specific river catchments, and broken past time series datasets. This research evaluates the performance of gridded climatic variables over Punjab over the regional (district) scale. Such performance evaluation and accuracy threshold criteria were never tested before on local to regional scales along such a continuous time series dataset.

Study area

The Punjab (meaning land of five rivers) is the largest province of Pakistan, and this province comprises 36 districts of different geographical extent. It has the largest population and shares in the economy of the country. Agriculture is the major contributor to the province’s economy, which is sensitive to climatic extremes and abnormalities. This province has faced a series of natural disasters, e.g., floods, climatic extremes, and earthquakes, in the past decade and still facing such disasters regularly. The historical earthquake of 2005 and flood of 2010 nearly shuffled its economic conditions thoroughly. The policymakers require accurate and timely information on climatic variables to mitigate regional disasters. Most districts of Punjab face extreme weather with foggy winters, sometimes accompanied by precipitation. The temperature begins to rise from mid-February till April in the springtime. By May, monsoon onsets and brings intensive rainfall, which causes flash floods in the province. June and July are the hottest months of the year. Climatic extremes are notable from hot, barren, and flat terrain in the south to the cool and mountainous Pothohar plateau in the north (Table 1; Fig. 1).

Table 1 Overall climatological view of the study area

Full size table

Dataset and methodology

Datasets

Different gridded datasets of the National Oceanic and Atmospheric Administration, Physical Science Laboratory (NOAA PSL), mainly climate prediction center (CPC) Global Unified Temperature (0.50 — degree latitude × 0.50 — degree longitude grid) and CPC Global Unified Gauge-Based Analysis of daily precipitations (0.50 — degree latitude × 0.50 — degree longitude grid), climate diagnostics center (CDC) NCEP-NCAR Reanalysis-1 surface level products (2.5-degree × 2.5-degree global grids), e.g., wind speed and relative humidity, were tested over nine different districts of Punjab. The following daily variables: maximum temperature “Tmax (CPC),” minimum temperature “Tmin (CPC),” wind speed “WS (CDC),” relative humidity “RH (CDC),” and rainfall “RF (CPC)” and monthly long-term average rainfall “RFM (CPC)” were selected to check their accuracy during the period 1991–2020. NOAA PSL provides such historic datasets globally, four times a day on monthly bases till the present time.

Pakistan Meteorological Department (PMD) is an official department in Pakistan that observes climatic variables on the ground based on meteorological observatories (Fig. 1). PMD (established in 1947) monitors and records maximum and minimum temperature “Tmax (PMD)” and “Tmin (PMD)” in a day, while it records relative humidity “RH (PMD 8 AM) and RH (PMD 5 PM)” and wind speed “WS (PMD 8 AM) and WS (PMD 5 PM)” two times in a day, e.g., 8 A.M. and 5 P.M. (Pakistan Standard Time). PMD’s rain gauges collect total rainfall at a specific location (Fig. 1) and announce it as daily district total rainfall “RF (PMD DT)” and monthly long-termed average rainfall as “RFM (PMD).” PMD observatories are spatially sparse in Punjab, and a gridded dataset may be required to mitigate climatic disasters in the province.

Methodology

Daily datasets from NOAA PSL and PMD were obtained from 1991 to 2020 to check the accuracy of gridded datasets over nine districts of Punjab. Intensive care was taken to select the pixels of NOAA PSL datasets falling within the districts to calculate the districts’ mean of Tmax (CPC), Tmin (CPC), WS (CDC), and RH (CDC). However, different geospatial approaches were used in precipitation, i.e., district average, point-sum, point-average, and district sum, which were compared with “RF (PMD DT).” In the first case of precipitation, the average of all gridded rainfall pixels falling within the district domain was taken, hence written as RF (CPC DM). In the second and third cases, the sum and average of all gridded rainfall pixels were taken over the PMD observatory, which is written as RF (CPC PS) and RF (CPC PM), respectively. In the last case of rainfall, a sum of all gridded rainfall pixels was taken over district “RF (CPC DS).” The climatic datasets were taken in standard units, i.e., precipitation in millimeters (mm), the temperature in degrees centigrade (°C), relative humidity in percentage (%), and wind speed in meters per second (m/s) throughout this research. Climatic data were preprocessed to remove null values and gaps in the time series (Fig. 2).

A new Spatial Accuracy Ranking Matrix using a Pixel-based Approach was introduced in this research to check the accuracy of the gridded precipitation in real-time applications and to produce accuracy maps over all of Punjab (A total of 36 districts). A pixel-wise spatial accuracy approach was adopted, i.e., we selected the pixels of gridded precipitation within each district boundary (Fig. 3A). RMSE was calculated for each pixel with the PMD rainfall of the selected district (Fig. 3B); hence, a spatial accuracy grid was formed, which contains pixel-by-pixel RMSE of gridded precipitation. Similarly, a spatial accuracy grid was developed for each district, representing the spatial accuracy of each gridded precipitation pixel (based on RMSE) to PMD rainfall (Fig. 3B). As this spatial accuracy grid contains the RMSE values of each pixel falling inside the district, a modeling approach of pixel-wise accuracy was utilized to predict the gridded rainfall accuracy over the whole province of Punjab (Fig. 3).

This spatial accuracy grid is divided into training and testing grids of 80% by 20% to formulate a model. Due to the different geospatial extent of districts (Fig. 1), the testing pixels were selected with spatial randomness, consisting of the best visual and scatter geographic locations, so that interpolation and extrapolation in the rest of the districts have sufficient training pixels (Fig. 1). Nine different interpolation techniques broadly consisting of purely statistical and geostatistical approaches were utilized to develop pixel-based interpolated spatial accuracy grid. The interpolated spatial accuracy grid was validated against the testing grid, and the model was optimized to get minimum RMSE (Fig. 4). The ranking of the optimized interpolated spatial accuracy grid was declared based on ascending order of minimum RMSE with a testing grid; hence, the best interpolation technique was ranked for each month of the year, and it was named spatial accuracy ranking matrix. This approach formulated a matrix of order 12 by 9, representing month-wise spatial accuracy of different interpolation techniques in a month by accuracy rank all over Punjab (Fig. 4). Below are the details of statistical and geostatistical interpolation techniques used in the spatial accuracy ranking matrix using a pixel-based approach.

Inverse distance weighting (IDW) with variable search radius is based on the Tobler’s Law of geography (Saleem et al. 2018). Kumar et al. (2022) and Philip and Watson (1982) used the following equation for IDW:

$${RF}_{idw}=\frac{{\sum }_{i=1}^{n}{w}_{i}\times {RF (CPC)}_{i}}{{\sum }_{i=1}^{n}{w}_{i}} , {w}_{i}=\frac{1}{{dis}_{ji}^{p}}$$

(1)

The geostatistical technique of Kriging was used in research with two modes, e.g., ordinary and universal. McBRATNEY and WEBSTER (1986), Oliver and Webster (2007), and Kumar et al. (2022) mentioned the following formula for ordinary Kriging:

$${RF\left(\mu \right)}_{o}={\sum }_{i=1}^{n(u)}{\gamma }_{i}\times RF({\mu }_{i})$$

(2)

The mathematical formula for Universal Kriging (Gundogdu and Guney 2007) is given below:

$${RF\left(u\right)}_{u}=d\left(u\right)+ \delta (u)$$

(3)

where in (1), ${RF}_{idw}$ is interpolated spatial accuracy grid of rainfall pixels through the IDW, ${RF (CPC)}_{i}$ is the known value of rainfall pixel at an ith location, W_i is the weight,${dis}_{ji}^{p}$ is the displacement between pixels at ith, jth, and $\rho$ is the degree of distance. In (2), ${RF\left(\mu \right)}_{o}$ is the interpolated spatial accuracy grid through ordinary Kriging at a given location of $\mu$, $RF({\mu }_{i})$ are given a spatial accuracy grid, ${\gamma }_{i}$ is kriging weight for minimizing the variance, and $n(u)$ is the total surrounding samples in predicting the value of ${RF\left(\mu \right)}_{o}$. In (3), ${RF\left(u\right)}_{u}$ is the predicted rainfall through universal Kriging, $d\left(u\right)$ is the deterministic function, and $\delta (u)$ is macroscale or random variations. The following semi-variance models of Kriging were incorporated in this research: ordinary Kriging with a circular semi-variance model (KOC), ordinary Kriging with an exponential semi-variance model (KOE), ordinary Kriging with Gaussian semi-variance model (KOG), ordinary Kriging with a linear semi-variance model (KOL), and ordinary Kriging with a spherical semi-variance model (KOS). The other modes of Kriging were universal Kriging with linear drift written as “KU (LL)” and universal Kriging with quadratic drift written as “KU (LQ).” Spline interpolation predicts values using a mathematical function that minimizes the overall surface curvature, resulting in a smooth surface that passes perfectly through training pixels (Shekhar and Xiong 2008).

The following performance and quality evaluation parameters were adopted to check NOAA PSL quality and accuracy with the real-time PMD dataset:

$$RMSE=\frac{\sqrt{({x}_{PMD}-{x}_{Gridded})}}{n}$$

(4)

$$SKWN=\frac{\sum_{i}^{n}(x-{x}_{mean})}{\left(n-1\right)\times {\sigma }^{3}}$$

(5)

where in (4), RMSE is the root mean square error, and its minimum value indicates more accuracy in dataset. SKWN is the skewness of the dataset, and ${x}_{PMD}$ is the PMD dataset; ${x}_{Gridded}$ is the NOAA PSL datasets, n is the total number of samples, and $\sigma$ is the standard deviation of datasets. The second standard error of SKWN was suggested by Brown (1997) and Doane and Seward (2011) and given as follows:

$$2\mathrm{nd}\;\mathrm{standard}\;\mathrm{error}\;\left(SKWN\right)=2\times\sqrt{\frac6n}$$

(6)

$$KRTS=\frac{\sum_{i}^{n}{\left(x-{x}_{mean}\right)}^{4}}{n\times {\sigma }^{4}}$$

(7)

where in (7) KRTS is the kurtosis of datasets and its 2nd standard error suggested by Brown (1997) and Doane and Seward (2011) as follows:

$$2\mathrm{nd}\;\mathrm{standard}\;\mathrm{error}\;\left(KRTS\right)=2\times\sqrt{\frac{24}n}$$

(8)

The correlation coefficient (CORREL) was calculated by using the following formula:

$$CORREL=\frac{\sum ({x}_{PMD}-{x}_{PMDmean})({x}_{Gridded}-{x}_{Griddedmean})}{\sqrt{\sum {\left({x}_{PMD}-{x}_{PMDmean}\right)}^{2}{({x}_{Gridded}-{x}_{Griddedmean})}^{2}}}$$

(9)

where in (9), ${x}_{PMDmean}$ and ${x}_{Griddedmean}$ are the means of PMD and NOAA PSL datasets respectively. The following classifications of CORREL were adopted in research, e.g., 0.2 to 0.39 a weak, 0.4 to 0.59 a moderate, 0.6 to 0.79 a strong, and 0.8 to 1.0 a very strong correlation between variables.

In order to check relative deviations in gridded and ground observed datasets, the standard deviation ratio (RSTD) was calculated using (10):

$$RSTD=\frac{{x}_{Gridded}}{{x}_{PMD}}$$

(10)

The following possibilities of RSTD were used, e.g., RSTD > 1.0, RSTD = 1.0, and RSTD < 1.0

Results and discussions

Brown (1997) and Doane and Seward (2011) made an interpretation of skewness and kurtosis, which was used in this research. The value of 2nd standard error (SE) of skewness for the dataset used was $\pm 0.258$. Tmax (CPC) and Tmax (PMD) remain negatively skewed in all districts of Punjab, and Tmax (CPC) follows the same pattern of skewness as Tmax (PMD) in 9 districts of Punjab (Table 2). However, such negative values were beyond the standard error (SE) of skewness, so a non-symmetric curve was declared. Tmin (CPC) was symmetric and negatively skewed and followed the same skewness trend as that of Tmin (PMD). The results of WS (CPC) reveal a non-symmetric nature of data in all districts, and a similar direction of positive skewness was observed for WS (PMD 8 AM) in 9 districts of Punjab (Table 2). WS (PMD 5 PM) followed symmetric nature of data in Lahore, Multan, and Sargodha districts (Table 2). In contrast, the rest of the districts follow the pattern of non-symmetric distributions like WS (CPC). WS (PMD Mean) was found normal in Lahore and Sialkot districts, while in the rest, it remains non-symmetric with positive skewness like WS (CPC). RH (CDC) follows a symmetric nature in Jhelum and Sargodha districts. In the rest of the districts, the non-symmetric nature of the dataset was found, i.e., negatively skewed in Sialkot and positively skewed in the rest of the districts (Table 2). RH (PMD 8 AM) was symmetric in the Jhang, and the rest of the districts were revealed non-symmetric with negative skewness (Table 2). RH (PMD 5 PM) was also symmetric with slightly negative skewness in Bahawalnagar, Faisalabad, Lahore, and Multan, while non-symmetric in the rest of the districts (Table 2). RH (PMD Mean) was symmetric in the Jhelum district; in the rest, it was non-symmetric with negative and positive skewness. There was no symmetry in RF (CPC DM), RF (CPC PM), RF (CPC PS), RF (CPC DS), and RF (PMD DT) because skewness was positive and much beyond the SE of skewness (Table 2).

Table 2 Results of skewness of climatic datasets used in research

Full size table

The value of the 2nd standard error (SE) of kurtosis for the dataset used was $\pm$ 0.516. Tmax (CPC), Tmax (PMD), Tmin (CPC), and Tmin (PMD) were fairly platykurtic (that is, ordinarily high) with slightly negative kurtosis values in 9 districts of Punjab. WS (CPC) was mesokurtic in Bahawalnagar, Bahawalpur, Multan, and Sialkot, while slightly negative mesokurtic in Jhelum and the rest of the districts; WS (CPC) was leptokurtic.WS (PMD 8 AM) was approximately mesokurtic in Jhang and Multan, and it remained slightly negative mesokurtic in Lahore, Sargodha, and in rest, had leptokurtic distributions. WS (PMD 5 PM) was slightly negatively mesokurtic in Bahawalnagar, Bahawalpur, Jhang, Jhelum, and had a flat platykurtic distribution in Lahore, Multan, and Sargodha, and in Sialkot, it had a flatten leptokurtic kurtosis (Table 3). WS (PMD Mean) had mesokurtic distributions in Jhang, Jhelum, Multan, Sialkot, Bhawalnagar; platykurtic in Lahore, Sargodha; and leptokurtic in Bhawalpur district. RH (CDC) had normal distributions in Jhelum, Sialkot; platykurtic in Sargodha; and in the rest of the districts, it was leptokurtic. RH (PMD 8 AM) was mesokurtic in Jhelum, Bhawalpur, and slightly negative mesokurtic in Bhawalnagar, Lahore, Jhang, and Multan. It was platykurtic in the rest of the districts. RH (PMD 5 PM) was slightly negative mesokurtic in Bahawalnagar, Bahawalpur, Jhang, Sialkot, Sargodha, and fairly mesokurtic in Faisalabad, and at rest, it had platykurtic distributions. RH (PMD Mean) followed negative mesokurtic distributions in all districts except Sialkot, which had much leptokurtic, and Jhelum, which had platykurtic distributions. RF (CPC DM), RF (CPC PM), RF (CPC PS), RF (PMD DT), and RF (CPC DS) had very much tall leptokurtic distributions in all selected districts of Punjab (Table 3).

Table 3 Results of kurtosis for the climatic datasets used in this research

Full size table

The correlation coefficient was calculated with the corresponding climatic variable of the PMD observatory in the district to check the linear relation between the variables. Tmax (CPC) showed a very strong correlation with Tmax (PMD) in each district except Sargodha, which had a moderate correlation. Similarly, a strong correlation was found between Tmin (CPC) and Tmin (PMD), except in Sargodha, where a weak correlation was found. WS (CPC) showed a strong to moderately strong correlation with WS (PMD 8 AM), WS (PMD 5 PM), and WS (PMD mean) in Bahawalnagar, Bahawalpur, and Multan, respectively. It had a weak correlation in Jhang and Jhelum and no correlation in the rest of the districts (Table 4). RH (CDC) showed a moderate to fair strong correlation with RH (PMD 8 AM) in Faisalabad, Sargodha, Lahore, Sialkot, Jhelum, and Jhang, respectively, and a weak correlation in Multan. RH (CDC) showed no correlation with RH (PMD 8 AM) in the rest of the districts. RH (CDC) revealed a moderate to strong correlation with RH (PMD 5 PM) in Bahawalpur, Bahawalnagar, Sargodha, Multan, Faisalabad, Jhelum, Sialkot, and Lahore, respectively (Table 4). A very strong to solid uphill was found between RH (CDC) and RH (PMD Mean) in Jhang, Sialkot, Lahore, Sargodha, and Faisalabad districts, respectively. A moderately strong to weak correlation comes from these climatic variables in Bahawalnagar, Bahawalpur, and Jhang, respectively. RF (CPC DM) revealed a relatively fair strong correlation with RF (PMD DT) in Bahawalnagar, Sialkot, Jhang, Lahore, Faisalabad, Bahawalpur, and Multan, respectively. RF (CPC PM) revealed a moderately strong correlation with RF (PMD DT) in Sialkot, Jhelum, Jhang, and Faisalabad districts, respectively, and a robust correlation was observed in Bahawalnagar and Sargodha, and a weak correlation in Multan (Table 4). A solid correlation of RF (CPC PS) observed in the Bahawalnagar, Lahore, and Sargodha districts showed a moderate to strong correlation in Sialkot, Bahawalpur, and Jhang, respectively. This variable showed a weak correlation in the rest of the districts (Table 4).

Table 4 Results of the correlation coefficient for the climatic dataset used in this research

Full size table

The standard deviations ratio (RSTD) was calculated by dividing the standard deviations of gridded data by the ground observed variable (Table 5).

Table 5 Results of RSTD of the climatic dataset used in this research

Full size table

There are three possibilities, e.g., RSTD > 1, RSTD = 1, RSTD < 1.0. The first case means the standard deviation (STD) of the gridded dataset is more than the standard deviation of the corresponding variable observed on the ground. The second case (RSTD = 1.0) means the STD of both gridded and ground observed datasets are of the same spread, and the last case (RSTD < 1.0) means the STD of the gridded climatic variable is less than the STD of the ground observed one.

With this interpretation, Tmax and Tmin have RSTD near 1.0, indicating the same spread of both datasets in districts of Punjab (Table 5). WS (8 AM), WS (5 PM), and WS (Mean) have RSTD nearly equal to 1.0 in Bahawalnagar and Lahore districts, while it has RSTD < 1.0 in Multan and the rest of the districts have RSTD > 1.0. RH (8 AM), RH (5 PM), and RH (mean) have RSTD nearly equal to 1.0 in all districts except in Jhang, where RSTD < 1.0, revealing the spread in WS (CPC) to WS (PMD 8 AM, 5 PM and mean). RF (CPC DM) has RSTD, nearly 1.0 in Bahawalpur and Faisalabad; less than 1.0 in Bahawalnagar, Sargodha, and Sialkot; and greater than 1.0 in the rest of the districts (Table 4). RSTD of RF (CPC PM) was near 1.0 in Bahawalnagar, Bahawalpur, Faisalabad, and Lahore, and in the rest of the districts, RSTD < 1. RSTD of RF (CPC PS) has < 1.0 in Jhelum, Sargodha, Lahore, Sialkot, and it has RSTD > 1.0 in the rest of the districts (Table 5).

RMSE was the last and most crucial parameter deployed to find the accuracy of NOAA PSL datasets. Tmax (CPC) and Tmin (CPC) found an accurate representation of Tmax (PMD) and Tmin (PMD) as their RMSE remains very low (Table 6).

Table 6 Root mean square error of gridded climatic dataset with ground observed data

Full size table

Similarly, the RMSE of WS (CPC) with WS (PMD 8 AM), WS (PMD 5 PM), and WS (PMD mean) showed excellent accuracy (RMSE $\cong 0)$ in all of the districts except Sargodha, where accuracy slightly varied (RMSE $\cong 3-5)$. RMSE of RH (CPC) with RH (PMD 8 AM) was marginally more than RH (PMD 5 PM) and RH (PMD Mean) in most of the districts (Table 6). Maximum RMSE of 3.88 in RH (CDC) with RH (PMD 8 AM) describing 3.88% relative humidity error in RH (CDC) from PMD observatory. RF (CPC DT), RF (CPC DM), and RF (CPC PM) were very accurate (RMSE $\cong 0-2)$ in Punjab, except in Sargodha, where RF (CPC DT) was poor (RMSE > 6) with the ground observatory. RF (CPC PS) was also very accurate (RMSE $\cong 0-2).$

Spatial accuracy ranking matrix of gridded rainfall by using a pixel-based approach

This approach was developed on monthly long-term mean gridded precipitation RFM (CPC) and long-term mean ground rainfall collected by PMD RFM(PMD). According to the methodology described above, the spatial accuracy grid was divided into training and testing grids of 80% by 20% (Fig. 1). The training grids were given as input to the model, and the spatial accuracy grid was interpolated up to Punjab province using statistical and geostatistical interpolation techniques, and an interpolated spatial accuracy grid was formed. RMSE was calculated of interpolated spatial accuracy grid with the testing grid, and results are formulated in Table 7.

Table 7 Optimized RMSE for interpolated spatial accuracy grid with the RFM (CPC) testing grid by nine different interpolation techniques

Full size table

A spatial accuracy ranking matrix (Table 8) is developed from Table 7 by rearranging it in the ascending order of minimum RMSE to months. This spatial accuracy ranking matrix concluded the month-by-best ranking matrix for gridded rainfall with the sparse rain gauges in the Punjab province of Pakistan. The nine ranks (columns) in this matrix represent the order of the best interpolation technique in space by time; hence, the best interpolation is placed at rank first and the least at the end. The 1st rank in the matrix was declared the optimum, in which the best interpolation techniques among the nine were placed (Table 8).

Table 8 Spatial accuracy ranking matrix using the pixel-based approach for RFM (CPC)

Full size table

In the first rank of the spatial accuracy ranking matrix (Table 8), IDW produced RMSE of 2.55, 2.86, 6.71, 0.86, and 0.36 in January, May, September, October, and November, and KOE produced RMSE of 3.18, 3.56, 3.15, 10.72, 19.92, and 1.32 in February, March, April, June, August, and December (Table 9). In the 2nd rank, KOE, KOL, KOG, KOS, and KOC produced the least RMSE with testing pixels. KOE produces RMSE of 2.76, 22.35, 9.83, 1.13, and 0.9 in January, July, September, October, and November, and KOL produces 3.21, 3.58, and 3.16 in February, March, and April. In the 3rd rank, KOC, KOS, and KOL remain the least RMSE-producing interpolation techniques. RMSE from KOC was 2.78, 3.21, 3.58, 3.16, 10.73, 20.03, 9.85, and 1.01 in January, February, March, April, June, August, September, and November. The RMSE of KOC, KOL, and KOS was the same in February, March, April, June, and September. In rank 4 in the spatial accuracy ranking matrix (Table 8), RMSE produced through KOC was 3.21, 3.58, 3.16,10.73, 22.44, 9.85, 1.35, and 1.01 in February, March, April, June, July, September, October, and November. RMSE of KOC and KOS was the same in February, March, and April, and it was also the same for KOL in September. KOL, KOC, KOS, KU (LQ), and spline interpolations remain optimum in rank 5 in the spatial accuracy ranking matrix. In this accuracy rank, RMSE from KOL was 2.90, 4.42, 22.45, 9.85, 1.45, and 1.13 in January, May, September, October, and November, besides KOL, KOC and KOS produced the same RMSE in May and September. In this rank, KU (LQ) and spline interpolations performed well in February, March, April, and December. KU (LQ), spline dominated most of the months in rank 6 in the spatial accuracy ranking matrix. RMSE from KU (LQ) was 4.08, 5.67, 12.01, and 2.09 in January, March, June, and November. It was 5.66, 25.58, 27.58, 9.92, 2.09, and 1.93 from spline interpolation in February, July, August, September, October, and December (Table 9). KOC, KOE, KOL, and KOS produced the same RMSE in May, ranking 6th in the spatial accuracy ranking matrix. Similarly, in the 7th rank, KU (LL), KU (LQ), KOG, and spline, in 8th rank KU (LL), and in 9th rank KOG, IDW, and KU (LL) remain the maximum-producing RMSE with testing pixels (Table 9).

Table 9 RMSE values of spatial accuracy ranking matrix using the pixel-based approach

Full size table

These nine ranked spatial accuracies of gridded rainfall can be classified into three categories: good, moderate, and poor. This classification was based on two parameters, i.e., raining and less rainy months. Raining months are monsoon months, e.g., June to September, and the rest are less rainy in Punjab. The spatial accuracy ranking matrix is classified as good if RMSE values from different interpolation techniques remain less than 5 in less rainy months, and for rainy months, it remains less than 20. Similarly, the spatial accuracy ranking matrix is classified as moderate if RMSE remains less than 6.5 in less rainy months, and rainy months remain less than 30. Class poor is defined as if RMSE remains less than 9 in less rainy months and greater than 30 in rainy months of the year. According to this classification, ranks 1 to 4 in the spatial accuracy ranking matrix fall under class good, 5 to 6 fall under class moderate, and the rest fall under poor. Statistical and geostatistical interpolations performed very well in each rank, e.g., produced RMSE < 10. However, in the rainy months of the year, the accuracy of interpolations (RMSE values) was found directly varying with ranks in the spatial accuracy ranking matrix (Fig. 5).

This accuracy ranking matrix helps the user of the RFM (CPC) interpolation technique to be feasible in space and time when less dense rainfall gauges are available.

Accuracy mapping of gridded rainfall

Using the first accurate rank in the spatial ranking matrix of RFM (CPC) from Table 8, accuracy mapping (in mm) was processed over Punjab. The accuracy mapping was downscaled to 0.25 by 0.25 grid so that good accuracy may be achieved in visualization. The accuracy mapping of the Rank-1 in the spatial accuracy ranking matrix of RFM (CPC) was mapped for each month of the year (Fig. 6). This mapping revealed good RFM (CPC) accuracy in the South Punjab region (Bahawalpur, Bahawalnagar, Rahim Yar Khan, Rajanpur, Muzaffargarh, DG Khan, Layyah, Multan, Lodhran, Khanewal, Vehari districts) throughout the year. The central Punjab region (Lahore, Sheikhupura, Kasur, Nankana, Sahiwal, Pakpattan, Okara, Faisalabad, Chiniot, T.T. Singh, Narowal, Gujrat, Gujranwala, and Sialkot) reveals well to moderate accuracy throughout the year. In the central Punjab region, the results were good in winter, spring, and autumn seasons; however, its accuracy went from average to poor in the monsoon season (May to September). In the Northern Punjab (Rawalpindi, Jhelum, Chakwal, Attock, Mianwali, Khushab, Bhakkar, Jhang, Sargodha, and Hafizabad districts), mapping of rank-1 in spatial accuracy ranking matrix remains the poorest throughout the year. The Northern Punjab region consists of a high terrain area in the province (Fig. 1); the Central Punjab region has moderate to flat terrain, and the South Punjab region mainly consists of flat terrain. Hence, Rank-1 in the spatial accuracy ranking matrix enormously varied with the districts’ elevations and the monsoon period (Table 10).

Table 10 District-wise RMSE (in millimeters) for each month by using the Rank-1 in the spatial accuracy ranking matrix

Full size table

Conclusions

The following are the outcomes of the research work:

•Tmax (CPC) and Tmin (CPC) followed the same data quality as Tmax (PMD) and Tmin (PMD), and the RMSE error remains < 0.5 °C in nine districts of Punjab. WS (CPC) also followed the same pattern of accuracy with ground observed wind speed; the results reveal its accuracy is very high (RMSE < 1 m/s), and RH (CDC) was also very accurate and had RMSE < 3%.

•The performance of daily gridded precipitation evaluated in different spatial domains, i.e., point-to-point and polygon-to-point, reveals that its district average precipitation produced high accuracy (RMSE < 3 mm) over sparse rainfall gauges.

•The spatial accuracy of RFM (CPC) using statistical and geostatistical interpolations techniques was predicted for the remaining 27 districts of Punjab. The model was optimized based on the least RMSE, and a spatial accuracy ranking matrix (of order 12 by 9) was produced. This matrix provides 9 different accuracy ranks of statistical and geostatistical interpolations in space and time. IDW and KOE remain the best in most elements of Rank-1 in the spatial accuracy ranking matrix throughout the year.

•The nine accuracy ranks in the matrix were classified as good, moderate, and poor. This classification of nine accuracy ranks in the spatial accuracy ranking matrix was based on two parameters, i.e., rainy and less rainy months, with certain thresholds of RMSE. Ranks 1 to 4 in the spatial accuracy ranking matrix were declared good, ranks 5 to 6 fall under moderate, and the rest fall under poor classification.

•The accuracy mapping of RFM (CPC) revealed that this dataset is perfect and accurate in South Punjab (flat terrain region). It is also good in central Punjab (flat-to-terrain region) during winter, spring, and autumn, while it has moderate accuracy in the monsoon season. Cross-validation concludes the abysmal performance of RFM (CPC) in the northern region of Punjab (Highly terrain region) in each month of the year. RFM (CPC) accuracy is directly related to terrain and the monsoon period.

The spatial accuracy ranking matrix was developed on monthly long-term average gridded precipitation using ground-based long-term average rainfall data; however, investigations are suggested on a daily or hourly dataset to understand more precision and accuracy of precipitation.

References

Adnan S, Ullah K, Ahmed R (2020) Variability in meteorological parameters and their impact on evapotranspiration in a humid zone of Pakistan. Meteorol Appl 27:e1859
Article Google Scholar
Baudouin J-P, Herzog M, Petrie CA (2020) Cross-validating precipitation datasets in the Indus River basin. Hydrol Earth Syst Sci 24:427–450
Article Google Scholar
Beck HE, Pan M, Roy T et al (2019) Daily evaluation of 26 precipitation datasets using Stage-IV gauge-radar data for the CONUS. Hydrol Earth Syst Sci 23:207–224
Article Google Scholar
Bell S, Cornford D, Bastin L (2015) How good are citizen weather stations? Addressing a biased opinion. Weather 70:75–84. https://doi.org/10.1002/wea.2316
Article Google Scholar
Belo‐Pereira M, Dutra E, Viterbo P (2011) Evaluation of global precipitation data sets over the Iberian Peninsula. J Geophys Res Atmos 116. https://doi.org/10.1029/2010JD015481
Brown JD (1997) Statistics corner questions and answers about language testing statistics: skewness and kurtosis. Shiken JALT Test Eval SIG Newsl 1:20–23
Bui HT, Ishidaira H, Shaowei N (2019) Evaluation of the use of global satellite–gauge and satellite-only precipitation products in stream flow simulations. Appl Water Sci 9:1–15
Article Google Scholar
Dinku T, Connor SJ, Ceccato P, Ropelewski CF (2008) Comparison of global gridded precipitation products over a mountainous region of Africa. Int J Climatol A J R Meteorol Soc 28:1627–1638
Article Google Scholar
Doane DP, Seward LE (2011) Measuring skewness: A forgotten statistic? J Stat Educ 19. https://doi.org/10.1080/10691898.2011.11889611
Early R, Keith SA (2019) Geographically variable biotic interactions and implications for species ranges. Glob Ecol Biogeogr 28:42–53
Article Google Scholar
Eum H, Dibike Y, Prowse T, Bonsal B (2014) Inter-comparison of high-resolution gridded climate data sets and their implication on hydrological model simulation over the Athabasca Watershed, Canada. Hydrol Process 28:4250–4271
Article Google Scholar
Fischer EM, Knutti R (2014) Detection of spatially aggregated changes in temperature and precipitation extremes. Geophys Res Lett 41:547–554
Article Google Scholar
Ghosh S, Das D, Kao S-C, Ganguly AR (2012) Lack of uniform trends but increasing spatial variability in observed Indian rainfall extremes. Nat Clim Chang 2:86–91
Article Google Scholar
Gundogdu KS, Guney I (2007) Spatial analyses of groundwater levels using universal Kriging. J Earth Syst Sci 116:49–55. https://doi.org/10.1007/s12040-007-0006-6
Article Google Scholar
Gyalistras D (2003) Development and validation of a high-resolution monthly gridded temperature and precipitation data set for Switzerland (1951–2000). Clim Res 25:55–83
Article Google Scholar
Haylock MR, Hofstra N, Klein Tank AMG, et al (2008) A European daily high‐resolution gridded data set of surface temperature and precipitation for 1950–2006. J Geophys Res Atmos 113. https://doi.org/10.1029/2008JD010201
Herrera S, Gutiérrez JM, Ancell R et al (2012) Development and analysis of a 50-year high-resolution daily gridded precipitation dataset over Spain (Spain02). Int J Climatol 32:74–85
Article Google Scholar
Karori MA, Zhang P (2008) Downscaling NCC CGCM output for seasonal precipitation prediction over Islamabad–Pakistan. Pakistan J Meteorol 4:59–72
Khan SI, Hong Y, Gourley JJ et al (2014) Evaluation of three high-resolution satellite precipitation estimates: potential for monsoon monitoring over Pakistan. Adv Sp Res 54:670–684. https://doi.org/10.1016/j.asr.2014.04.017
Article Google Scholar
Khan N, Shahid S, Ahmed K et al (2018) Performance assessment of general circulation model in simulating daily precipitation and temperature using multiple gridded datasets. Water 10:1793
Article Google Scholar
Kidd C, Becker A, Huffman GJ et al (2017) So, how much of the Earth’s surface is covered by rain gauges? Bull Am Meteorol Soc 98:69–78
Article Google Scholar
Kumar A, Dhakhwa S, Dikshit AK (2022) Comparative evaluation of fitness of interpolation techniques of ArcGIS using leave-one-out scheme for air quality mapping. J Geovisualization Spat Anal 6:4–5
Google Scholar
Kyselý J, Plavcová E (2010) A critical remark on the applicability of E‐OBS European gridded temperature data set for validating control climate simulations. J Geophys Res Atmos 115. https://doi.org/10.1029/2010JD014123
Ma L, Zhang T, Li Q et al (2008) Evaluation of ERA‐40, NCEP‐1, and NCEP‐2 reanalysis air temperatures with ground‐based measurements in China. J Geophys Res Atmos 113. https://doi.org/10.1029/2007JD009549
McBratney AB, Webster R (1986) Choosing functions for semi-variograms of soil properties and fitting them to sampling estimates. J Soil Sci 37:617–639. https://doi.org/10.1111/j.1365-2389.1986.tb00392.x
Article Google Scholar
Nashwan MS, Shahid S, Chung E-S et al (2018) Development of climate-based index for hydrologic hazard susceptibility. Sustainability 10:2182
Article Google Scholar
Nashwan MS, Shahid S, Abd Rahim N (2019) Unidirectional trends in annual and seasonal climate and extremes in Egypt. Theor Appl Climatol 136:457–473. https://doi.org/10.1007/s00704-018-2498-1
Article Google Scholar
Nashwan MS, Shahid S, Wang X (2019) Uncertainty in estimated trends using gridded rainfall data: a case study of Bangladesh. Water 11:349
Article Google Scholar
Oliver MA, Webster R (2007) International journal of geographical information systems Kriging : a method of interpolation for geographical information systems. Geographical 37–41. https://doi.org/10.1080/02693799008941549
Ougahi JH, Mahmood SA (2022) Evaluation of satellite-based and reanalysis precipitation datasets by hydrologic simulation in the Chenab river basin. J Water Clim Chang 13:1563–1582
Article Google Scholar
Philip GM, Watson DF (1982) A precise method for determining contoured surfaces. APPEA J 22:205. https://doi.org/10.1071/aj81016
Article Google Scholar
Prakash S, Gairola RM, Mitra AK (2015) Comparison of large-scale global land precipitation from multisatellite and reanalysis products with gauge-based GPCC data sets. Theor Appl Climatol 121:303–317
Article Google Scholar
Prakash S, Mitra AK, Momin IM et al (2015) Seasonal intercomparison of observational rainfall datasets over India during the southwest monsoon season. Int J Climatol 35:2326–2338
Article Google Scholar
Prein AF, Gobiet A (2017) Impacts of uncertainties in European gridded precipitation observations on regional climate analysis. Int J Climatol 37:305–327
Article Google Scholar
Saleem U, Akram MS, Ullah MF, Rehman F (2018) Accurate imputation for relative humidity over Pakistan gathered from AQUA Satellite. Open J Geol 8:987
Article Google Scholar
Schiemann R, Liniger MA, Frei C (2010) Reduced space optimal interpolation of daily rain gauge precipitation in Switzerland. J Geophys Res Atmos 115. https://doi.org/10.1029/2009JD013047
Shekhar S, Xiong H (2008) Esri. Springer International Publishing, Boston
Sun Q, Miao C, Duan Q et al (2018) A review of global precipitation data sets: data sources, estimation, and intercomparisons. Rev Geophys 56:79–107. https://doi.org/10.1002/2017RG000574
Sylla MB, Giorgi F, Coppola E, Mariotti L (2013) Uncertainties in daily rainfall over Africa: assessment of gridded observation products and evaluation of a regional climate model simulation. Int J Climatol 33:1805–1817
Article Google Scholar
Tapiador FJ, Turk FJ, Petersen W et al (2012) Global precipitation measurement: Methods, datasets and applications. Atmos Res 104:70–97
Article Google Scholar
Trenberth KE, Dai A, Rasmussen RM, Parsons DB (2003) The changing character of precipitation. Bull Am Meteorol Soc 84:1205–1218
Article Google Scholar
Ullah W, Wang G, Ali G et al (2019) Comparing multiple precipitation products against in-situ observations over different climate regions of Pakistan. Remote Sens 11:628
Article Google Scholar
Usman M, Ndehedehe CE, Ahmad B et al (2022) Modeling streamflow using multiple precipitation products in a topographically complex catchment. Model Earth Syst Environ 8:1875–1885
Article Google Scholar
Wong JS, Razavi S, Bonsal BR et al (2017) Inter-comparison of daily precipitation products for large-scale hydro-climatic applications over Canada. Hydrol Earth Syst Sci 21:2163–2185. https://doi.org/10.5194/hess-21-2163-2017
Yatagai A, Arakawa O, Kamiguchi K et al (2009) A 44-year daily gridded precipitation dataset for Asia based on a dense network of rain gauges. Sola 5:137–140
Article Google Scholar

Download references

Acknowledgements

Pakistan Meteorological Department Government of Pakistan, the Board of Revenue Government of Punjab, and the National Oceanic and Atmospheric Administration, Physical Science Laboratory are acknowledged for providing datasets for this research.

Author information

Authors and Affiliations

Institute of Geology, University of the Punjab, P.O. Box No. 54590, Lahore, Pakistan
Muhammad Usman Saleem & Mian Sohail Akram

Authors

Muhammad Usman Saleem
View author publications
You can also search for this author in PubMed Google Scholar
Mian Sohail Akram
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muhammad Usman Saleem.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Responsible Editor: Zhihua Zhang

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Saleem, M.U., Akram, M.S. Performance assessment of gridded climatic data and modeling spatial accuracy ranking matrix for gridded precipitation using a new pixel-based approach: a district-wise case study of Punjab. Arab J Geosci 16, 347 (2023). https://doi.org/10.1007/s12517-023-11442-w

Download citation

Received: 24 September 2022
Accepted: 22 April 2023
Published: 03 May 2023
DOI: https://doi.org/10.1007/s12517-023-11442-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Performance assessment of gridded climatic data and modeling spatial accuracy ranking matrix for gridded precipitation using a new pixel-based approach: a district-wise case study of Punjab

Abstract

Similar content being viewed by others

Development of high resolution daily gridded precipitation and temperature dataset for potohar plateau of indus basin

Evaluation of gridded precipitation products in the selected sub-basins of Lower Mekong River Basin

Monthly gridded precipitation databases performance evaluation in North Patagonia, Argentina

Introduction

Study area