1 Introduction

Accurate rainfall estimation is crucial for a multitude of applications, including water resources management, flood monitoring, drought analysis, rainfall climatology, forecasting, and agricultural planning (Onarun et al. 2023 & Siamaki., 2024). Nevertheless, the execution of hydrological projects in arid regions presents significant challenges due to the unpredictable and sporadic nature of rainfall, which is characterized by irregular temporal and spatial patterns (Chitrakar et al. 2023). Consequently, extreme weather events can lead to fluctuations in precipitation levels in water-scarce arid countries. These events often manifest as prolonged droughts followed by sudden and destructive flash floods and pose a threat to coastal regions, residential areas, agriculture, and the delicate natural ecosystems found in arid regions (Gunawardhan and Al-Rawas 2016; Mahmoudi et al. 2023; Sathish et al. 2022). Recent climate change is recognized as a significant challenge for both water supply systems and flood risk analysis efforts (Yamoat et al. 2023; Mohamadi and Bagheri 2022). Therefore, rainfall quantification is accomplished through the utilization of intensity-duration-frequency (IDF) curves (Chow et al. 1988). IDF curves serve as crucial tools for ensuring the safety and cost-effectiveness of hydrologic and engineering projects for specified return periods (Alzahrani et al. 2023; Yamoat et al. 2022).

The development of Intensity-Duration-Frequency (IDF) curves relies on statistical analysis of historical annual maximum rainfall data. When trends in annual rainfall maxima occur, they have the effect of altering the shape and characteristics of the IDF curves (Cook et al. 2020). Assuming a stationary climate, where climatic conditions remain relatively constant over time, can result in a significant underestimation of Intensity-Duration-Frequency (IDF) curves. This underestimation, in turn, elevates the risk of flooding and the potential failure of urban drainage systems (Cheng and AghaKouchak 2014). Similarly, a study introduced a method for constructing Intensity–Duration–Frequency (IDF) curves while considering associated uncertainties at ungauged sites. This approach involved bias correction of satellite rainfall data and was further compared to the observed IDF curve (Noor et al. 2022). Hence, given the climate change-induced alterations in extreme rainfall patterns, there is a global need for revisions to current design practice. Comprehending the shifts in extreme rainfall intensity holds significant importance in the context of climate change adaptation and bolstering resilience against its impacts (Willems 2013). Another study examined rainfall frequency and intensity in the Itacaiúnas River watershed, eastern Brazilian Amazon, using daily and hourly data from 1988 to 2018. Gumbel and gamma distributions were best fit annual maximum daily rainfall, while GEV and Weibull 3 were optimal for maximum annual rainfall, assessing up to 115 mm/h for a 100-year return period (de Bodas., 2023).

The intensity–duration–frequency (IDF) relationship is a fundamental tool in flood prevention and various other water resource management activities. Its primary purpose is to determine the maximum rainfall intensity that can be expected for a given duration and return period. Typically, the IDF relationship is visually represented as an IDF curve, where rainfall intensity is plotted on the X-axis, duration on the Y-axis, and a range of return periods are depicted as different curves or lines on the graph (Yamoata et al. 2023). The foundation of the Intensity–duration-–Frequency (IDF) can be traced back to (Bernard 1932) and IDF formulas tailored to specific regions in the United States (Bell 1969).

Numerous researchers have dedicated efforts to developing Intensity–duration–frequency (IDF) curves for different areas. Rainfall intensity–duration–frequency (IDF) curves have been established using the Gumbel distribution and Log Pearson type III for specific areas, namely Abha, Al-Baha, Bisha, Makkah Al Mukarramah, Gizan, and Khamis Mushait, located in the region of Saudi Arabia (AL-Areeq et al. 2021). A study examines IDF curves, and illustrates the correlation between average rainfall intensity and event duration for different probabilities of return periods. The work describes four case studies located at different latitudes across Europe: Helsingborg (Sweden), Frauenwald (Germany), Pavia, and Erice (Italy). The derived IDF structures are fit closely with the quantile predictions of extreme rainfall data for specific durations (Creaco., 2024). Another research aimed to construct IDF curves and empirical formulas for the Weta Traditional area, situated in the southeastern portion of the Volta Region of Ghana. It employed data from rain gauges and remote sensing sources like TRMM. The IDF curves and empirical equations serve as crucial hydrological tools, especially in the planning and design phases of water resources projects (Agakpe et al. 2024). IDF curves were constructed by utilizing a 1-hour annual maximum rainfall data series in urban basins. These curves were developed by employing two different statistical distributions: the Generalized Extreme Value (GEV) distribution and the Gumbel distribution (Kourtis et al. 2023).

In this study the application of the Gumbel and log-Pearson type III distributions was chosen over other methods for several reasons. These distributions are well-established and widely used in hydrological studies for their robustness and reliability in modeling extreme events. The Gumbel distribution is mostly effective for predicting the recurrence interval and magnitude of extreme rainfall events and floods with limited data, providing consistent estimates even with small sample sizes (Mirhosseini et al. 2016). The log-Pearson type III distribution is recognized for its flexibility in handling skewed data, which is often faced in hydrological records (Haddad et al., 2011). While more advanced methods such as the Generalized Extreme Value (GEV) distribution or Bayesian methods could possibly offer other benefits, they also introduce increased complexity and computational demands (Yan et al. 2017).

Limited research has been conducted on rainfall analysis and climate change projections in Pakistan. Notably, a study conducted in Karachi, Pakistan, machine learning models (ML) were employed to predict rainfall amounts and, subsequently, to identify flood-prone areas, while in the region of Lahore City, Gumbel extreme value type 1 distribution and log Pearson distribution were employed to derive Intensity-Duration-Frequency (IDF) curves. Consequently, comprehensive IDF relationships remain unavailable for many regions within Pakistan (Ahmed and Ali 2016; Rasool et al. 2023).

The absence of well-defined IDF relationships poses challenges for the planning and execution of water resource systems, including projects such as recharge dams, flood protection structures, and storm-water collection networks. Insufficient data can lead to suboptimal designs. Moreover, well-constructed IDF curves are invaluable for quantifying rainfall rates and predicting potential flooding events in any given region. In an effort to bridge this knowledge gap, our research aims to construct rainfall intensity-duration-frequency (IDF) curves for Pakistan. This is achieved through the analysis of rainfall data collected from multiple meteorological stations situated across various elevations and regions throughout the country. The Gumbel distribution is employed in this analysis. Additionally, empirical formulas are developed to estimate rainfall intensity for various durations and return periods.

The scientific novelty of present research lies in the reality that it gives comprehensive analysis of rainfall intensity patterns across diverse elevations and regions in Pakistan, which was previously underexplored. By developing detailed IDF curves and empirical parameters by both Gumbel and Log Pearson Type III distributions, and utilizing long-term historical rainfall data (2001–2022), this study reveals significant geographical and elevation-based differences in rainfall intensities. This understanding provides region-specific insights for flood management and water resource planning.

The limitations of this study, particularly the use of coarse spatial resolution data (0.5°×0.5°) which may not capture finer-scale rainfall variations within individual cities, and limited regional representation. To report these limitations, future research should incorporate higher-resolution datasets; expand the study to include more regions across Pakistan. These steps will increase the accuracy and applicability of IDF curves for better hydrological and water resource management.

2 Study area and data collection

Pakistan is situated in South Asia and shares its borders with several countries. It is bordered by China, India, and Afghanistan to the north, east, and west respectively. It has a coastline along the Arabian Sea. The study area encompasses four major cities of Pakistan: Lahore, Quetta, Karachi, and Peshawar, red circles shown in Fig. 1. The nominated cities are the capital cities of their respective provinces: Lahore (Punjab), Karachi (Sindh), Quetta (Balochistan), and Peshawar (Khyber Pakhtunkhwa) and the most populous and economically substantial in Pakistan. Each city represents different elevation and geographical regions of Pakistan, providing a broad spatial coverage. Moreover, the historical significance of extreme rainfall events and the relevance impact of these cities in terms of population and infrastructure are crucial factors in their selection.

2.1 Lahore

Lahore is one of the most cultural and historical cities in Pakistan with coordinate 31.5497° N, 74.3436° E. It experiences a monsoon-influenced climate, with the majority of its annual rainfall occurring during the monsoon season from July to September.

2.2 Karachi

Karachi is Pakistan’s largest city and has coordinates 24.8607° N, 67.0011° E. Its port is important for trade, and the city is a melting pot of cultures due to its historical role as a trading center. It has a relatively low average annual rainfall and its main source of rainfall is the monsoon season, which occurs from July to September.

2.3 Quetta

Quetta has coordinates 30.1798° N, and 66.9750° E and is strategically located near the borders of Afghanistan and Iran. It has a semi-arid climate with low annual rainfall. Snowfall is also a famous occurrence in winter. Moreover, Quetta’s unique climate and landscape attract researchers and those interested in studying arid environments.

2.4 Peshawar

Peshawar holds historical significance as a gateway to the Indian subcontinent having coordinates 34.0150° N, 71.5249° E. Its history, along with its role in geopolitics and regional studies, makes Peshawar a vital academic and research center. It receives most of its rainfall during the monsoon season, primarily in July and August.

Fig. 1
figure 1

Study area (studied cities: Lahore, Karachi, Quetta, and Peshawar) of Pakistan. X - Axis denotes longitude and Y - axis denotes latitude of respective cities

The Table 1 summarizes the elevation and average annual precipitation for four cities in Pakistan. Lahore, at an elevation of 209.4 m, faces an average annual precipitation of 610.29 mm. Karachi, situated at 53.0 m, receives 230.90 mm of precipitation annually. Quetta, with a significantly higher elevation of 1803.8 m, experiences an average annual precipitation of 222.21 mm. Peshawar, at an elevation of 986.5 m, registers a substantial average annual precipitation of 672.48 mm. Figure 2 also presents a spatio-temporal analysis of average annual rainfall in four cities of Pakistan. Notably, the size of circles represents the magnitude of average annual rainfall that steadily increases, reaching its peak value 672.480 mm.

The selection criteria for cities entailed a minimum prerequisite of 22 years’ worth of data. The cities chosen for the trend analysis are visually depicted in Fig. 1 and are enumerated in Table 1. The data for rainfall used in this research were obtained through the source given at https://power.larc.nasa.gov/data-access-viewer/ during 2001–2022. While there are many satellite-based products available globally, this study specifically utilized MERRA-2 (The Modern-Era Retrospective Analysis for Research and Application version 2) for analysis. MERRA-2 is specifically crafted to predominantly cater to the needs of historical climate analysis across various weather and climate timeframes. This particular dataset draws its foundations from the NASA-developed cutting-edge atmospheric model known as the Goddard Earth Observing Model version 5 (GEOS-5). It offers a spatial resolution of 0.5°×0.5° and a temporal resolution of 24 h, presenting an exceptional wealth of information for researchers in search of top-tier data to enrich their analytical pursuits.

3 Methodology and data analysis

The process of estimating Intensity-Duration-Frequency (IDF) curves encompassed several sequential stages. The initial phase involved the examination and deconstruction of rainfall data, focusing on shorter time spans. The data gathered from monitoring stations was initially categorized based on criteria such as years, rainfall magnitude, and duration. The highest recorded rainfall depth was obtained for each observed year across various durations. Statistical evaluations, including calculations for the mean and standard deviation of these maximum rainfall depths, were also conducted for different durations.

Table 1 Characteristics of four cities used in study
Fig. 2
figure 2

Average annual rainfall distribution across the study site in Pakistan. X - Axis denotes longitude and Y - axis denotes latitude of respective cities

The creation of Intensity-Duration-Frequency (IDF) curves involves the process of fitting a probability distribution function to the extreme rainfall data related to specific time intervals. Utilizing the established relationship from this fitting, the rainfall intensity for particular durations and return periods is determined for the given recorded years. The Gumbel distribution is employed to conduct frequency analysis on the annual maximum rainfall data, facilitating the computation of rainfall depths for varying return periods. The study focuses on design durations such as 1 h, 6 h, 12 h, and 24 h along with return periods of 2, 5, 10, and 25 years. Moreover, the parameters for the Intensity-Duration-Frequency (IDF) relationship suggested by Bernard (1932) are derived using regression techniques.

3.1 Gumbel distribution

The Type I Extreme Value distribution, also known as the Gumbel distribution, was introduced by the German mathematician Emil Gumbel (Gumbel 1958). This distribution is extensively utilized for the purpose of modeling exceptional occurrences within the realm of water resources management. According to the Gumbel method, the rainfall corresponding to a return period can be computed for any chosen duration. The frequency of precipitation (denoted as PT) in millimeters for different time intervals, given a particular return period (T) in years, is determined using the following equations:

$${{\rm{P}}_{\rm{T}}}{\rm{ = }}{{\rm{P}}_{{\rm{avg}}}} + {\rm{ KS}}$$
(1)

The Gumbel frequency factor denoted as K (Chow 1953) is determined using Eq. (2).

$$\:K=-\frac{6}{\varPi\:}[0.577\:+\text{l}\text{n}[\text{ln}[\frac{T}{T-1}]\left]\right]$$
(2)

Pavg and S represent the average and standard deviation of the highest recorded rainfall for a given duration, which are computed using Eq. (3) and Eq. (4), respectively. Here, Pi stands for the individual extreme rainfall value, and n is the count of events or years under observation.

$$\:{P}_{avg}=\frac{1}{n}\sum\:_{i=n}^{n}{P}_{i}$$
(3)
$$\:S={\left[\frac{1}{n-1}\sum\:_{i=n}^{n}{({P}_{i}-{P}_{avg})}^{2}\right]}^{1/2}$$
(4)

The rainfall intensity IT (measured in mm/hr) for the specified return period Td is determined using Eq. (5):

$$\:{I}_{T}=\frac{{P}_{T}}{{T}_{d}}$$
(5)

3.2 Log Pearson type III distribution

Log Pearson type III model is employed to compute the rainfall intensity across various rainfall durations and return periods for each monitoring city. The simplified representation for this distribution is provided below:

$$\:{P}^{*}=\text{l}\text{o}\text{g}\left({P}_{i}\right)$$
(6)
$$\:{P}_{T}^{*}={P}_{avg}^{*}+{K}_{T}{S}^{*}$$
(7)
$$\:{P}_{avg}^{*}=\frac{1}{n}\sum\:_{i=n}^{n}{P}^{*}$$
(8)
$$\:{S}^{*}={\left[\frac{1}{n-1}\sum\:_{i=n}^{n}{({P}^{*}-{P}_{avg}^{*})}^{2}\right]}^{1/2}$$
(9)
$$\:{C}_{s}={\frac{1}{(n-1)(n-2)({{S}^{*})}^{3}}\sum\:_{i}^{ni}({P}_{i}^{*}-{P}_{avg}^{*})}^{3}$$
(10)

Here, PT, Pavg, and S* hold the same definitions as explained before. However, they are determined based on the logarithmically converted Pi values, as illustrated in Eq. (6). KT is referred to as the Person frequency factor, calculated based on the Skewness coefficient (Cs) and the return period (T). Cs is computed using Eq. (10), while KT is obtained through hydrological references such as tables provided by (Chow et al. 1988). Additionally, the antilog of the solution presented in Eq. (7) yields the estimated extreme value corresponding to the specified return period.

3.3 The IDF empirical formula

The relation between rainfall intensity (I), rainfall duration (d), and return period (TR) is delineated through the IDF empirical formula. A series of steps are undertaken to formulate an equation for computing rainfall intensity within a given rainfall duration and recurrence interval, primarily depending on the findings of IDF curves. The widely recognized Bernard equation (Bernard 1932) is chosen to establish the IDF relationship in the study. The subsequent steps and equations are utilized in the IDF relationship.

$$\:I=\frac{C{T}_{R}^{m}}{{d}^{e}}$$
(11)

Here, I represents the rainfall intensity (measured in millimeters per hour), d signifies the duration of rainfall (expressed in minutes), and T denotes the return period (measured in years). The empirical parameters are denoted by C, m, and e. Through logarithmic conversion, Eq. (11) can be represented as follows.

$$\:logI=\text{log}\left(C{T}_{R}^{m}\right)-elog\left(d\right)$$
(12)

furthermore, for a specific value of T, while considering K as a constant.

$$\:K=\text{log}\left(C{T}_{R}^{m}\right)$$
(13)
$$\:logI=\text{log}\left(K\right)-elog\left(d\right)$$
(14)

Plotting the logarithm of rainfall intensity (log I) against the logarithm of time (log d) for a given return period gives a linear relationship in Eq. (13). Employing this linear relation, the values of log K (intercept) and e (slope) are deduced from each plot corresponding to different return periods. The average of the e values establishes the empirical parameter e. Similarly, the parameters C and m are determined by logarithmically transforming Eq. (13) as shown in Eq. (15):

$$log{\rm{ }}K{\rm{ }} = {\rm{ }}log{\rm{ }}C{\rm{ }} + {\rm{ }}m{\rm{ }}log{\rm{ }}{T_R}$$
(15)

By plotting log K and log T in a linear manner, the slope (m) and intercept (log C) are extracted. Subsequently, the values of C, m, and e are inserted into Eq. (11) to derive the complete IDF equation.

3.4 Goodness of fit test

The least squares goodness of fit approach was additionally employed to assess the variance between the observed and computed rainfall intensities for the chosen distribution. The evaluation of the goodness of fit is examined through the computation of the coefficient of determination (R2) using Eq. (16).

$${R^2} = 1 - {{\mathop \sum \nolimits_{i = 1}^n {{\left( {{X_i} - {Y_i}} \right)}^2}} \over {\mathop \sum \nolimits_{i = 1}^n {{\left( {\bar Y - {Y_i}} \right)}^2}}}$$
(16)

In this equation, Xi and Yi represent the observed and estimated data at a specific instance “i,” respectively. Ȳ denotes the mean of the estimated data, and “n” signifies the total number of data points.

4 Results and discussions

4.1 Results

4.1.1 Sub-daily extreme precipitation

Table 2 displays the annual maximum precipitation values observed in Pakistan’s cities for various high temporal resolutions ranging from 1 h to 24 h. The maximum value of rainfall for 1 h was observed in Quetta on August 31, 2021 at 6 P.M. Interestingly, the second highest peak value of rainfall was also registered in Quetta on the same day at 8 P.M.

Table 2 Maximum precipitation for different time resolutions from 1 to 24 h
Fig. 3
figure 3

Box and Whisker plot of annual rainfall across study site in Pakistan during the study period. X - Axis denotes number of cities and Y - axis denotes average annual rainfall (mm) A

The cumulative occurrence frequency of accumulated precipitation depth within annual maximum 60-minute rainfall events was categorized into bins based on thresholds of 6, 12, 18, 24, and 30 mm, as presented in Table 3. It is evident from the data that the intensity of extreme rainfall is comparatively lower in the less-elevated areas (Lahore, Karachi), conversely, in elevated cities (Quetta, Peshawar), the intensity is notably higher. Cumulative frequency occurrence is higher with higher precipitation values for all cities.

Table 3 Cumulative frequency occurrence of 60-minute rainfall events with precipitation values above 6, 12, 18, 24, and 30 mm

Figure 3 displays the box and whisker plot illustrating the annual rainfall distribution across the examined cities throughout the monitoring period. The dataset demonstrates a positive skewnesspattern across all cities, where the average rainfall surpasses the median rainfall.

Fig. 4
figure 4

Average annual rainfall. X - Axis denotes years and Y - Axis denotes annual rainfall (mm) respective cities

Figure 4 illustrates the fluctuation of average annual rainfall (total rainfall) across different cities of Pakistan spanning the years 2001 to 2022. The peak average, amounting to 63.3 mm, was documented in Peshawar during the year 2020. The minimum average, measuring 1.02 mm, was noted during the year 2014 in Karachi.

4.2 Rainfall intensity, its duration, and the probability of occurrence over a defined return period

The Gumbel and Log Pearson Type III distributions are widely employed in arid regions for Intensity-Duration-Frequency (IDF) calculations. Initially, both distributions were applied to Lahore’s rainfall records to determine the optimal choice. Rainfall intensities for all cities were estimated across various rainfall durations (1, 6, 12, and 24 h) and return periods (2, 5, 10, and 25 years). Figure 5 displays the modeled intensities for both distributions, while Table 4 summarizes the best-fit results using coefficients of determination (R2) from Eq. (16) at different return periods.

Table 4 Best fit distribution summary at various return periods for Lahore city

Both models exhibited strong correlations, with R2 values higher than 0.9 across all return periods. Given the similarity in best-fit results between the two distributions, the Gumbel distribution was selected and applied to all cities throughout this study.

Fig. 5
figure 5

Modeled rainfall intensity of Lahore using Gumbel (left panel) and Log Pearson type III distribution (right panel) X - axis denotes duration (h) and Y - axis denotes rainfall intensity (mm/h) at various return periods

Table 5; Fig. 6 present the computed rainfall intensities for all cities considering return periods of 2, 5, 10, and 25 years through the use of the Gumbel distribution. The analysis reveals that as the return period increases, the estimated rainfall intensities also increase. Conversely, the intensities decrease as the duration of rainfall increases across all the cities. Furthermore, it is noteworthy that in Peshawar higher-intensity rainfall is consistently registered at higher elevations for various return periods compared to those at lower elevations (Lahore, and Karachi, ). It can be demonstrated that the mountainous region exhibits the highest rainfall intensities, followed by hilly areas, while plain regions experience the lowest rainfall intensities for various return periods. Quetta’s lower rainfall intensity despite its higher elevation can be primarily attributed to its location in a rain shadow area, its distance from moisture sources like the Arabian Sea, the effects of elevation on temperature and moisture content, and the specific local weather patterns that prevail in the region. Thus, this underscores that mountainous areas tend to have higher estimated rainfall intensities for different return periods compared to arid or inland regions (Lahore), as well as coastal areas within the country (Karachi).

Fig. 6
figure 6

Rainfall intensity at different durations and return periods across study site in Pakistan. X - Axis denotes cities and Y - axis denotes rainfall intensity (mm/h) and Z – axis denotes various return periods at different durations

The estimated empirical parameters (C, m, and e) for the IDF relationship, as described in Eq. (16) are shown in the Table 6. The goodness of fit between observed and estimated data was assessed using R2 values. Notably, these IDF curves exhibit parallelism, as seen in Fig. 7. Table 6 provides the estimated values of the IDF parameters along with the corresponding IDF equations and the achieved R2 values through the analysis of IDF data. The empirical parameter values for C ranged from 1.9*106 to 1.1*105, m varied from 0.55 to 0.27, and e spanned from 1.04 to 1.07 across the studied cities also shown in Fig. 8. These results indicate a strong correlation between observed and estimated rainfall intensities, with high R2 values ranging between 0.999 and 0.994 (as detailed in Table 6). Consequently, the IDF curves generated for these cities can be effectively utilized in rainfall estimation and the planning of water-related projects in Pakistan.

Table 5 Rainfall intensity at various durations and return periods using the Gumbel distribution

The spatial distribution maps of the IDF parameters C, m, and e are given in Fig. 9. Figure 9 (a, b, and c) focuses on the IDF parameter C, m, and e representing the scale of rainfall intensity across the four cities. Figure 9(a) shows the contour map that displays contour lines representing IDF parameter C. The expanded contour lines at Lahore and Karachi suggest the lower value of C indicating lower rainfall intensities. This means that Lahore and Karachi experience more intense rainfall events. On the other hand the condensed contour lines at Peshawar and Quetta suggest the highest value of C indicating higher rainfall intensities. This shows Peshawar and Quetta experience more intense rainfall events. But Quetta’s location is in a valley surrounded by mountains can create a rainfall shadow effect resulting in lower rainfall intensity. Figure 9 (b) and (c) show the m and e values with their corresponding contour patterns. The contour lines of Peshawar are relatively wider than Lahore and Karachi, but similar to Quetta, shown in Fig. 9 (b). This indicates that Peshawar and Quetta experience a moderate to gradual decrease in rainfall intensity with duration which is reflected in the smoother contours. The contour lines around Lahore and Karachi are circular, shown in Fig. 9 (c) indicating a more localized and consistent pattern of the e parameter across the area. Circular contours propose that the relationship between rainfall intensity and the return period is more stable and uniform around Lahore and Karachi.

The creation of IDF curves and their associated parameters for various cities across Pakistan has not been thoroughly documented in the existing literature. This gap underlines the significance of created contour maps, which offer a valuable resource for estimating empirical parameters, developing IDF formulas and curves, and evaluating rainfall intensities for different durations and return periods.

Table 6 IDF parameter and equation with coefficient of determination for studied cities
Fig. 7
figure 7

IDF curves for different return Periods across study site in Pakistan through observed data. X - Axis denotes duration (h) and Y- Axis denotes rainfall intensity (mm/h) at various return periods

4.3 Discussion

The analysis of rainfall intensities is vital for maintenance, optimizing, and enhancing the resilience of hydrological systems against storms and floods. This is especially significant in arid and semi-arid regions like Pakistan, where comprehensive long-term rainfall data with short aggregation periods are often limited. To meet this necessity, the present study creates intensity-duration-frequency (IDF) curves utilizing rainfall data from four cities representing various elevations and geographical areas in Pakistan. By applying the Gumbel distribution to observed rainfall data at different durations (1 h, 6 h, 12 h, and 24 h), the study determines rainfall intensities for specific return periods.

The cumulative frequency occurrence increases with higher rainfall values in all cities. This shows that rainfall is prominently greater in mountainous regions as compared to plain and coastal areas. The box and whisker plot showed the annual rainfall distribution across the studied cities during the monitoring period displays a positive skewness pattern in all locations, with average rainfall exceeding the median rainfall. Figure 4 showed the peak average rainfall, amounting to 63.3 mm, in Peshawar during the year 2020. The minimum average rainfall, measuring 1.02 mm, was noted during the year 2014 in Karachi.

Primarily, the Gumbel and Log Pearson Type III distributions were applied to Lahore’s rainfall records. Both models showed strong correlations, with R2 values higher than 0.9 across all return periods. Given the similarity in best-fit results between the two distributions, the Gumbel distribution was selected and applied to all cities throughout this study. A study (Tfwala et al. 2017) on the IDF curves in the Ghaap plateau revealed that precipitation intensity decreases with increasing rainfall duration and increases with longer return periods. This finding is supported by the current study. The objective of present study to develop IDF curves for four cities in Pakistan using the Gumbel distribution aligns closely with (Ashok et al., 2023), who developed IDF curves for the Ananthapuramu district in southern India, also by the Gumbel distribution. Both studies successfully designed IDF parameters and correlation coefficients for multiple return periods, indicating the effectiveness of the Gumbel distribution in modeling extreme rainfall events. Moreover, (Noor et al. 2021) evaluated IDF curves using satellite-based precipitation data in Malaysia and concluded that the Gumbel distribution provided the best fit for rainfall intensity across all durations. (Basumatary et al., 2018) observed that rainfall intensity increases with longer return periods, in the Barak River Basin that aligns with the findings of the present study. This clarifies why larger structures like bridges and dams are designed for higher return periods, while smaller structures, such as irrigation tanks and culverts, are planned for lower return periods (Agakpe et al. 2024). Furthermore, rainfall intensities tend to decrease as the rainfall duration increases for a given return period, underscoring the potential catastrophic impact of short-duration, intense rainfall events on the environment.

This distribution reveals that rainfall in mountainous regions is considerably higher when contrasted with the plain and coastal regions of the country at different durations and return periods. Additionally, (Chitrakar et al. 2023) found that rainfall intensities are higher in mountainous regions compared to desert or coastal regions, a conclusion that reflects present findings in Pakistan, where mountainous areas exhibited higher rainfall intensities compared to plain and coastal regions. This consistency across studies highlights the geographical influence on rainfall patterns and the applicability of the Gumbel distribution for accurate IDF curve development.

The estimated empirical parameters (C, m, and e) for the IDF relationship are presented in the Table 6. The results explain a strong correlation between observed and estimated rainfall intensities, with high R² values ranging from 0.994 to 0.999. The spatial distribution of IDF parameters C, m, and e are also drawn. These parameters depend on the location, shape, and scale of the area and are crucial for understanding the overall magnitude of rainfall intensity, frequency and duration in different regions (AlHassoun et al., 2011 & Takeleb et al. 2022). The contours illustrate the gradual variation of the parameters across the entire study area. The IDF parameter C is lower and the rainfall intensity is lower at flat areas (Lahore and Karachi) than at higher elevated areas. This work is similar to (Chitrakar et al. 2023) in which higher values of the parameter were observed along the mountain range then along the flat range.

Fig. 8
figure 8

IDF parameters across the study site in Pakistan. X - Axis denotes cities and Y- axis denotes IDF parameters (C, m, and e)

The outcomes of this study have substantial implications for water resource management in Pakistan. The statement that rainfall intensity is higher in mountainous regions and lower in plain and coastal areas proposes that flood management strategies should be region-specific. Moreover, the study’s understandings into the cumulative frequency occurrence of extreme rainfall events can monitor water harvesting practices and agricultural planning.

Fig. 9
figure 9

Spatial distribution contour map of IDF (a) parameter C, (b) parameter m and (c) parameter e. X - Axis denotes longitude and Y - axis denotes latitude of respective cities

5 Limitations and future work

The study depend on data of four capital cities, and the temporal resolution of the dataset (2001–2022) however, the future work could expand the geographic coverage to include more regions across Pakistan, by incorporating climate change projections to account for future shifts in rainfall patterns, and explore advanced statistical or machine learning techniques to enhance the accuracy of IDF curves. Additionally, Log Pearson type III and Gumbel distributions have been implemented but it will be interesting to explore other advance models. Developing these curves with comprehensive hydrological models would offer a more holistic understanding of flood risks.

6 Conclusion

Intensity-duration-frequency (IDF) curves play a significant role in hydrologic and water developments, especially in the strategic planning and design of various water resource systems. These include a wide range of initiatives, including storm drainage systems, flood prediction measures, construction of bridges and ducts, and the development of water-impounding facilities. In this particular study, IDF curves were carefully developed using Bernard’s equation, with rainfall intensities for different durations and return periods gained through the application of Gumbel and log Pearson type III distributions. The research used historical rainfall data gathered from cities spanning diverse elevations and regions across Pakistan. IDF curves and empirical formulas were systematically developed for multiple rainfall durations (1 h, 6hr, 12 h, and 24 h) covering various return periods (2, 5, 10, and 25 years). The study brought to light significant geographical differences, exploring that mountainous regions consistently experience higher rainfall compared to plain and coastal areas. Furthermore, cities at higher elevations consistently register more intense rainfall across various return periods compared to their lower-elevation areas. Through curve fitting and the estimation of IDF empirical parameters, the study evaluated parameter values for the studied cities, with C ranges from 1.9*106 to 1.1*105, m varies from 0.55 to 0.27, and e spans from 1.04 to 1.07. The contours illustrate the gradual variation of the parameters across the entire study area. The IDF parameter C is lower and the rainfall intensity is lower at flat areas (Lahore and Karachi) than at higher elevated areas. The Peshawar and Quetta experience a moderate to gradual decrease in rainfall intensity with duration which is reflected in the smoother contours. The contour lines around Lahore and Karachi are circular, indicating a more localized and consistent pattern of the e parameter across the area.

The findings of this comprehensive study have valuable insights in enhancing the resilience and sustainability of water resource systems in Pakistan. This knowledge is necessary for supporting informed decision-making in the experience of the country’s diverse climatic and geographical challenges. The study not only advances our understanding of rainfall patterns but also gives practical guidance for effective water resource management in a region characterized by changing elevations and climatic conditions. It also emphasizes the need for adaptive flood management strategies, agricultural planning, and water harvesting practices, thus contributing to the broader goals of sustainable development and climate adaptation in Pakistan. Moreover, the methodological method adopted in this study, including the use of the Gumbel and logs Pearson type III distributions, provides a robust framework for developing IDF curves. Future research can form on these findings to further refine IDF models and discover their applications in different climatic settings.