1 Introduction

The analysis of climatic data series with the objective of monitoring climate change has become a common practice among researchers in recent years (Shi et al. 2013). In hydrology and climatology, it is common to consider precipitation as one of the climatic variables of major interest due to the great variability which it shows in both temporal and spatial scales (IPCC 2007). Furthermore, as it is mentioned by Trenberth et al. (2007) and based on the updated results of the IPCC Fourth Assessment Report, the heavy precipitation events increased during the last century finding that the precipitation has become much heavier than the average.

Due to the important influences of precipitation extremes of the hydrological cycle, many studies focused on changes in precipitation extremes in various regions have been conducted around the world.

According to the works of Brooks and Carruthers (1953) and Martin-Vide (2004), precipitation can be considered a discrete process which can be modeled by exponential negative distributions. It means that few rainy days concentrate the majority of the total amount of precipitation. This, for Cortesi et al. (2012), means that the precipitation is a very highly time-compressed process. The concentration of precipitation can have significant impacts on natural, human, and social systems as Martin-Vide (2004), Zhang et al. (2009), and Li et al. (2011) have pointed out.

Li et al. (2011) considers that compared with studies of precipitation frequency and intensity, the concentration or dispersion of the precipitation has been rarely investigated. This situation for Brugnara et al. (2012) may be due to the lack of high-quality daily precipitation datasets. The studies focused on the concentration of precipitation are based in most of the cases on the methodologies of the precipitation concentration degree (Zhang and Qian 2003; Liu et al. 2016), the precipitation concentration index (PCI) (Oliver 1980), and the concentration index (CI) (Martin-Vide 2004).

Martin-Vide (2004) proposed the concentration index to explore the contribution of the days of the greatest rainfall to the total amount. This index is based on the Lorenz curve, which was first used by economists to assess income/wealth distributions. Shi et al. (2013) pointed out that the Lorenz curve is a function that relates the accumulation of a variable and its cumulative frequency. In this manner, when the distribution is equitable, the Lorenz curve can be expressed by the function y = x.

The concentration index has been widely used in recent years to analyze statistically the characteristics of daily precipitation, e.g., by Martin-Vide (2004) in Spain, Alijani et al. (2008) in Iran, Zhang et al. (2009) in the Pearl River basin of China, Benhamrouche and Martín-Vide (2012) in mainland Spain, Cortesi et al. (2012) in Europe, Suhaila and Jemain (2012) in Peninsular Malaysia, Shi et al. (2013) in the Lancang River basin of China, Shi et al. (2014) in the upper reaches of the Huai River in China, Benhamrouche et al. (2015) in Algeria, Patel and Shete (2015) in India, Yesilırmak and Atatanır (2016) in western Turkey, Mayer et al. (2017) in Canary Islands of Spain, Serrano-Notivoli et al. (2017) in Spain, Zubieta et al. (2017) in Peru, Llano (2018) in Argentina, and Vyshkvarkova et al. (2018) in the south of Russia.

Most of these research works are focused on the spatial and temporal behaviors of the concentration index within particular geographical regions. In the papers cited above, they mainly show the ranges between which the fluctuations of the concentration index occur and based on this, a zoning in areas of low, medium, and high concentration according to the behavior of this index is proposed.

However, sometimes, the interpretation of the results could be confusing because depending on the research work consulted, a different categorization can be found in the levels of concentration of this index. For example, in Zhang et al. (2009), it is stated that CI ≤ 0.75 is considered low, between 0.75 and 0.78 is medium, and CI ≥ 0.78 is high, while Vyshkvarkova et al. (2018) stated that according to previous studies, CI > 0.61 is high, 0.55 < CI < 0.61 is moderate, and a CI < 0.55 is low, and on the other hand Llano (2018) classifies CI values of Argentina greater than 0.61 as high as pointed out by Vyshkvarkova et al. (2018), but values less than 0.58 as low.

The differences observed in the classifications presented are due to the distribution of the CI values within each of the regions under study, that is, it is due to a local classification. Under these circumstances, it is important to differentiate between the global and local behaviors of this index in order to make comparable the results of the research work aforementioned. In addition, in the mentioned research works, there is no reference to the methodology followed to perform the classification of the concentration index.

In this sense, the objective of this work was to analyze the behavior of the concentration index at a global scale, and then, to analyze the variations of the concentration index at the local scale with the purpose of defining a classification for the CI on both scales.

2 Data and methods

To analyze the global behavior of the precipitation concentration index, there were obtained data from the results of previous research works. Data retrieved from the papers are shown in Table 1. Additionally, the CI data calculated for the stations showed in Table 2 were used. The total amount of data available including own data amounts 539 values for the CI index.

Table 1 Research works used for the analysis of the global behavior of the concentration index
Table 2 Meteorological stations used for the calculation of the concentration index for central-western Mexico

Own data correspond to daily precipitation data of thirty-seven meteorological stations obtained from the National Meteorological Service of Mexico (NMS 2018). Data series were tested for homogeneity problems using the methodology proposed by Wang et al. (2010). The precipitation series used cover the period of 1970–2010 like most of the consulted works. The localization and main characteristics of the precipitation in these meteorological stations are shown in Table 2.

The calculation of the concentration index was based on the methodology proposed by Martin-Vide (2004). In this sense, the first step consisted in the classification of precipitation values in categories of 1 mm in ascending order. Therefore, the cumulative summation of days and precipitation quantities was calculated. Finally, for the cumulative frequency of rainy days (x) and its associated precipitation amounts (y), an equation of the form (1) was fitted:

$$ y= ax{e}^{bx} $$
(1)

where a and b are regression constants estimated by the least square method.

By definition, the precipitation concentration index is the ratio between the area (A) compressed by the Lorenz curve and the equidistributional line, and the area of the triangle delimited by the equidistributional line.

$$ \mathrm{CI}=\frac{A}{5000} $$
(2)

In this manner, the area under the Lorenz curve was evaluated as:

$$ A=5000-{\int}_0^{100} ax{e}^{bx} dx $$
(3)

To define a classification in areas of low, medium, and high concentrations at the local scale, it used two methodologies, one based on the standardized anomaly which is computed by subtracting the sample mean of the concentration index values and dividing the corresponding sample standard deviation (Wilks 2006) as:

$$ z=\frac{\mathrm{CI}-\overline{\mathrm{CI}}}{S_{\mathrm{CI}}} $$
(4)

Once the anomalies were calculated, the classification was made taking as reference a distance of one standard deviation from the sample mean, that is, the CI value was considered high when the anomaly was equal or greater than 1, and low when the computed anomaly was equal or less than − 1. The CI values observed for anomalies between − 1 and 1 were considered moderate.

The second methodology consisted in the calculation of the first and third quartiles. In this manner, the CI values were considered high when they were equal or greater than the 3rd quartile, and low when the CI values were equal or less than the 1st quartile. The CI values observed between the interquartile range were considered moderate.

3 Results and discussion

3.1 Concentration index of central-western Mexico

The behavior of concentration index as well as the regression constants a and b and the percentage of precipitation contributed by 25% of the rainiest days of 37 weather stations located in the central-western of Mexico are shown in Table 3. In these results, it is observed that CI varies within 0.49 and 0.68. The lowest values observed are related to areas where 25% of the rainiest days contribute, in average, with 59% of the annual precipitation, while on the other hand, the highest values are associated with places where 25% of the rainiest days contribute to 77% of the annual precipitation. The observed difference of 18% suggests that in San Gregorio (station 14125), a more regular daily precipitation patter could be expected than in Cihuatlán (station 14028). It is because, according to the observed rainfall regime of the study area, low values of the concentration index can be related to convective precipitation patterns while high values are associated with the tropical cyclonic activity of the Pacific Ocean (García 2003).

Table 3 Regression constants a, b, concentration index, and percentage of precipitation contributed by 25% of the rainiest days

Compared with other studies, the values of the concentration index observed in central-western Mexico are higher than those observed by Suhaila and Jemain (2012) in Peninsular Malaysia and Zubieta et al. (2017) in the central Andes of Peru, and lower than all the other studies cited in Table 1. It is a result of the different climatic systems observed in all these countries.

Regarding the regression constants, the results show that as the CI value increases, the constant b is the one with the highest incidence in the behavior of the concentration curves. It suggests that for high values of the CI, the concentration curves could be modeled in an acceptable manner with an equation that depends only on one constant.

3.2 Behavior of the concentration index at a global scale

The concentration index ranges between 0 and 1 (Vyshkvarkova et al. 2018), where 0 is related to a very low concentration (complete uniformity) and 1 with a very high concentration of the precipitation (Shi et al. 2014). On this basis, the concentration index could be intuitively classified on a global manner into five categories: (1) 0–0.2 as a very low concentration, (2) 0.2–0.4 as low concentration, (3) 0.4–0.6 as moderate concentration, (4) 0.6–0.8 as high concentration, and (5) 0.8–1 as a very high concentration.

However, the results suggest that there is a low probability for the precipitation to cover all ranges because the observed CI values tend to concentrate mainly between moderate and high categories. This can be reinforced if it is taken into account that Monjo and Martin-Vide (2016) founded in their study about the daily precipitation concentration around the world according to several indices that the Gini index, which is very similar to the CI, varies from 0.38 to 0.87.

Figure 1 shows some simulated concentration curves for the main categories according to empirical data as well as an extrapolated concentration curve in order to identify the behavior of the CI for lower categories. In this figure, it is appreciated, on average, that for the concentration index of 0.42, 25% of the rainiest days concentrate around 50% of the precipitations. On the other hand, it is appreciated that for values of concentration index above 0.5, the curves show a very slow initial increase, which means that a large percentage of events concentrate only on a small percentage of precipitation. For example, in the case of CI = 0.5, 25% of the rainiest days concentrate, in average, 60.5% of the precipitation while for CI values of 0.6, 0.7, and 0.8, the 25% of the rainiest day’s concentrates, on average, 70%, 79%, and 90.5% of the precipitation, respectively.

Fig. 1
figure 1

Simulated concentration curves for the main categories of the observed concentration index

On the contrary, based on the extrapolation for the CI of 0.2, it can be observed that the distribution of precipitation is more uniform, since 25% of the rainiest days concentrate, on average, 38% of total precipitation.

The frequency distribution of the concentration index for all CI values presented in the research works available shows that this index varies within a compact and defined range of 0.42 to 0.81 (Fig. 2). Range is similar but greater than that founded by Cortesi et al. (2012) across Europe (0.51–0.72) and a little lower than that observed by Monjo and Martin-Vide (2016) around the world. The lowest values were observed in Perú (Zubieta et al. 2017) and the highest in China (Zhang et al. 2009). The observed range in the CI values indicates that according to the previous categorization, the concentration of the precipitation in the countries considered in this work is between moderate and high. In fact, the percentage of CI value greater than 0.5, which was 94%, suggests the fact that precipitation is effectively a process with a trend of high concentration. In Fig. 2, it also can be appreciated that values greater than 0.8 are very scarce (< 0.2%) which indicates that the probability of finding places where the concentration of the precipitation was very high is less probable. This is because a CI greater than 0.8 implies that 25% of the rainiest days concentrate more than 90% of the total precipitation. In the papers consulted in this work, only in Zhang et al. (2009) a CI value of 0.805 was found and it was attributed mainly to precipitations coming from typhoons which are usually of the northwestern Pacific.

Fig. 2
figure 2

Frequency distribution of the concentration index observed in the countries used in this work

3.3 Behavior of the concentration index at a local scale

Figure 3 shows the distribution of the concentration index by country. The lowest values of the concentration index are located in the Mantaro River basin in central Andes of Peru (Zubieta et al. 2017) and in Peninsular Malaysia (Suhaila and Jemain 2012), and the highest values in the Pearl River basin of China (Zhang et al. 2009). The lowest values observed by Zubieta et al. (2017) are related to semiarid conditions with low precipitations while the values observed by Suhaila and Jemain (2012) are related to monsoons. In Fig. 3, it is noticeable that in some countries, the distribution of the concentration index is not uniform due to the presence of both negative and positive skewness.

Fig. 3
figure 3

Behavior of the concentration index between countries

Ranges of variation of the concentration index at a local scale have been found from 0.07 to 0.19. The mean value was 0.12 ± 0.041. On average, the range within the CI oscillates is practically a quarter of the observed value on a global basis. The smallest range of variation by country was observed in the Pearl River basin of China (Zhang et al. 2009), followed by the Southern Russia (Vyshkvarkova et al. 2018), the Canary Islands (Mayer and Marzol 2014), and the upper reaches of the Huai River of China (Shi et al. 2014), while the wide range was founded in Peninsular Malaysia (Suhaila and Jemain 2012) and in central-western Mexico (Own data).

Despite the differences observed in the ranges of variation, in all countries, the existence of low, medium, and high concentrations is recognized, although in each country, the thresholds that define the categories are different. The results obtained for the classification of the concentration index through two methodologies are presented in Table 4. In this table, it is observed that although it seems that the differences are small, they are really important since in some cases, a difference of one hundredth can represent up to 10% of the variation in the concentration of precipitation at the local scale.

Table 4 Classification of concentration index according to the proposed methodologies

When comparing the two classifications, it is observed that the classification based on standardized anomalies could be considered more conservative because the ranges of extreme values (low and high) are lower than those found with the quartile-based methodology. This is because the quartiles divide a data set into four equal parts which cause the extreme ranges to be greater. Therefore, it can be considered a stricter condition. One advantage of using the standardized anomaly is that it is not necessary to assume that the data follow any particular distribution in order to express them in terms of standardized anomalies (Wilks 2006).

According to the proposed classifications for the concentration index, it was found that the classification presented in Zhang et al. (2009) is very similar to that obtained with the standardized anomaly methodology, while in the case of the classification proposed by Vyshkvarkova et al. (2018), both methodologies generate the same results and they match with the threshold for the highest category of concentrations. However, in the case of the lower category, the proposed methodologies result in a value greater than that proposed by these researchers.

Finally, in the case of the classification proposed by Llano (2018) for the lower category, there is a total coincidence with the methodology based on standardized anomalies; however, in the case of the higher category, the coincidence is better with the methodology based on the quartiles. Thus, the results of the comparisons made show that the methodology based on standardized anomalies could become a good reference since it shows good performance even in regions where the concentration of precipitation is very high as in the case of the Pearl River basin of China.

3.4 Comparison of the behavior of the concentration index at global and local scales

The classifications presented above show that the concentration of daily precipitation can be analyzed from two perspectives, global and local. The global classification allows knowing the general level of concentration of daily precipitation of one country or region according to the mathematical definition of the CI, while the local classification allows analyzing within the range of local variation the conditions of concentration of the precipitation at this scale. For example, in the case of the central-western Mexico, it can be observed that at the global level, the concentration of the precipitation is within moderate and high categories, approaching the average to 0.6, which indicates that in this region, 25% of the rainiest days could concentrate up to 70% of the total rainfall. On the other hand, at the local scale, according to the observed values and the classification based on standardized anomalies, it can be seen that the concentration of the precipitation can be considered locally low for the values lower than 0.52; on the contrary, when the CI value is equal or greater than 0.6, the concentration is considered as high.

In the same manner, if we take as a reference the results of Zhang et al. (2009), it can be appreciated on a global scale that the region where the Pearl River basin is located is the one which has shown the highest CI values and therefore, the highest concentration of the precipitation in the case of the research works consulted. However, despite the observed high concentrations, Zhang et al. (2009) recognize that on a local scale, the CI values lower than 0.75 are related to a low concentration of the precipitation while the values higher than 0.78 are related locally to high concentrations of the precipitation.

In general, analyzing Fig. 3 in conjunction with Table 4, the behavior of the CI can be known both on a global and local scale for each of the geographical areas taken into account in this study. According to Fig. 3, the presence of two main groups of precipitation concentration is notorious. One group considered within the categories of moderate to high concentration, and the other group in the range between the categories of high concentration to very high. Although two main groups of precipitation concentration were identified at a global scale, at the local scale, significant differences were observed between the analyzed geographical areas. An example of these differences can be observed between the Lancang and the Pearl River basins, both located in China, where in the case of the first basin, it is contemplated that high concentrations of precipitation have occurred for CI values higher than 0.67, while in the second basin, the highest concentration conditions were observed for CI values greater than 0.79. It indicates that in the Lancang river basin, 25% of the rainiest days could concentrate up to 76% of the total rainfall while in the Pearl River basin, 25% of the rainiest days could concentrate up to 90% of total precipitation.

4 Conclusions

The concentration index (CI) has been widely used in recent years to analyze the statistical structure of daily precipitation. This index was proposed by Martin-Vide (2004) to explore the contribution of the days of the greatest rainfall to the total amount. However, sometimes, the interpretation of the concentration index is confusing because in the majority of the research works, more emphasis has been placed on the analysis of the behavior of this index on a local scale, which makes it difficult to compare results between different research projects. The results of this work show that the concentration of daily precipitation, according to the concentration index, can be analyzed from two perspectives, global and local. The global analysis allows to know the general level of concentration of daily precipitation of one country or region while the local analysis allows to visualize within the range of local variation the conditions of concentration of the precipitation at this scale.

At the global scale, according to the analyzed data, it was founded that the concentration of the precipitation is between moderate and high due to this, index varies within a compact and defined range of 0.42 to 0.81. The lowest values were observed in Peru where 25% of the rainiest days concentrate around 50% of the precipitations, and the highest values were found in China where 25% of the rainiest days can concentrate up to 90.5% of the total precipitation.

At the local scale, the methodologies proposed for the classification of the concentration index show good performance even though the classification based on standardized anomalies could be considered more conservative because the ranges of extreme values (low and high) are lower than those found with the quartile-based methodology.

The comparison of the concentration index on the global and local scale allows knowing the general level of concentration of daily precipitation of one country or region according to the mathematical definition of the CI in the first case, while the local classification allows knowing within the range of local variation the conditions of the concentration of precipitation at this scale.