1 Introduction

In recent years it has been observed that rain precipitation had an atypical behavior, often with excessive rainfall in some areas and very little rain in other regions, perhaps caused by climate changes observed on the planet. In Brazil this atypical behavior has been observed in the southeast region where it was rarely seen many drought events and short of water in water reservoirs that supply the big cities and agriculture. This temporal behavior of large drought events is common in northeastern region of Brazil, but rarely seen in the southeast region, the region with the highest population, highest industrialization and significant agricultural production.

A drought event results from lower levels of precipitations compared to standard normal precipitations observed during a long period of time. To monitor drought events, the literature present different methodologies such as percentage of normal precipitation and precipitation percentiles or the Palmer Drought Severity Index (PDSI), that is a measurement of dryness based on recent precipitation and temperature (Palmer 1965).

A popular index introduced in the literature and used by many countries around the world, more simple, easy to calculate, statistically relevant and effective in analyzing wet periods/cycles or dry periods/cycles is the Standardized Precipitation Index (SPI) introduced by McKee et al. (1993, 1995). This index has been extensively used by climatologists around the world (Karavitis 1998, 1999; Hayes et al. 1999; Lana et al. 2001; Sönmez et al. 2005; Livada and Assimakopoulos 2007; Chortaria et al. 2010; Karavitis et al. 2011).

To calculate the SPI, it is required at least 20–30 years of monthly values; optimal measures requires 50–60 years (Guttman 1994). The SPI is based on the probability of precipitation for any time scale. The probability of observed precipitation is then transformed into an index.

The SPI quantifies the precipitation deficit for different timescales. These timescales reflect the impact of a drought event on the availability of the different water resources. Soil moisture conditions respond to precipitation anomalies on a relatively short scale; groundwater, streamflow and reservoir storage reflect the longer-term precipitation anomalies. For these reasons, McKee et al. (1993) originally calculated the SPI for 3, 6, 12, 24 and 48-month timescales.

The SPI calculation for any location is based on the long-term precipitation record for a specified period. The SPI index is based on the cumulative probability of a rainfall occurring at a climate station. The historic rainfall data of the station is fitted to an asymmetrical distribution as a gamma distribution (usually this distribution gives good fit for the rainfall data) with the gamma distribution parameters being estimated by maximum likelihood estimation method. In this way, based on the historic rainfall data, an analyst can find the probability of the rainfall being less than or equal to a certain amount. Thus, the probability of rainfall being less than or equal to the median rainfall for a specified area will be about 0.5, while the probability of rainfall being less than or equal to an amount much smaller than the median will be lower. Therefore if a particular rainfall event gives a low probability on the cumulative probability function, then this is indicative of a likely drought event. Alternatively, a rainfall event which gives a high probability on the cumulative probability function is an anomalously wet event. In this way the distribution have a standard deviation and a mean which depends on the rainfall characteristics of a specified area. Therefore it is difficult to compare rainfall events for two or more different areas in terms of drought, as drought is really a “below-normal” rainfall event. It is important to point out that what is “normal rainfall” in one area can be surplus rainfall in another area, taking in account only rainfall amounts. The SPI index is obtained from a transformation of the obtained cumulative probability gamma function into a standard normal random variable Z with mean of zero and standard deviation of one. A new variate is formed, and the transformation is done in such a way that each rainfall amount in the old (gamma) function has a corresponding value in the new (transformed) Z function. The probability that the rainfall is less than or equal to any rainfall amount will be the same as the probability that the new variate is less than or equal to the corresponding value of that rainfall amount. The resultant transformed variate is always in the same units (Edwards 1997). Positive SPI values indicate greater than median precipitation and negative values indicate less than median precipitation. Because the SPI is normalized, wetter and drier climates can be represented in the same way; thus, wet periods can also be monitored using the SPI.

A classification system introduced by McKee et al. (1993) (see Table 1) is used to define drought event intensities obtained from the SPI. They also defined the criteria for a drought event for different timescales. A drought event occurs any time the SPI is continuously negative and reaches an intensity of \(-1.0\) or less. The event ends when the SPI becomes positive. Each drought event, therefore, has a duration defined by its beginning and end, and an intensity for each month that the event continues. The positive sum of the SPI for all the months within a drought event can be termed the drought’s “magnitude”.

Table 1 SPI values

In this paper, we consider as a main goal of the study, the SPI measures (1, 3, 6 and 12-month timescales) for a large industrial city in Brazil, Campinas, with more than 1 million people located in the southeast region of Brazil, São Paulo State, ranging from January 01, 1947 to May 01, 2011. The data set is available in the site of the CIIAGRO (Índice Padronizado de Precipitação—SPI), a site related to climate in São Paulo State, Brazil: http://www.ciiagro.sp.gov.br/ciiagroonline/Listagens/SPI/LspiLocal.asp.

In Fig. 1, we have the time series plots for the \(SPI-1\), \(SPI-3\), \(SPI-6\) and \(SPI-12\) in Campinas for the period ranging from January 01, 1947 to May 01, 2011.

Fig. 1
figure 1

Campinas SPI measures. a \(SPI-1\), b \(SPI-3\), c \(SPI-6\) and d \(SPI-12\)

From the plots of Fig. 1, we observe very atypical values of SPI (\({<}-2.0\)) indicating severely dry period from the 600th to the 620th months which corresponds to the period ranging from December 1996 to August 1998. That is, we observe a great variability of the SPI measures in this period of 2 years with alarming consequences. It is important to point out that the period from December 1996 to August 1998 is a period of a record breaking El Niño event (see NOAA technical report for the El Niño effects in North America in http://www1.ncdc.noaa.gov/pub/data/techrpts/tr9802/tr9802), which could be connected to this drought in Brazil.

To model this time series SPI data, we could consider a time series modeling approach as for example using stochastic volatility (SV) models (Ghysels 1996; Kim et al. 1998; Meyer and Yu 2000) with a Bayesian hierarchical approach. Alternatively, in place to model the SPI time series, since a drought event occurs any time the SPI is continuously negative and reaches an intensity of \(-1.0\) or less, we consider the calendar time or epochs of droughts less or equal to the threshold \(L=-1.0\) for Campinas, during the period ranging from January 01, 1947 to May 01, 2011, which corresponds to a period of \(T=770\) months.

In Fig. 2, we have the plots of the accumulated numbers of drought event occurrences for \(SPI-1\), \(SPI-3\), \(SPI-6\) and \(SPI-12\) series (\(SPI\le -1.0\)) in each time versus time (month of occurrence).

Fig. 2
figure 2

Accumulated numbers of \(SPI\le -1.0\) in each time versus time (month of occurrence). a \(SPI-1\), b \(SPI-3\), c \(SPI-6\) and d \(SPI-12\)

From Fig. 2, we observe the existence of one or two change-points close to the time period ranging from 600 to 620, which corresponds to approximately the period of time between the years 1996 and 1998.

For modeling the number of violation occurrences of drought events in Campinas city, we consider a point process to count violations. Let \(N\left( t\right) \) be the cumulative number of violations that are observed during the interval \(\left( 0,T\right) \) and assume that \(N\left( t\right) \) is modeled by a non-homogeneous Poisson process (NHPP). To have the intensity function \(\lambda \left( t\right) =\frac{\partial m\left( t\right) }{\partial t}=\frac{\partial E\left[ N\left( t\right) \right] }{\partial t}\) where \(m\left( t\right) \) is the mean value function, a monotonic function of t, we could consider different parametrical forms introduced in the literature (Cox and Lewis 1966; Musa and Okumoto 1984; Musa et al. 1987). Bayesian inference for non-homogeneous Poisson processes has been discussed by many authors (Kuo and Yang 1996; Pievatolo and Ruggeri 2004; Cid and Achcar 1999).

Alternatively, in many applications, we could have the presence of change-points for the intensity function of non-homogeneous Poisson processes. In this situation, we have interest to get inference on these change-points where the non-homogeneous Poisson process changes. Bayesian methods have been used by different authors to get inference for change-points models (Matthews and Farewell 1982; Achcar and Bolfarine 1989; Carlin et al. 1992; Dey and Purkayastha 1997; Achcar and Loibel 1998). Raftery and Akman (1986) considered a Bayesian analysis for Homogeneous Poisson Processes (HPP) in the presence of a change-point. Ruggeri and Sivaganesan (2005) introduces a Bayesian analysis for change-points in non-homogeneous Poisson processes considering Power Law Processes (PLP). Applications of non-homogeneous Poisson processes in presence or not of change-points are introduced by Achcar et al. (2008, 2010) to analyse ozone pollution data.

In this paper, we consider the use of Markov Chain Monte Carlo (MCMC) methods (Gelfand and Smith 1990; Smith and Roberts 1993; Chib and Greenberg 1995) to develop a Bayesian analysis for the change-point in NHPP assuming a popular class of models commonly used in software reliability or operating safety policy: the PLP processes.

The paper is organized in the following way: in Sect. 2, we introduce the likelihood function in presence or not of change-points; in Sect. 3, we present a Bayesian formulation of the model; in Sect. 4, we present a Bayesian analysis of the SPI data from Campinas city; finally in Sect. 5, we present some discussion of the obtained results.

2 The likelihood function

Let \(\left\{ N_{t}\right\} \), \(t \ge 0\), be a non-homogeneous Poison process with mean value function \(m\left( t;\varvec{\theta }\right) \), where \(\varvec{\theta }\) is a vector of parameters, representing the expected number of events (Barlow and Hunter 1960) (here is the number of drought months to time t). The full characterization of a process of this type is achieved by specification of the functional form of \(m\left( t;\varvec{\theta }\right) \), or equivalently, of its intensity function \(\lambda \left( t\right) = \frac{\partial m\left( t;\varvec{\theta }\right) }{\partial t}\).

We consider the use of a special case of NHPP’s: the power law process (PLP) defined (Crow 1974), by the mean value function,

$$\begin{aligned} m_{PLP} \left( t;\varvec{\theta }\right) = \left( \frac{t}{\sigma }\right) ^{\alpha }, \quad \alpha ,\sigma >0, \end{aligned}$$
(1)

where \(\varvec{\theta }=\left( \alpha ,\sigma \right) \). The intensity function of this processes is given by,

$$\begin{aligned} \lambda _{PLP}\left( t;\varvec{\theta }\right) = \left( \frac{\alpha }{\sigma }\right) \left( \frac{t}{\sigma }\right) ^{\alpha -1}. \end{aligned}$$
(2)

From 2, we observe that the intensity function \(\lambda _{PLP}\left( t;\varvec{\theta }\right) \) has a flexible behavior for the PLP according to the value of \(\alpha \). The intensity function is increasing for \(\alpha >1\), decreasing for \(\alpha <1\) and constant (i.e. the NHPP is actually a Homogeneous Poisson Process) for \(\alpha =1\).

This process is considered to model the epochs of occurrence of drought violations up to time T. The data set is denoted by \(D_{T}=\left\{ n;t_{1},\ldots ,t_{n};T\right\} \) where n is the number of observed occurrence time \(t_{i}\), which we suppose to be ordered, that is, \(0<t_{1}<\cdots <t_{n}<T\). Letting \(N\left( t\right) \) denote the observable number of violation occurrences in \(\left( 0,t\right] \), considered to be modeled as an NHPP, \(N\left( s+t\right) -N\left( s\right) \) given \(\varvec{\theta }\) has a Poisson distribution \(P\left[ m\left( s+t;\varvec{\theta }\right) -m\left( s;\varvec{\theta }\right) \right] \) for \(t>0\) and independent increments.

The likelihood function for \(\varvec{\theta }\) considering the time truncated model is given by (Cox and Lewis 1966),

$$\begin{aligned} L\left( \varvec{\theta };D_{T}\right) =\left[ \prod _{i=1}^{n} \lambda \left( t_{i}\right) \right] \exp \left[ -m\left( T\right) \right] . \end{aligned}$$
(3)

When the counting process undergoes changes over the time range \(\left( 0,T\right) \), we have a single change-point \(\tau \) making a transition between two NHPP’s processes. In this way, the intensity function of the overall process is given by,

$$\begin{aligned} \lambda \left( t;\varvec{\theta }\right) =\left\{ \begin{array}{l} \lambda _{1}\left( t\right) ,\quad 0\le t \le \tau \\ \lambda _{2}\left( t\right) ,\quad t>\tau \end{array} \right. . \end{aligned}$$
(4)

where \(\lambda _{j}\left( t\right) = \lambda \left( t;\varvec{\theta }_{j}\right) \), \(j=1,2\) is related with the intensity functions defined in (2) and \(\varvec{\theta }=\left( \alpha _{1},\sigma _{1},\alpha _{2},\sigma _{2},\tau \right) \). Equivalently, letting \(m_{j}\left( t\right) =m\left( t;\varvec{\theta }_{j}\right) \), the corresponding mean value function is given by,

$$\begin{aligned} m\left( t;\varvec{\theta }\right) =\left\{ \begin{array}{ll} m_{1}\left( t\right) ,&{}\quad 0\le t \le \tau \\ m_{1}\left( \tau \right) +m_{2}\left( t\right) -m_{2}\left( \tau \right) , &{}\quad t>\tau \end{array} \right. . \end{aligned}$$
(5)

Observe that considering PLP models in the presence of a change-point, the intensity function (4) is given by,

$$\begin{aligned} \lambda \left( t;\varvec{\theta }\right) =\left\{ \begin{array}{ll} \left( \frac{\alpha _{1}}{\sigma _{1}}\right) \left( \frac{t}{\sigma _{1}}\right) ^{\alpha _{1}-1},&{}\quad 0\le t \le \tau \\ \\ \left( \frac{\alpha _{2}}{\sigma _{2}}\right) \left( \frac{t}{\sigma _{2}}\right) ^{\alpha _{2}-1},&{}\quad t>\tau \end{array} \right. . \end{aligned}$$
(6)

with corresponding mean value function (5) given by,

$$\begin{aligned} m\left( t;\varvec{\theta }\right) =\left\{ \begin{array}{ll} \left( \frac{t}{\sigma _{1}}\right) ^{\alpha _{1}},&{}\quad 0\le t \le \tau \\ \\ \left( \frac{\tau }{\sigma _{1}}\right) ^{\alpha _{1}}+\left( \frac{t}{\sigma _{2}}\right) ^{\alpha _{2}}-\left( \frac{\tau }{\sigma _{2}} \right) ^{\alpha _{2}},&{}\quad t>\tau \end{array} \right. . \end{aligned}$$
(7)

Considering the data \(D_{T}=\left\{ n;t_{1},\ldots ,t_{N\left( \tau \right) },t_{N\left( \tau \right) +1},\ldots ,t_{n};T\right\} \) for model (4)–(5), the likelihood function for \(\varvec{\theta }\) is given by,

$$\begin{aligned} L\left( \varvec{\theta };D_{T}\right)= & {} \left[ \prod _{i=1}^{N\left( \tau \right) }\lambda _{1}\left( t_{i}\right) \right] \exp \left[ -m_{1} \left( \tau \right) \right] \left[ \prod _{i=N\left( \tau \right) +1}^{N\left( T\right) } \lambda _{2}\left( t_{i}\right) \right] \nonumber \\&\times \, \exp \left[ -m_{2} \left( T\right) +m_{2}\left( \tau \right) \right] , \end{aligned}$$
(8)

where \(N\left( s\right) \) stands for the number of failures in the interval \(\left( 0,s\right] \). Observe that (3) is a special case of (8) when there is not a change-point (e.g., \(\tau =T\)).

In presence of two change-points, \(\tau _{1}\) and \(\tau _{2}\) making a transition between three NHPP’s processes, the intensity function of the overall process is given by,

$$\begin{aligned} \lambda \left( t;\varvec{\theta }\right) =\left\{ \begin{array}{ll} \left( \frac{\alpha _{1}}{\sigma _{1}}\right) \left( \frac{t}{\sigma _{1}} \right) ^{\alpha _{1}-1},\quad 0\le t \le \tau _{1} \\ \\ \left( \frac{\alpha _{2}}{\sigma _{2}}\right) \left( \frac{t}{\sigma _{2}} \right) ^{\alpha _{2}-1},\quad \tau _{1}\le t \le \tau _{2} \\ \\ \left( \frac{\alpha _{3}}{\sigma _{3}}\right) \left( \frac{t}{\sigma _{3}} \right) ^{\alpha _{3}-1},\quad t>\tau _{2} \end{array} \right. . \end{aligned}$$
(9)

where \(\varvec{\theta }=\left( \alpha _{1},\sigma _{1},\alpha _{2},\sigma _{2},\alpha _{3},\sigma _{3},\tau _{1},\tau _{2}\right) \), with corresponding mean value function given by,

$$\begin{aligned} m\left( t;\varvec{\theta }\right) =\left\{ \begin{array}{ll} \left( \frac{t}{\sigma _{1}}\right) ^{\alpha _{1}},&{}\quad 0\le t \le \tau _{1} \\ \\ \left( \frac{\tau _{1}}{\sigma _{1}}\right) ^{\alpha _{1}}+ \left( \frac{t}{\sigma _{2}}\right) ^{\alpha _{2}}-\left( \frac{\tau _{1}}{\sigma _{2}}\right) ^{\alpha _{2}},&{}\quad \tau _{1}\le t \le \tau _{2} \\ \\ \left( \frac{\tau _{1}}{\sigma _{1}}\right) ^{\alpha _{1}}+ \left( \frac{t}{\sigma _{3}}\right) ^{\alpha _{3}}-\left( \frac{\tau _{2}}{\sigma _{3}}\right) ^{\alpha _{3}}+ \left( \frac{\tau _{2}}{\sigma _{2}}\right) ^{\alpha _{2}}- \left( \frac{\tau _{1}}{\sigma _{2}}\right) ^{\alpha _{2}},&{}\quad t>\tau _{2} \end{array} \right. . \end{aligned}$$
(10)

Assuming that the data is observed up to a total time T, where the epochs of occurrence of cases are denoted by \(t_{i}\), \(i=1,\ldots ,n\), \(0<t_{1}<t_{2}<\cdots <t_{N\left( \tau _{1}\right) }<t_{N\left( \tau _{1}\right) +1}<\cdots <t_{N\left( \tau _{2}\right) }< t_{N\left( \tau _{2}\right) +1}<\cdots <t_{n}<T\), the likelihood function for \(\varvec{\theta }\) in the presence of two change-points \(\tau _{1}\) and \(\tau _{2}\) is given by,

$$\begin{aligned} L\left( \varvec{\theta }\right)= & {} \left[ \prod _{i=1}^{N\left( \tau _{1} \right) }\lambda _{1}\left( t_{i}\right) \right] e^{-m_{1}\left( \tau _{1}\right) } \left[ \prod _{i=N\left( \tau _{1}\right) +1}^{N\left( \tau _{2}\right) } \lambda _{2}\left( t_{i}\right) \right] \nonumber \\&\times \, e^{-m_{2}\left( \tau _{2}\right) +m_{2}\left( \tau _{1}\right) } \left[ \prod _{i=N\left( \tau _{2}\right) +1}^{N\left( T\right) } \lambda _{3}\left( t_{i}\right) \right] e^{-m_{3}\left( T\right) +m_{3}\left( \tau _{2}\right) }. \end{aligned}$$
(11)

3 A Bayesian analysis

In this section, we introduce a Bayesian analysis for the PLP model. Posterior summaries of interest are obtained using Markov Chain Monte Carlo (MCMC) methods. For a Bayesian analysis of the proposed model, we first assume NHPP’s not in the presence of a change-point. We assume a uniform distribution \(U\left[ a,b\right] \) in the interval \(\left[ a=0,b=100\right] \) for the parameters \(\alpha \) and \(\sigma \) to have approximately non-informative priors. We further assume prior independence among the parameters.

The joint posterior distribution for \(\varvec{\theta }\) given the data \(D_{T}\) is,

$$\begin{aligned} p\left( \varvec{\theta }\mid D_{t}\right) \propto p\left( \varvec{\theta }\right) L\left( \varvec{\theta }\mid D_{t}\right) \end{aligned}$$
(12)

where \(p\left( \varvec{\theta }\right) \) denotes the joint distribution and \(L\left( \varvec{\theta }\mid D_{t}\right) \) is the likelihood function given in (3). Simulated samples for the joint posterior distribution for \(\varvec{\theta }\) are obtained using standard MCMC methods, that is, from the full conditional posterior distributions \(p\left( \theta _{i}\mid \theta _{1},\theta _{2},\ldots ,\theta _{i-1},\theta _{i+1},\ldots ,\theta _{n},D_{T}\right) \) for \(i=1,\ldots ,n\) (Gelfand and Smith 1990). A great simplification is obtained using the library R2jags (Su and Yajima 2015) in software R (R Core Team 2015), where we only need to specify the joint distribution for the data and the prior distributions for the parameters.

In a second stage for the Bayesian analysis, we assume the PLP model in the presence of a change-point \(\tau \). In this case, we assume an uniform prior distribution for the change-point in the interval \(\left( 0,T\right) \), and uniform non-informative prior distributions for the parameters \(\alpha _{1}\), \(\alpha _{2}\), \(\sigma _{1}\) and \(\sigma _{2}\), that is, \(U\left[ 0,100\right] \). We further assume prior independence among the parameters.

In presence of two change points, we assume uniform non-informative priors for the change points parameters \(\tau _{1}\sim U\left[ 400,600\right] \) and \(\tau _{2}\sim U\left[ 600,T\right] \), and uniform non-informative priors \(U\left[ 0,100\right] \) for the parameters \(\alpha _{1}\), \(\alpha _{2}\), \(\alpha _{3}\), \(\sigma _{1}\), \(\sigma _{2}\) and \(\sigma _{3}\).

4 Analysis of the SPI data from Campinas

Consider the precipitation data of Campinas corresponding to the drought occurrences for \(SPI-1\), \(SPI-3\), \(SPI-6\) and \(SPI-12\) \(\left( SPI \le -1.0\right) \) from January 01, 1947 to May 01, 2011. This corresponds to a total of \(T = 770\) months. The Bayesian analysis is made considering three cases. In the first case, we consider NHPP’s with PLP intensity function without the presence of one change-point. In the second case, we consider the same NHPP’s but now assuming the presence of a change-point. In the third case, we consider the same NHPP’s but now assuming the presence of two change-points.

The selection of the best model is made using some existing Bayesian adequacy measures such as the Deviance Information Criterion (DIC) (Spiegelhalter et al. 2002) which is an approximation for the Bayes factor. Smaller values of DIC indicate better models. We also discriminate the models by comparing plots of the accumulated numbers of \(SPI\le -1.0\) in each time with the estimated mean value functions versus time of occurrence. The Bayesian analysis for all models was made using the library R2jags (Su and Yajima 2015) in software R (R Core Team 2015). Convergence of the Gibbs sampling algorithm was monitored by usual time series plots for the simulated samples and also using some existing Bayesian convergence methods considering different initial values (Gelman and Rubin 1992).

4.1 NHPP model not in the presence of change-points

Assuming the PLP model, we consider uniform non-informative prior distributions for the parameters \(\alpha \) and \(\sigma \) considering the four SPI series (\(SPI-1\) with \(n=137\) observations, \(SPI-3\) with \(n=124\) observations, \(SPI-6\) with \(n=116\) observations and \(SPI-12\) with \(n=111\) observations). A burn-in sample of size 10, 000 was considered to eliminate the effect of the initial values. The posterior summaries of interest and the Monte Carlo estimate for DIC based on 2000 simulated Gibbs samples (by taking every 100th simulated value) are given in Table 2.

Table 2 Posterior means (standard deviation) for \(\alpha \) and \(\sigma \) (PLP–NHPP not presence of change points)

4.2 NHPP model in the presence of one change-point

Assuming the PLP model in presence of one change-point, we also consider uniform non-informative prior distributions for the parameters \(\alpha _{1}\), \(\alpha _{2}\), \(\sigma _{1}\), \(\sigma _{2}\) and \(\tau \). A burn-in sample of size 10, 000 was considered to eliminate the effect of the initial values. The posterior summaries of interest and the Monte Carlo estimate for DIC based on 2000 simulated Gibbs samples (by taking every 100th simulated value) are given in Table 3.

Table 3 Posterior means (standard deviation) for \(\alpha _{1}\), \(\alpha _{2}\), \(\sigma _{1}\), \(\sigma _{2}\) and \(\tau \) (PLP–NHPP presence of one change point)

4.3 NHPP model in the presence of two change-points

Assuming the PLP model in presence of two change-points, we also consider uniform non-informative prior distributions for the parameters \(\alpha _{1}\), \(\alpha _{2}\), \(\alpha _{3}\), \(\sigma _{1}\), \(\sigma _{2}\), \(\sigma _{3}\), \(\tau _{1}\) and \(\tau _{2}\). A burn-in sample of size 10, 000 was considered to eliminate the effect of the initial values. The posterior summaries of interest and the Monte Carlo estimate for DIC based on 2000 simulated Gibbs samples (by taking every 100th simulated value) are given in Table 4.

Table 4 Posterior means (standard deviation) for \(\alpha _{1}\), \(\alpha _{2}\), \(\alpha _{3}\), \(\sigma _{1}\), \(\sigma _{2}\), \(\sigma _{3}\), \(\tau _{1}\) and \(\tau _{2}\) (PLP–NHPP presence of two change points)

From the obtained Monte Carlo estimates for the Deviance Information Criterion (DIC) given in Tables 2, 3 and 4, we observe smaller values of DIC assuming the PLP model in presence of two change-points, that is an indication of better fitted model for the SPI data set. We also observe better fit of the PLP model in presence of two change-points by comparing plots of the accumulated number of \(SPI\le -1.0\) in each time with the estimated mean value functions versus time of occurrence (see Fig. 3).

Fig. 3
figure 3

Plots of the accumulated number of \(SPI\le -1.0\) in each time with the estimated mean value functions versus time of occurrence. a \(SPI-1\), b \(SPI-3\), c \(SPI-6\) and d \(SPI-12\)

5 Discussion of the results

The use of non-homogeneous Poisson processes assuming a specified parametrical form for the intensity function could be very useful to analyze precipitation data of cities throughout the world and explain some atypical drought event periods. This data would provide information about the epochs of occurrence of drought event violations of environmental standards during a specific period of time, as is the case of the SPI measures (1, 3, 6 and 12-month timescales) for the Campinas city, Brazil.

With the use of Markov Chain Monte Carlo methods, we obtain in a simple way the posterior distribution summaries of quantities of interest. The use of the library R2jags in software R gives a great simplification in the computational work to simulate the samples from the posterior distributions of interest.

A common problem with precipitation or other environmental data is the presence of one or more change-points, possible due to climate changes in the last years. In this case, we also observe that better NHPP models could be of great use. It is interesting to point out that other parametrical forms for the intensity function in the NHPP could be used in a similar way as it was considered using PLP intensity forms.

Important interpretations could be given from the obtained results. Considering the PLP–NHPP in presence of two change points (the best fitted model), we observe from the results of Table 4, that: \(\left( i\right) \) The credibility intervals for \(\alpha _{j}\), \(j=1,3\) in the four SPI measures (\(SPI-1\), \(SPI-3\), \(SPI-6\) and \(SPI-12\)) cover the value 1, thus behaving similarly to a homogeneous PP, that is, a constant rate for a corresponding exponential distribution for the inter-arrival times for the SPI measure threshold \(L=-1.0\) (a normal standard of rainfalls) whereas \(\alpha _{2}\) does not cover the value 1, with Bayesian estimates close to the value 2 for all SPI measures, an increasing intensity (2) which indicates an atypical behavior of rainfall for the period between the two change-points (larger period of droughts). This means that for the period ranging from the two estimated change-points given respectively by the approximations (four SPI measures) 560th and 620th months, corresponding to the period ranging from February, 1996 to August, 1998 we have many occurrences of the SPI measures less than the threshold \(L=-1.0\) (similarly to a reliability deterioration model used in reliability studies). \(\left( ii\right) \) It is interesting to observe that the Bayesian estimate for \(\sigma _{1}\) is close to 6.4 and that the Bayesian estimates for \(\sigma _{2}\) and \(\sigma _{3}\) are close to 56. Although it is hard to interpret this parameter, we could find estimates of the mean value function (10) for each specified value of time t.

It is important to point out that other modeling approaches could be considered for the climate dataset as the spline model introduced in Ramsay and Silverman’s (2006) Functional Data Analysis, but this modeling and comparison of results could be the goal of a future paper. Finally, it is important to point out that the class of models used for the analysis of the rainfall dataset of Campinas could be used in many other applications with ecological or environmental data as air or water pollution data, temperature changes, level of sea changes and many other applications.