1 Introduction

Currently, there is great concern that climate change coupled with human-induced environmental degradation is a major threat to contemporary water resources management in the world (Githui et al. 2009; Huang et al. 2012). Many studies have indicated that global warming alters the patterns of rainfall resulting into more frequent extreme weather events such as droughts and floods (Zhang et al. 2009). Climate models predict that climate change is expected to increase the risk of drought in some areas of the world and the risk of extreme precipitation and flooding in others (IPCC 2007).

Kenya has frequently witnessed prolonged and severe droughts leading to electric power and water rationing with negative impacts on the economy (The World Bank 2011). Power rationing and reduction in water supplies (especially to Nairobi and the environs) have been attributed to a reduction in available surface water resources in the TRB (Nakaegawa and Wachana 2012). Moreover, increasing water demands lead to conflicts among competing water users that are mostly pronounced during drought periods (Hisdal and Tallaksen 2003; Santos et al. 2010). Studies have shown that drought in the country affects more than 3.7 million people and its combined economic impact and related shocks run into 100 s of millions of dollars (The World Bank 2011). Recently, there have been debates on the apparent increase in intensity and frequency of drought and its possible causes. Even though projections by climate models depict a scenario of increase in rainfall in the East African region (IPCC 2007), recent studies (e.g., Rowell et al. 2015; Lyon and Dewitt 2012; Shongwe et al. 2011; Liebmann et al. 2014; Lyon 2014; Ongoma and Chen 2017) have shown a decline in temporal and spatial distribution of rainfall in the region which has led to increased intensity and frequency of drought. Some authors have attributed the drought condition to climate change effects caused by the emission of anthropogenic greenhouse gases into the atmosphere (Williams and Funk 2011). This may increase the uncertainty in the availability of surface water in Kenya especially because of the existing water scarcity problem coupled with the high demand for agriculture, industry, and the burgeoning population (Nakaegawa and Wachana 2012). Hence, the assessment of climate variability in terms of dry and wet events may contribute to a more prudent monitoring of climate-related risks and develop appropriate adaptation and mitigation strategies (Hayes et al. 2005).

The TRB is a vital resource for socio-economic development of Kenya and for the sustenance and preservation of ecological systems in the basin. It supplies about 95% of the water needs of the capital city of Nairobi and contributes 65–75% of the hydropower production in the country (Baker et al. 2015; Oludhe et al. 2013; Jacobs et al. 2007; Gichuki and Vigerstol 2014). Recently, the government has put in place significant development targets for hydropower, domestic water provision, and irrigated agriculture in the basin as part of the Vision 2030 development plan (Baker et al. 2015). However, the basin has been experiencing frequent hydrological extremes in terms of droughts and floods which have been attributed to climate change and environmental degradation (Nakaegawa and Wachana 2012). A recent study by Kerandi et al. (2016) concluded that precipitation over the TRB has generally been decreasing since the 1997/1998 El Niño rains in Kenya, with 2011 to 2014 recording below normal mean annual precipitation. This means that the increasing water demand in the TRB coupled with climate variability is exerting a lot of pressure on the water resources of the basin. Therefore, it is crucial for water resource managers and developers to understand how climate variability, on both long and short timescales, will affect water availability. This further reinforces the need for studies that will attempt to understand the possible consequences of climate change processes on the future availability of water resources in the basin and to determine the current relationship between climate variability and water resources. In this regard, the significance of a drought index such as the SPEI would be its applicability in drought monitoring, risk assessment and planning, and management of water resources of the TRB.

There are a number of quantitative drought indices that have been proposed by the research fraternity for assessing the severity of droughts in regard to water resources planning and risk assessment (Burke et al. 2006; Gao et al. 2017). Among these indices is the recently developed SPEI which is an improved drought index well suited for studies of the effect of global warming on drought severity (Vicente-Serrano et al. 2010). The SPEI is good at detecting, monitoring, and exploring the consequences of global warming on drought conditions (Dubrovsky et al. 2009). The SPEI is superior to other widely used drought indices and it is capable of identifying the role of evapotranspiration and temperature variability with regard to drought assessment in the context of global warming (Vicente-Serrano et al. 2010; Potop et al. 2012; Wu et al. 2016). Furthermore, it is now well known that although the primary cause of drought is rainfall deficiency, temperature plays an important role in initiating drought (Shi et al. 2017; Chen and Sun 2015). Due to this, the SPEI has been recommended as an alternative to other drought indices to quantify anomalies in accumulated climatic water balance, incorporating potential evapotranspiration (Stagge and Tallaksen 2014). A number of studies have investigated the occurrence of drought using the SPEI. Yu et al. (2014) calculated the SPEI based on monthly precipitation and air temperature values and found an increase in severe and extreme droughts for the whole of China. Potop et al. (2012) used the SPI and SPEI to study the evolution of secular drought from 1901 to 2010 in the lowland regions of the Czech Republic. Their study explored the relationship between extreme dry and extreme wet episodes with vegetable yields in the lowland regions of the Czech Republic and found that more than 40% of the months during the vegetable-growing season can be affected by moderate and severe drought. Lorenzo-Lacruz et al. (2010) applied the SPI and SPEI to analyze the influence of climate variation on the availability of water resources in the headwaters of the Tagus River basin. Among their findings was that the responses in river discharge and reservoir storage were slightly higher when based on the SPEI rather than the SPI, indicating that although precipitation had a major role in explaining temporal variability in the analyzed parameters, the influence of temperature was not negligible. Chen and Sun (2015) in their study of drought characteristics over China found differences in the estimates of SPEI calculated using different evapotranspiration parameterizations. Vicente-Serrano et al. (2012) compared the performance of several drought indices for ecological, agricultural, and hydrologic studies and concluded that SPEI was the best index to capture the effects of summer droughts. Moreover, a recent study on temperature and precipitation variability over the East African region by Ongoma and Chen (2017) showed that there has been an increase in temperature in the region with significant positive changes observed from the year 1992.

The goal of this study is to determine the historical and recent climatic variability in the TRB based on the assessment of spatial and temporal variability of dry and wet events in the basin. Specifically the study will (1) analyze the temporal and spatial evolution and variation of the SPEI over the basin, (2) use the PCA to identify the spatial and temporal patterns of dry and wet events, and (3) apply continuous wavelet transform to identify cycles of dry and wet events in the temporal patterns. To the best of our knowledge, no other study of this kind has been conducted in the region and thus it will go a long way in contributing to the body of literature in the region. Further, the study is intended to be used for drought monitoring, risk assessment, and water resources planning. The major challenge to prudent risk management approaches is the lack of reliable and updated climatological records that are suitable for risk analysis (Raziei et al. 2011; Hayes et al. 2005). Data availability is the most important limitation that hinders establishing a drought monitoring and early warning system in developing countries (Worqlul et al. 2014). To overcome this obstacle, this study used high spatial resolution gridded datasets from the GPCC (rainfall) and the CRU (temperature) for the analysis of dry and wet events in the TRB.

2 Data and methodology

2.1 Study area

The TRB lies between the latitudes 0° 0′ 53″ S and 3° 0′ 00″ S and between the longitudes 37° 00′ 00″ E and 41° 00′ 00″ E. The basin is marked by a complex terrain with a very steep topography which rises from sea level in the coastal plains along the Indian Ocean to over 5000 m above sea level in Mount Kenya (Fig. 1). The size of the basin ranges between 100,000 and 126,000 km2 (Baker et al. 2015). The Tana River which forms the basin is the longest river in Kenya at approximately 1000 km. Its headwaters are on Mt. Kenya and the Aberdare Range and winds through a densely forested ecosystem to agricultural and rangeland areas and ultimately discharging into the Indian Ocean (Okazawa et al. 2009). The river is a vital resource for the socio-economic development of the country and is important for the sustenance and preservation of ecological systems in the basin. The basin is the source of about 95% of the water needs of the capital city of Nairobi and is also a major source of hydropower and irrigation-fed agriculture production in Kenya (Baker et al. 2015). The basin has a varied climate ranging from humid in the highlands to arid and semi-arid in the lowlands, with a close correlation of elevation and climatic zones. Rainfall characteristics are primarily influenced by topography and the proximity to the Indian Ocean (Kerandi et al. 2016). The long-term annual average and long-term monthly average distribution of rainfall and long-term monthly average minimum and maximum temperature in the basin is given in Fig. 2.

Fig. 1
figure 1

The map of the study area showing the topography, the location of the meteorological stations, and the sampled grid cells in the TRB

Fig. 2
figure 2

The spatial and temporal patterns of annual mean rainfall, minimum, and maximum temperature over the basin (1960–2013)

2.2 Data

This study utilized the monthly gridded rainfall data at 0.5° × 0.5° resolution for 54 years (1960–2013) covering the entire TRB acquired from the Global Precipitation and Climate Center (GPCC) version 7 (http://gpcc.dwd.de). The temperature data that was used for calculating potential evapotranspiration was acquired from the CRU dataset based on a similar resolution as the rainfall data. The GPCC provides an updated and globally gridded precipitation estimate extracted from surface rain gauge observations with a minimum of 90% data availability over the years 1951–2000. The GPCC Reanalysis (V7) for the period 1901 to 2013 is based on quality-controlled data from all stations in GPCC’s database. This product is optimized for best spatial coverage and is recommended for water-budget studies (Schneider et al. 2014). The gridded reanalysis data were adopted as a substitute to the scarce rain gauge climatological data. A simple evaluation of the GPCC and CRU datasets with the observed rainfall data measured at six (6) World Meteorological Organization synoptic stations within the basin showed that GPCC had higher positive correlation of above 0.9 in all the stations and therefore the GPCC data was used for rainfall, and CRU data was solely used for temperature-based derivation of potential evapotranspiration. Gridded rainfall data have been widely used in various hydro-climatological analyses in different parts of the world (Rajeevan et al. 2006; Caramelo and Orgaz 2007; Jury 2010; Raziei et al. 2011; Mahfouz et al. 2016). Jury (2010) used 0.5° × 0.5° gridded precipitation data for the period between 1901 and 2007 from the GPCC dataset to study decadal climate variability in Ethiopia. Likewise Wagesho et al. (2013) successfully used the GPCC data to determine the temporal and spatial variability of annual and seasonal rainfall over Ethiopia. In this study, the gridded data was extracted from 43 grid cells within the basin as shown in Fig. 1.

2.3 Methodologies

2.3.1 Calculation of the SPEI

SPEI is a drought index based on precipitation and PET, which describes the degree of deviation of dry and wet conditions by standardizing the difference between PET and precipitation. It can describe water deficit effectively with multiple timescales, reflecting the lag relation between different water resources, precipitation, and evapotranspiration (Liu et al. 2015). The SPEI is calculated using the difference between monthly (or weekly) precipitation and potential evapotranspiration data which is aggregated over the time period and fitted to a probability distribution function (Vicente-serrano et al. 2010). The difference between precipitation and evapotranspiration (i.e., moisture deficit) can be negative and is commonly so in semi-arid and arid regions and therefore, a three-parameter distribution is needed to model the deficit values (Hernandez and Uddameri 2014). The log-logistic three-parameter distribution is commonly applied as it fits the extreme values better, and the fitted cumulative probability density function is transformed to the standard normal distribution, which is also the SPEI (Vicente-Serrano et al. 2010; Zambreski 2016). Positive values of SPEI indicate above average moisture conditions while negative values indicate below normal (dry) conditions.

In this study, the SPEI values are calculated for 6- and 12-month timescales for each grid cell. Calculations were performed using the “SPEI package” available in R-program (Beguería and Vicente-Serrano 2013). The calculation of the SPEI is briefly described as follows;

  1. (1)

    Calculate the difference between precipitation and PET on monthly basis (Eq. 1);

$$ {D}_j={P}_j- PE{T}_j $$
(1)

The PET was calculated using the Hargreaves equation which has limited data requirements and does not suffer the inherent limitations of the Thornthwaite equation and it performs relatively close to the standard FAO PM equation (Beguería et al. 2014; Droogers and Allen 2002; Hargreaves and Allen 2003).

  1. (2)

    The next step is to calculate the accumulated difference between precipitation and PET at different timescales. The accumulated difference \( \left({X}_{i,j}^k\right) \) at the k-month timescale is calculated using Eq. 2;

$$ \Big\{{\displaystyle \begin{array}{c}{X}_{i,j}^k=\sum \limits_{l=13-k+j}^{12}{D}_{i-1,l}+\sum \limits_{l=1}^j{D}_{i,j}\\ {}{X}_{i,j}^k=\sum \limits_{i=j-k+1}^j{D}_{i,j}\end{array}}\kern0.5em {\displaystyle \begin{array}{c} if\kern0.5em j\kern0.5em \vartriangleleft \kern0.5em k\\ {}\begin{array}{cccc} if& j& \ge & k\end{array}\end{array}} $$
(2)

where \( {X}_{i,j}^k \) is the accumulated difference between precipitation and the PET at the k-month timescale in the j-month of the i-th year; Di, l is the monthly difference between the precipitation and the PET in the l-month of the i-th year.

  1. (3)

    Normalize the \( {X}_{i,j}^k \) data sequence. Because there may be negative values in the original data sequence \( {X}_{i,j}^k \), therefore, the SPEI uses the three-parameter log-logistic probability distribution (Vicente-Serrano et al. 2010). For the data sequence of all timescales, the accumulative function of the log-logistic probability distribution F(X) is as given in Eq. 3;

$$ F(X)={\left[1+{\left(\frac{\alpha }{x-\gamma}\right)}^{\beta}\right]}^{-1} $$
(3)

where α, β, and γ are scale, shape, and position parameters, respectively, which can be calculated using the equations proposed by Vicente-Serrano et al. (2010).

p is the probability of a definite \( {X}_{i,j}^k \) value:

$$ p=1-F(X) $$
(4)

If p ≤ 0.5,

$$ w=\sqrt{-2\ln p} $$
(5)
$$ SPEI=w-\frac{C_0+{C}_1w+{C}_2{w}^2}{1+{d}_1w+{d}_2{w}^2+{d}_3{w}^3} $$
(6)

If p > 0.5,

$$ w=\sqrt{-2\ln \left(1-p\right)} $$
(7)
$$ SPEI=\frac{C_0+{C}_1w+{C}_2{w}^2}{1+{d}_1w+{d}_2{w}^2+{d}_3{w}^3} $$
(8)

where C0 = 2.515517, C1 = 0.802853, C2 = 0.010328, d1 = 1.432788, d2 = 0.189269, and d3 = 0.001308 (Gao et al. 2017).

The calculated values of the SPEI are classified as shown in Table 1 and are used to analyze for the characteristics of dry and wet events in the basin in terms of the duration, severity, intensity, and frequency of occurrence of dry and wet events. The duration of an event is the length of time (months) that the SPEI is consecutively at or below a truncation level. In this study, the threshold used for the SPEI ≤ − 1 for dry event and SPEI ≥ 1 for wet event. The frequency is the number of months that the SPEI value meets a set value (Table 1) divided by the number of months in the entire series. The severity and intensity were calculated for all the grid cells sampled according to Zambreski (2016) as shown below:

  1. (i)

    Severity is the cumulative sum of the index value based on the duration extent

Table 1 Classification of the severity of dry and wet events based on the calculated SPEI
$$ S=\sum \limits_{i=1}^{Duration} Index $$
(9)
  1. (ii)

    Intensity of an event is the severity divided by the duration. Events that have shorter duration and higher severities will have large intensities.

$$ I=\frac{Severity}{Duration} $$
(10)

2.3.2 Principal component analysis

The principal component analysis (PCA) is a common way of identifying patterns in climatic data and expressing the data in such a way as to highlight their similarities and differences (Santos et al. 2010; Zhao et al. 2012). It is basically a data reduction method, which explains the correlation among several random uncorrelated variables in terms of a small number of underlying factors or principal components without extreme loss of information. This study uses the PCA to capture the spatial patterns of co-variability of dryness/wetness based on SPEI series at each grid cell. The original inter-correlated SPEI variables at different grid cells areXi, 1, Xi, 2,….., Xi, k where k is the number of the grid cells in the basin (=43) and i represents the length of SPEI series at each grid cell. The principal components (PCs) are produced for the same time Yi, 1, Yi, 2, … . , Yi, k using linear combinations of the first ones according to Eq. 9;

$$ \left\{\begin{array}{c}{Y}_{i,1}={a}_{11}{X}_{i,1}+{a}_{12}{X}_{i,2}+\dots +{a}_{1k}{X}_{i,k}\\ {}{Y}_{i,2}={a}_{21}{X}_{i,1}+{a}_{22}{X}_{i,2}+\dots +{a}_{2k}{X}_{i,k}\\ {}\vdots \\ {}{Y}_{1,k}={a}_{k1}{X}_{i,k}+{a}_{k2}{X}_{i,2}+\dots +{a}_{kk}{X}_{i,k}\end{array}\right. $$
(11)

In the combination, the Y values are orthogonal and an uncorrelated variable, such that Yi, 1 explains most of the variance, Yi, 2 explains the remainder and so on. The coefficients of the linear combinations are called “loadings” and represent the weights of the original variables in the PCs (Santos et al. 2010). A detailed methodology of the PCA procedure is abundant in the literature (e.g., Santos et al. 2010; Bordi et al. 2004).

2.3.3 Wavelet transform analysis

Wavelet transform is a powerful method to characterize the frequency, intensity, time position, and duration of the variations in a climate data series by revealing the localized time and frequency information (Zhao et al. 2012). The wavelet transform can be used to analyze time series that contain non-stationary power at many different frequencies (Torrence and Compo 1998; Santos and Ideião 2005). The wavelet transform has been widely applied to the fields of climatic and hydrological changes. The wavelet transform was computed using the “WaveletComp” package available in R-program (Rosch and Schmidbauer 2014).

In the continuous wavelet transform, it is assumed that Xn is the time series with equal time interval ∆t (1 month in this study). One particular wavelet, the Morlet, is defined as presented in Eq. 10;

$$ {\psi}_0(n)={\pi}^{-1/4}{e}^{{i\omega}_0\eta }{e}^{-{\eta}^2/2} $$
(12)

where ω0 is the dimensionless frequency and η denotes the non-dimensional time frequency (Torrence and Compo 1998; Zhao et al. 2012). The continuous wavelet transform of a discrete sequence Xn is defined as the convolution of Xn with a scale and translated version of φ0(n):

$$ {W}_n(s)=\sum \limits_{n^{\prime}}^{N-1}{X}_n{\psi}^{\ast}\left[\frac{\left({n}^{\prime }-n\right)\delta t}{s}\right] $$
(13)

where the asterisk represents the complex conjugate, s is the dilation parameter used to change the scale, and n is the translation parameter. By varying the wavelet scale s and translating along the localized time index n, one can construct a picture showing both the amplitude of any features versus the scale and how this amplitude varies with time (Torrence and Compo 1998).

The significance of wavelet power can be assessed relative to the null hypothesis that the signal is generated by a stationary process with a given background power spectrum. The distinctive red noise characteristics of the time series are modeled by a first-order auto-regressive process. Torrence and Compo (1998), and Lau and Weng (1995) give detail description of wavelet transform analysis. Significance testing of the wavelet transform is based on the assumption that the time series has a mean power spectrum and the significance of a peak in the wavelet power spectrum in relation to the background spectrum is used to determine the confidence level regarding the evaluation of potential periodicities (Hartmann et al. 2012). In this study, the periodicities were analyzed with a confidence level of 90%.

3 Results and discussion

3.1 Temporal evolution and frequency of occurrence of dry and wet events in the basin

The time series of the SPEI at 6- and 12-month timescales were calculated using a 54-year-long series of rainfall and temperature data (1960–2013) over 44 grid cells in the basin. Figure 3 shows the evolution of the SPEI for 6- and 12-month timescales averaged over all the grid cells in the TRB. The evolution of the mean SPEI over different parts of the TRB with similar climatic and physiographic features was computed and is shown in Fig. 4. It is evident that the shorter timescale (SPEI6) showed a higher temporal frequency of dry and wet events but the temporal frequency stabilizes for the 12-month timescale. This shows that the SPEI at a longer timescale responds more slowly and coherently to changes in monthly rainfall and temperature revealing clear periods of annual and multiple year dry and wet events. This means that longer timescales are better suited for the detection of historically significant events while shorter timescales show the frequent seasonal and inter-annual variations (Łabedzki 2007). It is seen that there are three contrasting periods in the evolution of dry and wet events for both the 6- and 12-month SPEI. The dry events dominated the period between 1960 and 1980 while the wet events were dominant from 1980 to the later part of the 1990s even though there was a mix of both dry and wet events between the late 1990s and 2000s (although dry events were evidently dominant). Both timescales showed that the 1970s had the longest duration of dry events (consecutive negative SPEI values) implying that the dry events were dominant. The SPEI was able to identify some of the documented major drought and flood episodes in Kenya (e.g., 1964–1965, 1973–1974, 1983–1984, 1999–2000 and 2009–2011 (for drought), and 1978–1979 and 1997–1998 (for floods)) (Wambua et al. 2015; Mwale and Gan 2004). The driest month for both timescales of SPEI was recorded in grid cell 31 which occurred in August 1971 while the wettest month occurred in March 1998 in grid cell 7. In general, the wettest period was between 1997 and 1998 in the entire basin clearly showing the effects of the El Niño rainfall that were experienced during that period. It can be seen from the variations in the occurrence of dry and wet events that the basin is prone to hydrological extremes in terms of both drought and floods during the study period. The SPEI is able to clearly indicate the onset and cessation of a dry and/or wet event and this is found to vary from one grid cell to the other and also from one timescale to the other. Tables 2 and 3 show the duration, severity, and intensity of occurrence of some of the major dry and wet events. The duration (persistence) of dry/wet event is given by the cumulative time that the SPEI is consecutively greater or less than a designated truncation value. In this study, the threshold value for dry event is SPEI ≤ − 1 and SPEI ≥ 1 for wet event. The severity of an event is the cumulative sum of the index value based on duration extent while the intensity of an event is the event severity divided by the event duration (Zambreski 2016) as given in Eqns. 9 and 10. The longest duration of dry events for SPEI6 was 25 months which was observed in grid cells 18 and 19 while for that for SPEI12 was 52 months observed in grid cell 19. The longest duration of wet events for SPEI6 was 23 months observed in grid cells 13 (May 1982–March 1984) and 39 (Dec 1988–Oct 1990) while for SPEI12, the longest duration was observed in grid cell 21 and 39 between November 1992 and January 1996. It can be seen in Table 2 that the most severe dry event for SPEI6 was observed in grid cell 31 while for SPEI12 it occurred in grid cell 19. The most intense event was observed in grid cell 31 for both SPEI6 and SPEI12. As shown in Table 3, the most severe wet event occurred in grid cell 39 between December 1988 and October 1990 and had a magnitude of 40.03. For SPEI12, the most severe wet event occurred in grid cell 21 (Nov 1992-Jan 1996) with a magnitude of 57.22.

Fig. 3
figure 3

The evolution of the mean SPEI for 6- and 12-month timescale over the TRB showing the variation in the duration, severity and intensity of dry and wet events

Fig. 4
figure 4

The evolution of the dry events (red color) and wet events (blue color) for SPEI6 (left panels) and SPEI12 (right panels) over (I) the highlands, (II) coastal, (III) northern and (IV) southern parts of the TRB

Table 2 The duration, severity and intensity of occurrence of some of the major dry events (SPEI ≤ − 1) in the basin
Table 3 The duration, severity, and intensity of occurrence of some of the major wet events (SPEI ≥ 1) in the basin

Usually, the extreme dry and wet events are the representative indicators of a changing climate and are the dominant factors that affect socio-economic and ecological development. The number of months in which the various categories of events (Table 1) occur during the study period is given in Table 4. It can be seen that on a given timescale, near normal and moderate events occur most frequently and extreme events occur least frequently. The number of extreme dry events ranged from 3 to 16 for SPEI6 and 0 to 19 for SPEI12. The severe dry events occurred in the range of 18 and 49 for SPEI6 and 23 and 48. The occurrence of extreme wet events ranged from 6 to 22 events for SPEI6 and from 7 to 19 events for SPEI12. The prevalence of the of the dry and wet events was investigated for each timescale based on the percentage occurrence of each event (within each category) for all the grid cells with respect to the total number of months in the same category and timescale. The aim is to identify areas that frequently encounter extreme and severe weather events at comparable timescales, based on their percentage occurrence. The frequency of occurrence is the number of months that the SPEI value attains a set threshold value (see Table 1) divided by the number of months in the entire series. The spatial pattern of the frequency of extreme dry and wet events and severe dry and wet events for the 6- and 12-month timescales is shown in Figs. 5 and 6. The frequency of extreme dry events for SPEI6 ranged from 0.16 (in grid cell 5 and 17) to 2.49% (in grid cells 11, 19 and 31) and 0 to 2.98% for SPEI12. As shown in Fig. 5, the frequency of extreme dry events for SPEI6 extended from the southwestern through the middle to northern part of the basin. The extreme wet events were mainly found in the southeastern part (stretching from the coast towards the interior) and the northeast part of the basin. The extreme dry events for SPEI12 showed a high frequency in the northern part of the basin while the extreme wet events had a higher frequency in the northeastern and southeastern parts of the basin. The frequency of occurrence of severe dry events ranged between 2.8 and 7.62% for SPEI6 and 3.61 to 7.54% for SPEI12 as shown in Fig. 6. For SPEI6, the severe dry events had a high frequency of occurrence in the eastern part of the basin while the severe wet events showed a higher frequency of occurrence in the southwestern part and a few isolated grids in the eastern part of the basin. The severe dry events for SPEI12 also affected the eastern part of the basin albeit on a larger scale than the SPEI6 events. The severe wet events showed a high frequency of occurrence in the southwestern and to a less extent, the southern parts of the basin.

Table 4 The frequency (number of months in which the different classifications occur) of dry/wet events of SPEI6 and SPEI12 for all the grid cells in the TRB
Fig. 5
figure 5

The spatial pattern of the frequency of extreme dry events (top panel) and extreme wet events (bottom panel) computed for the 6- and 12-month SPEI

Fig. 6
figure 6

The spatial pattern of the frequency of severe dry events (top panel) and severe wet (bottom panel) events computed for the 6- and 12-month SPEI

3.2 Spatial patterns of SPEI in the TRB

The SPEI series calculated for the 43 grid cells in the basin between 1960 and 2013 were used to analyze for spatial patterns of distribution of dry and wet events based on the principal component analysis (PCA). From the results of the PCA, the first two leading components which altogether accounted for 84% for SPEI6 and 87% for SPEI12, respectively, of the total explained variance were retained. The first principal component PC1 explains 73% and 79% for the two SPEI classes and has spatially homogenous negative values over the whole basin while the second loading showed both positive and negative values across the basin. The second principal component for both timescales explained a smaller percentage of variance (10.3% and 8.2% for SPEI6 and SPEI12 respectively) suggesting that this component represents more localized spatial patterns of SPEI (Santos et al. 2010). This means that it is possible to have extremely dry areas during specific periods, while other areas are wet. The spatial patterns of the factorial loadings obtained from the two PCAs for each timescale are presented in Fig. 7.

Fig. 7
figure 7

The spatial pattern of the first two dominant PCA loadings for SPEI6 (top) and SPEI12 (bottom) in the TRB

The first loading for SPEI6 characterizes the southeastern and the western part of the basin whereas the loading for SPEI12 highlights the middle part. The southeastern part is the area adjoining the Indian Ocean and the western part of the basin encompasses regions of high elevation. These two parts of the basin receive relatively higher rainfall compared with other parts of the basin. The PC1 loading for SPEI12 shows high values in the middle part of the basin. This region is mainly arid and semi-arid with low rainfall and high temperatures. The corresponding PC scores of the loadings for the two timescales are given in Fig. 8. The associated PC1 scores for both timescales describe the temporal behavior of the SPEI in the basin. The scores show a pattern of downward (negative) trend from 1960 to 1995 and thereafter the series stabilizes. The negative values of the PC1 loadings may mean that these regions of the basin have been affected by more frequent dry events. The second PC loading for both timescales is mainly representative of the eastern part of the basin. The second PC scores for SPEI6 show high frequency oscillation without any noticeable trend from 1960 to around 1992 followed by a noticeable upward trend from 1992 to 2000 and henceforth a downward trend to the end of the period. The second PC scores for SPEI12 similarly did not feature any noticeable pattern from 1960 to 1992. Thereafter, there was a noticeable downward trend from 1992 to 2002 and finally a sharp upward trend.

Fig. 8
figure 8

The temporal pattern of PCA scores corresponding to the first two dominant loading for SPEI6 and SPEI12 in the TRB

3.3 Wavelet transform of SPEI in TRB

Wavelet transform analysis of the PC scores was performed to show the inter-annual variability of the SPEI in the basin. The wavelet power spectra of the first two PC scores for the two timescales (SPEI6 and SPEI12) are shown in Fig. 9. In this study, the level of significance (confidence) was set to 90%. It can be seen from Fig. 9 that the wavelet power spectra of PC1 scores for both timescales show similar features. The wavelet transform of the first PC score for both timescales shows significant periodicity between 1 and 2 years between 1960 and 1968, 1970 and 1973, and 1980–1985. There is also a stable inter-decadal higher frequency oscillation at about 2–7 years between 1963 and 1980. In addition, there exists a stable frequency between 1995 and 2010 with oscillations of about 2 to 7 years. Significant periodicities of about 2 to 3 years are distributed in the spectrum of the scores of the second PC and are found between 1960 and 1985 and also 2000 and 2005. Similar results have been found albeit by some studies that undertook a spectral analysis of annual and seasonal rainfall variability over East Africa. A study on trend and periodicity for annual rainfall over East Africa by (Rodhe and Virji 1976) revealed major peaks centered on 2–2.5, 3.5, and 5.6 years. Ogallo (1982) showed the existence of three major peaks, centered on the Quasi-Biennial Oscillation (QBO) of 2.5–3.7 years, ENSO of 4.8–6 years, and the sunspot cycle of 10–12.5 years. Nicholson and Nyenzi (1990) observed a strong quasi-periodic fluctuation in the East African rainfall with a timescale of 5–6 years corresponding to the ENSO and sea surface temperatures (SSTs) fluctuations in the equatorial Indian and Atlantic Oceans. A 2–3 year period means that a year with positive SPEI is followed by a year with negative SPEI and then either immediately by a year with positive SPEI or by another year with negative SPEI and then a year with positive SPEI (Hartmann et al. 2012). In addition, a relatively high wavelet power of PC2 of SPEI12 occurred from about 1980 to 1990.

Fig. 9
figure 9

The continuous wavelet transform of the first two loadings of SPEI6 (top) and SPEI12 (bottom) PC1 and PC2 scores

4 Discussion and conclusion

In this study, the characteristics of the extreme weather events over the TRB were investigated based on the 6- and 12-month SPEI for the period beginning 1960 to 2013. It is now accepted that global warming has altered the patterns of rainfall resulting more frequent extreme weather events such as drought and floods. Granted that the TRB ecosystem is highly vulnerable to climate variability and environmental degradation, the SPEI was considered as the most appropriate index because of its ability to account for potential impacts of climate change in the basin. The spatial and temporal evolution of dry and wet events is captured by both the 6- and 12-month SPEI. It is shown that dry conditions were predominant in the 1960s to 1980s and 2000 to 2013 while wet conditions were experienced between 1980 and 1998. The pattern of the temporal evolution of dry/wet events in the basin can be due to the influence of the high variability of seasonal and annual rainfall in the East African region. Some studies (e.g., Lyon and Dewitt 2012; Shongwe et al. 2011(Lyon 2014)(Lyon 2014)(Lyon 2014)) have suggested that the increase in the frequency of drought conditions post 1998 in the East African region is due to multi-decadal variability of SSTs in the tropical Indian and Pacific oceans. Lyon and Dewitt (2012) observed that the short rains (October–December) in the region showed a robust relationship with the El Niño-Southern Oscillation (ENSO) on the seasonal to interannual timescale while the long rains are mostly linked to sea surface temperature (SSTs) anomalies. Therefore, the high variability of rainfall attributed to La Niña, El Niño, and SSTs could occasion rainfall anomalies leading to decline (or increase) in total seasonal and/or annual rainfall in the basin. For example, the La Niña events significantly contributed to the occurrence of persistent dry events in Kenya in 2010/2011 while El Niño events of 1997 and 1998 caused extreme wet events during that period (Kisaka et al. 2015). This could be the reason for the spatial heterogeneity and non-synchronous occurrence of dry/wet episodes over the basin, where some extreme episodes were witnessed in specific periods over the basin while there were no dry/wet events corresponding to these episodes across other regions of the study domain. The occurrence of long episodes of dry and wet events in the basin is an indication of its susceptibility to climate variability and change.

The PCA analysis revealed a spatial heterogeneity in the 6- and 12-month SPEI variations over the investigated period, with the first two main loadings explaining 84% and 87% of the variability across the basin. In particular, the first loading of the 6-month timescale characterized the southeastern coastal lowlands and the western region of the basin while that for 12-month timescale highlighted the middle part of the basin. These patterns are related to distinct geographical areas and are associated with distinct temporal variations. The leading component of the 12-month SPEI shows that there is a strong relationship between the rainfall distribution and dry/wet events in the region, where the most significant patterns of dry/wet events were specifically found over the coastal and highland regions, while the less significant patterns were observed over the arid lowlands of the basin. This spatial inhomogeneity of dry/wet events in the basin can be viewed in the context of climate diversity, strong rainfall seasonality, complex topography, and the effects of the Indian Ocean. The long rain (March–May) season is more pronounced in the highlands region while the coastal region usually experiences more enhanced short rains (October–December). The spectral analysis of the PCA loadings shows the influence of these effects on the periodicity of dry/wet events. A periodicity of 2.3 years was reported by Kabanda and Jury (1999) to be dominating the interannual cycle in rainfall variability over Northern Tanzania.

It is evident that the SPEI is a useful tool for analyzing the spatial and temporal pattern of dry and wet conditions in the TRB. Several studies have shown that the index is a valuable tool in the planning and decision-making processes as well as in drought and flood mitigation. The capability of the SPEI in identifying the beginning and end of dry/wet episodes and their spatial variability makes it a potential tool for monitoring hydrological conditions and drought/flood risks given the advantage that it can be calculated at multiple timescales. Although gridded datasets are useful tools for climate variability studies in data scarce regions like Kenya and can be used to complement the scarce observation data, there is need for further studies that utilize high-quality observation data so as to minimize the uncertainty problem in analysis. Furthermore, the availability of credible high-quality data will enable exhaustive studies on the effect of temperature change and the atmospheric circulations on the spatial and temporal variability of dry and wet events in the basin. This is because empirical studies (e.g., Vicente-Serrano et al. 2010) have shown that although rainfall is the main variable determining drought/floods conditions, the rise in temperature has important effects on the severity of dry/wet events.