1 Introduction

The amount of soil-water available to crops depends on rainfall onset, length and cessation which influence the success/failure of a cropping season (Ngetich et al. 2014). Understanding climatic parameters, rainfall in particular, offers a critical step towards improving the socioeconomic well-being of smallholder farmers and optimal agricultural productivity. This is particularly important in Sub-Saharan Africa (SSA) where agricultural productivity is principally rain-fed which is highly variable (Jury 2002). Drier parts of Kenya’s Central Highlands, eastern Kenya continue to experience high unpredictable rainfall patterns, persistent dry spells/droughts coupled with high evapotranspiration (2000–2300 mm year−1) (Micheni et al. 2004). Generally, there is enough rainwater on the annual total; however, it has been reported to be poorly re-distributed over time (Kimani et al. 2003) with 25 % of the annual rain often falling within a couple of rainstorms. Consequently, crops suffer from water stress, often leading to complete crop failure (Meehl et al. 2007). Recha et al. (2011) noted that most studies do not provide information on the much-needed character of within-season variability despite its critical influence on soil-water distribution and productivity.

There has been continued interest in understanding rainfall’s seasonal patterns by evaluation of its variables including rainfall amount, rainy days, lengths of growing seasons and dry spell frequencies (e.g. Mugalavai et al. 2008; Ngetich et al. 2014. Studies by Sivakumar (1991), Seleshi and Zanke (2004) and Tilahun (2006) noted high variations in annual and seasonal rainfall totals and rainy days in Ethiopia and Sudano-Sahelian regions. Studies on rainfall patterns in the region have been based principally on annual averages, thus missing on within-season rainfall characteristics (Barron et al. 2003). However, understanding the average amount of rain per rainy day and the mean duration between successive rain events aids in understanding long-term variability and patterns (Akponikpè et al. 2008). Nonetheless, most meteorological stations in the Kenya’s Central Highlands, which are sole sources of climatic data, are only limited to single locations spatially. In Sub-Saharan Africa, the predominant setbacks in analysing hydro-meteorological events are occasioned by either lack, inadequate, or inconsistent meteorological data. Like in most other places, the rainfall data within the drier parts of Embu county and the neighbouring stations are scarce with missing data making their utilization quite intricate.

Geographic information systems (GIS) and modelling have become critical tools in agricultural research and natural resource management (NRM), yet their utilization in the study area is quite minimal and inadequate. Utilization of GIS spatial-interpolation techniques such as inverse distance weighted (IDW), spline and kriging interpolation techniques are some of the applications exhausted in the ArcGIS tool essential for data reconstruction. Most data on climatic variables (rainfall, temperature) are collected from point sources. However, spatial array of these point data permits for a more precise estimation of the value and properties of events at the ungauged sites through interpolation. The value of data between two gauged points is interpolated by fitting an appropriate model to account for the anticipated variation. The principal issue is the selection of the interpolation approach for any given set of input data (Burroughs and McDonald 1998) that will determine the accuracy of the output. This is true for areas where collection of data is sparse and the measurements for the given variables differ extensively even at somewhat reduced spatial scales. Kriging is a geostatistical gridding and flexible technique that has proven useful and popular in many fields and is supported by the ArcGIS software. This technique generates visually appealing maps from intermittently spaced data. Kriging attempts to convey the trends produced by data, so that, for instance, high points being joined along a ridge rather than be isolated by bull’s eye form of contours. The kriging defaulting can be established to produce a perfect grid of the data or it can be custom fit to a data set, by specifying the fitting variogram replica. Kriging can either be exact or a smoothing interpolator. This depends on the user-specified parameters during data input. It integrates anisotropy as well as the underlying trends in an efficient and natural way (Yan et al. 2005). Unlike the other interpolation techniques supported by the ArcGIS Spatial Analyst, kriging utilizes an interactive analysis of the spatial trends of the events represented by the z-values before selecting the accurate estimation technique for spawning the output surface.

IDW interpolation overtly implements the premise that things that are close to each other are more identical than those that are farther apart. Thus, predictably, values close to the gauged point have predominant influence on the generated value on assumption that the gauged value has a local influence which diminishes with distance. Philip and Watson (1987) argue that the technique weights the points that are close to the estimate point greater than those farther away hence the name IDW. Spline technique estimates values via a mathematical function which minimizes general surface curvature, resulting into an even surface that interconnects all the input points. Conceptually, the gauged points are extruded up to the height of their magnitude. This implies that the technique curves the interpolated surface over which the input points pass and at the same time minimizing the overall warp of the surface to generate output points.

To aid in understanding spatio-temporal occurrence and patterns in agro-climatic variables (e.g. rainfall), accurate and inexpensive quantitative approaches such as GIS modelling and availability of long-term data are essential. Most meteorological data in the study area are inconsistent, unrecorded, or missing, leading to more discrete and unreliable data for analysis besides the main stations themselves being several kilometres from the target area. This calls for use of data reconstruction through interpolation.

On the other hand, the much-needed information on inter-/intra-seasonal variability of rainfall in the region is still inadequate despite its critical implication on soil-water distribution, water use efficiency (WUE), nutrient use efficiency (NUE) and final crop yield. To optimize agricultural productivity in the region, there was need to quantify rainfall variability at a local and seasonal level as a first step of combating extreme effects of persistent dry spells/droughts and crop failure. Since rainfall which is heterogeneous, in particular, is the most critical factor determining rain-fed agriculture, knowledge of its statistical properties derived from long-term observation could be utilized in developing optimal mitigation strategies in the area. To redress problems of inadequate, missing and inconsistent point data especially for ungauged areas within the study area, this study sought to further evaluate the efficacy of geostatistical and/or deterministic interpolation techniques in daily rainfall data reconstruction.

2 Materials and methods

2.1 The study area

The study was conducted in the drier parts of Kenya’s Central Highlands, in Embu County. This region lies in the lower midland 3, 4 and 5 (LM 3, LM 4 and LM 5), upper midland 1, 2, 3 and 4 (UM 1, UM 2, UM 3 and UM 4) and inner lowland 5 (IL 5) (Jaetzold et al. 2007) at an altitude of approximately 500 to 1800 m above sea level (a.s.l) (Fig. 1).

Fig. 1
figure 1

Map showing the study area and its elevation with studied point gauged rainfall data: Machang’a, Embu and Kiritiri

It has an annual mean temperature ranging from 14.4 to 27.5 °C (from Embu station increasing towards Mbeere stations with a range of 12.1 to 33.3 °C), average annual rainfall of 700 to 900 mm and a range of 500 to 1400 mm. It has a population density of 82 persons per km2 with an average farm size less than 5.0 ha per household. Embu represent a densely populated high potential humid area with humic nitosols soils and generally annual rainfall above 800 mm. Conversely, areas of the sub-humid Mbeere sub-county are emblematic of a low agricultural potential with less fertile and low soil water-holding ferralsols, frequent droughts and annual rainfall of less than 600 mm (Jaetzold, et al. 2007). However, Mbeere sub-county continues to experience population pressure occasioned by the influx of immigrants from the over-populated high potential areas such as Embu. These areas represent Kenya’s Central Highlands and those of East Africa, predominant of smallholder rain-fed, non-mechanized agriculture and diminutive use of external inputs. Generally, the rainfall is bimodal with long rains (LR) from March to May and short rains (SR) from mid-October to December hence two potential cropping seasons per year. Various agricultural studies have been carried out in the region hence the rationale behind its selection. According to (Mugwe et al. 2009), the region has experienced drastic decline in its productivity potential rendering most farmers resource poor. The prime cropping activity is maize intercropped with beans though livestock keeping is equally dominant. Mbeere sub-county represents a sub-humid climate region, with annual average rainfall of 781 mm while Embu is more humid with annual average rainfall above 1210 mm (Table 1).

Table 1 Selected metadata of the meteorological stations used in the study

This region is a strategic production region, producing about 20 % of the country’s maize cover (Ngetich et al. 2014). The inherently fertile nitosols in Embu are the reasons for high-potential productivity while lower and erratic rainfall, less fertile, shallow and sandy Ferralsols in Mbeere region and high drought frequency explain predominant crop failures (Jaetzold et al. 2007).

2.2 Rainfall data

The rainfall data were from five rainfall stations: Machang’a, Kiritiri, Kiambere and Kindaruma (herein commonly referred to as Mbeere region) and Embu (Embu). Secondary daily rainfall data were sourced from both the Kenya Meteorology Department (KMD) and research sites with primary recording stations within the study area. Primary dailies were recorded in Machang’a station since 2000, without any missing data gaps, owing to ongoing experimental trials in the area. Thus, Machang’a was treated as reference station for selection of other stations in the region. The other stations had data sets of over 15 years with missing data of less than 10 %. In addition, agro-ecological zoning of the stations was considered during selection. The KMD regularly sends the raw data to the Centre for Climate Systems Modeling (C2SM) through MeteoSwiss and EMPA bodies for quality check, control and assurance before the data is forwarded to the World Data Centre for archiving and availability to the scientific community (KMD 2015). During this study, the dailies were further subjected to homogeneity testing to evaluate whether they came from the same population. Summarily, the choice of rainfall stations used depended on availability of the station, the agro-ecological zones and the percentage of missing data (less than 10 % for a given year as required by the world meteorological organization (WMO)).

2.3 Data analyses

Daily primary and secondary rainfall time series were captured into MS Excel spreadsheet where seasonal rainfall totals for short rains (SR), long rains (LR), annual average and number of rainy days were computed. In cases of high data gaps (unrecorded or missing), multiple imputations were utilized to fill in missing daily data through creation of several copies of data sets with different possible estimates. This method was preferred to single imputation and regression imputation as it appropriately adjusted the standard error for missing data yielding complete data sets for analysis (Enders 2010). Being a season-based analysis, the cumulative impact of rainfall amount was underpinned. A rainy day was considered to be any day that received more than 0.2 mm of rainfall as reported by the WMO. Daily rainfall data were captured into the RAINBOW software (Raes et al. 2006) for homogeneity testing based on cumulative deviations from the mean to check whether numerical values came from the same population. The cumulative deviations were then rescaled by dividing the initial and last values of the standard deviation by the sample standard deviation values (Eq. 1).

$$ \begin{array}{ll}{S}_k={\displaystyle {\sum}_{i=1}^k\left({X}_i-\overline{X}\right)}\hfill & \mathrm{when}\kern0.75em k = 1,\dots, n\hfill \end{array} $$
(1)

where S k is the rescaled cumulative deviation (RCD), n represents the period of record for K = 1 and also when K = 14

The maximum (Q) and the range (R) of the rescaled cumulative deviations from the mean were evaluated based on number of nil values, non-nil values, mean and standard deviations as well as K-S values (Eqs. 2 and3) to test homogeneity. Low values of Q and R would indicate that data was homogeneous.

$$ Q= max\left[\raisebox{1ex}{${s}_k$}\!\left/ \!\raisebox{-1ex}{$s$}\right.\right] $$
(2)
$$ R= max\left[\raisebox{1ex}{${s}_k$}\!\left/ \!\raisebox{-1ex}{$s$}\right.\right]- min\left[\raisebox{1ex}{${s}_k$}\!\left/ \!\raisebox{-1ex}{$s$}\right.\right] $$
(3)

where Q is maximum (max) of S K and R in the range of S K and min is minimum.

The frequency analyses were based on lognormal probability distribution with log10 transformation using cumulative distribution function (CDF) for both LR and SR rainfall amounts. The Weibull method was used to estimate probabilities while the maximum likelihood method (MOM) was utilized as a parameter estimation statistic. Homogeneous seasonal rainfall totals for both seasons were then subjected to trend and variability analyses based on rainfall anomaly index (RAI) as described in (Tilahun 2006).

Seasonal variability was computed in tandem with annual averages for both positive (Eq. 5) and negative (Eq. 6) anomalies using RAI.

$$ RAI=+3\left(\frac{RF-{M}_{RF}}{M_{H10}-{M}_{RF}}\right). $$
(5)
$$ RAI=-3\left(\frac{RF-{M}_{RF}}{M_{L10}-{M}_{RF}}\right) $$
(6)

where M RF is mean of the total length of record, M H10 is mean of 10 highest values of rainfall of the period of record and M L10 is the lowest 10 values of rainfall of the period of record.

The coefficient of variance (coefficient of variation) statistics were utilized to test the level of mean variations in LR and SR seasonal rainfall, number of rainy days (RDs) and rainfall amounts (RAs) and independent ttest statistic to evaluate the significance of variation.

A dry day was taken as a day that received either less than 0.2 mm or no rainfall at all. A dry spell was considered as sequence of dry days bracketed by wet days on both sides (Kumar and Rao 2005). The method for frequency analysis of dry spells was adapted from Belachew (2000) as follows: in the Y years of records, the number of times (i) that a dry spell of duration (t) days occurs was counted on a monthly basis. Then, the number of times (I) that a dry spell of duration longer than or equal to t occurs was computed through accumulation. The consecutive dry days (1, 2, 3 days …) were prepared from historical data. The probabilities of occurrence of consecutive dry days were estimated by taking into account the number of days in a given month n. The total possible number of days, N, for that month over the analysis period was computed as, N = n × Y. Subsequently, the probability p that a dry spell may be equal to or longer than t days was given by Eq. 7: The probability q that a dry spell not longer than t does not occur at a certain day in a growing season was computed by Eq. 8; and probability Q that a dry spell longer than t days will occur in a growing season was calculated by Eq. 9 and probability p that a dry spell exceeding t days would occur within a growing season was computed by Eq. 10 as shown below:

$$ P=\raisebox{1ex}{$I$}\!\left/ \!\raisebox{-1ex}{$N$}\right. $$
(7)
$$ q=\left(1-p\right)=\left[1-\frac{1}{N}\right] $$
(8)
$$ Q={\left[1-\frac{1}{N}\right]}^n $$
(9)
$$ p=\left(1-Q\right)=1-{\left[1-\frac{1}{N}\right]}^n $$
(10)

ArcGIS software tool combined with the digital elevation model (DEM) to generate average spatial rainfall and maps using various interpolation techniques were utilized for data re-construction purposes. The stepwise methodology is summarized in Fig. 2.

Fig. 2
figure 2

Flow chart showing stepwise interpolation and data reconstruction analyses (Adopted from ESRI 2010)

The efficacy of interpolation techniques was assessed using mean absolute errors (MAEs) (Eq. 11), root mean square errors (RMSE) (Eq. 12), prediction error (P e ) (Eq. 13) and coefficient of determination (R 2) statistics plus validation using gauged rainfall data.

$$ MEA=\frac{1}{n}{\displaystyle {\sum}_{i=1}^n\left({P}_i-{O}_i\right)} $$
(11)
$$ RMSE=\sqrt{\frac{1}{n}{\displaystyle {\sum}_{i=1}^n{\left({P}_i-{O}_i\right)}^2}} $$
(12)
$$ Pe=\frac{\left( Pi-Oi\right)}{Oi}X100 $$
(13)
$$ {R}^2=\frac{{\left[\frac{1}{n}{\displaystyle {\sum}_{n=1}^i}\left({O}_i-{O}^{-}\right)\left({S}_i-{S}^{-}\right)\right]}^2}{{\displaystyle {\sum}_{i=1}^n}{\left({O}_i-{O}^{-}\right)}^2{\displaystyle {\sum}_{i=1}^n}{\left({S}_i-{S}^{-}\right)}^2} $$
(xiv)

where P i and O i are the predicted and observed or measured rainfall values. The P and O are the respective means of these values, and n is the number of observations.

3 Results and discussion

3.1 Homogeneity testing

Homogeneity analyses had no nil-values (values below threshold) but 100 % non-nil values (above threshold) showing high homogeneity. The standard deviations (SDs) of the normalized means for both LR and SR rainfall amounts were low, e.g. lowest SD = 0.1 (in Embu and Kiritiri during SRs) and highest (SD = 0.9 in Embu during LRs. Low SD values indicated the restriction of variations (RCD) around mean rainfall amounts indicating high homogeneity (Table 2).

Table 2 Mean, standard deviation and R 2 values for the rainfall dailies from study stations for the period between 2001 and 2013

The Kolmogorov-Smirnov (K-S value) test values, R-square for the seasonal rainfall and the values of the average rainfall means are summarized in Table 3.

Table 3 Homogeneity test for the rainfall dailies from study stations for the period between 2000 and 2013

A plot of homogeneity of the average seasonal rainfall dailies for the stations studied showed deviations from the zero mark of the RCDs not crossing probability lines. In this regard, homogeneity was accepted at 99 % probabilities (Fig. 3).

Fig. 3
figure 3

Rescaled cumulative deviations for seasonal months and studied rainfall stations for the period between 2000 and 2013

There was a normal distribution of the sampled-temporal rainfall data with high goodness of fit (R 2 = 92 to 96 %). This showed continuity of the data from mother primary data indicating high homogeneity (Raes et al. 2006). Kolmogorov-Smirnov values (one-sided sample K-S test) showed K-S values (0.15 to 0.23) consistently lower than the K-S table value (0.302) for n = 14 at α = 0.005 probability indicating that an exponential, continuous distribution of the studied data sets was statistically acceptable, based on the empirical cumulative distribution function (ECDF) derived from the largest vertical difference between the extracted (observed K-S value) and the table value (Botha et al. 2007; Mzezewa et al. 2010; MATLAB Central 2013). Frequency analyses of meteorological data require that the time series be homogenous in order to gain in-depth and representative understanding of the trends over time (Raes et al. 2006). Often, non-homogeneity and lack of exponential distributions between data sets indicate gradual changes in the natural environment (thus trigger variability) which corresponds to changes in agricultural production (Huff and Changnon 1973; Bayazit 1981).

3.2 Probabilities of rainfall exceedance, return periods and amounts

Results showed that there was at least 90 % chance of rainfall exceeding 141.5 mm (lowest) and 258.1 mm (highest) during LRs in Machang’a and Embu, respectively, within a return period of about 1 year (Tables 4 and 5). Nonetheless, there were observably low probabilities (10 %) that rains would exceed 449.8 and 763.0 mm during LR seasons in Machang’a and Embu, respectively, for a 10-year return period (Table 4).

Table 4 Probability of rainfall exceedance and return periods for the LRs and SRs in the study area
Table 5 Probability of average seasonal months’ rainfall exceedance and return periods for the LRs and SRs in Mbeere sub-county

Conversely, probabilities of monthly rainfall during cropping seasons exceeding cropping threshold were equally low, e.g. 5 % probability to exceed 419 mm in April and 331 mm in November (Table 5).

A study by Mzezewa et al. (2010) established that seasonal rainfall amount greater than 450 mm is indicative of a successful growing season and described it as a threshold rainfall amount. During this study, the probabilities that seasonal rainfall would exceed this threshold were quite low (at most 30 % for a return period of 3.33 years). Embu, being much wetter, would probably (50 %) receive above threshold rainfall amount (506.8 mm) after every 2 years (Tables 4 and 5). Mzezewa et al. (2010) observed 47 % chance of seasonal rainfall exceeding 580 mm but 0 % (no increase) of exceeding total annual rainfall for a 5-year return period in the semi-arid ecotope of Limpopo South Africa.

3.3 Variability and anomalies in seasonal rainfall amount

There was notable high inter-seasonal variability and temporal anomalies in rainfall between 2001 and 2013. Results showed neither station nor season with persistent near average (RAI = 0) rainfall especially from stations in the sub-humid region. For instance, in Machang’a, the wettest LRs were recorded in 2010 (RAI = +4) while wettest SRs were recorded in 2001 (RAI = +4), 2006 (RAI = +3.8) and 2011 (RAI = +4) (Fig. 4). In Embu, the highest positive anomalies (+5.0) were recorded in 2002, 2005 and 2007 during LRs (Fig. 4). Noticeably, Embu appeared to be receiving more near average rainfall during SRs (2002, 2003, 2007 and 2011) contrary to the trends observed in Mbeere region (Fig. 4). Variability in rainfall was generally low in Kiritiri.

Fig. 4
figure 4

Decadal rainfall anomaly index for both LR_MAM and SR_OND in Embu, Machang’a and Kiritiri; RAI rainfall anomaly index

Generally, stations in sub-humid areas of Mbeere sub-county recorded more negative anomalies in rainfall amount received compared to Embu. An intra-station seasonal comparison showed that SRs in Embu were less variable but more drier compared to LR seasons. Conversely, SRs in Mbeere region were wetter than SRs in Embu but more variable in the former. Assorted studies have cited unpredictability of LR seasonal rainfall patterns and farmers’ reliance on SRs (e.g. Cohen 1987; Shisanya 1990; Hutchinson 1996; Recha et al. 2011). According to Shisanya (1990), the failure of the LRs in 1984 in the whole country (Kenya) prompted the Kenyan government to launch a national relief fund among other responses. Reducing LRs were also reported by Recha et al. (2011) while studying rainfall variability in the upper eastern dry areas (Tunyai and Chiakariga). Akponikpè et al. (2008) also reported similar trends of high variability (coefficient of variation (CV) = 57 %) in temporal annual rainfall (mono-modal rainfall between February and September), in the Sahel region. Conversely, the incumbent study showed that the decade between 2000 to 2013 experienced marked increases in SRs and a decrease in LRs. Nicholson (2001) and Hulme (2001) attributed the decrease in LRs to the desiccation (drying out) of the March-to-August rains in SSA. A study by Tilahun (2006) based on the cumulative departure index established that parts of northern and central Ethiopia persistently received below average rainfall for the rains received between February and August since 1970. While studying vegetation dynamics based on the normalized difference vegetation index (NDVI), Tucker and Anyamba (2005) noted persistent droughts and unpredictable rainfall patterns marked by reduction in the NVDI values during LRs for periods approaching the twenty-first century. On the other hand, it was apparent that SRs recorded consistent above-average rainfall during this study, indicating possibilities of a reliable growing season especially for the drier Machang’a region. In tandem with this observation, findings by Hansen and Indeje (2004) and Amissah-Arthur et al. (2002) observed that SRs constituted the main growing season in the drier parts of SSA and Great Horne of Africa for crops such as maize, sorghum, green grams and finger millet. Ovuka and Lindqvist (2000) further observed an increasing SR amounts for the period 1963–1976 in Murang’a, sub-county, of central Kenya. Generally, high variability (often attributed to La Niña, El Niño and sea surface temperatures) could occasion rainfall failures leading to declines in total seasonal rainfall in the study area. According to Shisanya (1990), La Niña events significantly contributed to the occurrence of persistent droughts and unpredictable weather patterns during LRs in Kenya. In contrast, El Niño events (of 1997 and 1998) have been cited as the key inputs of the positive anomalies in SR seasonal rainfall in the ASALs of Eastern Kenya (Anyamba et al. 2001; Amissah-Arthur et al. 2002).

3.4 Variations in rainfall amounts and number of rainy days

On average, the total amount of rainfall received in all stations was below 900 mm (sub-humid stations) and 1400 mm (humid) per annum. Yet, LRs contributed 314.9 and 586.3 mm while SRs contributed 438.7 and 479.1 mm (Table 6) translating to a total of 754 and 1084 mm of seasonal rainfall in the respective station (Table 6).

Table 6 Variability analyses: coefficient of variations in seasonal rainfall amounts and number of rainy days in the study stations for the period between 2000 and 2013

These account for close to 90 % of total rainfall received annually, implying that smaller proportions of rainy days supplied much of the total amounts of rainfall received in the region. Evaluation of variability based on CV in RA and number of RDs showed that most stations received highly variable rainfall. It has been shown that a CV greater than 30 % in rainfall data series indicate massive variability in rainfall amounts and distributional patterns (Araya and Stroosnijder 2011). In Machang’a and Kiritiri, rainfall amounts during LRs were highly variable (CV = 0.41 and 0.39, respectively) than those in Embu (CV = 0.36). Variability was equally high in the number of RDs, e.g. CV = 0.51, in Kiritiri. Results also showed that LR and SR amounts were not significantly different from each other in most stations of Mbeere region but different in Embu (Table 6). Lack of notable significance in intra-seasonal rainfall amounts in the drier parts of Kenya (represented by Machang’a in this study) were also reported by Recha et al. (2011) while studying rainfall variability in the upper eastern dry areas (Tunyai and Chiakariga). These results indicate high variability of rainfall received across all AEZs in the study area, further evidenced by massive rainfall anomalies reported earlier by this study. Regionally, findings of Seleshi and Zanke (2004) further showed that annual and seasonal rainfall (Kiremt and Belg seasons) in Ethiopia were highly variable with CV values ranging between 0.10 and 0.50.

3.5 Monthly variations in seasonal rainfall amounts and number of rainy days

Results showed that rainfall amounts received within seasonal months (March-April-May LRs and October-November-December SRs) were highly variable (all with CV > 0.3).

Notably, CV-RA were quite high during the months of March (CV-RA = 0.98) and December (CV-RA = 0.86) in Machang’a and CV-RA = 0.61 March) (and CV-RA = 0.97 (December) in Embu (Table 7). CV-RD for each seasonal month was equally high in the two study stations. For instance, March (CV-RD = 0.61 and CV-RD = 0.47) and December (CV-RD = 0.34 and CV-RD = 83) had the highest variability in the number of rainy days in Machang’a and Embu, respectively (Table 7).

Table 7 Variability in rainfall amounts and number of rainy days during seasonal months for studied stations for the period between 2000 and 2013

Generally, onset months (March and October) and cessation months (May and December) received highly variable rainfall amounts compared to mid-seasonal months. Notably, Machang’a, though, being more of an arid region, it generally recorded lower variability in number of rainy days during SR seasonal months compared to those recorded at Embu during the same season, evidence of reduced variability and wetting of SRs in the region. In addition, it was evident that the amount of rainfall and number of rainy days received in the past decade in most stations were more consistent (temporally) in April and November but highly unpredictable in March (onset) and December (cessation). This significantly affects the cropping calendar in rain-fed agricultural productivity of the region. Nonetheless, lower values of CV-RDs indicated that variations in rainy days were fairly consistent compared to variations in rainfall amounts received. It would also appear that most stations in Mbeere region received more rainfall during SR season with November alone accounting for about 60 % of total seasonal rainfall amount received while April accounts for 51 % of the LR rainfall in the case of Machang’a. Conversely, Embu received more rainfall during LRs with April accounting for about 52 % of total rainfall received. These trends indicate that SR seasons would be receiving more rainfall amounts than LRs in the region, a trend acknowledged by most (67.3 %) smallholder farmers in SSA Amissah-Arthur et al. (2002) and Barron et al. (2003). Trends of high variability in seasonal monthly rainfall reported by this study have also been cited by Mzezewa et al. (2010) who reported high coefficient of variation for seasonal (315 %) and annual (50–114 %) rainfall in semi-arid Ecotope, north-east of South Africa. Mutai et al. (1998) observed high SR variability in Machakos Kenya, while Phillips and McIntyre (2000) reported low inter-annual variability during LRs attributing this to its insignificant relationship with ENSO. The ENSO is the most dominant perturbation responsible for inter-annual climate variability, especially SRs over eastern and southern Africa. Additionally, Sivakumar (1991) found that annual rainfall in the Sudano-Sahelian zone of West Africa was less variable (0.36) than monthly (0.54) rainfall.

3.6 Droughts and dry spell characterization

Results showed that the probability of occurrence of dry spells of various durations varied from month to month of the growing season. High probabilities of dry spells were in March (0.72 and 0.55) and December (0.8 and 0.6) in average sub-humid (Machang’a, Kiritiri) stations and humid (Embu), respectively (Fig. 5). The probability of having a dry spell increased with shorter periods (for instance, more chance of having a 3 than a 10 or 21 days of dry spell) (Fig. 5).

Fig. 5
figure 5

Probability of a dry spell of length ≥ n days, for n = 3, 5, 7, 15 and 21, in each seasonal-cropping month, based on raw rainfall data from 2000 to 2013 for studied humid and sub-humid stations

The probabilities that dry spells would exceed these day durations were equally high (Fig. 6). There was 70 % chance that dry spells would exceed 15 days in average Mbeere stations and 50 % in Embu (Fig. 6).

Fig. 6
figure 6

Probability of dry spells exceeding the n (3, 5, 7, 10, 15 and 21) days for each seasonal month calculated using the raw rainfall data from 2000 to 2013 for studied humid and sub-humid stations

Dry spells during cropping months are quite common that often trigger reduced harvests or even complete crop failures, in the study region. Rainfall being a prime input and requirement for plant life in rain-fed agriculture, the occurrence of dry spells has particular relevance to rain-fed agricultural productivity (Belachew 2000; Rockstrom and De Rouw 1997). It was observed that lowest probabilities of occurrence of dry spells of all durations were recorded in the month of April (during LRs) and November (during SRs). The occurrence of dry spells of all durations decreased from April towards May (LR) and November towards December (SRs). Indeed, the months of April and December coincides with the peak of rainfall amounts for both SR and LR growing seasons in the region (Kosgei 2008; Recha et al. 2011). This trend is in line with works reported by several studies in SSA, including Kosgei (2008), Aghajani (2007) in Iran and Sivakumar (1991) in East Africa. Dry spells during SR season in Makindu and Katumani stations in Kenya’s lower eastern parts recorded similar trends of high probabilities (averaging 88 %) in October Mutai et al. (1998). High probabilities of dry spells occurring and exceeding the same durations show the high risks and vulnerability that rain-fed smallholder farmers are predisposed to in the study area. Often, prolonged dry spells are accompanied by poor distribution and low soil moisture for the plant growth during the growing season. General high probabilities of persistent dry spells in SSA have been reported by Hulme (2001), Dai et al. (2004) and Mzezewa et al. (2010). This could be attributed to the persistence of intermediate warming scenarios in parts of equatorial East Africa (Hulme 2001; Mzezewa et al. 2010). Prolonged dry spells during cropping seasons directly impacts on the performance of crop production. For instance, high evaporative demand indicated by high aridity index (P > 0.52) in the drier parts of Eastern Kenya implies that rainwater is not available for crop use and cannot meet the evaporative demands (Kimani et al. 2003). Thus, deficit is likely to prevail throughout the rain seasons as observed in other SSA regions (Li et al. 2003). Run-off collection and general confinement of rainwater within the crop’s rooting zone could enhance rainwater use efficiency as demonstrated by Botha et al. (2007).

3.7 Spatial average rainfall interpolations (ArcGIS spatial analyst application)

Performance of the different interpolation techniques was varied. Kriging and spline techniques reported more representative values of observed rainfall when compared to the IDW method. Generally, kriging spatial interpolation capability for rainfall amounts was found to be high (predicting 670–742 mm for observed 800 mm) (Fig. 6). Evidently, lower eastern parts of the region received low rainfall amounts as interpolated across all the test methods (ranging from 229 to 397 mm), adequately replicating trends of the actual observed rainfall. Trends of the region receiving high rainfall at Siakago (1200 mm p.a.) were adequately predicted in kriging and IDW when compared to spline prediction (Fig. 7).

Fig. 7
figure 7

Annual rainfall maps of observed and those of reconstructed rainfall using IWM, kriging and spline interpolation techniques

Evaluation of the mean absolute error (MAE) and root mean square error (RMSE) between reconstructed interpolated) and observed rainfall data further showed that the kriging method (MAE = 147 mm and RMSE = 176.5 mm) would be the best-bet technique to adopt for rainfall interpolation for the region (Table 8).

Table 8 Mean absolute error, RMSE and R 2 values for the interpolation produced from validation of IDW, kriging and spline methods

Interpolations under IDW method was generally unsatisfactory (R 2 = 0.04) when compared to the spline (R 2 = 0.23) and kriging (R 2 = 0.67) interpolation method.

Figure 8 show the scatter plots of recorded versus predicted (interpolated) decadal average rainfall across the study stations based on kriging interpolation technique.

Fig. 8
figure 8

Comparison between recorded and ArCGIS kriging predicted average decadal rainfall amount across study stations. Error bars denote standard deviation of observed means, n = 13

A comparison of the predicted and recorded rainfall amounts showed further best-fit performance of the kriging interpolation technique in ArcGIS. Predictions in Machang’a recorded high values of best-fit (R 2 = 0.92) compared to Embu (R 2 = 0.76) which could be attributed to high missing data in the raw rainfall dailies in the latter station (Fig. 8).

Assorted arguments regarding the varied performances of the different interpolation techniques could explain the results of this study. Both the IDW and spline methods are deterministic methods since their predictions are directly based on the surrounding measured values or on specified mathematical formulas (Burroughs and McDonald 1998). On the other hand, kriging is a geostatistical method, which is based on statistical models that include autocorrelation, which underpins the statistical relationships among the measured and predicted data points (Heine 1986). Better prediction of the kriging method established in this study could be attributed to its capability of producing a prediction surface, thus providing a measure of the certainty or accuracy of the predictions. In this study, the resultant patterns of spatial distribution for each map were an outcome of the generated patterns from the mapping of the index value (the mean annual precipitation) and as influenced by the spatial local conditions (elevation) including the non-existence of altitudinal variability of the parameters of the distribution function and the interpolation methods used. Statistically, the spatial distribution of quantiles is theoretically better underpinned in kriging method than in the other methods tested. For this study, kriging was extended by the regional regression for each index value for areas whose terrain or other controls could have contributed to the spatial variability of the trends, explaining its better predictability.

4 Conclusion and recommendations

Results showed that available rainfall data series from study station are homogenous, thus records of same population. Before frequency analysis of the rainfall data is done, various transformations are essential for the data to follow particular probability distribution patters. Weibull method for estimating probabilities and MOM parameter estimation methods proved to be sufficient for the task, in evaluating data series homogeneity and frequency. Decadal rainfall trends showed that both LRs and average annual rainfall have decreased in the past 13 years in the region. Mbeere region appeared to have experienced pronounced declines in rainfall amounts especially those received during LRs. Nonetheless, rainfall amount during SRs markedly increased in most study stations, with high amount gains established in the Mbeere stations. Evidently, probabilities that seasonal rainfall amounts would exceed the threshold for cropping (500–800 mm) were quite low (10 %) in all stations. The amount of rainfall received during LRs and SRs varied significantly in Embu but not in Machang’a. There was evidence of increasing rainfall variability from Embu station towards Mbeere stations to as high as CV = 0.88 in Machang’a. Probabilities that the region would experience dry spells exceeding 15 days during a cropping season were equally high, e.g. 46 % in Embu and 87 % in Machang’a. This replicates high chances that soil moisture could be lost by evaporation bearing in mind the high chances (81 %) the same dry spells exceeding 15 days could reoccur during the cropping season. On the other hand, kriging technique was identified as the most appropriate (R 2 = 0.67). Geostatistical interpolation techniques that can be used in spatial and temporal rainfall data reconstruction in the region. Based on these findings, it is apparent that farmers in the lower eastern Mbeere region be encouraged to intensify cropping during SRs as compared to LRs. It is equally important that they schedule supplementary irrigation, only based on timely, regular and accurate dissemination daily monthly and seasonal forecasts by the Kenya Meteorological Department. High rainfall variability and chances of prolonged dry spells established in this study also demands that farmers ought to keenly select crop varieties and types that are more drought resistant (sorghum and millet) other than common maize cropping. For instance, probabilities of having dry spells exceeding 15 days is relatively high (63, 80 and 57 % for Machang’a, Kiritiri and Embu, respectively) during both SR and LR seasons. In this regard, the choice of crop variety and type should be based on the degree of its tolerance to drought. These decisions can be optimized if the probability of dry spells is computed after successful (effective) planting dates. There is need for establishing further precise, timely weather forecasting mechanisms and communication systems to guide on seasonal farming. In most arid and semi-arid regions, soil moisture availability is primarily dictated by the extent and persistency of dry spells. It is thus essential to match the crop phenology with dry spell length-based days after sowing to meet the crop water demands during the sensitive stages of crop growth. Knowledge of lengths of dry spells and the probability of their occurrence can also aid in planning for supplementary risk aversion strategies through prediction of high water demand spells.