1 Introduction

Drought has the most extensive impact among natural disasters occurring as a result of meteorological events. Drought is a natural phenomenon which causes water resources to be adversely affected and the hydrological balance deteriorates when precipitation falls significantly below normal. Meteorological drought has serious effects on agriculture, water resources and socio-economic conditions. Precise estimation of drought is important to reduce environmental damage. The standard precipitation index (SPI) which is one of the meteorological drought indices and developed by McKee et al. (1993) is the commonly used index which only needs precipitation data (Wang et al. 2015; Zarch et al. 2015; Paulo et al. 2016). However, it is seen that artificial intelligence methods, which have been frequently used to estimate hydrologic parameters in recent years, also give successful results in drought studies (Tripathi et al. 2006; Bacanli et al. 2009; Chen et al. 2010; Gocić et al. 2015; Tigkas et al. 2015). Mokhtarzad et al. (2017) developed artificial neural networks (ANNs), adaptive neural-based fuzzy inference system (ANFIS) and support vector machine (SVM) models to predict drought for Bojnourd meteorology station. They used temperature, humidity and seasonal precipitation values as input parameters and 3 months SPI values as output parameter for modelling. As a result of the developed models, they stated that SVM model gave better results than ANN and ANFIS models. Choubin et al. (2014) used eight climate indices, neuro-fuzzy model and SPI for forecasting drought conditions in Iran. They used the principal component analysis to identify these indices and the indices account for 81% of the variance. They suggested that neuro-fuzzy model could be used for drought forecasts. Jalalkamali et al. (2015) extracted an optimum model to forecast meteorological drought using SPI for Yazd Province in Iran. For this aim, they used multilayer perceptron ANN, ANFIS, SVM and ARIMAX multivariate time series. ARIMAX models demonstrated a better performance especially for a 9-month period, especially. Nguyen et al. (2017) searched the applicability of ANFIS to forecast drought by using SPI and standardized precipitation evapotranspiration index (SPEI) in Vietnam. They stated that the efficiency of the ANFIS models developed for SPI/SPEI was similar and ANFIS model developed for SPEI had better performance than SPI for a 12-month period, particularly.

In addition, the use of wavelet transform (W), one of the data preprocessing techniques, has increased in cases where artificial intelligence methods are insufficient in estimation studies (Nourani et al. 2009; Singh 2011; Belayneh and Adamowski 2012; Kisi and Cimen 2012; Mehr et al. 2014; Nourani et al. 2015). Santos and Silva (2014) proposed wavelet-neural network (W-ANN) models to forecast daily streamflow for 1-, 3-, 5- and 7-days ahead. They said that the proposed models had significantly better results than the ANN model. Khan et al. (2018) used standard index of annual precipitation (SIAP) for meteorological drought and standardized water storage index (SWSI) for hydrological drought. They developed and compared ANN and W-ANN models to estimate monthly drought. The results of the comparisons displayed that W-ANN models gave higher correlation coefficients than ANN models. Shirmohammadi et al. (2013) investigated the ability of ANN, ANFIS, W-ANN and W-ANFIS models to forecast meteorological drought in Iran. They derived that wavelet transform could improve modelling of meteorological drought and ANFIS models better performed than ANN models. Fung et al. (2018) developed support vector regression (SVR), boosting-support vector regression (BS-SVR) and wavelet-boosting-support vector regression (W-BS-SVR) models for drought prediction in Malaysia. They calculated standard rain evapotranspiration index (SPEI) values for 1-, 3- and 6-months using monthly precipitation, average temperature and evapotranspiration. They used the calculated SPEI values as input parameter for SVR, BS-SVR and D-BS-SVR models. Comparing the results, they found that W-BS-SVR models showed better performance than the other two models.

The aim of this study is to investigate the effects of wavelet transform on the success of ANFIS, SVM and ANNs models in meteorological drought estimation. For this purpose, the W-ANFIS, W-SVM and W-ANN models were developed and compared to the ANFIS, SVM and ANNs models in Çanakkale Province, Turkey.

2 Study region and data

Located in the northwestern part of Turkey, the study area covers the Çanakkale province, Gökçeada and Bozcaada islands and also the Dardanelles, which is an internationally important strait connecting Europe with Asia and also connecting the Sea of Marmara with the Aegean Sea.

The Çanakkale province is located between 25°35′–27°45′E and 39°30′–40°45′N and has an area of 9,737 km2 (figure 1). Çanakkale has semi-humidity climatic conditions. Çanakkale has a total annual rainfall of 591.5 mm per year. The maximum amount of rainfall measured in 24 hrs was 137.8 mm (November 5th, 1956). For long years, the average temperature has been 15.0°C and the temperature has an increasing trend. The daily maximum temperature measured was 39.0°C (July 30th, 2007) whereas the daily minimum temperature measured was –11.8°C (February 14th, 2004) up to present. The average wind speed is 3.9 m/s in Çanakkale. The maximum wind speed measured was 139.3 km/hr (February 15th, 1991) up to present (GDTM 2017).

Figure 1
figure 1

The study area map.

Besides the city of Çanakkale, the study area covers also two substantial islands of Turkey, which are Gökçeada and Bozcaada. Gökçeada, the largest island of Turkey, has a surface area of 289 km2 and has a rough terrain. The island has a dam and three ponds. Bozcaada is the 3rd largest island of Turkey. The area of the island is 40 km2 and the distance to the mainland is 6 km. There is Mediterranean climate in Bozcaada (GMKA 2017).

In this study, the precipitation data were used for three stations located in the Çanakkale province. These are Çanakkale, Gökçeada and Bozcaada stations. The precipitation data were obtained from the Turkish State Meteorological Service. Some statistical parameters of the precipitation data of these stations are given in table 1.

Table 1 The statistical parameters of the precipitation data.

The homogeneity test was made to the precipitation data of Çanakkale, Bozcaada and Gökçeada stations. The double mass curve method was used for the homogeneity of the data of the three stations. The cumulative graphs were drawn using annual precipitation averages. When the changes in slope were examined, no breaks were observed in the slopes of three stations (figure 2).

Figure 2
figure 2

The double-mass curves for Çanakkale, Gökçeada and Bozcaada stations.

3 Methods

3.1 Standardized precipitation index (SPI)

The standard precipitation index (SPI), which is a meteorological drought index and based on only precipitation data, was developed by McKee et al. (1993). SPI, which is obtained by dividing standard deviation of the difference of precipitation values from the mean value, has two advantages. Firstly, it is relatively easy to evaluate because it uses only precipitation data (Cacciamani et al. 2007; Belayneh et al. 2014). Secondly, SPI makes it possible to define drought on multiple time scales (Tsakiris and Vangelis 2004; Mishra and Desai 2006; Cacciamani et al. 2007). For drought analysis with SPI, at least 30 years of continuous data is required (Cacciamani et al. 2007; Belayneh et al. 2014). In the calculation of SPI, the long-term precipitation record for the desired period is fitted to the probability distribution and then converted to the normal distribution; therefore, the mean SPI value is zero (McKee et al. 1993; Edwards and McKee 1997). A positive value is obtained if the value of SPI is greater than the mean value (median for normal distribution) of the precipitation, and a negative value, if it is lower. As SPI is normalized, the wet and dry climates can be shown in the same way.

A drought event occurs when the index continuously reaches –1.0 value or less according to the SPI. Drought continues until the SPI value is zero (Tigkas et al. 2015). SPI drought categories are given in table 2.

Table 2 SPI categories (Tigkas et al. 2015).

3.2 Discrete wavelet transform

The wavelet function ψ(t), named the mother wavelet, has shock characteristics and can quickly reduce to zero. \( \varPsi_{a,b}^{\left( t \right)} \) can be determined by compressing and expanding ψ(t), which is defined mathematically as \( \int_{ - \infty }^{ + \infty } \varPsi \left( t \right)dt = 0 \).

$$ \varPsi_{a,b}^{\left( t \right)} = \left| a \right|^{ - 1/2} \varPsi^{{\left( {\frac{t - b}{a}} \right)}} $$
(1)

if a and b are real numbers in equation (1), it can be stated as \( \varPsi_{a,b}^{\left( t \right)} \) is the consecutive wavelet, a is the periodicity factor and b is the time factor (Wang and Ding 2003; Tiwari and Chatterjee 2011).

In the wavelet transform, a function of variables a and b, parameter a can be defined as the expansion (a > 1) or contraction (a < 1) factor for different scales of the wavelet function. Parameter b can be defined as a shift of ψ(t). Continuous wavelet transforms of f(t) for f (t) ∈ L2(R) can be described as:

$$ w_{f} \left( {a,b} \right) = \left| a \right|^{ - 1/2} \int_{ - \infty }^{ + \infty } f \left( t \right)\varPsi^{*} \left( {\frac{t - b}{a}} \right)dt .$$
(2)

For the different scales of a and b, the relationships between signal and wavelet function are investigated by wavelet transform. From this, a scattering map with a wavelet coefficient \( w_{f} \left( {a,b} \right) \) is derived. Selecting the scales and positions as the forces of the two will make the analysis more accurate. Thus WMN can be obtained from equation (3) (Mallat 1989).

$$ \varPsi_{m,n} \left( {\frac{t - b}{a}} \right) = a_{0}^{{ -\, \frac{m}{2}}} \varPsi^{*} \left( {\frac{{t - nb_{0} a_{0}^{m} }}{{a_{0}^{m} }}} \right), $$
(3)

where m is the wavelet expansion and n is the wavelet translation. Parameters a0 and b0 are generally numbers, which are assumed to be 2 and 1, respectively (figure 3) (Tiwari and Chatterjee 2011).

Figure 3
figure 3

DWT decomposition of a time series (Tiwari and Chatterjee 2011).

The power-of-two logarithmic scaling of the translations and expansions are called a dyadic grid arrangement (Mallat 1989; Tiwari and Chatterjee 2011). \( w_{f} \left( {m,n} \right) \) occurs at a different time t, becomes as following equation,

$$ w_{f} \left( {m,n} \right) = 2^{ - m/2} \sum\limits_{t = 0}^{N - 1} f \left( t \right)\varPsi^{*} \left( {2^{ - m} i - n} \right) ,$$
(4)

where w = f(m, n) can be calculated as the wavelet coefficient with a = 2m and b = 2m. \( f\left( t \right) \) is a finite time series. n is the time translation parameter and is less than (2M–m–1). N is equal to the power of two and is calculated by N = 2M. However, 2m is the largest wavelet scale in the range < 1< m < M (when m = M).

Also, \( \bar{W} \) is the mean signal. Thus, the inverse discrete can be calculated as below:

$$ f\left( t \right) = \bar{W}\left( t \right) + \sum\limits_{m = 1}^{M} {W_{m} } \left( t \right),$$
(5)

where \( \bar{W}\left( t \right) \) is the approximation sub-signal and \( W_{m} \left( t \right) \) are detailed sub-signals (Tiwari and Chatterjee 2011).

Discrete wavelet transform processes with scaling function (low-pass filter) and wavelet function (high-pass filter). After the original data is processed in these functions, they are divided into approximation and detail components. Then, this procedure is repeated with successive approximations being decomposed in order (Tiwari and Chatterjee 2011).

3.3 Adaptive neural-based fuzzy inference system (ANFIS)

There are two critical processes in ANFIS proposed by Jang (1993). These are: (1) the determination of the rules and (2) the estimation of the parameters of the input and output membership functions by the backpropagation learning algorithm.

In the ANFIS structure, the fuzzy rule base is combined with artificial neural networks. Thus, the new fuzzy neural network is more advantageous in that organization of membership functions and the identification of fuzzy rules can be done automatically for a problem.

ANFIS aims to determine the parameters of the Sugeno type of fuzzy inference system using a hybrid learning model. The Sugeno fuzzy model was proposed by Takagi, Sugeno and Kang for a system approach that produces fuzzy rules from an input–output dataset. A typical rule structure in Sugeno fuzzy model consisting of two inputs x, y and one output z is in the following forms:

$$ \text{R}^{1}\!\! :{\text{If}}\; {x}\; {\text {is}}\; A_{1} \;{\text{AND}} \; {y}\;{\text {is}} \; {B_{1}} ,\;{\text{THEN}} \; {z} = f_{1} = p_{1} x + q_{1} y + r_{1} $$
(6)
$$ \text{R}^{2}\!\! :{\text{If}} \; {x} \; {\text {is}}\; {A_{2}} \;{\text{AND}}\; {y} \; {\text {is}}\; {B_{2}} ,\;{\text{THEN}}\; {z} = f_{2} = p_{2} x + q_{2} y + r_{2} $$
(7)
$$ \text{R}^{n}\!\! :{\text{If}} \; {x}\; {\text {is}}\; {A_{n}} \;{\text{AND}}\; {y}\; {\text {is}} \; {B_{n}} ,\;{\text{THEN}} \; {z} = f_{n} = p_{n} x + q_{n} y + r_{n} $$
(8)

where, Ai shows membership degree of inputs and the output level of each rule is weighted by wi multiplied value which is power of the rule equation (9).

$$ w_{i} = f_{1} \left( {x_{1} } \right)*f_{2} \left( {x_{2} } \right). $$
(9)

Here, the output of each node gives the firing level of the rule to which it belongs. Then, firing strengths are normalized by the equation below,

$$ \overline{{w_{i} }} = \frac{{w_{i} }}{{w_{1} + w_{2} }}. $$
(10)

The latest total output value is calculated by equation (11)

$$ f = \mathop \sum \limits_{i} \overline{{w_{i} f_{i} }} = \frac{{\mathop \sum \nolimits_{i} w_{i} f_{i} }}{{\mathop \sum \nolimits_{i} w_{i} }} . $$
(11)

More detailed information can be obtained from Chang and Chang (2001).

3.4 Support vector machines (SVM)

Support vector machines (SVM) is a non-parametric classification method based on statistical learning theory (Vapnik 1995). SVM is widely used in different classification and regression problems with high-dimensional data sets and for the linearly separable case (Mountrakis et al. 2011; Kavzoglu et al. 2014). In artificial intelligence methods with overfitting problem, empirical risk minimization (ERM) theory is employed. SVM based on structural risk minimization (SRM) theory can display better performance than artificial intelligence. Unlike ERM, which minimizes error in training data, SRM minimizes the upper limit of expected risk. The other feature of SVM for determining the data structure is the transformation of the original data from the input space to a new feature space with the kernel function which is a new mathematical paradigm (Shahbazi et al. 2011). Therefore, to find the linear function (\( f\left( {x_{i} } \right) \)),

$$ y_{i} = f\left( {x_{i} } \right) = w\phi_{i} \left( {x_{i} } \right) + b, $$
(12)

where \( \phi_{i} \) is a non-linear transformation function mapping input space into a higher dimension feature space, wi represents a weight vector and b = bias. The linear function represents a non-linear relation between inputs \( (x_{i} ) \) and outputs \( (y_{i} ) \).

Firstly, SVM was developed for classification problems. Then, in recent years SVM has been based on regression. SVM applies regression by using an ε-sensitive loss function \( ||y - f\left( x \right)||_{\varepsilon } = \hbox{max} \left\{ {0,||y - f\left( x \right)|| - \varepsilon } \right\} \). This function is related to bigger errors than a certain threshold. SVM converts to the following forms:

$$ {\text{Minimize}}:\,\frac{1}{2}\left| {\left| w \right|} \right|^{2} + C\frac{1}{L}\left( {\sum\limits_{i}^{N} {\left( {\xi_{i} + \xi_{i}^{*} } \right)} } \right) $$
(13)
$$ {\text{Subject to}}\, \left\{ {\begin{array}{*{20}c} {w\phi \left( {x_{i} } \right) + b - y_{i} \le \varepsilon + \xi_{i} } \\ {y_{i} - w\phi \left( {x_{i} } \right) - b \le \varepsilon + \xi_{i}^{*} } {\xi_{i} ,\xi_{i}^{*} \ge 0} \\ {i = 1,2,3, \ldots ,N} \\ \end{array} } \right. $$
(14)

where L is the number of data points in the training dataset; C is model parameter; xi is feature space data points; \( \xi_{i} \) and \( \xi_{i}^{*} \) are positive slack variables (Shahbazi et al. 2011).

3.5 Artificial neural networks (ANNs)

Artificial neural networks (ANNs) are defined as complex systems which are formed by interconnecting each other with different connection geometries of artificial neuron which are inspired by neurons in the brain. ANNs, which can be described as computing processes, can be likened to a black box that produces outputs for given inputs (Kohonen 1988).

An artificial neuron consists of five main parts: inputs, weights, summation function, transfer function and output. Inputs are information entering to a neuron from other neurons or external sources. The weights are values indicating the effect of another processing element on this processing element in the input set or in a previous layer. In figure 4, the effect of the input on neuron is determined by the weight. The summation function calculates the effect of all inputs by using weights on this process element. The function determines the net input of a neuron. Sum of net input (net) collected in neuron is obtained as:

$$ net = \sum\limits_{i = 1}^{n} {w_{ij} } x_{i} + b ,$$
(15)

where xi is the input value of the i neuron; wij is weight coefficients, n is the total input numbers on neuron; b is the threshold value and \( \sum\) is summation function. The transfer function determines output of neuron by processing the net input obtained from the summation function. In general, the sigmoid function has been used as the transfer function (f (.)) in the multilayer perceptron model. The output of the neuron calculated using the sigmoid function is shown as follows:

$$ y_{i} = f\left( {net} \right) = \frac{1}{{1 + e^{ - net} }}. $$
(16)

Output (yi) obtained from the neuron is transmitted to another neuron or as output of neural network (Oztemel 2003).

Figure 4
figure 4

An artificial neuron.

Artificial neural networks contain many neurons that are connected to each other. The connection of neurons is not random. In general, the network is formed in three layers by the neurons. The inputs are located in the input layer, while the outputs are obtained in the output layer. There are hidden layers between the input and output layers. Since the outputs of hidden layers cannot be observed directly, the numbers of hidden layers can be one or more (figure 5) (Kartalopoulos 1996). In figure 5, it shown that there are weighted connections between layers.

Figure 5
figure 5

An artificial neural network.

4 Results and discussion

Using precipitation data of Çanakkale, Bozcaada and Gökçeada stations located in the northwest of Turkey, the drought indices between the years 1975 and 2010 were determined by standardized precipitation index (SPI) method. These indices were used in developing various models with different artificial intelligence (AI) techniques, which are wavelet transform (W), adaptive neural-based fuzzy inference system (ANFIS), support vector machine (SVM) and artificial neural networks (ANNs) methods. The modelling stage consists of two parts: (1) modelling of SPI series generated according to historical records and (2) modelling of SPI series by using SPI subsets obtained by wavelet transform technique.

In the first part, the ANFIS, SVM and ANNs models were developed to estimate 3-, 6-, 9- and 12-months SPI series of Çanakkale station. Various model combinations were tried using the SPI series of Bozcaada and Gökçeada stations as input parameters. The data between 1975 and 2003 years were allocated to training set, whereas the rest of data were used in testing set. The ANFIS, SVM and ANNs models with the optimum performance are given in table 3. The performance of the models was determined by using determination coefficients (R2) and root mean squared error (RMSE) in equations (17) and (18).

$$ R^{2} = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{N} \left( {D_{{i\left( {\text{real}} \right)}} - D_{{i\left( {\text{model}} \right)}} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{N} \left( {D_{{i\left( {\text{real}} \right)}} - D_{{i\left( {\text{mean}} \right)}} } \right)^{2} }} $$
(17)
$$ {\text{RMSE}} = \sqrt {\frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left( {D_{{i\left( {\text{real}} \right)}} - D_{{i({\text{model}) }}}}\right) ^{2} }, $$
(18)

where N is the total data number; Di(real) is the SPI value; Di(model) is the model result; and Di(mean) is the mean SPI value.

Table 3 The results of ANFIS, SVM and ANNs models.

In addition, the two-sample Kolmogorov–Simirnov (K–S) test was applied to select the model suitability with the calculated 3-, 6-, 9- and 12-months SPI values (tables 3 and 4). K–S test was performed at significance level 5%. H0 and Ha hypotheses for this test were given as the two samples follow the same distribution and the distributions of the two samples were different, respectively. As the computed p-value is greater than the significance level 5%, the null hypothesis H0 cannot be rejected.

Table 4 The results of the W-ANFIS, W-SVM and W-ANNs models.

As can be seen from table 3, the performances of the models obtained for 6- and 9-months are higher than those obtained for 3- and 12-months according to these AI techniques. However, SVM model gave the best performance for 9- and 12-months, while the highest R2 value was obtained as 0.846 for 6-month in ANNs model.

In the second part, the wavelet transform technique was used as the pre-processing technique in ANFIS, SVM and ANNs models, by considering that increasing input numbers (sub-series) could improve the performance of model. In the preprocessing stage, the 3-, 6-, 9- and 12-months SPI values of Gökçeada and Bozcaada stations were decomposed into eight detailed components (2-4-8-16-32-64-128-256) and one approximation component by using the discrete wavelet transform. Haar, Daubechies (db) and DMeyer (Dmey) wavelets, which are the mostly used wavelets in the discrete wavelet transform technique, were selected to form sub-series. Then, the correlation values were calculated between the sub-series and the SPI values of Çanakkale station. It was found the highest correlation values for the wavelets W1, W2 and W3 for types of Haar, db and Dmey wavelets. Together with the wavelets, the summations W1+W2, W1+W2+W3 were also tried as inputs to develop various hybrid W-ANFIS, W-SVM and W-ANNs models. Since the hybrid models developed with Dmey wavelet gave the highest R2 values for W-ANFIS, W-SVM and W-ANNs models, the results of only these models developed by using Dmey wavelet were given in table 4.

According to table 4, the performance of the models improved with wavelet transform technique is higher than the models mentioned in table 3. When the ANFIS models were analyzed, it was seen that R2 values increased and error values decreased for the models of 3-, 6- and 9-months, except for the models of a 12-month period. Similarly, the performances of the ANNs and SVM models slightly increased after the application of wavelet transform technique. On the other hand, the W-SVM model for 9-months seemed to have performed lower than the SVM model. When all of the models were analyzed, the highest performance improvement with wavelet transform technique was observed at the ANFIS model of a 9-month period, but the highest R2 value (0.874) was obtained from W-ANFIS model for 6-month for testing set. The inputs of 6-months W-ANFIS model were W1+W2 subsets of Gökçeada and W1+W2+W3 subsets of Bozcaada. Accordingly, scatter diagrams of the ANFIS and W-ANFIS models for 6-month period are given in figures 6 and 7. Analyzing the scatter diagrams, it can be inferred that SPI values are in line with the results of the ANFIS and W-ANFIS models. Also, time series of SPI, ANFIS and W-ANFIS results were given in figure 8 for showing compatibility of the models.

Figure 6
figure 6

Scatter diagrams for the ANFIS model.

Figure 7
figure 7

Scatter diagrams for the W-ANFIS model.

Figure 8
figure 8

Time series of the SPI, ANFIS and W-ANFIS results.

Also, the cumulative distribution function graphs of ANFIS and W-ANFIS models for 6-month SPI values were given in figure 9 for the training and test sets.

Figure 9
figure 9

Cumulative distribution functions (a) ANFIS model (b) W-ANFIS model for training and testing sets.

According to the two-sample K–S test, H0 hypothesis was found appropriate. The p value of 6-month ANFIS and W-ANFIS models were found to be 0.740 and 0.856 for testing set, respectively. These values were supported by the high R2 values as seen in tables 3 and 4. Also, p values of 9-month ANFIS and W-ANFIS models were found high. Even though the performance of both 9-month and 6-month ANFIS and W-ANFIS models are similar values, the R2 value for 6-month W-ANFIS model was considerably greater than other models which were showing better performance of 6-month W-ANFIS model.

5 Conclusions

In recent years, drought has been one of the most important problems in the world for the living and economy. The measures to be taken to observe and decrease the effects of drought, which is defined as the incapability of meeting the water demand due to the scarcity of water resource, have a great importance. Therefore, in water resources studies, not only artificial intelligence techniques that use examples to solve a specific problem and do not need expert knowledge, but also the use of data pre-processing methods in increasing the performance of these techniques have become crucial in recent years.

In this study, drought estimation has been performed for the Çanakkale province. Firstly, drought series have been obtained for the 3-, 6-, 9- and 12-months periods with SPI. Secondly, ANN, ANFIS and SVM models have been developed to estimate the drought series for 3-, 6-, 9- and 12-months periods. Later, by decomposing all these series to sub-series with wavelet transform technique, new input sets have been obtained; and finally, W-ANN, W-ANFIS and W-SVM models have been developed.

SVM model has been more successful in the drought forecast for 9- and 12-months period, whereas ANFIS and ANNs models have performed better in the estimation of 3- and 6-months SPI values, respectively. W-ANFIS models developed by wavelet transform technique gave better results in 3-, 6- and 9-months estimations. For 12-months SPI value estimation, W-SVM model has the best result. Additionally, it is observed that wavelet transform technique has improved the performance of ANFIS, ANN and SVM models in the drought forecasting of Çanakkale province. However, it is seen that improvement of model performance is better in ANFIS than the others especially, when data decomposition is applied to all three artificial intelligence methods. According to RMSE, R2 and K–S test results, the most appropriate drought forecasting was obtained in 6-months W-ANFIS model. In conclusion, artificial intelligence methods seem to be applicable in drought forecasting and they perform better when improved with hybrid models.