1 Introduction

Air temperature is one of the most important meteorological elements influencing hydrologic phenomena. The hourly data of air temperature are required in some environmental and agricultural sections such as in crop growth models, for simulation of crop photosynthesis, soil N transformation, root growth, and hourly crop evapotranspiration estimation. In addition, the prediction of hourly air temperature is needed in soil science, for simulation of soil temperature at different soil layer. In fact, instantaneous air temperature affects the flux of heat into and out of the soil and the vertical flux of sensible heat from the earth (Rosenberg et al. 1983; Zand-Parsa et al. 2006; Saito et al. 2006; Majnooni-Heris et al. 2011).

In developing countries, the data records in meteorological stations cover only the values of daily maximum and minimum air temperatures, and the hourly air temperatures are measured rarely. To equip the meteorology stations to measure hourly data is essentially expensive in the mentioned countries. Under these situations, the use of daily air temperature to indirectly estimate hourly air temperature offers an easy and very reliable alternative. One of the alternative approaches, which has captured researchers attention in the past decades, is the artificial neural network (ANN) applied in various fields of water resources and hydrology (Kisi 2007; Rezaeian Zadeh et al. 2010; Marofi et al. 2011; Hosseinzadeh Talaee et al. 2011; Tabari et al. 2010a, b, 2011).

The background of applying ANN to hydrologic processes, especially rainfall-runoff simulation goes back to the 1990s (e.g., Halff et al. 1993; Hjemfelt and Wang 1993; Karunanithi et al. 1994; Hsu et al. 1995; Smith and Eli 1995; Minns and Hall 1996). Preliminary concepts and hydrologic applications of ANNs have been detailed by ASCE (2000a, b) and Govindaraju and Rao (2000).

The application of ANN in the prediction of time-dependant temperatures is less frequent than in other areas of knowledge. Some investigations were conducted for the prediction of daily, half-daily, or maximum air temperatures (e.g., Smith et al. 2006; Moon et al. 2009). Tasadduq et al. (2002) used ANNs for the prediction of hourly mean values of ambient temperature 24 h in advance. The results of their study showed that the ANN can be a valuable tool for hourly temperature prediction in particular, and other meteorological predictions in general. It must be noted that they used just one temperature value as input to the ANNs in their investigations. Abdel-Aal (2004) applied 24-hourly temperatures from previous day as input to the ANNs for forecasting the 24-hourly temperatures of the next day. The ANNs performances were significantly superior to naive forecasts based on persistence and climatology. In another recent study, Dombayci and Gölcü (2009) developed an ANN model to predict daily mean ambient temperatures in Denizli, south-western Turkey. They employed monthly and daily temperatures and also the mean temperature value of the previous day as input data for the ANN. Their results showed that the ANN approach is a reliable model for daily mean ambient temperature prediction.

Daily minimum and maximum air temperatures are commonly measured at Iranian weather stations, but hourly air temperature has been measured in electronic forms by automatic weather stations since 2000 only, so we could not access the hourly values for previous years. In addition, in some periods, the data obtained are corrupted and unusable. Hence, we can derive the lost data by using the ANNs models.

To the best knowledge of authors, there is no study to implement daily min/max values of air temperature and also antecedent daily min/max ones as input to the ANNs for hourly air temperature driven. Also, application of radial basis function (RBF) networks with these inputs may result in a constructive assessment of ANNs in data-driven issues (here, hourly air temperature).

To that end, issues such as choice of transfer function, learning algorithm, and network type and finding optimal network structure, need careful consideration. Transfer functions that are most commonly employed in multi-layer perceptrons (MLPs) are sigmoidal-type functions such as the logistic and hyperbolic tangent functions (Maier and Dandy 2000). The objective of this study is to evaluate the applicability and capabilities of ANN models and empirical equations for the prediction of hourly air temperature using data from Fars province, Iran. Hence, two ANN models: MLP and RBF and one empirical method were used and their performances were compared.

2 Materials and methods

2.1 Multi-layer perceptron

MLP is perhaps the most popular ANN architecture (Dawson and Wilby 1998). It is a network formed by simple neurons called perceptron. The perceptron computes a single output from multiple real-valued inputs by forming a linear combination according to input weights and then possibly subjecting the output to some nonlinear transfer function (see Fig. 1).

Fig. 1
figure 1

Schematic of a typical MLP

Mathematically this can be represented as:

$$ y = f(\sum\limits_{{i = 1}}^n {{w_i}{p_i}} + b) $$
(1)

where, w i represents the weight vector, p i is the input vector (i = 1, 2..n), b is the bias, f is the transfer function, and y is the output. The transfer function used in this study was the tangent sigmoid function defined for any variable s as:

$$ f(s) = \tfrac{2}{{(1 + {e^{{ - 2s}}})}} - 1 $$
(2)

MLP is usually trained using the back error propagation algorithm. This popular algorithm works by iteratively changing a network's interconnecting weights such that the overall error (i.e., between observed values and predicted outputs by ANNs) is minimized (Sudheer et al. 2002).

2.2 Neural networks training algorithm

In this study, the Levenberg–Marquardt MLP training algorithm was used (More 1977). The Levenberg–Marquardt algorithm approaches a second-order training speed without having to compute the Hessian matrix. This algorithm produced better results for the application under consideration. The objective of the training is to minimize the global error E defined as:

$$ E = \tfrac{1}{p}\sum\limits_{{p = 1}}^p {{E_p}} $$
(3)

where p is the total number of training patterns, and E P is the error for training pattern p. E P was calculated as:

$$ {E_p} = \tfrac{1}{2}\sum\limits_{{k = 1}}^n {{{({o_k} - {t_k})}^2}} $$
(4)

where n is the total number of output nodes, o k is the network output at the kth output node, and t k is the target output at the k th output node (Kisi 2007). In the training algorithm, an attempt was made to reduce this global error by adjusting weights and biases.

2.3 Radial basis function

The RBF ANN model, developed by Powell (1987) and Broomhead and Lowe (1988), consists of an input layer, a single hidden layer, and an output layer. Figure 2 shows a typical RBF model. The number of input and output nodes is similar to the MLP neural networks, determined by the nature of actual input and output variables. However, RBF networks tend to learn much faster than a MLP. The output of RBF was calculated as:

$$ Y = \sum\limits_{{p = 1}}^p {{W_P}} \theta \left( {\left\| {X - \left. {{X_p}} \right\|} \right.} \right) $$
(5)

where X is the input value, Y is the output value, θ( ) is the radial basis function, W is the weight connecting the hidden and output nodes, X P represents the center of each hidden node (depends on the observed input data), and \( \left\| {X - \left. {{X_p}} \right\|} \right. \) is the Euclidean distance between input and hidden nodes. Each hidden node represents a group of input nodes that have similar information from the input data. The transformation associated with each node of the hidden layer is a Gaussian function (Sudheer et al. 2002).

Fig. 2
figure 2

Schematic of a typical RBF

2.4 Development of ANN models

In general, the development of ANN models involves four stages: architectural design, training, testing, and optimization. The network architectural design refers to assigning a number of processing elements (neurons) that perform calculations and the number of layers that contain these neurons. Training is the stage at which data records are introduced to a preconfigured network to discern relationships between input and output variables. During this stage, data are selected and entered into the network. Furthermore, weights are constantly adjusted for the network outputs to match observed values and minimize errors. This process is repeated until the network output has converged and the global error has hopefully reached its minimum. When convergence is achieved, further training is stopped, weights are fixed, and the network is said to be trained. During the testing phase, the trained network is tested using another data that has not been used for training. Optimization involves adjusting the number of neurons, transfer functions, and their coefficients and fine-tuning other parameters for optimum network performance (Weiss and Kulikowski 1991).

2.5 Analytical method

Hourly air temperature was estimated by empirical functions, such as linear, exponential, or sinusoidal using daily maximum and minimum air temperatures (Baskerville and Emin 1969; Allen 1976; Ephrath et al. 1996; Saito et al. 2006). Saito et al. (2006), Zand-Parsa et al. (2006), and Majnooni-Heris et al. (2011) estimated the instantaneous air temperature using the proposed equation of Kirkham and Powers (1972) as follows:

$$ T = \overline T + {A_T}\cos (2\pi \frac{{t - 13}}{{24}}) $$
(6)

where T and \( \bar{T} \) are instantaneous and average daily temperatures (degrees Celsius), respectively, A T is the amplitude of air temperature (the half of difference between maximum and minimum of air temperature, degrees Celsius), and t is local time within the day. In Eq. 6, the cosine function is maximum (+1) at 1300 hours and minimum (−1) at 0100 hours and the values of T become maximum and minimum at these times, respectively. In this study, Eq. 6 was also used for predicting hourly air temperature.

3 Dataset and methodology

Daily maximum and minimum air temperature and hourly air temperature datasets were obtained from Arsanjan synoptic station (53°16′36″ E, 29°56′36″ N) and Bajgah (52°37′00″ E, 29°44′00″ N) and Kooshkak (52°36′00″ E, 30°06′00″ N) climatology stations located in Fars province in southwest Iran (Fig. 3). The air temperature data from all stations were for the year 2007. Climate of Bajgah is semi-arid with warm summers and most of the rains occur in the winter months (Sepaskhah and Andam 2001). Kooshkak has the same climate as Bajgah (semi-arid). Also, the Arsanjan was classified in arid and semi-arid regions and the mean annual precipitation, evaporation, and temperature are 323.8 mm, 989.1 mm, and 18.2°C, respectively (Emadi et al. 2010).

Fig. 3
figure 3

The location of selected stations in Fars province, Iran

As many as 255 daily maximum and minimum air temperature values and 6,120 hourly air temperature values were used for training phase (70% of data) and 110 daily maximum and minimum temperature values and 2,640 hourly air temperature values were used for testing phase (30% of data).

As mentioned earlier, the MLP and RBF networks were implemented to predict hourly air temperature. The training of MLP was done using the Levenberg–Marquardt (MLP-LM) algorithm. In addition, the tangent sigmoid transfer function was used for hidden layer and the linear function for output layer, respectively. In this study, two methods were used for the use of data for the training and testing of the networks. In the first method, time series (TS) data without randomization were used and in the second method randomized (RZ) data were used. According to the above-mentioned methods, four input vectors are employed for the MLP and RBF networks, where the first two models are based on the TS data sets and the rest are based on RZ data sets. Table 1 summarizes the combination of input data used in simulations. Targets (outputs) of the networks were 24-h time series of air temperature in a day.

Table 1 Combinations of MLPs and RBFs for hourly air temperature driven

Because of the use of sigmoid functions in the ANN model, the hydrologic data must be normalized onto the range [0, 1] before applying the ANN methodology. It was found to be useful to normalize the time series to the range [0.05, 0.95] to avoid the problem of output signal saturation that can sometimes be encountered in ANN applications (Smith 1993). Thus, before applying ANNs, all data were normalized and to that end they were transformed into the range of [0.05, 0.95] as:

$$ {X_n} = 0.05 + 0.9\tfrac{{{X_r} - {X_{{\min }}}}}{{{X_{{\max }}} - {X_{{\min }}}}} $$
(7)

where X n and X r are the normalized input and the original input; and X min and X max are the minimum and maximum of input ranges, respectively.

Normalized data were used to train both MLP and RBF ANN models. Program codes, including Neural Network Toolbox, were written in the MATLAB language for the ANN simulation. The randomization of data was performed by the MATLAB programming.

The three-layer network with sigmoid transfer function for hidden layer and linear transfer function for output layer can represent any functional relationship between inputs and outputs, if the sigmoid layer has enough neurons (Hagan et al. 1996), so in all MLPs, three-layered networks (one hidden layer) were used. The ANN results were transformed back to the original domain and the root mean square errors (RMSE) were computed for both the training and testing data for each ANN.

4 Results and discussion

For finding the optimum network, various epochs, and neuron numbers were examined. The architecture that produced the smallest error was used for the development of networks for the derivation of hourly temperature. Optimal numbers of neurons for input, hidden, and output layers of MLPs were 4, 6, and 24, respectively.

To have a true comparison with MLPs, the RBF models were developed using the same data sets. The optimal number of neurons for hidden layers of RBF was 75 and the other layers (input and output) had the same neurons as MLPs. The mentioned number of neurons in hidden layer was obtained using a simple trial and error procedure. The number of parameters in the RBF models was high compared with MLPs.

The values of RMSE and coefficient of determination (R 2) for Arsanjan, Bajgah, and Kooshkak stations datasets were computed to evaluate the performances of the ANNs, as shown in Tables 2, 3, and 4, respectively.

Table 2 Performance of MLP and RBF models for Arsanjan station
Table 3 Performance of MLP and RBF models for Bajgah station
Table 4 Performance of MLP and RBF models for Kooshkak station

The values of RMSE and R 2 between the measured values and the values predicted by ANN models were close to zero and one, respectively, showing that ANN models were capable of predicting hourly air temperature using daily maximum and minimum air temperatures. The values of RMSE for MLP training were close to each other for both TS and RZ data. The Tables show that the validated MLP2 with the TS data was more accurate based on the values of RMSE than was MLP1. Similar results were obtained for randomized data, but there was no noticeable improvement in the results of MLP2 over MLP1. Hence, the addition of input vectors did not necessarily lead to improved results. The values of R 2 in both time series data and randomized data for MLPs were high. Comparison of RMSE and R 2 values of TS and RZ data for MLPs showed that randomized data resulted in higher estimation (data derivation) efficiency. It should be noted that in importing the TS data to the network, the first 2 months of each season were applied for the training phase and the other month of that season was applied for the testing phase.

The RMSE values of MLP2 for comparison with measured data by TS and RZ data for Arsanjan, Bajgah, and Kooshkak stations were equal to 1.3°C and 1.2°C, 2°C and 1.9°C, and 1.9°C and 1.7°C, respectively. Also, the values of R 2 were equal to 0.98 and 0.97, 0.96 and 0.97, and 0.96 and 0.96, respectively. The values of RMSE and R 2 for RBF models showed that RBF2 models were not applicable to predict hourly air temperature, but RBF1 models had acceptable results. Clearly, the MLP model performance was superior to that of the RBF model.

Moreover, the values of mean absolute error (MAE) were calculated for the testing phase for all the stations and the results are presented in Table 5. The values of MAE for RBF2 models were very high and RBF2 models were not appropriate and applicable for the derivation of hourly air temperature. Adding the antecedent daily maximum and minimum air temperatures caused the reduction in the performance of the models in RBF networks. Hence, importing the additional input to the networks to enhance hourly air temperature derivation especially for RBF networks was not always useful. Obviously, the results obtained from other models were acceptable and MLP1 and MLP2 had more accurate results than RBF1. Among these models, the lowest and highest values of MAE were equal to 0.99°C and 3.80°C, respectively.

Table 5 Mean absolute error (MAE) values for MLP and RBF models for all stations in testing phase

The estimated and measured hourly air temperatures of MLP2 with TS data for 5 days of January (winter season) and July (summer season) are shown in Fig. 4. This figure shows that MLP-simulated hourly air temperatures were in close agreement with observed values.

Fig. 4
figure 4

Simulated and measured hourly air temperature for testing phase of MLP2 with time series data (5 days of January and July)

Comparison of simulated and measured hourly air temperatures of MLP2 using the TS and RZ data is shown in Fig. 5. The MLP2 model for three stations showed the closest matching of estimated and observed data.

Fig. 5
figure 5

Comparison of simulated and observed hourly air temperatures for three stations for time series (TS) and randomized (RZ) data, respectively

Comparison of the two parts of Fig. 5 showed that testing using the RZ data was more satisfactory than using the TS data. Furthermore, the MLP-LM methods predicted hourly air temperatures close to measured values.

Since both the model's training and validation were done using the data for year 2007, the generalization ability of the ANN models was further evaluated by validation using the data for a different year. The data for year 2008 were imported to the models from Arsanjan station and its results are presented in Table 6. It was clear that the proposed models were appropriate for the derivation of hourly air temperature from min/max temperature data and this proved the generalization ability of the proposed models. Since only daily maximum and minimum air temperatures were measured at most of the meteorological stations in Iran in the years before 2000, hourly air temperature can be estimated using artificial neural networks.

Table 6 Performance of MLP and RBF models for Arsanjan station in 2008 year

To assess the generalization ability of the proposed models to be implemented for hourly air temperature derivation of other stations, MLP2 was trained using data provided from Arsanjan station and validated using data from Bajgah station. For this purpose, the 365 daily (time series) values of max/min air temperature (and the related 8,760 hourly values of air temperature for the year 2007) from Arsanjan station were imported to the MLP2 (the best constructed models). MLP2 was trained based on these imported data from Arsanjan station. Then, the 365 daily (time series) values of max/min air temperature from Bajgah in 2007 and the related 8,760 hourly values were selected for validation of trained MLP2. Comparison of simulated and measured data was depicted in Fig. 6. The values of R 2 and RMSE were 0.938 and 2.430°C, respectively. The obtained results showed the constructed models of a station can be used for other stations with similar climatic conditions.

Fig. 6
figure 6

Comparison of simulated and observed hourly air temperatures for validation of Bajgah station using trained MLP2 by Arsanjan station

5 Comparison with analytical method

Kirkham and Powers (1972) proposed an analytical equation (Eq. 6) for estimation of hourly air temperature and it has been applied by different authors. The values of RMSE and R 2 between the measured values and the values predicted by this analytical equation were 3.1°C and 0.84, respectively. Also, the MAE value related to this equation was equal to 4.1°C.

We applied Eq. 6 to the selected stations for 5 days of January (winter season) and July (summer season) and the results are shown in Fig. 7. In all cases, the errors in derivation of hourly air temperatures values were greater and the values of R 2 were less than the ANN methods.

Fig. 7
figure 7

Comparison of the observed and predicted hourly air temperatures by the Kirkham and Powers equation at the study stations (5 days of January and July)

6 Conclusions

As a case study, the hourly air temperature is predicted using min/max of daily air temperature data for three stations located in various geographic and elevation zones in the Fars province, Iran. The results of this study show that hourly air temperature derivation using ANNs (proposed models) have less error than the empirical equations that are used worldwide. Moreover, the multi-layer perceptron ANN with a tangent sigmoid transfer function performs better than the radial basis function to predict hourly air temperature. The test of proposed models using randomized data shows higher estimation (data derivation) efficiency. Survey of two input models shows that using Tmax i , Tmin i , Tmax i-1 , and Tmin i-1 is better than using only daily values of Tmax i and Tmini but adding these antecedent values does not have a considerable improvement in the estimation efficiency. On the other hand, RBF models have more error with the addition of inputs. To end that, the data for the other year is imported to the models from one station. It is found that the proposed models are appropriate for the derivation of hourly air temperature from min/max temperature data and this proves the generalization ability of the proposed models. For simulation of hourly crop evapotranspiration, hourly soil temperature at different layers and crop growth, the model of Kirkham and Powers (1972) has been applied by different authors successfully. Therefore, applying the ANNs method is more attractive for the prediction of hourly air temperature in the plant and soil models. The present study shows that hourly air temperature can be successfully constructed from daily maximum and minimum air temperatures records. Also, Generalization ability of proposed models for application in different time periods and stations were assessed. The ANN models developed here provide the simple and accurate means to predict hourly air temperature for the years before 2000 in the study region and to fill missing data due to faulty operation of the measuring device in the region and the similar climatic conditions.