1 Introduction

In order to balance the ecological system for clean air, water and food, the less polluting renewable energy sources should be used to meet the energy demand of human beings across developed, developing and underdeveloped countries. Renewable energies refer to sources of energy that are sustainable (non-depletable), ubiquitous (found everywhere) and essentially non-polluting (Nelson 2011). Among the renewable energy sources, solar energy is at the top of the list due to its abundance and distribution more uniform in nature.

Consequently, estimation of solar radiation reaching the earth’s surface has a paramount importance for various applications such as photovoltaic applications, cooling and heating applications, agriculture, medical studies and seawater desalination. This matter usually is possible via solar measuring equipment which requires daily maintenance and data recording; consequently, this procedure increases cost of data collection. Furthermore, some remote and rural areas are suitable for the installation of solar energy but do not have the necessary measuring devices (Azadeh et al. 2009a).

In the literature, different approaches have been used to predict solar radiation; recall, for example: estimating global solar radiation from meteorological observations (Ångstrom 1924; Bristow and Campbell 1984; Cengiz et al. 1981; Hargreaves et al. 1985; Liu and Scott 2001; Mahmood and Hubbard 2002; Thornton and Running 1999; Kambezidis et al. 1997; Psiloglou and Kambezidis 2007), substitution of data from nearby stations (Trnka et al. 2005; Rivington et al. 2006), linear interpolation (Soltani et al. 2004), interpolation in neural networks (Elizondo et al. 1994; Reddy and Ranjan 2003), satellite-based methods (Pinker et al. 1995) and generation from stochastic weather models (Richardson et al. 1984; Hansen 1999). Most of the above models lack the detailed knowledge of various parameters whether geographical or meteorological, and they will underperform when used to model nonlinear systems. To overcome these problems, the need for accurate modelling becomes even more important.

Recently, the artificial neural network (ANN) technique has received much attention as a computational approach providing an alternative and complementary way for modelling, due to its ability to cope with complex and ill-defined problems in many scientific fields. In the meteorological field, the modelling of solar radiation variables by the ANN models knew a very promising development which improved the performance of the existing statistical approaches (Lopez et al. 2000a, b). Several studies have been done in this context as can be seen from the literature (Azadeh et al. 2009b; Elizondo et al. 1996; Al-Alawi and Al-Hinai 1998; Togrul and Onat 1999; Sozen et al. 2004, 2005; Yang and Koike 2002; Mohandes et al. 1998, 2000; Hontoria et al. 2001, 2002; Tasadduq et al. 2002; Tymvios et al. 2005; Kalogirou et al. 2002; Bosch et al. 2008; Mubiru and Banda 2008), noting that there are a few attempts interested in the methodology to obtain an optimal model for the estimation of solar radiation.

In solar radiation estimation, the most commonly used ANNs are the multi-layer perceptrons (MLP) that use back-propagation training.

This paper endeavours to propose an optimisation methodology for reaching the better MLP network; based on almost all aspects in ANN modelling such as the activation function, the training data, the training algorithms, pre- and post-processing, number of hidden layers and size.

The main objective of this research is to develop an accurate model for predicting 5-min solar radiation data in the region of Ghardaïa, Algeria based on the number of days, time, temperature and humidity. Furthermore, the novelty in our study, compared to other studies, is that we have tried to predict four parameters at once, which are: direct normal radiation, diffuse radiation (90°), global radiation (90°), global radiation (30°).

2 Artificial neural networks

A neural network is a massively parallel distributed processor made up of simple processing units, known as nodes that perform certain mathematical functions, usually nonlinear. This kind of non-algorithmic computation is characterised by a system that resembles the human brain structure. One of the great advantages of these models is their ability to learn (store experimental knowledge), generalise (make the knowledge available) or extract automatically rules from complex data (Haykin 1998).

The architecture of ANN consists of a number of units called nodes. These nodes are arranged in layers and are interconnected by weights and biases between the layers. Usually, three stages are considered in ANN applications, viz., (i) training, (ii) validation and (iii) testing.

Neural networks are characterised by their topology, weight vectors and activation functions. In this work, an MLP model with each layer consisting of a number of computing neurons has been used. A schematic representation of a feed-forward MLP neural network is shown in Fig. 1. In this network, all the information is transferred in the forward direction only and there is no loop or cycle in the network.

Fig. 1
figure 1

Schematic diagram of a feed-forward MLP network (Bishop 1995)

The procedure of weight adjustment is called back-propagation. A simplified procedure for the learning process of ANNs is summarised below:

  1. Step 1

    Provide the network with training data consisting of input variables and target outputs.

  2. Step 2

    Evaluate the agreement of the network output with the target outputs.

  3. Step 3

    Adapt the connection weights between the neurons so the network produces better approximations of the desired target outputs.

  4. Step 4

    Continue the process of adjusting the weights until some desired level of accuracy is achieved.

3 Study area and database

In order to train the neural network and to apply the suggested methodology, the 5-min mean values of air temperature, relative humidity, global, direct and diffuse solar radiation, measured for 1 year (2007) by the Unit for Applied Research in Renewable Energy (URAER) located in Ghardaïa city were used; Ghardaïa is a city in northern-central of Algeria in the Sahara desert (32°38 N, 3°78 E, 468 m above sea level) (Fig. 2). Geographically, Ghardaïa is in a key region (in the Sun Belt), to play an important strategic role in the implementation of renewable energy technology in Algeria.

Fig. 2
figure 2

Geographical location of Ghardaïa, Algeria

The climate of Ghardaïa is classified as semi-arid with minimum and maximum air temperatures ranging from 14 to 47 °C and 2 to 37 °C in summer and winter months, respectively. The daily global solar radiation varies between a minimum of 2.185 MJ/m2 to a maximum of 27.266 MJ/m2, and the annual daily mean is about 20.361 MJ/m2, measured on the horizontal surface.

Figure 3 shows the high precision radiometers, installed at the rooftop of URAER, that record the solar radiation data every 5 min. Air temperature and relative humidity are measured via a thermo-hygrograph.

Fig. 3
figure 3

Radiometric station. (1) A shading ball for diffuse irradiance measurements. (2) Pyranometer for diffuse irradiance measurements. (3) Pyranometer for horizontal global solar irradiance measurements. (4) Pyranometer for tilted global solar irradiance measurements. (5) Pyrheliometer for direct irradiance measurements

Table 1 gives a detailed exposition of the specifications of these apparatuses.

Table 1 Technical specifications of used instruments

We note that a quality control procedure has been implemented for data recorded where:

  • Overcast days were rejected due to the difficulty of controlling the data from pyrheliometer.

  • Times where we found missing data, data that clearly violate physical limits and extreme data were omitted.

The database was divided in two different sets:

  • Training and validation set: This data set constitutes the major part of the data, because from this data set, the connection weights of the neurons are adjusted during the training in order to acquire the knowledge of the network. On the other hand, the validation set is used for verifying the generalisation ability of the network. The training and validation set covers 66 % of the database.

  • Test set: this data set is used to evaluate the ANN performance in real situations. This data set is formed by the remains of the database, i.e., 33 %.

4 Building the neural network and optimisation methodology

There is no method to predetermine the best combination of neurons/layers, as this depends on the specific model, the physical process and the simulating data. In the bibliography, there are some empirical relationships to solve this problem but the best method till now is up to the researcher to build several models and choose the best suited for the particular application.

In this part, we list all important steps that led us to the solution of the optimisation of neural networks. The following parameters are investigated and optimised during the development of the best network for the prediction of solar radiation: input and output data selection, possible transfer functions, training mode, stopping criteria, training algorithm, normalisation technique, number of hidden layers, number of hidden nodes and performance evaluation measures.

4.1 Selection of input and output parameters

The performance of the final model is heavily dependent on the input variables used to develop the model. The selection of the best and appropriate set of input variables is a vital step and essential to being able to model the system under consideration reliably and to improve computational efficiency. However, the input selection is a difficult task since real systems are generally complex and mostly associated with nonlinear processes. Consequently, the dependencies between output and input variables, as well as conditional dependencies between variables, are difficult to measure.

According to the limited data that we have and the lack of a huge database with variety of inputs and outputs, we decided to fix these inputs and outputs during the study stages. Air temperature, relative humidity, number of days and local time are considered as input parameters. These parameters are used for simulating the outputs parameters, which are direct normal radiation, diffuse radiation (90°), global radiation (90°), global radiation (30°).

Thirty and Ninety degrees are respectively the optimal inclination for a maximum global irradiation on the year and the South-facing vertical plane.

The structure of the neural network is shown in Fig. 4.

Fig. 4
figure 4

ANN architecture used for estimation in this present work

4.1.1 Sensitivity analysis

In order to investigate the impacts of the inputs parameters selected at the predicted outputs, a sensitivity study is performed, where the model that was chosen to be an application example of the study has 4 inputs, 4 outputs and 10 neurons in its hidden layer; it uses the Min–Max method as a normalisation technique and ‘trainbr’ as a training algorithm. Once the network has been trained and optimised, weights matrix will be generated (Table 2).

Table 2 Statistical parameters obtained for the example ANN model (4-10-4)

Then, this weights matrix is exploited in the formula (Eq. (1)) proposed by Garson (1991) to assess the relative importance of the input variables.

$$ {I}_j=\frac{{\displaystyle {\sum}_{m=1}^{m={N}_h}\left(\left(\raisebox{1ex}{$\left|{W}_{j_m}^{i_h}\right|$}\!\left/ \!\raisebox{-1ex}{${\displaystyle {\sum}_{K=1}^{N_i}\left|{W}_{j_m}^{i_h}\right|}$}\right.\right)\times \left|{W}_{m_n}^{h_o}\right|\right)}}{{\displaystyle {\sum}_{k=1}^{k={N}_i}\left\{{\displaystyle {\sum}_{m=1}^{m={N}_h}\left(\left(\raisebox{1ex}{$\left|{W}_{j_m}^{i_h}\right|$}\!\left/ \!\raisebox{-1ex}{${\displaystyle {\sum}_{K=1}^{N_i}\left|{W}_{j_m}^{i_h}\right|}$}\right.\right)\times \left|{W}_{m_n}^{h_o}\right|\right)}\right\}}} $$
(1)

where I j is the relative importance of the jth input variable on output variable, N i and N h are the number of input and hidden neurons, respectively, W’s are connection weights, the superscripts ‘i’, ‘h’ and ‘o’ refer to input, hidden and output layers, respectively, and subscripts ‘k’, ‘m’ and ‘n’ refer to input, hidden and output neurons, respectively. Note that the numerator in (Eq. (1)) describes the sums of absolute products of weights for each input. However, the denominator in (Eq. (1)), represents the sum of all the weights feeding into hidden unit, taking the absolute values (El Hamzaoui et al. 2011). A summary of the obtained results are shown in Table 3, where we have found that, all input variables have a considerable influence on the estimation of solar radiation, with a slight advantage of the air temperature and the local time.

Table 3 Relative importance of input variables on the outputs

4.2 Selection of the ANN structure

The selected ANN structure is a multi-layer feed-forward back-propagation, consisting of an input layer, an output layer and usually one or more hidden layers. A detailed description of this network may be found in (Haykin 1994; Rumelhart et al. 1986).

4.2.1 Transfer function

The main difference between the various network types lies in the type of the activation function used by the hidden neurons. In MLPs, a common type used by the hidden neurons has a logistic sigmoid function shown in Eq. (2); and that by the output neurons a purelin function shown in Eq. (3) (Rafiq et al. 2001; Vogl et al. 1988; Hagan et al. 1996; MathWorks 2004):

$$ f(w)=\frac{1}{1+{e}^{-w}} $$
(2)

where W is the weighted sum of the input.

$$ f(z)=z $$
(3)

and Z is the input to the output layer.

4.2.2 Effect of learning rate

An important parameter in the back-propagation algorithm is the learning rate. It is a parameter used to level out the changes in the weights at the end of each epoch. The learning rate coefficient determines the size of the weight adjustments made at each iteration and hence influences the rate of convergence. Poor choice of this coefficient can result in failure in convergence.

We should keep the coefficient constant through all the iterations for best results. If it is too large, the search path will oscillate and will converge more slowly than a direct descent. If it is too small, the descent will progress in small steps significantly increasing the time of convergence. In the present work, a learning rate of 0.1 is selected.

4.2.3 Number of epochs, goal and momentum

A very significant parameter in the ANN work is the number of training epochs. Each epoch consists of all calculations made by the network expressed as a response to the output neuron. Therefore, the time of training was individually determined for each set of input data. The goal determines the desired accuracy in the output result. Adding some inertia or momentum to the gradient expression is another way to improve the network performance. This can be accomplished by adding a fraction of the previous weight change to the current weight change. The addition of such a term helps to smooth out the descent path by preventing extreme changes in the gradients due to local anomalies. The values of these parameters assumed in this study are as follows:

  • The number of epochs is 1000.

  • The goal is 0.1.

  • The momentum is 0.5.

All these values can be readjusted on real-time operation to improve the prediction performance.

4.2.4 Training mode

There are two different modes for training ANNs: batch mode and pattern mode. In a batch mode, when an epoch is completed, a single average error is calculated and the network weights are adjusted according to the error. In a pattern mode, the error is calculated after each pattern is presented to the network, and then the network weights are adjusted. Choosing between the two modes is generally problem-specific. Swingler (1996) has indicated that the following points should be considered before making the choice:

  • The batch mode requires less weight update and hence it is faster to train.

  • The batch mode provides a more accurate measurement of the required weight changes.

  • The batch mode is more likely, than pattern mode, to become tapped in local optima.

It would be advisable to train the network using batch mode to start with and test and analyse the network output. If the level of the error after testing the network with unseen data, i.e., data that was not used in training, is not satisfactory then a pattern mode should be used.

4.2.5 Performance evaluation measures

In this article, we have only presented the mean of results obtained after about 20 trials with the same input–output data for different models. An example of calculating the mean is illustrated in Table 4 (this example uses the model ‘trainbr’ as the training algorithm and has 10 neurons in its hidden layer; it also uses the Min–Max method as a normalisation technique).

Table 4 Example of ANN performance with 20 trials

Among the various statistical methods for assessing the prediction performance of models, the mean absolute percentage error (MAPE) proposed by Lewis (1982), as one of the most stringent criteria due to its relative values, is used for comparing the final performance of different networks; the mean squared error (MSE) is also used as a criterion for the minimisation algorithms during the optimisation of the network.

The MAPE and the MSE are defined as follows:

$$ \mathrm{MAPE}\left(\%\right)=\left(\frac{1}{n}\right)\times \left({\displaystyle \sum_n\left|\frac{\left(\mathrm{experimental}\kern0.5em \mathrm{value}\kern0.5em -\kern0.5em \mathrm{predicted}\kern0.5em \mathrm{value}\right)}{\mathrm{experimental}\kern0.5em \mathrm{value}}\right|}\times 100\right) $$
(4)
$$ \mathrm{M}\mathrm{S}\mathrm{E}=\frac{1}{n}{{\displaystyle \sum_n\left(\mathrm{experimental}\kern0.5em \mathrm{value}\kern0.5em -\kern0.5em \mathrm{predicted}\kern0.5em \mathrm{value}\right)}}^2 $$
(5)

where n is the number of data points.

4.2.6 Training algorithm

An algorithm is a procedure for solving a problem, in which a list of well-defined instructions for completing a task will proceed through a well-defined series of successive states, eventually terminating in an end-state.

Selection of an appropriate ANN training algorithm has always been a difficult task. Its importance is equally the network architecture and geometry.

We performed thorough experimental tests to determine the best training algorithm. Altogether, 13 different learning algorithms for MLP networks algorithms available in Matlab were used.

Only one hidden layer has been adopted with a variation in the number of neurons from 1 to 30 for each tested algorithm. As we have indicated previously, in each new configuration of network, 20 trials have been carried out because the responses of the networks were not stable, meaning that the performance of the same network was varying from one training session to another.

The detailed comparison of the performance of the MLP neural networks trained by the different algorithms mentioned above is shown in Fig. 5; where MAPE is plotted against the number of neurons in the hidden layer.

Fig. 5
figure 5

Performance of MLP networks in terms of MAPE with various training algorithms

In Fig. 5, it is observed that the performance results are consistent with fairly minor changes when increasing the number of neurons. However, it appears that among all algorithms, trainlm, trainrp and trainbr out-perform the others in account of error reduction. In order to see clearly the performance of these three algorithms, Fig. 6 focuses on them only.

Fig. 6
figure 6

As in Fig. 5, but for the best three performing algorithms

The variation range of MAPE values for all algorithms are shown in Table 5.

Table 5 Range of MAPE (%) with different training algorithms

Overall, these results are not good enough and, therefore, no accurate estimations are expected. This means that the networks should improve in terms of their prediction quality and reduction of the error values. According to the previous studies, the normalisation of data is considered necessary for neural network training; it can be made more efficient using certain pre-processing steps on the network inputs and targets.

4.2.7 Normalisation technique

The data normalisation refers to the analysis and transformation of the input and output variables in order to minimise noise, highlight important relationships, detect trends and flatten the distribution of the variable to assist the neural network in learning the relevant patterns.

The created forecasting models based on ANN demand a consistent treatment of the data to guarantee reasonably good performance and effective application of them. In general, the normalisation is accomplished to assure that all variables used in the model inputs have equal importance during the training; therefore, the normalisation should range the data from lower to upper limit of the activation function. Finally, outputs from the neural network are denormalised before being presented.

In order to compare the ability of ANN between normalisation and no normalisation of the data (Fig. 7), three pre-processing techniques have been applied at the three best networks in the previous stage. These techniques are:

Fig. 7
figure 7

Performance of networks in terms of MAPE with all the pre-processing methods

  • Minimum and maximum method (MMM);

  • Mean and standard deviation method (MSDM);

  • Mean and standard deviation method (MSDM) with scaling the inputs variables between [0.2, 0.8];

A coding system (Table 6) is then created to facilitate the separation of the different networks in terms of the pre-processing methods.

Table 6 Network code description

It must be mentioned that the networks preserve their structures and their parameters that were used in the previous stage.

The obtained results show that the normalisation techniques make the networks more stable and with a minimum error compared to those without normalisation. Also, the networks using MMM as a pre-processing technique performed much better than the others in terms of consistency and the lowest MAPE.

Figure 8 depicts the comparison between the three networks when the MMM is applied. The lowest error value is achieved by Net1_1 that used Bayesian Regularisation algorithm (trainbr). Furthermore, its response in terms of consistency was very well. This network has been chosen for the rest of the trials in the following stages.

Fig. 8
figure 8

Comparison of the error performance between the three networks when using the MMM

At the end of this stage, we can say that the three networks have responded very well to the introduction of the pre-processed data. They give excellent performance and fairly accurate prediction. On the other hand, it appears that MAPE has reduced when the hidden nodes increase. Consequently, there is a relationship between the performance of network and the hidden nodes. This will be examined in the next section looking for improvement in order to reduce the network error further.

4.2.8 Network architecture (number of hidden layers and number of neurons)

It is known from the literature that the hidden nodes should be in an optimum range. A value higher than this optimum range causes over specification of input layer-hidden layer relation leading to over fitting of the model.

The previous section showed that a single hidden layer with a variation of nodes from 0 to 30 provides a favourable network performance. The lowest min error value is achieved by the network with 30 neurons, having MAPE equal to 4.51 %, root mean square error (RMSE) around 25.70 % and correlation coefficient over 0.969.

We therefore decided to investigate the performance of networks using two hidden layers. It is known that three or more hidden layered systems cause unnecessary computational overload. In order to reach an optimum amount of nodes in the two hidden layers, training with hidden nodes from 0 to 30 in each layer was applied. There is often no better solution than to proceed by successive trials to test the architecture of the network. To start off with Net 1_1 will be used with all possible combinations, i.e., 900.

Figure 9 shows all responses corresponding to those combinations in three dimensions. Each point in the graph corresponds to a value of error performance to one combination. As there is a huge number of points for all combinations, the points were divided into groups; each group belongs to a specific MAPE range and the optimum point is discerned from the rest of the points. In fact, there will be several optimal combinations, and, therefore, this is truly a multi-objective optimisation problem. The optimisation search should be for a set of combinations that gives a superior output. Table 7 summarises the error performance and architecture of the ten best networks.

Fig. 9
figure 9

Performance of networks in terms of MAPE with two hidden layers

Table 7 Error performance and architecture of the 10 best networks

Fig. 9 shows a decrease in MAPE values with an increase in the hidden nodes. Further, the presence of a single node in the hidden layer, whether the first hidden layer or the second, gives undesirable results.

In comparison, the performance of a network with two hidden layers is considerably better than with one hidden layer; typically, the network error was 5 times lower on average. The architecture with 4-30-30-4, i.e., 30 hidden nodes in both hidden layers, shows the best overall predictability with MAPE = 1.17, MBE = 0.12 and RMSE = 14.06 %. This means that this network will be used to perform similar tasks in the future.

Figure 10 shows the performance curve produced while training the network. This performance is measured in terms of MSE. The line converges to the best MSE possible (a constant drop). It turns out that after 100 iterations, the performance does not improve and settles at 0.0042251.

Fig. 10
figure 10

Network’s performance of optimum model

The performance of a trained network can be measured to some extent by the errors on the training, validation and test sets, but it is often useful to investigate the network response in more detail. One option is to perform a regression analysis which is a measure of how well the variation in the outputs is explained by the targets. The R values obtained for the four estimated outputs are 97.28, 97.56, 99.28 and 99.29 % as shown in Figs. 11, 12, 13 and 14. The linear fit to the output-target relationship is close to the 1 ÷ 1 line (output = target), which is a good sign to an accurate learning of the network.

Fig. 11
figure 11

Regression analysis plot for the optimum model between output and target of direct normal radiation

Fig. 12
figure 12

Regression analysis plot for the optimum model between output and target of diffuse radiation (90°)

Fig. 13
figure 13

Regression analysis plot for the optimum model between output and target of global radiation (90°)

Fig. 14
figure 14

Regression analysis plot for the optimum model between output and target of global radiation (30°)

5 Comparison with other models

To evaluate the importance of the results obtained, we tried to compare them with similar studies, and especially, with the models that have the same or most inputs such as we used. We find that in most models, the goal is to determine the global solar radiation on horizontal planes, like Rehman and Mohandes (2009) who developed ANN model by using three combinations of input parameters (day, maximum air temperature, mean air temperature and relative humidity) to estimate global solar radiation for Abha city in Saudi Arabia. Al-Alawi and Al-Hinai (1998) used location, month, mean pressure, mean temperature, mean vapour pressure, mean relative humidity, mean wind speed and mean sunshine hours as input variables to multi-layer feed-forward network, for prediction global radiation in Seeb locations. Lazzús et al. (2011) applied ANN model, for the estimation of hourly global solar radiation in La Serena (Chile). The inputs of the model are wind speed, relative humidity, air temperature and soil temperature. Linares-Rodríguez et al. (2011) estimated daily global solar radiation in Spain via ANN model using latitude, longitude, day of the year, daily clear sky global radiation, total cloud cover, skin temperature, total column water vapour and total column ozone as inputs. Elminir et al. (2007) carried out an ANN model to predict diffuse radiation in Egypt, based on the global solar radiation, long-wave atmospheric emission, air temperature, relative humidity and atmospheric pressure as inputs. Elminir et al. (2005) also proposed multi-layer feed-forward network for predicting infrared, ultraviolet, global solar radiation at Helwan city, using wind direction, wind speed, ambient temperature, relative humidity as inputs parameters. Jiang (2008) developed an ANN model for estimating monthly mean daily diffuse solar radiation. The input data to the network are monthly mean daily clearness index, sunshine percentage. Mubiru and Banda (2008) found FFBPANN architecture to estimate monthly average daily global solar radiation for Uganda locations. The network uses inputs as annual average of sunshine hours, cloud cover, relative humidity, rainfall, latitude, longitude and altitude. Benghanem et al. (2009) used the inputs (air temperature, relative humidity, sunshine duration and the day of year) at different combinations to estimate solar radiation in Al-Madinah (Saudi Arabia) with ANN model. Fadare (2009) elaborated ANN model with different architecture, for prediction of solar energy potential over 195 cities in Nigeria. The inputs for the network are latitude, longitude, altitude, month, mean sunshine duration, mean temperature and relative humidity. Azadeh et al. (2009a) proposed an ANN approach for predicting global solar radiation by using inputs variables such as mean value of maximum temperature, minimum temperature, relative humidity, vapour pressure, wind speed, duration of sunshine and total precipitation. The results obtained from these models that we have mentioned and our results are collected in the Table 8.

Table 8 Results of our model versus various similar models

6 Conclusions

In this paper, a methodology for choosing ANN model to predict solar radiation was presented. However, this type of studies has not been seriously addressed in the literature. The objective of this paper was to bridge this gap. The methodology starts with an extensive search in order to select the model with minimum complexity, optimal performance and choice of the respective parameters (inputs and outputs, activation functions, training algorithm, normalisation technique, hidden layers and hidden nodes).

This work only dealt with the optimisation of MLP back-propagation networks. However, the method proposed is general enough to be used with other connectionist models or different training algorithms. Actually, the results confirm the hypothesis that the MLP neural networks are competent enough to estimate solar radiation when using meteorological parameters as inputs.

During the training process, several neural network configurations were studied. We started with the use of 13 training algorithms, by observing the effect of each of them on the network performance; therefore, we chose the three top and we attempted to improve their performance by using two normalisation techniques. Among the six obtained networks, we selected the best predicting network, which has also been tested with two hidden layers and several hidden nodes. It has been found that two hidden layers with 30 neurons in each layer can provide a better prediction.

The mean MAPE, MBE and RMSE values are found to be 1.17, 0.12 and 14.06 %, respectively, for the optimum model, with accuracy exceeded 97.28 % for the estimation of direct and diffuse radiation, and reached up 99.29 % for the global radiation, which proves that the estimated values are in good agreement with the actual values. Comparing these results to other studies in the same context, we can say that our model gives better performance to many other models.