1 Introduction

Electricity is a fundamental consumption good for the process of economic development and the well-being of nations [1]. It is therefore crucial for a country to guarantee permanent and continuous access to this good. The prediction of electricity demand is a major concern of energy producers and plays an important role in the identification of optimal operating strategies and the planning of required power systems in medium and long-terms. The forecast of short-term electricity demand is also of great interest for electricity suppliers since it helps them to guarantee the supply/demand equilibrium. In fact, electricity is one of the very few economic goods that could not be stocked. A suitable forecast method is the one that allows lowering costs and improving the dispatching of electric power production. In addition, understanding the behavior of electricity demand is imperative for electricity producers since it allows them maintaining continuous, reliable and secure access to electric power. The electricity demand forecast errors may be responsible for significant operating costs. Regarding this point, Bunn and Framer [2] conclude that a 1% increase in the forecast error led to an increase of 10 million pounds in annual operating costs in the United Kingdom in 1984. The electricity crisis occurred in California in summer 2000 is also a good example, since it induced a blackout due to the insufficiency of the supply compared to the demand. The existing literature has proposed several techniques to forecast electricity demand. Some of them are statistical (such as econometric and time series models) while others are computational (such as neural networks and fuzzy logic) [3].

This paper contributes to the related literature by implementing the artificial neural network (ANN) to forecast half-hour-ahead electricity demand in Tunisia. Compared to prior studies on the subject, this paper has relatively some novelties. First, it is based on a rich and unique database provided by the Tunisian Company of Electricity and Gas. It considers half-hourly electric load data covering the period ranging between 2000 and 2008. The choice of Tunisia offers a particularly interesting context for discussing the issue of modeling and forecasting electricity demand in developing countries for at least two reasons. On the one hand, electricity demand has been consistently increasing during the last decades in Tunisia. For instance, electricity demand climbed by 6.8% per year between 2000 and 2010. It represents 85% of the total installed capacity against 68% in 2000. This sharp rise in electricity demand is a direct consequence of industrialization and change in household habits, especially the use of air conditioning. To avoid unexpected blackouts, the Tunisian Company of Electricity and Gas signed agreements with the Algerian and Moroccan Electricity companies to acquire electricity to be used during periods of peak electricity demand in summer 2018. On the other hand, there are no prior studies that focused on forecasting electricity demand in Tunisia. Currently, the Tunisian Company of Electricity and Gas is based on a traditional method to forecast electricity demand during the next 24 h, namely the ‘similar days’ method. This method consists of forecasting the daily demand for day j based on the demand in day j − 1 between Tuesdays and Fridays and on demand in day j − 3 for Mondays. The prediction of electricity demand during the weekend is based on the past weekend’s demand, with slight corrections owing to weather conditions, holidays and exceptional events. Given the limitations of such a method, it is crucial to propose a more accurate forecasting tool that will help the company predicting the optimum production and therefore avoiding cuts during electricity demand peaks. Second, contrary to statistical methods, which impose restrictive assumptions, the artificial neural network model proposed in this paper does not impose any condition but uses an intelligent training process. Furthermore, the optimal structure of the model (the number of hidden layers and the number of neurons in each layer) is based on the use of a specific algorithm, namely the pattern search optimization algorithm and not through a trial-and-errors process. Third, to check the robustness of the obtained forecasts and their accuracy, we compare the performance of three training algorithms used in the literature, namely the Levenberg–Marquardt algorithm, the resilient back-propagation algorithm, and the conjugate gradient algorithm.

The remainder of this paper is organized as follows. The second section gives a brief presentation of ANN and its use in forecasting electricity demand, while Sect. 3 describes the model and data used in the study. Test results and after-the-fact error analysis are presented in Sect. 4. Finally, Sect. 5 concludes the paper.

2 A brief review of electricity demand forecasting techniques

According to Hong [4], three categories of methods have been developed and used when modeling and forecasting the demand for energy, namely traditional methods (autoregressive integrated moving average model, seasonal autoregressive integrated moving average model, exponential smoothing models, linear regression model, etc.), artificial intelligence approaches (knowledge-based expert system model, artificial neural networks model and fuzzy inference system model) and the support vector machine. Ghalehkhondabi et al. [3] provide an excellent survey of energy demand forecasting methods employed in research articles published between 2005 and 2015. The authors conclude that neural networks are the most used methods to forecast energy demand.

Findings of previous studies on the subject suggest that the demand of electricity depends on many factors such as the population size, the economic structure, the season, the month, the day, the time of day and the climate change [5,6,7]. Moreover, social factors, such as national events and holidays, may explain the demand behavior in some cases. As mentioned above, the empirical literature has provided several methods for modeling and forecasting the demand for electricity. For instance, Hussain et al. [8] employ the ARIMA model to forecast the demand of electricity in Pakistan. Taylor [9] is rather based on the double seasonal Holt-Winters exponential smoothing method to forecast the electricity demand in England and Wales. Taking the case of the Nova Scotia Power Corporation, Mbamalu and El-Hawary [10] use the iteratively reweighted least squares to estimate the parameters of the multiplicative autoregressive model incorporating seasonal factors. Finally, Clements et al. [11] are based on a multiple equation time series approach estimated using the ordinary least squares to forecast the day-ahead electricity load in the Queensland region of Australia. The use of these so-called traditional methods, such as ARIMA and regression models, for forecasting purposes, is conditioned by the validation of some crucial time series assumptions, particularly the linear behavior of the data. However, the electric load time series always unveil several seasonal features, different time frequencies (daily, weekly, monthly or annually), complex schedule effects and nonlinear dependence of meteorological variables [12]. The relative empirical failure of conventional statistical methods in forecasting nonlinear time series has led researchers to make use of artificial intelligence methods, particularly the artificial neural network [13]. Indeed, such methods are considered as powerful tools for modeling and predicting nonlinear time series.

A neural network is a system composed of interconnected neurons which are arranged in layers. Artificial neural networks are input–output models whose design is schematically inspiredfrom the functioning of biological neurons. Within the framework of these models, the direction of transfer of information in the network is defined by the nature of the connections that could be direct or recurrent. According to the path followed by the information in the network, one can classify neural networks into two main groups. The ‘feed-forward’ networks are those for which information circulates from inputs to outputs without turning back. On the other hand, the ‘feedback’ networks have rather a cyclical topology. When circulating in the direction of connections along the network, it is possible to find at least one path which returns to its starting point. In neural networks models, the nature of the relationship between the dependent and independent variables is not a priori determined. The optimal structure of the model is the result of a training process. This phenomenon reflects the capacity of the artificial neural network to learn from its environment in order to improve its performance. The training procedure is done through an iterative adjustment process applied to the synaptic weights and thresholds. Following each iteration during the training, the network becomes more informed about its environment, which improves the forecasting performance.

The network structure is composed of three layers: an input layer (independent variables), hidden layers (units or unobservable nodes) and output layers (dependent variables). The outputs of the network may be linear or nonlinear mathematical functions compared to inputs. The input vector is usually defined according to the degree of knowledge and experience, through assessment criteria of the technology and the size of the artificial neural network. The following three elements characterize an artificial neuron:

  • All connections, also known as synapses, allow receiving input signals and transmitting the output signal. Each connection is characterized by a synaptic weight so that the signal transmitted by a source neuron is multiplied by the weight associated with the connection before being received by the destination neuron.

  • The status of the neuron is based on an adder that performs the sum of weighted input signals by synaptic weights.

  • The activation function: the status of the neuron is calculated from an activation function that is applied to the weighted sum of synaptic weights. It serves to introduce nonlinearities in the neuron’s operation. The form of this function is however linked to the nature of the studied problem.

Neural networks have been initially used in many fields such as biology, physics, industry, etc. These models have been afterward employed in statistics and particularly for the prediction of financial and economic variables, such as stock market indices [14, 15], exchange rates [16, 17], oil price [18] and natural gas demand [19]. Recently, several studies have shown the accuracy of ANNs for the prediction of electric power demand [20,21,22,23]. As mentioned by Panapakidis [24], artificial neural networks perform well in forecasting electric power consumption, especially for datasets characterized by nonlinear behavior between inputs and outputs. Gonzalez-Romera et al. [25] use the artificial neural network to forecast monthly electricity consumption in Spain. Results show that the artificial neural network performs well and is better than the ARIMA model. Bakirtzis et al. [26] are based on an artificial neural network model to forecast short-term load demand in Greece. Three types of variables have been used as inputs, namely previous loads, the season and the temperature. Park et al. [27] propose a back-propagation neural network with three layers to resolve the daily forecasting problems. It has been shown that the used model provides more accurate results than regression models. Darbellay et al. [28] implement an artificial neural network to predict the electricity demand in the Czech Republic. The forecasting performance metrics indicate that the proposed model is more efficient compared to an ARIMA. Khotanzad et al. [29] present an approach based on the use of two types of neural networks, one for the prediction of the basic load while the second is used for predicting the variation of the load. The final forecast is a combination of the two aforementioned forecasts. Recently, Hippert et al. [30] compare the performance of short-term electricity demand forecasting models by studying the case of Rio de Janeiro in Brazil. The authors show the superiority of artificial neural networks compared to traditional forecasting techniques such as the exponential smoothing and regression models.

3 The data and forecasting method

3.1 Data gathering

The output of the ANN is the load forecast for a given half-hour (t) during the day. Inputs to the ANN may be classified into three categories. First, we consider the 48 previous load demand (t-i, where i = 1, 2,…,48). In line with many previous studies, such as [5,6,7], we also gathered data on climatic conditions, namely the minimum and maximum daily temperatures.Footnote 1 Finally, we are based on the calendar to obtain information on the type of day, the type of week, the type month and the type of year. Data on half-hour electricity demand are provided by the Tunisian Company of Electricity and Gas. The study covers the period 2000–2008 (9 years). Table 1 presents definitions and characteristics of the three categories of variables used in the study.

Table 1 Data definition and sources

3.2 Data pre-processing and normalizationFootnote 2

It is well recognized that the performance of any forecasting tool depends on how data are used. It is, therefore, crucial to perform a pretreatment of data to provide a good network performance. The use of neural networks requires the normalization of input vectors and the output vector such that the data are scaled to be in the [0–1] range. Data normalization aims to avoid high dispersion between errors and weights during the training process. Matlab presents several types of normalization approaches. In our case, we use the Mapminmax normalization method. The input and output variables are scaled as follows:

$$ \hat{x} = \frac{{x - x_{min} }}{{x_{max} - x_{min} }}, $$

where \( \hat{x} \), \( x \), \( x_{min} \) and \( x_{max} \) are the normalized value, the original value, the minimum and maximum values for a given variable, respectively.

3.3 The network architecture

The common structure of a neuron network depends on how to select the input data (in our case, electricity demand in previous periods, meteorological and schedule variables). The electricity demand forecast is the output variable, where each node represents the half-hourly demand. As shown in Fig. 1, the network used in the current study is composed of three layers, namely an input layer, several hidden layers, and an output layer.

Fig. 1
figure 1

The structure of the used multilayer neural network

The structure of the forecasting model takes into account the different factors influencing electricity demand, namely the previous loads, the type of day, the type of week, the type of month, the type of year and the maximum and minimum temperatures. Given these variables, the output layer is represented by 48 vectors (48 half-hours per day) and the input layer by 58 vectors having the following structure:

  • Vector 1 represents the month (1,2,…,12);

  • Vector 2 represents the year (1, 2… 8);

  • Vector 3 represents the week (1, 2, …, 51);

  • Vectors 4 to 6 represent the kind of day;

  • Vector 7 represents the minimum temperature for the day j;

  • Vector 8 represents the maximum temperature for the day j;

  • Vector 9 represents the minimum temperature for the day j − 7;

  • Vector 10 represents the maximum temperature for the day j − 7;

  • Vectors 11–58 are the previous 48 half-hourly loads.

Figure 2 shows that the daily electricity demand depends upon the day and the time of the day. For instance, the electricity demand is the lowest between midnight and 6 am and ranges between 1000 and 1400 MW. Moreover, the demand for electricity is lower during Sunday than for the other days, especially between 8 am and 6 pm (in Tunisia, Sunday is the day of rest and Monday is the first day of the working week). Finally, it is worth noting that there is a significant similarity regarding the demand for electricity during Tuesday, Wednesday, and Thursday. In order to save space and avoid repetition, we will only refer to Thursday in what follows, instead of the three aforementioned days.

Fig. 2
figure 2

Load curves for a week from 05/01/2008 to 11/01/2008

3.4 The training process

The initial stage consists of dividing the whole dataset into three datasets. The first dataset, reserved for training, is used for calculating the gradient and adjust the network weights. The second dataset, the validation dataset, is used to evaluate the ability to generalize the network. Finally, the third dataset, the test dataset, is reserved to simulate the output related to another dataset, not used during the training process, and check the performance of the developed model. It is important to note that there is no rule for determining these datasets quantitatively. In our case, we used the early stopping method during the training process. That is, the training process is stopped when the validation error reaches its minimum. The study period is divided into two sub-periods: data from 2000 to 2007 is used for the training and validation, while data from 2008 is used for testing the performance of the network.

The current study uses the Levenberg–Marquardt training algorithm as a benchmark algorithm. This algorithm has the advantage that it quickly converges with high precision.

As stated by Madić and Radovanović [34], the Levenberg–Marquardt algorithm is faster and finds better optima than other usual algorithms. Furthermore, Beale et al. [35] point out that this algorithm can converge from ten to one hundred times faster than the conventional algorithms. The Levenberg–Marquardt has been mainly used by Amjady and Keynia [36], Tanoto et al. [37] and Rodrigues et al. [38] to learn the input/output mapping function of the forecast process. To check the robustness of forecasts, we also used the resilient back-propagation algorithm and the conjugate gradient algorithm.

Unlike many previous studies, the optimal structure of the model is based on the pattern search optimization algorithm and not on a trial-and-errors process. The use of such an algorithm aims to determine the optimal structure of the multilayer neural network model (the number of hidden layers and the number of neurons in each layer). In the case of a neural network composed of 58 input neurons and 48 output neurons (representing the half-hourly loads), the application of the pattern search algorithm gives an optimal structure composed of three hidden layers having the following structure: the first layer includes 18 neurons, the second 35 neurons, while the third 12 neurons.

4 Test results and discussion

4.1 Main results

The accuracy of the developed network in forecasting electricity demand has been tested during the year 2008. The use of the Levenberg–Marquardt training algorithm allows obtaining five electric load curves (Thursday, Friday, Sunday, Monday and Saturday).Footnote 3 A comparison between actual and forecasted load profiles during the 5 days is presented in Fig. 3.

Fig. 3
figure 3

Actual and forecasted load curves using the Levenberg–Marquardt training algorithm (in MW)

In order to evaluate the performance of the proposed ANN model, we use several indicators, namely the mean absolute error (MAE), the mean percentage error (MPE), the mean squared error (MSE), the mean absolute percentage error (MAPE) and the root mean square error (RMSE).Footnote 4 Values of the five indicators calculated for the 5 days are presented in Table 2.

Table 2 Forecasting performance results of the Levenberg–Marquardt learning algorithm

Values of the performance indicators confirm the ability of the Levenberg–Marquardt algorithm in providing good and suitable forecasts. Among others, the obtained MAPE values, ranging between 1.125 and 3.410%, indubitably confirm the satisfactory forecasted results.

4.2 Comparison with other training algorithms

In order to check the performance of the Levenberg–Marquardt algorithm, we reproduce the same task using two additional algorithms, namely the resilient back-propagationFootnote 5 and the conjugate gradient.Footnote 6 To determine the optimal structure of the multilayer neural network model (the number of hidden layers and the number of neurons in each layer), we employ the pattern search algorithm that allows generating results presented in Table 3.

Table 3 The pattern search results

The neural network based on the resilient back-propagation training algorithm is composed of 58 input neurons, 48 output neurons representing the half-hourly loads and 3 hidden layers composed of 21, 20 and 20 neurons, respectively. Regarding the neural network based on the conjugate gradient training algorithm, the table shows that it is also composed of 58 input neurons and 48 output neurons and three hidden layers composed of 28, 5 and 18 neurons, respectively.Footnote 7 Figure 4 plots the evolution of the actual and forecasted loads using the three mentioned algorithms.

Fig. 4
figure 4

Actual and forecasted load curves using the three training algorithms (in MW)

As a preliminary statement, one may note that the Levenberg–Marquardt algorithm is more accurate than the conjugate gradient algorithm and the resilient back-propagation algorithm. The forecasting power of the three algorithms is evaluated based on the five performance indicators previously mentioned and the correlation coefficient.Footnote 8

Results, given in Table 4, suggest that the electricity demand forecasts based on the three training algorithms are relatively acceptable. However, findings of Table 4 strongly confirm that the ANN based on the Levenberg–Marquardt algorithm is more efficient in forecasting the electric load. The reported MAPE values of the Levenberg–Marquardt training algorithm is always lower than those of the conjugate gradient algorithm and the resilient back-propagation algorithm, regardless of the considered day. Moreover, graphs presented in Appendix C suggest a good agreement between predicted and observed values for the training, validation, test and overall datasets. Despite the overall correlation coefficient is always high when using the three training algorithms, the Levenberg–Marquardt algorithm is found to perform better than the other two algorithms since the correlation coefficient is the highest (0.98245 for the Levenberg–Marquardt algorithm versus 0.98025 for the resilient back-propagation algorithm and 0.97214 for the conjugate gradient algorithm).

Table 4 Comparison of the forecasting performance

5 Concluding remarks

This paper aims to evaluate the performance of artificial neural networks in forecasting the electricity demand in Tunisia. The used model is based on a Multilayer Perceptron artificial neural network to forecast short-term load curve by exploiting the half-hourly electricity demand data as inputs. Besides, many meteorological and calendar variables are included in order to improve the quality of forecasts. Contrary to many earlier studies on the subject, the choice of the optimal structure of the neural network has not been made arbitrarily. In fact, we applied the pattern search optimization algorithm that determines the number of hidden layers and the number of neurons in each layer. The experimental results suggest that the Levenberg–Marquardt training algorithm performs well in forecasting the half-hourly electricity demand. In order to check the performance of this algorithm, we employ two additional algorithms frequently used in the literature, namely the MLP with conjugate gradient and MLP with resilient back-propagation. Results are compared based on several performance measures. In all cases, results show that the MLP with the Levenberg–Marquardt algorithm is the most efficient tool in forecasting the half-hour electricity demand.

Although the used optimization and training algorithms perform well in forecasting the short-term electricity demand, neural networks still suffer from some insufficiencies. Future research focusing on electricity demand forecasting may implement hybrid models combining neural networks with econometric models.