1 Introduction

Crude oil constitutes the largest proportion of world’s energy sources and is a widely traded commodity (Yu et al. 2008). Price of crude oil fundamentally contributes to world economy growth and has direct impact on economies of oil exporting and oil importing countries (Godarzi et al. 2014). Therefore, many businesses and governments are affected by crude oil price fluctuations and seek to analyze and understand the behavior of oil prices in order to make better decisions and to select better policies.

There are numerous elements that influence crude oil prices. First, economic factors are the major elements in the rising and falling of crude oil prices, i.e. crude oil supply and demand like any other commodity (Godarzi et al. 2014). Second, crude oil prices are affected by technical aspects of oil industry such as refining capacity and above the ground oil inventories as well as chemical properties of oil such as density or viscosity. In addition, production technologies such as shale oil extraction can considerably affect crude oil supply and subsequently oil prices. Third, geopolitical tensions and crises also play an important role in determination of crude oil prices. Since world’s largest oil reserves are located in the Middle East, a politically unstable region, geopolitical factors can sharply disrupt oil prices (Movagharnejad et al. 2011). Furthermore, economic sanctions imposed on major oil producing countries can impact oil supply and upset global energy markets.

Driven by forces of different nature, crude oil price is highly volatile and hard to predict. Numerous efforts have been made to provide insight into behavior of oil prices. Some researchers have used econometric and statistical methods for forecasting the price of oil. Park and Ratti (2008) used a vector autoregressive (VAR) model to study oil price shocks in relation to stock markets. Similarly, Aloui and Jammazi (2009) used a wavelet analysis and Markov switching vector autoregressive model to analyze the behavior of crude oil prices with stock market returns in France, Japan and UK. Cheong (2009) developed a flexible autoregressive conditional heteroskedasticity (ARCH) model to forecast WTI and Brent crude oil markets. Wei et al. (2010) used generalized autoregressive conditional heteroskedasticity (GARCH) models to capture volatility of West Texas Intermediate (WTI) and Brent crude oil prices.

Moreover, complexity and nonlinear behavior of oil prices have convinced researchers to use artificial intelligence methodologies such as artificial neural networks and machine learning in their forecasting models in order to deal with chaotic movements of oil prices. Mingming and Jinliang (2012) proposed a multiple adaptive wavelet recurrent neural network model using gold prices. Pan et al. (2009) provided short-term forecasts for crude oil price with an artificial neural network using futures and gold prices along with Dollar index as input factors.

Furthermore, there is extensive literature on the relationship between crude oil prices and monetary policy. Amano and van Norden (1998) discuss how oil prices capture exogenous terms-of-trade shocks and emphasize on the structural relationship between exchange rates and oil price in the long run. Leduc and Sill (2004) explore the dynamics of oil-price shocks and monetary policy highlighting the role of monetary policy as the major response of the aggregate output of the economy to crude oil prices. In a later study, Kilian (2006) categorizes oil-price shocks by origin and quantifies this as compared to macroeconomic policy. Krichene (2006) further explains the relationship between monetary policy and oil prices and how exchange and interest rates disturb oil market equilibrium. The empirical analysis based on a world demand and supply model indicates the dependency of oil price stability on monetary policies. Yousefi and Wirjanto (2004) formulate the reaction of major oil producing countries to changes in the exchange rate of US dollar and illustrates a relationship between crude oil prices and exchange rate suggesting the inability of OPEC as a uniform determinant of oil prices.

Crude oil has also long been studied as an exhaustible natural resource. In an earlier study, Stiglitz (1976) compares the rate of exploitation of an exhaustible natural resource with the profit maximization objective of a monopolist in the framework of a competitive market. He indicates that monopoly prices and market equilibrium prices are almost the same under mild conditions. Neumayer (2000) further explores natural resource availability and alternative options for economic growth in light of resource constraints and concludes that elimination of resource scarcity is theoretically impossible. Greiner et al. (2012) develop an empirical analysis of oil exploitation rate and uncover a U-shaped relationship between oil price and extraction rate. Maslyuk and Smyth (2008) indicate an upward trend in oil price pattern corresponding to supply constraints suggesting a consistency with theory of exhaustible natural resources by Hoteling (1931).

Although many econometric and artificial intelligence models have been used to forecast crude oil prices, little attention has been focused on including parameters that take into account the exhaustible nature of crude oil and effects of monetary policy that are recently having a major impact on crude oil prices given the shift in the dynamics of oil markets. As crude oil is being increasingly traded in spot and future markets such as NYMEX, a more detailed analysis of monetary policies is required to be incorporated into forecasting models to capture this recent and essential element of the oil market (Askari and Krichene 2008). In addition, recent developments in the upstream exploration and extraction suggest a new era for crude oil with tighter scarcity of supply which dictates more careful assumptions regarding modeling supply and exhaustibility of crude oil in the forecasting model. Furthermore, some models introduced in the literature have not had a comprehensive understanding of oil market dynamics and drivers of crude oil price and have sometimes incorporated somewhat irrelevant variables such as gold prices or oil futures prices in their forecasting models (Hamilton 2008). While crude oil, like any other fossil fuel, is exhaustible, most models have not taken into consideration the exhaustible nature of crude oil and depletion of oil reserves. Moreover, monetary policy can play a major role in oil price movements and this crucial element has often been missing from forecasting models. This paper aims to provide a thorough modeling of crude oil market elements and key drivers of oil prices that include major aspects of oil price dynamics such as monetary policy and exhaustibility of crude oil. Meanwhile, the effectiveness of artificial neural networks as powerful computational tools for modeling nonlinear and volatile behavior of crude oil prices is tested by comparing its accuracy with results from an estimated VAR model.

In particular, the present paper connects to and builds on research carried out in three related areas: (1) Artificial Neural Network structure determination, (2) feature engineering, and (3) crude oil prices economic modeling.

  1. (1)

    Artificial Neural Network structure determination In this paper, rather than solely depending on the minimization of loss function, a data-driven procedure is implemented to provide more robust weight approximation for the neural net. This is particularly effective in the context of crude oil price forecasting given the intrinsic volatility of the prices. Thus, the robust, data-driven procedure allows for more reliable estimation and weight configuration within the ANN setting. Furthermore, the neural network implemented in this paper allows for an internal mechanism to bound the number of neurons in the hidden layer to avoid overfitting. This feature is specifically important given the sparse data set available which is prone to overfitting which is avoided by this approach in the paper. Ultimately, a final consideration while designing the ANN structure has been to allow for a correspondence between the interpretability of the results and the research question. The relevance between the machine learning structure and the context of crude oil price forecasting from an economic modeling perspective is somewhat absent in the literature and this paper has attempted to bridge this gap by incorporating and discussing both economic and analytics aspects of the problem.

  2. (2)

    Feature engineering Another contribution of this paper is discussion of the potential mistakes and dangers of using machine learning approaches in prediction of commodities, and particularly, crude oil prices. As discussed in the paper, careful feature engineering can guarantee validity of the results whereas many related works in the literature involve machine learning approaches that involve irrelevant features such as futures or gold prices which may seemingly produce accurate results while in fact having little value in terms of economic interpretability. This paper has addressed this issue which can largely impact future research involving machine learning and economic forecasting.

  3. (3)

    Crude oil price economic modeling Following the recent transformations in the global economy and particularly crude oil markets, this paper has a novel modeling approach to crude oil price forecasting where important variables such as monetary policy and exhaustible nature of crude oil prices are incorporated in the prediction model. This approach is a continuation of recent work such as (Askari and Krichene 2010a) and also reflective of the dominant realities in the global crude oil market.

The remainder of this paper is organized as follows. Section 2 briefly reviews the underlying mechanism of artificial neural networks and the process of identifying optimal structure of the proposed MLP neural network as well as the theoretical background for VAR models. Section 3 discusses the selection of key variables that determine the price of crude oil, data collection and results of the proposed ANN model. Moreover, the vector autoregressive estimation for the forecasting model is presented and the results are validated by appropriate statistical tests and then compared with those of the ANN model. Section 4 discusses the significance of the findings and relevant policy issues. Finally, concluding remarks and policy implications are drawn in Sect. 5.

2 Methodology

This paper investigates predictability of crude oil prices using artificial neural networks and vector autoregressive models. This section reviews the theoretical background for the methods used for forecasting of the crude oil prices.

2.1 Artificial Neural Network Models

Artificial Neural Networks (ANNs) are computational tools adopted from network of neurons in human brain and are capable of mapping nonlinear relationships between inputs and outputs. ANNs’ efficiency in learning mechanisms has induced considerable interest in researchers from various fields and artificial neural networks are becoming increasingly used in statistical science, engineering and energy studies (Movagharnejad et al. 2011).

ANNs imitate the learning process in human brain using inter-connected units named neurons. Connections between neurons are adjusted by weights. A common structure of ANNs is a feed-forward neural network consisted of an input, an output and the hidden layer(s). This structure is also known as a multi-layer perceptron (MLP). Figure 1 shows a typical MLP neural network with one input layer, one hidden layer and an output layer. Data flow from input layer into output layer through hidden layer(s) and weights are then determined by the learning process which is done by back-propagation algorithm. This algorithm optimizes a quadratic cost function. Number of neurons in input layer is determined by independent variables and the number of neurons in the output layer represents the number of dependent variables (Boroushaki et al. 2003).

Fig. 1
figure 1

(Reproduced with permission from Boroushaki et al. 2003)

An MLP neural network.

In this paper, an MLP neural network is developed with an input layer consisting of nine neurons that account for the nine identified variables determining the price of oil and the output layer has one neuron that represents the target price of crude oil. In theory, there can be one or several hidden layer(s) but universal approximation theory suggests that a network with a single hidden layer with sufficiently large number of neurons can interpret any input–output structure (Tambe et al. 1996). Therefore, the proposed neural network has a single hidden layer. In order to determine the optimal number of neurons in the hidden layer, an artificial neural network is trained several times for different number of neurons in the hidden layer and the structure with the least mean square error (MSE) for test data is chosen as the best possible structure with sufficient number of neurons in the hidden layer.

Thus, artificial neural networks, as a type of algorithms that allow learning of underlying patterns in data, can help discover patterns embedded in in the complex processes of crude oil price determination, by merely analyzing the available data without imposing any structural form on the model. Multi-layer Perceptron which is used in this paper is a class of ANNs that allows for recognition of hidden patterns between the features and the variable of interest (i.e. the outcome). This is done through adjusting of weights connecting the nodes. The data is passed on through these nodes and the corresponding weights are then adjusted so that to minimize the loss function. Through a series of linear and nonlinear transformations, the weights are optimally adjusted to provide the best possible outcome for a given set of features.

2.2 Vector Autoregressive Models (VAR)

Vector autoregressive models are a class of econometric techniques used for forecasting and economic analysis. The underlying assumption of VAR models is that present values of variables can be explained by past values of the variables involved (Lütkepohl 2009).

A VAR(p) model (VAR model of order p) is specified in Eq. (1):

$$\begin{aligned} y_t =v+A_1 y_{t-1} +\cdots +A_p y_{t-p} +u_t \end{aligned}$$
(1)

where \(y_{t}=(y_{1t},{\ldots },y_{Kt})^{T}\) is a \((K\times 1)\) random vector, the \(A_{i}\) are fixed \((K\times K)\) coefficient matrices, \(v=(v_{1},{\ldots },v_{K})^{T}\) is a fixed \((K\times 1)\) vector of intercept terms allowing for the possibility of a nonzero mean \(E(y_{t})\). Finally, \(u_{t}=(u_{1t},{\ldots },u_{Kt})^{T}\) is a K-dimensional white noise that is independently distributed random variable with a mean of zero, constant variance and zero covariances.

A major characteristic of the variables estimated by a VAR model is stationarity. If the underlying processes of the time series in question are non-stationary, some variables may be integrated or cointegrated. Accordingly, Vector Error Correction Models (VECM) would be suitable to model the cointegration. The VECM(p) model i.e. a vector error correction model of order p for K variables is illustrated in Eq. (2):

$$\begin{aligned} \Delta y_{t}=C+\Gamma _{0} y_{t-1} +\sum _{i=1}^{p-1} {\Gamma _i \Delta y_{t-i} } +u_t \end{aligned}$$
(2)

where \(\Delta y=y_{t}- y_{t-1}\), C is the vector of K constants, \({\Gamma }_{i}\) are the coefficient matrices, and \(u_{t}\) is the vector of white noise.

Cointegration of an estimated VAR model indicates long-term relations. If short-term forecasting or analysis is intended, the VEC models can be used to separate the short term dynamics from long-term relations (Lütkepohl 2005).

In simpler terms, the vector auto regressive method assumes dependency between variables denoting the evolution of a feature (or variable) throughout time. This is particularly the case when we are dealing with series of data points correlated over time such as oil prices that indicate the price of the same commodity over different time periods. This sort of formulation enables the capturing of inter-dependent characteristics of the time series as well as exogenous elements affecting the outcome.

This paper develops a VAR model for forecasting the prices of crude oil and compares the performance of the proposed VAR model against the ANN model based on the R-squared index.

3 Model Specification and Empirical Results

Annual data for demand and supply side factors have been used in this work. On the demand side, annual percent change in world GDP has been taken from IMF economic dataset (“IMF Data” n.d.). It reflects world economic activity and growth which is a major factor in oil demand (Kilian and Murphy 2013). This factor is particularly important taking into account the dependency of crude oil prices on world economic activity and responsiveness of crude oil demand to changes in this activity that indicates an increasingly significant pressure on oil demand from the upward trend of economic activity. Monetary policy is another influential demand-side element in crude oil prices. Monetary policy has a tremendous impact on commodity markets and consequently on crude oil prices. Demand price elasticity for crude oil is insignificant which combined with inflexibility of supply can cause sharp surges in crude oil prices. Monetary policy has a considerable impact on crude oil markets through interest rates and dollar exchange rates. Interest rates have a dramatic effect on the demand for crude oil as low interest rates lead to increased economic activity. This means that volatile nature of crude oil prices can then be disrupted by aggressive monetary policies such as low interest rates that combined with natural rigidity of oil supply bring about escalated oil prices (Askari and Krichene 2010b). Furthermore, exchange rate of US dollar, main currency used in quoting oil prices, can have direct impact on the nominal price of oil (Askari and Krichene 2010a). In order to account for the impact of monetary policy on crude oil prices, Federal Reserve Bank interest rate and Major Currencies Exchange Rate for US dollar have been collected (“FRB: Data Releases” n.d.). Meanwhile, since crude oil prices are set in New York Commodities Exchange, crude oil price movements can be seen as part of the larger movement of commodity markets. Therefore, data for commodity price index for industrial inputs including agricultural raw materials and metal price indices have been collected from IMF dataset. Commodity markets price index is particularly important for it is another medium of reflecting effects of monetary policies which strongly influence crude oil prices (Askari and Krichene 2010c).

On the supply side, the exhaustibility of oil is becoming a major concern, especially taking into account the depletion of oil reserves in major oil-producing countries. As more and more oil is extracted from a given reservoir over the years, it becomes increasingly harder to remove the remaining oil. Therefore, drilling additional wells are required and this is done by moving to new geographical areas. Yet major oil producing sites such as the ones in Texas and Saudi Arabia are showing substantial decline in production. In fact, some have begun to speculate that Ghawar oil field, world’s largest conventional oil field, is in decline. This can mean that an overall decline in global oil production may not be that distant (Hamilton 2008). This important feature of crude oil consequently affects oil production and above-the-ground oil inventories (Brandt 2010). Apart from the issue of depletion of oil reserves that fundamentally influences crude oil production, other issues such as geopolitical tensions and political instabilities, such as crises in Iraq and Nigeria, play an important role in the production of crude oil. Therefore, it is essential to include production patterns in the forecasting model. This becomes more important as the inability to increase production from stable regions has caused the world supply to further depend on more unstable oil producers such as the ones in the Middle East (Hamilton 2008). This dependency on unreliable oil producers consequently highlights the need for strategic oil inventories. Industrial countries such as China that are major consumers of crude and have little or no oil production usually store large volumes of oil in order to safeguard their economy from potential supply disruptions. On the other hand, major oil-producing countries such as OPEC members seek to keep their inventories on a specific level to control the supply in a way that best serves their need. Thus inclusion of the level of oil inventories in the model is of great importance. To this end, corresponding data for world oil stocks and world proved oil reserves and world oil consumption are extracted from United States’ Energy Information Administration (EIA) (“U.S. Energy Information Administration (EIA)—Data” n.d.). Natural gas prices also have a strong effect on crude oil prices. Natural gas industry was revived after the 1973 oil shocks and subsequently natural gas emerged as a domestically and internationally precious product and the growth in natural gas production considerably eased the existing pressure on crude oil demand and prices which indirectly affected crude oil supply. Natural gas and crude oil are in many cases substitutes and are often produced by the same companies. Moreover, crude oil and natural gas upstream investments are almost indistinguishable and they are associated products (Askari and Krichene 2010b). Therefore, price of natural gas obtained from EIA database, is incorporated in the forecasting model. Detail of the model variables are shown in Table 1. Target crude oil prices are simple average of three spot prices; Dated Brent, West Texas Intermediate, and the Dubai Fateh also taken from EIA database.

Table 1 Variables used in the forecasting model

Thus the proposed model for crude oil prices is given by:

$$\begin{aligned} \hbox {P}_{\mathrm{t}}= & {} {\upgamma }_{\mathrm{p}} \hbox {P}_{\mathrm{t}-1} +{\upgamma }_{\mathrm{GDP}} \hbox {GDP}_{\mathrm{t}} +{\upgamma }_{\mathrm{ER}} \hbox {ER}_{\mathrm{t}} +{\upgamma }_{\mathrm{CI}} \hbox {CI}_{\mathrm{t}} +{\upgamma }_{\mathrm{IR}} \hbox {IR}_{\mathrm{t}} +{\upgamma }_{\mathrm{OR}} \hbox {OR}_{\mathrm{t}} \nonumber \\&+\,{\upgamma }_{\mathrm{OS}} \hbox {OS}_{\mathrm{t}} +{\upgamma }_{\mathrm{OP}} \hbox {OP}_{\mathrm{t}} +{\upgamma }_{\mathrm{OC}} \hbox {OC}_{\mathrm{t}} +{\upgamma }_{\mathrm{GP}} \hbox {GP}_{\mathrm{t}} +\hbox {c}+\hbox {u}_{\mathrm{t}} \end{aligned}$$
(3)

It should be noted that some variables commonly used in crude oil price forecasting literature have not been discussed in the present work and are not included in the forecasting model. Namely, these variables include crude oil futures prices and gold prices. As for futures prices, it has been shown that crude oil futures prices are as good a forecast variable as the spot prices and it may be possible that the spot prices may even provide better forecasts than the futures prices (Alquist and Kilian 2010). Moreover, gold prices move closely in the same direction as crude oil prices but this is mainly due to being propelled by the same trend in as part of a general movement in commodity markets. Therefore, it is difficult to conclude that gold prices actually affect crude oil prices and it is more likely that they move in tandem rather than influencing each other’s movements (Askari and Krichene 2010b).

3.1 MLP Estimation

In order to improve the performance of an ANN, data are usually scaled (Basheer and Hajmeer 2000). Using the following formula, data used in this paper are scaled on a [− 1, 1] range.

$$\begin{aligned} x_{scaled}=2\times \frac{x-x_{min} }{x_{max} -x_{min} }-1 \end{aligned}$$
(4)

Out of 35 observations, 25 observations were used for training, five for validation and five for testing. In order to reduce the randomness of results, oil prices from 2010 to 2014 were used as fixed test target prices. In this paper, the back-propagation is done by Levenberg–Marquardt training algorithm in order to predict crude oil prices using a tangent sigmoid transfer function (tansig) for the hidden layer and a linear transfer function (purelin) for the output layer. Another major issue that causes varying MSE for a neural network with specific number of neurons in its hidden layer is the random selection of weights and biases at the initial stage of training that differ in each separate implementation of the ANN. In order to cope with this issue, different topologies were investigated for different number of neurons in the hidden layer spanning from two to 20 neurons and the corresponding MSE for test data is recorded. ANNs with more than 20 neurons in their hidden layer had higher MSE and therefore testing the number of hidden-layer neurons was limited to a maximum of 20 neurons. Figure 2 shows the least MSE for different number of neurons in the hidden layer. The optimum model is selected based on the least value of MSE for test data. It can be seen that a neural network with 15 neurons in its hidden layer has the best performance. The best performance of this MLP neural network with 15-neuron hidden layer has R-squared of 0.99 and MSE of 2.8781. For test data, R-squared index is 0.99 and 2.6398e−06 for MSE.

Fig. 2
figure 2

Performance of the proposed MLP neural network based on least MSE for different number of neurons in the hidden layer

3.2 VAR Estimation

In order to validate the effectiveness of the proposed ANN model, a vector autoregressive model is implemented to forecast the price of crude oil. Vector autoregressive (VAR) models are a class of econometric models used for forecasting and structural analysis. It is shown that VAR models including key variables that determine the price of crude oil have higher forecast accuracy and lower real-time mean-squared prediction error (MSPE) (Baumeister and Kilian 2012). Therefore, an economic model of crude oil prices including the above-mentioned variables is formed and then estimated using a vector autoregressive model.

Another reason for using a vector autoregressive model is its ability in capturing interdependencies among multiple time series. This is particularly important taking into account the dependency of crude oil price on previous years’ prices which can be identified and analyzed by a vector autoregressive model. Consequently, we need to determine the appropriate VAR order to develop an effective prediction model. Since forecasting is the objective here, it is best to use Akaike Information Criterion (AIC) for choosing the optimal lag order of the VAR model (Akaike 1969). In consequence, different lag orders for the estimated VAR model are compared against AIC and also other criteria such as FPEFootnote 1, LRFootnote 2, SCFootnote 3 and HQFootnote 4. Table 2 indicates the selected lag order for each criterion by an asterisk “*”. It can be concluded that an order of one is the suitable lag order of the proposed VAR model.

Table 2 Lag order selection criteria and selected VAR order

Results of the estimated regression are reported in Table 3.

Table 3 Estimated parameters of the economic model

VAR models need to be tested for a number of properties in order to assure the validity of the estimation output. First, stationarity of the considered system is a crucial issue because it is the underlying assumption of vector autoregressive models. Therefore, it is important to test whether the estimated VAR model is stable (stationary). If all inverse roots of the characteristic AR polynomial have modulus less than one and lie within a unit circle, the estimated model is stationary (Lütkepohl 2005). The proposed VAR model in this paper has a root of 0.598329 which satisfies the condition for stationartiy of the system.

Another important test is the normality of residuals. Nonnormal residuals mean that the model is not a good representation of the real system. Therefore, in order to verify that the selected variables are appropriately chosen as key elements determining the price of crude oil, the proposed VAR model should be tested for normality of residuals. Jarque–Bera test statistic is then computed. This statistic can be compared with a chi-square distribution with two degrees of freedom. The null hypothesis of normality of distribution is rejected if the computed statistic exceeds a critical value from chi-square distribution. The Jarque–Bera statistic for the proposed VAR model is computed to be 2.534756 and \(\chi _{0.90} ^{2}(2)= 4.605\). Thus, the null hypothesis of a Gaussian distribution of residuals cannot be rejected. This means that the proposed model is an acceptable representation of the real process generating crude oil prices and consequently the selected variables are indeed key elements in determination of crude oil prices (Lütkepohl 2005).

Another important issue regarding application of vector autoregressive models is the co-integration analysis. Using Augmented Dickey–Fuller (ADF) test, time series of the model variables are tested for unit roots in order to determine the integrated order of the series. Results of the unit root test are shown in Table 4. Therefore, except for the GDP and exchange rate, other time series of the model variables are integrated of the order one, i.e. I(1). Given the stationarity of the estimated VAR model, we can conclude that the proposed vector autoregression is co-integrated. Consequently, the model addresses long-run relations between the variables (Askari and Krichene 2010b). This is consistent with the nature of the proposed model and the variables therein. Most variables included in the model such as monetary policy, GDP growth, and oil reserves influence crude oil prices on a long-term basis and therefore the policies drawn from the model also address long-term planning regarding investments in conventional or alternative (clean) energies and also energy efficiency schemes that mainly involve lengthy socio-technological developments. Moreover, it has been found that unrestricted VAR models show forecast superiority compared to vector error correction models (VECM) that are used for capturing the short-term dynamics of co-integrated systems (Park and Ratti 2008).

Table 4 ADF unit root test with the null hypothesis of existence of unit roots at the 5% significance level

Based on positive results of testing the vector autoregressive model for stationarity of the system, optimal lag order and normality of residuals, it can be concluded that the estimation output of the proposed VAR model is reliable. Furthermore, it can be argued that based on the result of the Jarque–Bera test, variables used in both VAR and ANN models have correctly been identified as key drivers of crude oil prices.

Since both VAR and ANN models have successfully represented the process generating crude oil prices, the performance of the two models can be compared in order to determine which model can provide more accurate forecasts of crude oil price. Table 5 compares the performance of the proposed ANN and VAR models based on quality of fit measures. It can be concluded that the ANN model is capable of more accurate forecasts than the VAR model. This can be contributed to ANNs’ ability in mapping nonlinear relationships between inputs and outputs which comes in handy when predicting highly volatile crude oil prices.

Table 5 Comparison of ANN and VAR model accuracy

Although the R-squared index for test data in the proposed MLP model can, to some extent, reflect the neural network’s out-of-sample predictability power, the general approach in this paper has been intended to test in-sample power of predictability for the proposed models. This is mainly because out-of-sample tests may fail to detect predictability that exists in population while in-sample tests will correctly detect it. This is particularly important given the limited observations available for this study as it has been shown that in small samples, out-of-sample tests may have significantly lower power than in-sample tests of the same size (Kilian and Taylor 2003). Furthermore, despite the general belief that out-of-sample tests of predictability are more reliable than in-sample tests, in most cases in-sample tests have higher power than out-of-sample tests of predictability (Inoue and Kilian 2005).

Furthermore, crude oil has an extensive impact on economy and so is the list of elements that affect crude oil prices. There could be many elements associated with oil prices and one might want to include different variables in the forecasting model, other than the ones selected in this paper. While certainly possible, adding new variables to the model may not necessarily improve the performance. There is significant theoretical and practical justification behind the selection of the set of variables introduced in the present paper that can be described as feature engineering. This is explained as follows.

First, including a new variable is guaranteed to improve the R-squared index. This is because when a new variable is introduced in the model, more data points are available to the model to fit. The R-squared index then is mechanically improved. But this does not necessarily translate into more accuracy. In fact, in most cases, it would be the contrary i.e. as more variables (and consequently more data points) become available, the model is distracted from learning or fitting the true underlying data-generating process and the estimations would therefore be biased. Thus, it is important to carefully contemplate the inclusion of any variable in the model.

Second, adding a variable to the model increases the chances of overfitting. It may seem that given the ability of neural networks in learning the hidden patterns in data sets, adding a new variable can enhance predictions. This, however, is not true as adding a variable corresponds to increasing the dimension of the problem and intensifies the sparsity of the feasible set. This is particularly the case when dealing with macroeconomic indicators where data sets are not large enough to deal with the curse of dimensionality. This is a crucial factor to take into account when adding new variables.

Third, is the problem of interpretability. As new variables are added, the intervals generated by structural models such as VAR shrink and inference becomes harder. This fact challenges the interpretability of the model and its effectiveness for policy and decision making. Another requirement of the structural models that are prohibitive of adding new variables is that the set of variables selected must meet certain criteria as explained in Sect. 3.2. Stationarity and normal residuals are two important properties that must be satisfied in order to validate the model. In the present study, adding crude oil futures price and gold prices failed to satisfy stationarity and normal residuals criteria. Thus, adding these two variables did not improve the model. This is due to the statistical behavior of the variables which is further explained in the next paragraph.

Fourth, a major threat when including any variable is the problem of confounding. This affects both structural methods such as VAR and machine learning models like ANNs. This happens when the variable selected is mixed up with a factor associated with the outcome. Confounding causes lack of interpretability in estimates from structural methods and distracts neural networks from learning the true underlying process. In the present paper, adding futures prices to the model results in confounding. This is because crude oil futures prices are associated with the outcome. In fact, crude oil futures prices are determined by actual oil prices and consequently adding futures prices decreases model accuracy.

4 Discussion

The forecasting model presented in this paper incorporates a comprehensive review of crude oil market and based on key drivers of crude oil prices, provides accurate and reliable forecasts which can aid policy-makers. In this section we discuss the significance of the results and their associated policy implications.

Given the importance of crude oil prices in global economy, policy-makers in different government and private sectors have always been seeking methods to model and forecast crude oil prices. Studies modeling and forecasting crude oil prices based on econometric methods often lack the desired prediction accuracy due to their limited capability in capturing volatile and nonlinear characteristics of crude oil prices. On the other hand, studies that use ANNs or other artificial intelligence methodologies as forecasting tools are accompanied with incomprehensive modeling of crude oil market.

Although the significance of econometric approaches cannot be undermined, highly volatile and erratic movements of crude oil prices are more effectively captured by ANNs since they can map nonlinear relations between inputs and outputs. It is argued that crude oil prices are very hard to predict (Hamilton 2008), yet if we aim to provide as accurate projections as possible, ANNs have proved to be more promising forecasting tools than econometric methods. In this paper an MLP neural network is developed and trained and the proposed ANN model significantly has more forecasting accuracy than the VAR model.

However, outstanding performance of ANNs in forecasting crude oil prices must not distract the researcher from a comprehensive understanding of the oil market and key drivers of crude oil prices which is often missing from previous studies that model crude oil prices with artificial intelligence methodologies. A careful selection of variables that form crude oil prices is central to developing a comprehensive and reliable model for oil prices. In this paper, key drivers of crude oil prices are identified and then confirmed using the appropriate statistical test. These identified elements account for both economic and technical aspects of crude oil market such as depletion of oil reserves and impact of monetary policy that are playing an increasingly important role in determination of crude oil prices.

Since crude oil is the most important element in world energy market, an effective energy policy requires comprehensive insight into crude oil price behavior. Such comprehensive insight needs to account for nonlinear and volatile oil price movements as well as underlying elements in crude oil market. Results from this study shows that the proposed model can provide reliable forecasts of crude oil prices that can aid policy-makers especially since they incorporate significant policy issues such as the effect of monetary policy and oil reserve depletion trends on crude oil prices.

Crude oil market has undergone significant changes in the recent years commodity markets with spot and future prices replacing long-term contracts. As more and more oil is traded in commodity markets, the role of monetary policy indicators play an increasingly significant role in determination of crude oil prices. While there is extensive literature on the relationship between crude oil prices and macroeconomic variables, little has been done towards incorporating monetary policy indicators in forecasting models. To fill this gap, this paper includes exchange rate and interest rate to capture the effect of monetary policy on crude oil prices. The results indicate that inclusion of monetary policy indicators effectively increases the accuracy of the forecasting model. This is particularly important when using neural network architectures that can learn crude oil price processes. Combined with other economic drivers, the magnitude of monetary policy impact can vary in different time periods (Gürkaynak et al. 2007). Thus, including exchange and interest rates while at the same time utilizing a neural network framework allows us to exploit the learning ability of the ANN to discover the effect of monetary policy on crude oil prices with respect to other features and ultimately improving prediction accuracy.

Considering the exhaustible nature of crude oil in this model as a natural resource has also enabled a more comprehensive analysis of the oil market. This is due to recent discoveries regarding the physical specification of oil extraction and exploration. Many estimates regarding the capacity of existing and future oil fields have proven to be overly optimistic and the scarcity plays a far more important role in determination of prices now. This also affects geopolitical considerations that can largely disturb oil prices.

Aside from the econometric and machine learning analyses, clear practical applications exist for this paper. Given the acceptable performance of the forecasting model in terms of accuracy, the model presented in this paper can be incorporated as a component in large-scale, aggregate energy models such as National Energy Modeling System (Gabriel et al. 2001). The benefits of such an application is twofold: first, accurate predictions of crude oil as a crucial energy component of the national energy system can be provides; and second, the forecasts can be used to more accurately model substitute energy sources such as natural gas and even renewable energy sources as this paper provides supply-side and also macroeconomic analyses regarding the energy market.

Another immediate application of the model developed in the paper is utilization as a decision support system for traders in commodity markets. As large quantities of oil is being traded in spot and future markets every day, an accurate prediction of oil prices from this model, embedded in management information system (MIS) can help traders and industry participants make better decisions.

5 Conclusions and policy implications

Crude oil price is a major element in world economy and consequently understanding the behavior of crude oil prices has been subject of interest for many researchers and decision makers in different businesses and government agencies.

Given the volatile and nonlinear nature of crude oil prices, in this paper, an artificial neural network as an efficient tool for mapping nonlinear relations is developed to predict the price of crude oil. Using MSE as a criterion, optimal number of neurons in the hidden layer is identified based on the least mean square error for test data. An economic model using the same variables is then developed and estimated using a vector autoregressive (VAR) model and estimation output is tested in order to validate the results of the econometric model. It was concluded that by a careful selection of key drivers of crude oil prices and an MLP neural network with sufficiently large number of neurons in its hidden layer, more reliable estimates of crude oil price can be obtained as compared to estimates of the proposed VAR model.

This paper has attempted to bridge the gap in several aspects of crude oil price forecasting. First, a robust and data-driven approach is selected for the ANN structure which allows for reliable approximation and at the same time provides appropriate economic interpretation. Second, an analysis and discussion of feature engineering is discussed in order to guarantee validity of the results given the various caveats in implementation of machine learning approaches in economic forecasting. Ultimately, a novel modeling of crude oil prices is presented that is reflective of the realities of the new global crude oil market.

In addition, the following implications can be added to the previous discussion. From the energy point of view, countries of the world can be divided into three different categories as follows: (a) Crude oil producing countries, i.e. OPEC countries like Saudi Arabia, Kuwait and Iran that a very high percentage of their income comes from exporting crude oil. Accordingly, their fiscal year budgets highly depend on the crude oil price. The Iranian Parliament does not approve next year’s budget until they have a clear prediction of the crude oil price. Also, the government in Iran does not know which national projects should be started or continued in the next year until they have clear price forecast of the crude oil. The same problem exists with the private sector regarding the selection of development projects in the next year. They all depend on the country’s GDP and the latter depends completely on crude oil prices. (b) Crude oil pure consuming countries, i.e. China, Japan and South Korea that import almost all of their crude oil consumption. The growth rates of these countries’ GDP cannot be computed unless they have a reasonable crude oil price forecast. Also, their private businesses cannot compute their annual profit without a clear projection of the crude oil price. (c) Crude oil producing and consuming countries, i.e. the United States of America that imports crude oil for preparing the surplus of its consumption over its production. These countries have both problems of the above two country categories and their energy policy depends on crude oil price forecast much higher than the previous categories.

Moreover, as crude oil is the most traded form of energy and provides about two-thirds of world’s energy demand, the price of crude oil significantly impacts medium to long term policies and decisions concerning the use of other forms of energy. For example, high crude oil prices incentivize government and private sector investments in other energy sources such as renewables, especially in countries that have little or no oil production and importing crude oil imposes considerable tension on their economy. The capital intensive nature of alternative energies such as solar or wind power usually discourages efforts made to propagate such technologies and high crude oil prices may hasten such efforts. Accordingly, upward crude oil price trends would directly influence vital issues such as energy security in such countries and as a consequence, encourage them to develop and implement policies that target energy efficiency schemes and energy consumption patterns. Therefore, a reliable forecast of crude oil prices can help governments and companies to prioritize their energy policies regarding the investments in other forms of energy and devising energy efficiency initiatives.

Some economic factors mentioned in this paper such as percent change in GDP or monetary policy affect investments by governments or companies in oil industry. Due to the nature of petroleum industry and the extent of projects carried out in this field, such investments ordinarily take considerable amount of time to directly impact oil production or oil inventory capacities and consequently impact oil market and crude oil prices. Further studies may investigate the forecasting performance of ANN models with memory that account for the time-dependent feature of crude oil prices. Effectiveness of the proposed model can also be tested on other energy carriers such as natural gas. Moreover, despite the significantly high R-squared index for test data in the proposed neural network, future works may empirically investigate out-of-sample predictability power of VAR and ANN forecasting models.