1 Introduction

Commodities, especially gold and crude oil, and bond are important instruments to diversify the risk. In the calculation of optimum risky weights for portfolio, gold and crude oil are the most attractive commodities to be included for hedging risks in portfolio of investors. Gold and crude oil have played the vital role in economics. In the last decade, many economists paid much attention to investigating the volatility and the relationship between gold and crude oil prices. The most effective tool that is employed to measure this volatility and relation is Copula based GARCH model. The multivariate GARCH models have demonstrated to be useful and effective for analyzing the pattern of multivariate random series and estimating the conditional linear dependence of volatility or co-volatility in different markets.

The study of the volatility and the relationship between gold and crude oil prices, using Copula based GARCH, has been intensively conducted in the last decade. However, those studies investigated their works using a closing price data series. Thus, if we consider only the closing prices, we might lack a valuable intraday information and the obtained results might not be reasonable [11]. Recently some studies have proposed to use interval data, e.g. the lowest and the highest price during each day or period of time, as an alternative to single value data. In the ideal world, we should be able to predict both the lowest daily price and the highest daily price. However, in practice, this is difficult, so we would like to predict at least some daily price between these bounds. In the past, researchers tried to predict the representative of the lowest and highest prices, for example, a Center and MinMax methods of Billard and Diday [3], Center and Range method of Neto and Carvalho [10], and model M by Blanco-Fernndez, Corral, Gonzlez-Rodrguez [4], to deal with the interval data. These methods aim to construct the model without taking the interval as a whole. However, we expect that these methods, especially center method, may not be robust enough to explain the real behavior of the interval data and this may lead to the misspecification of the model. Therefore, it will be of great benefit to relax this assumption of mid-point of center method and assign appropriate weights between intervals. Thus in this study, a convex combination method of Chanaim et al. [6] is employed to obtain the appropriate weights.

This study investigates and compares the dependence structure of crude oil and gold prices using different interval values for copula-based GARCH model estimation and prediction. The examined interval value methods include the center method, equal weighted, and unequal-weighted convex combination. The main findings will confirm the usefulness of the convex combination in copula-based GARCH approach for evaluating the relationship, joint distribution and co-movement between crude oil prices and gold prices for investors whose investment interest is in gold and crude oil.

The remainder of the paper is organized as follows: Sect. 2 provides methodology of study. Section 3 proposes the empirical results. Section 4 summarizes this paper.

2 Methodology

In this section, we brief the convex combination in GARCH, EGARCH, and GJR-GARCH models; and copula family for estimating joint density of the obtained marginal from the GARCH families.

2.1 Operation with Interval Arithmetic

Let \(p_i=[\underline{p}_i,\overline{p}_i]\) be lower and upper interval data at time i. This data can be defined for arithmetic operations as in the following:

1. Addition

$$\begin{aligned} p_i+p_j=[\underline{p}_i+\underline{p}_j,\overline{p}_i+\overline{p}_j] \end{aligned}$$
(1)

2. Subtraction

$$\begin{aligned} p_i-p_j=[\underline{p}_i-\underline{p}_j,\overline{p}_i-\underline{p}_j] \end{aligned}$$
(2)

3. Multiplication

$$\begin{aligned} p_i.p_j~=~[\min ~A,\max ~A].A~=~\{\underline{p}_i\underline{p}_j,\underline{p}_i\overline{p}_j,\overline{p}_i\underline{p}_j,\overline{p}_i\overline{p}_j\} \end{aligned}$$
(3)

4. Division , p > 0

$$\begin{aligned} \dfrac{1}{p_j}=[\dfrac{1}{\overline{p}_j},\dfrac{1}{\underline{p}_j}] \end{aligned}$$
(4)

5. Addition and Multiplication by scalar

$$\begin{aligned} p_i+a&=\lbrace \underline{p}_i+a,\overline{p}_i+a \rbrace \end{aligned}$$
(5)
$$\begin{aligned} a.p_i&=\left\{ \begin{matrix} [a.\overline{p},a.\underline{p}],~&{}a<0\\ 0,~&{}a=0\\ [a.\underline{p},a.\overline{p},~&{}a>0 \end{matrix}\right. \end{aligned}$$
(6)

6. logarithm function, \(p_i>0\)

$$\begin{aligned} \log ~p_i= \left| \log ~\underline{p}_i,\log ~\overline{p}_i \right| \end{aligned}$$
(7)

2.2 Center Method

This method has been proposed by Billard and Diday [3]. The main idea is that it uses the center of the interval data \(p^c_t\) which is obtained from upper and lower values of interval, say \(\underline{p}_t\) and \(\overline{p}_t\), and can be derived by

$$\begin{aligned} p^c_t = \frac{\underline{p}_t + \overline{p}_t}{2} \end{aligned}$$
(8)

2.3 Autoregressive Moving Average-GARCH Model

Many previous studies suggested that volatility of financial return data is not constant over time, but is rather clustered. This issue can be tackled using volatility modeling. Within a class of autoregressive processes with white noises having conditional heteroscedastic variances, this paper considers a GARCH(1,1) model to estimate the dynamic volatility. It is the workhorse model and mostly applied in many financial data. The model is able to reproduce the volatility dynamics of financial data. Thus, in this study, we consider ARMA(p,q)-GARCH(1,1) which can be written as

$$\begin{aligned} r_t&=\phi _0+\sum _{i=1}^{p}\phi _ir_{t-i}+\sum _{i=1}^{q}\varphi _i\varepsilon _{t-i}+\varepsilon _t \end{aligned}$$
(9)
$$\begin{aligned} \varepsilon _t&=\sigma _t\eta _t \end{aligned}$$
(10)
$$\begin{aligned} \sigma ^2_t&=\omega _0+\omega _1\sigma ^2_{t-1}+\omega _2\varepsilon ^2_{t-1} \end{aligned}$$
(11)

where \(\eta _t\) is a strong white noise which has normal distribution with mean zero and variance one. \(\sigma ^2_t\) is the conditional variance in GARCH process by Tim Bollerslev [5]. Some standard restrictions on the variance parameters are given.

$$\begin{aligned} \omega _1,~\omega _2>0,~\omega _1+\omega _2~<1 \end{aligned}$$
(12)

Furthermore, Simon [12] presented a family of variance models in asymmetry EGARCH model and GJR-GARCH model.

2.4 EGARCH (Exponential GARCH)

From GARCH model, by introducing the parameters \(\lambda \) and \(\nu \), for \(\lambda \) = \(\nu \) =1. Then we can rewrite GARCH(1,1), Eq. (11) as

$$\begin{aligned} \log ~\sigma ^2_t=\omega _0+\omega _1\log ~\sigma ^2_{t-1}+\omega _2\left[ \frac{\left| \varepsilon _{t-j} \right| }{\sigma _{t-j}}-E\left\{ \frac{\left| \varepsilon _{t-j} \right| }{\sigma _{t-j}} \right\} \right] +\omega _3\left( \frac{\left| \varepsilon _{t-j} \right| }{\sigma _{t-j}} \right) \end{aligned}$$
(13)

The form of the expected value terms associated with ARCH coefficients in the EGARCH equation depends on the distribution of innovation. If the innovation distribution is Gaussian, then

$$\begin{aligned} E\left\{ \frac{\left| \varepsilon _{t-j} \right| }{\sigma _{t-j}} \right\} =E\left\{ \left| Z_{t-j} \right| \right\} =\sqrt{\frac{2}{\pi }} \end{aligned}$$
(14)

If the innovation distribution is Student’s t with \(\nu > 2\) degrees of freedom, then

$$\begin{aligned} E\left\{ \frac{\left| \varepsilon _{t-j} \right| }{\sigma _{t-j}} \right\} =E\left\{ \left| Z_{t-j} \right| \right\} =\sqrt{\frac{v-2}{\pi }}\frac{\varGamma \left( \frac{v-1}{2} \right) }{\varGamma \left( \frac{v}{2} \right) } \end{aligned}$$
(15)

2.5 GJR-GARCH

The GJR-GARCH model is a GARCH variant that includes leverage terms for modeling an asymmetric volatility clustering. In the GJR formulation, large negative changes are more likely to be clustered than positive changes. The GJR model is named for Glosten, Jagannathan, and Runkle [8]. The GJR-GARCH model is a recursive equation for the variance process, and the simple GJR-GARCH(1,1) can be written as

$$\begin{aligned} \sigma ^2_t=\omega _0+\omega _1\sigma ^2_{t-1}+\omega _2\varepsilon ^2_{t-1}+\omega _3I[\varepsilon _{t-1}<0]\varepsilon ^2_{t-1} \end{aligned}$$
(16)

The indicator function \(I[\varepsilon _{t-1}<0]\) equals 1 if \(\varepsilon _{t-1}<0\), and 0 otherwise. Thus, the leverage coefficients are applied to negative innovations, giving negative changes additional weight. For stationarity and positivity, the GJR model has the following constraints

$$\begin{aligned} \omega _0>0,~ \omega _1\geqslant 0,~\omega _2\geqslant 0,~ \omega _2+\omega _3\geqslant 0,~ \omega _1+\omega _2+\omega _3<1 \end{aligned}$$

2.6 Convex Combination Method

The convex combination method is applied to deal with the interval return data, where the appropriate value over the range of interval can be computed by

$$\begin{aligned} r_{t}^{cc}=\alpha _0\underline{r}_t+(1-\alpha _1)\overline{r}_t \end{aligned}$$
(17)

where \(\alpha _0~and~\alpha _1\) are the weighted parameters with value between 0 and 1. In this study, we consider both fixed weighted and unequal-weighted convex combination methods. Thus, we set \(\alpha \) = 0.5 for fixed weighted convex combination while \(\alpha \) \(\varepsilon \) [0, 1] is set as the parameter to be estimated for unequal-weighted convex combination method. For example, in the case of ARMA(1,1)-GARCH(1,1), we can rewrite Eqs. (9)–(11) as

$$\begin{aligned} \alpha \underline{r}_t+(1-\alpha )\overline{r}_t&=\phi _0+\phi _1(\alpha \underline{r}_{t-1}+(1-\alpha )\overline{r}_{t-1})+\varphi _1\varepsilon _{t-1}+\varepsilon _t \end{aligned}$$
(18)
$$\begin{aligned} r^{cc}_t&=\phi _0+\phi _ir^{cc}_{t-1}+\varphi _1\varepsilon _{t-1}+\varepsilon _t \end{aligned}$$
$$\begin{aligned} \varepsilon _t&=\sigma _t\eta _t, \eta _t \sim N(0,1)\end{aligned}$$
(19)
$$\begin{aligned} \sigma ^2_t&=\omega _0+\omega _1\sigma ^2_{t-1}+\omega _2\varepsilon ^2_{t-1}, \omega _0,\omega _1,\omega _2 > 0 \end{aligned}$$
(20)

2.7 Model Selection by Akaike Information Criterion (AIC)

In this study, we compare our models using Akaike information criterion applied from Kullback Leibler Information. It is defined as:

$$\begin{aligned} AIC=-2\ln (\hat{L})+2K \end{aligned}$$
(21)

where \(\hat{L}\) is maximized value of likelihood function, K is the number of parameters in the model.

2.8 Bivariate Copula Approach

Let XY be random variables, the continuous marginal distributions are F(x),  G(y) then H(xy) is a joint distribution, then 2-dimensional copulas \(C:[0,1]^2\) \(\rightarrow [0,1]\) can be defined by

Copula if property

$$\begin{aligned} F_i(x_i)=u,~G(y_i)=v~ and~ H(x,y) = C(u,v) \end{aligned}$$

so

$$\begin{aligned}&C(0,v)=C(u,0)=0~~~C(u,1)=u,~ C(1,v)=v, u<u', v<v'\\&\quad C(u',v')-C(u,v')-C(u',v)+C(u.v)\ge 0, \end{aligned}$$

where C is copula function of marginal distribution random 2 variables. If marginal has continuous distribution, the copula function is

\(\bullet \) Gaussian Copula

$$\begin{aligned} C(u,v)=\varPhi (\varPhi ^{-1}(u),\varPhi ^{-1}(v)) \end{aligned}$$
(22)

Lower and upper tail dependence or order parameters of Gaussian Copula is \(k_L=k_u\)=\(\frac{2}{(1+\rho )}\).\(\varPhi ^{-1}_n\) is quantile function for normal distribution function and \(x=\varPhi ^{-1}(u), y=\varPhi ^{-1} (v) \) and \(u, v~\in [0,1]\)

\(\bullet \) Student−t Copula

$$\begin{aligned} C(u,v)=\int _{-\infty }^{t^{-1}_{v}(u)}\int _{-\infty }^{t^{-1}_{v}(v)}f_{t_{1(v)}}(x,y)dxdy \end{aligned}$$
(23)

where \(t^{-1}_{v}(u)\) and \(t^{-1}_{v}(v)\) are quantile functions with student−t distribution, where v is degree of freedom and \(f_{t_{1(v)}}(x,y)\) is joint density function.

\(\bullet \) Frank Copula

$$\begin{aligned} C(u,v)=-\frac{1}{\theta }\log [1+(e^{-\theta u} -1)\dfrac{(e^{-\theta v} -1)}{(e^{-\theta } -1)}], \theta \in R-\lbrace 0\rbrace \end{aligned}$$
(24)

\(\bullet \) Clayton Copula

$$\begin{aligned} C(u,v)=(u^{-\theta }+v^{-\theta }-1)^{-\frac{1}{\theta }} , \theta > 0 \end{aligned}$$
(25)

\(\bullet \) Gumbel Copula

$$\begin{aligned} C(u,v)=exp(-[(-\log u)^{\theta }+(-\log v)^{\theta }]^{\frac{1}{\theta }}),\theta \geqslant 1 \end{aligned}$$
(26)

\(\bullet \) Joe Copula

$$\begin{aligned} C(u,v)=1-(u^{\alpha }+v^{\alpha }-u^{\alpha }v^{\alpha })^{\frac{1}{\alpha }},\theta \ge 1 \end{aligned}$$
(27)

Furthermore,this study also uses the bivariate copula family, presented by Joe [9] for asymmetric lower and upper tail dependence including,

\(\bullet \) BB1 coupla

$$\begin{aligned} C(u,v; \theta , \delta )= 1+[(u^{-\theta }-1)^{\delta }]^{-\frac{1}{\theta }},\theta >0,\delta \geqslant 0 \end{aligned}$$
(28)

\(\bullet \) BB2 coupla

$$\begin{aligned} C(u,v;\theta ,\delta )=1+\delta ^{-1} \log (e^{\delta (u^{-\theta }-1)}+e^{\delta (v^{-\theta }-1)}-1)]^{-\frac{1}{\theta }} ,\theta ,\delta >0 \end{aligned}$$
(29)

\(\bullet \) BB3 copula

$$\begin{aligned} C(u,v;\theta ,\delta )= exp{(-[\delta ^{-1}\log (e^{\delta \tilde{u}^{-\theta }}+e^{\delta \tilde{v}^{-\theta }}-1)]^{-\frac{1}{\theta }}}), \theta \geqslant 0,\delta \geqslant 0 \end{aligned}$$
(30)

\(\bullet \) BB4 copula

$$\begin{aligned} C(u,v; \theta , \delta )= (u^{-\theta } +v^{-\theta } -1-[(u^{-\theta } -1)^{-\delta } +v^{-\theta } -1)^{-\delta } ]^{-\frac{1}{\delta }})^{-\frac{1}{\theta }}, \theta \ge 1 , \delta \ge 0 \end{aligned}$$
(31)

\(\bullet \) BB5 copula

$$\begin{aligned} C(u,v;\theta ,\delta )=exp({-[x^{\theta }+y^{\theta }-(x^{-\theta \delta }+y^{-\theta \delta })^{-\frac{1}{\delta }}]^{\frac{1}{\theta }}}),\theta \geqslant 1,\delta \geqslant 0 \end{aligned}$$
(32)

\(\bullet \) BB6 copula

$$\begin{aligned} C(u,v; \theta ,\delta )=1-(1-exp{-[\log (1-u^{-\theta }))^\delta +(-\log (1-v^{-\theta }))^\delta ]^{\frac{1}{\delta }}})^{\frac{1}{\theta }}, \theta \geqslant 1, \delta \geqslant 0 \end{aligned}$$
(33)

\(\bullet \) BB7 copula

$$\begin{aligned} C(u,v; \theta , \delta )= 1-(1-[(1-u^{-\theta } )^{-\delta }+(1-v^{-\theta } )^{-\delta }-1]^{-\frac{1}{\delta }})^{-\frac{1}{\theta }},\theta \geqslant 1, \delta \geqslant 0 \end{aligned}$$
(34)

\(\bullet \) BB8 copula

$$\begin{aligned} C(u,v; \theta , \delta )= \delta ^{-1} (1-\{{1-\eta ^{-1}[1-(1-\delta u)^{\vartheta }]}[1-(1-\delta v)^{\vartheta }]\}^{\frac{1}{\vartheta }} \end{aligned}$$
(35)
$$\begin{aligned} \text {where}\,\,\vartheta \leqslant 1, 0<\delta \leqslant 1, \eta =1-(1-\delta )^{\vartheta } \end{aligned}$$

3 Empirical Result

3.1 Data Description

The data set consists of the Comex and Nymex for the period from 8 May 2009 to 15 July 2016, covering 376 observations. Interval data is the most important issue in examining the interaction among these commodity prices. Therefore, we have considered the weekly minimum and maximum of these prices and they were collected from Thomson Reuters DataStream. The data description is shown in Table 1 and the interval return plot is show in Figs. 1 and 2.

Table 1. Data description and statistics of Comex and Nymex using interval return data
Fig. 1.
figure 1

Comex interval return

Fig. 2.
figure 2

Nymex interval return

3.2 Results of Optimal Weights for ARMA-GARCH, ARMA-EGARCH and ARMA-GJR GARCH Model Using Convex Combination Method

In this section, we use Comex and Nymex interval returns to estimate ARMA-GARCH(1,1), ARMA-EGARCH and ARMA-GJR-GARCH models with the convex combination method to find the appropriate weights in the model in the range of [0,1]. We conduct a grid search to find the best fit weight. Here, the AIC is used to determine the appropriate weight in the interval [0,1] and the results are shown in Table 2. Then, we compare three GARCH models using Akaike Information Criterion (AIC) and the lowest AIC is preferred. The results are also provided in Table 2 and we find that the GJR-GARCH model with Student−t distribution is appropriate for present volatility of Comex and EGARCH model with Student−t distribution is appropriate for present volatility of Nymex. Therefore, we use this GARCH specification to obtain our marginals.

Table 3 presents the results of Comex from the estimation by GJR-GARCH models. The results show that \(\omega _1+\omega _2\) = 0.89. This indicates that Comex exhibits a significantly high persistent volatility. Table 4 presents the results of Nymex from the estimation by EGARCH models. The results show that \(\omega _1+\omega _2\) = 0.96. This indicates that Nymex exhibits a significantly high persistent volatility. Moreover, we try to compare the results of the model with convex combination and the center method, we find that the AIC of convex combination is lower than center method for both ARMA(3.4)-GJR-GARCH(1,1) and ARMA(3,4)-EGARCH(1,1). This result indicates the superiority of convex combination method over the center method.

Table 2. Results of AIC value in family of GARCH
Table 3. Estimated results of Comex by ARMA GJR-GARCH
Table 4. Estimated results form EGARCH of Nymex

3.3 In-Sample Forecast and Volatility

Then, the best fit GARCH model is used to predict the return of intervals and volatility of Comex and Nymex as shown in Figs. 3 and 4. These figures illustrate the accuracy of the predicted return against actual interval return (upper panel) and closing price returns (bottom panel) to see the performance of GJR-GARCH and EGARCH with convex combination. In addition, the predicted volatility \(\sigma ^2_t\) is also plotted in the middle panel. From the graph, it is obvious that the performance is satisfactory. Different results of predicted volatilities are shown in the middle of Figs. 3 and 4. We observed that the volatility of Comex is high during 2013–2014 corresponding to the Greek crisis. For Nymex, we observed that the volatility is high during 2015–2016. We expected that the increasing doubts about the success of the oil producers meeting and rising production as well as the record US and global crude oil inventories have put a high pressure on crude oil prices.

Fig. 3.
figure 3

Volatilities forecast and interval return of Comex

3.4 Results of Copulas

In this section, copula model is employed to measure the dependency of Comex and Nymex. The obtained standardized residuals from the best fit GARCH process are used to compute the dependence in the copula model. First of all, we present the scatter plot of copula between Comex and Nymex in Fig. 5. We observed an unclear relationship between Comex and Nymex, thus the various families of copulas are proposed to capture the relationship between these two variables.

Fig. 4.
figure 4

Volatilities forecast and interval return of Nymex

Fig. 5.
figure 5

Scatter plot between Comex and Nymex prices

Table 5. Estimation results of copulas

Finally, the results of copula model are presented in Table 5. We found that among 14 copula families, Student-t copula function shows the lowest value of AIC (−19.5115). Thus, we selected Student-t copula function to explain the dependence between Comex and Nymex. The result indicates that there exists a weak positive dependence between Comex and Nymex (\(\rho \) = 0.2479, degree of freedom = 8.4774). Moreover, we found the tail dependence between these two, where the upper and lower tail dependence was found to be 0.0392. We can conclude that there exists a dependence between Comex and Nymex not only in the normal event, but also in the extreme event.

4 Conclusion

This study investigates the performance of convex combination via various GARCH families and copula-based approach. In this study, we consider crude oil and gold with interval data as the application study. The results confirm that the EGARCH and GJR-GARCH with convex combination method improve the estimation. We also used the obtained standardized residuals from GARCH process with the copula model to measure the dependency of crude oil and gold. We found that among the various copula families, Student-t copula shows the lowest value of AIC, thus we used a Student-t copula function to join the marginal of crude oil and gold. The result of copula model showed that there exists a positive dependence between these two variables. Moreover, we also found the positive tail dependence which indicates that there exists a dependence in the extreme event.