Volatility forecasting: a new GARCH-type model for fuzzy sets-valued time series

Dai, Xingyu; Cerqueti, Roy; Wang, Qunwei; Xiao, Ling

doi:10.1007/s10479-023-05746-z

Volatility forecasting: a new GARCH-type model for fuzzy sets-valued time series

Original Research
Published: 14 December 2023

(2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Annals of Operations Research Aims and scope Submit manuscript

Volatility forecasting: a new GARCH-type model for fuzzy sets-valued time series

Download PDF

Xingyu Dai^1,2,
Roy Cerqueti^3,4,
Qunwei Wang ORCID: orcid.org/0000-0002-3878-2832^1,2 &
…
Ling Xiao⁵

236 Accesses
1 Citation
Explore all metrics

Abstract

In recent years, academia’s attention has gradually shifted toward non-point-valued time series volatility forecasting models in the finance big data environment. This paper uses random set theory to define the random fuzzy sets-valued assets returns and propose a new Generalized Autoregressive Conditional Heteroskedasticity (GARCH)-type model named the Set-GARCH model, which describes the evolution of sets-valued returns time series volatility. We conceptualize such a model in both cases of correlated and uncorrelated returns. We discuss the subtraction operation rule, the model specification, and the maximum likelihood estimation method for the Set-GARCH model and its derivative model. We also define how to convert the volatility of fuzzy sets-valued returns to the volatility of real returns. Using long timespan daily/weekly/monthly oil, S &P500, and gold returns data, both in-sample and out-of-sample empirical applications demonstrate that the volatility prediction ability of the Set-GARCH model and its derivative outperforms the point-valued GARCH-type models, conditional autoregressive range-type models, and two hotly debated interval-valued volatility models.

Evolving Fuzzy-GARCH Approach for Financial Volatility Modeling and Forecasting

Article 28 November 2015

Ordered Fuzzy GARCH Model for Volatility Forecasting

A Hybrid Fuzzy GJR-GARCH Modeling Approach for Stock Market Volatility Forecasting

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Volatility forecasting plays a critical role in derivatives valuation, portfolio management, and risk measurement. It attracts extensive research to improve the forecasting performance of time series volatility models (Barunik et al., 2016; Ma et al., 2019). The development of big data technology and artificial intelligence has been significantly changing the development process of econometric volatility estimation models (Papanagnou & Matthews-Amune, 2018; Zhu et al., 2023). The advancement of financial storage technologies enables investors and quantitative traders to effectively utilize all the available trading information, such as highest prices, lowest prices, closing prices, etc., for risk management or arbitrage purposes (Treleaven et al., 2013; Nuti et al., 2011). This required improvement of the previous paradigm which solely uses closing prices or point-valued trading data for risk management. This study proposes a new Generalized Autoregressive Conditional Heteroskedasticity (GARCH)-type volatility forecasting model based on random set theory and sets-valued time series, namely, the Set-GARCH model.

In the context of estimating volatility, GARCH models remain as the most popular devices. Both uni-variate and multi-variate GARCH employ point-valued data, i.e., each moment of the price time series of the GARCH model is a point, most notably the closing price-based returns (Hansen & Lunde, 2005). However, with the development of big data in finance, the structure of price data of assets has changed (Treleaven et al., 2013; Nuti et al., 2011). All trading information during the day t would inform investors’ decision-making i.e., investors may short (long) assets at any price during the trading day, rather than only focusing on the closing price. Conditional autoregressive range (CARR) group models reveal the relationship between return volatility and highest, and lowest price (Chou, 2005; Parkinson, 1980). The diverse set of price information facilitates volatility forecasting in the context of big data (Lyócsa et al., 2021; Molnár, 2012).

The non-point-valued data, particularly the study of interval-valued price characteristics or interval-valued price forecasting, has been prevalent in recent years (Buansing et al., 2020; Joshi & Kumar, 2016; Maia & de Carvalho, 2011). Interval-valued data may provide more information than traditional point-valued GARCH or CARR group models and could be used to forecast price volatility. The interval-valued models, such as auto-regressive conditional interval (ACI) group models and GARCH model with interval-valued variables (Int-GARCH) (Han et al., 2016; He et al., 2021; Sun et al., 2018, 2020; Yang et al., 2016), could explain the evolution of an interval-valued defined price proxy. The use of interval-valued variables and the incorporation of random set theory distinguishes these models from the standard multivariate time series model.^{Footnote 1}

However, the interval-valued variables are only driven by information regarding the highest and lowest prices. Could we use a sets-valued variable to represent prices and construct a sets-valued time series volatility model by incorporating additional price information (such as the closing price)? The interval-valued variable contains all possible price points with “equal” weighting, but do these points actually weigh equally? Could we use sets-valued information to achieve our desired point values? Motivated by these considerations, we intend to develop a sets-valued time series model to describe the dynamics of return dynamics and, from this, to forecast volatility.

The key element of our methodology is using random fuzzy sets to characterize the stochastic process of returns and calculate volatility. Numerous studies have characterized prices (or returns) with fuzzy sets-valued data (Atsalakis et al., 2019; Ezbakhe & Pérez-Foguet, 2021; Nowak & Romaniuk, 2010), it is proven that fuzzy set-valued data contains more information than interval-valued data.^{Footnote 2} Not only does using fuzzy sets-valued prices as a proxy for one day’s returns incorporate multiple types of price information, but it also highlights key information with a continuum of membership grades (Hocine et al., 2020; Jones et al., 2022). However, forecasting volatility with fuzzy sets-valued data is still underdeveloped due to its complexity. Given that returns are stochastic, fuzzy sets-valued data must be transformed into random fuzzy sets-valued time series, followed by evolution equations describing the time-varying pattern of volatility. Some studies have combined the fuzzy set concept with GARCH models, but in their models, prices are still point-valued or are not considered random fuzzy sets-valued (D’Urso et al., 2016).

In this vein, we construct fuzzy sets-valued prices in a stochastic manner and propose a novel Set-GARCH model to address the aforementioned issues in volatility forecasting. We incorporate additional price information into a set and predict the volatility of returns by influencing the time series of sets-valued variables. The highest log-price $H_t$, lowest log-price $L_t$, and closing log-price $C_t$ of a day are integrated into a set to form a sets-valued stochastic variable, that is, $\tilde{{\varvec{P}}}_t=\{H_t,L_t,C_t\}$. The Set-GARCH model has a similar volatility-driven equations structure to the GARCH model in general. Moreover, the addition/multiplication operation and the distance measurement of the Set-GARCH model are performed in the random fuzzy set space (Körner & Näther, 2002; Li et al., 2013; Sun et al., 2020; Wang et al., 2016). In this study’s practice of volatility forecasting, the Set-GARCH model is adaptable to different derivative family models and has demonstrated distinctive advantages.

The main contributions of this paper are two-fold. First, we established a theoretical framework for modelling dynamic volatility using random fuzzy sets-valued returns data. We propose the Set-GARCH model based on the characteristics of data structure and sets-valued operation rules. The Set-GARCH model extends the ACI model (Sun et al., 2018), the Int-GARCH model (Sun et al., 2020), and other interval-valued time series models (Wu et al., 2023; González-Rivera & Lin, 2013; Gonzalez-Rivera et al., 2020) from interval-valued data to sets-valued data. Our Set-GARCH model is, to the best of our knowledge, the first model to describe the dynamic volatility of sets-valued returns time series.

Second, we address the specification limits and extended specification space in the Set-GARCH model and thus propose the Set-GARCH-LR model as a variation We utilized the crude oil, gold, and S &P500 index, which are representative of the market, using daily, weekly, and monthly trading data, respectively, for the applications of the proposed models. In-sample and out-of-sample volatility forecasting demonstrate that the Set-GARCH model and Set-GARCH-LR model outperform conventional GARCH-type, CARR-type, ACI, and Int-GARCH models.

The rest of the paper is organized as follows. In Sect. 2, the definition of fuzzy sets-valued returns is provided. The specifications of our proposed Set-GARCH models is provided in Sect. 3. Section 4 presents an application of empirical data. The paper concludes with Sect. 5. “Appendix A” illustrates the benchmark volatility models used to validate the superiority of our approach by separating the point-valued and interval-valued cases. “Appendix B” contains an empirical example of a relevant technical point of the methodological section.

2 Construction of fuzzy sets-valued returns

2.1 Preliminaries of the fuzzy sets-valued random variable

We begin with defining the returns in the sets-valued variable space firstly by random set theory (Li & Guan, 2007; Li et al., 2013).

2.1.1 Sets-valued variable

Let ${\varvec{P}}_0({\mathbb {R}}^d)$ the family of all non-empty subsets of ${\mathbb {R}}^d$. For any $\tilde{{\varvec{A}}}\in {\varvec{P}}_0({\mathbb {R}}^n)$, we first define the membership function $m_{\tilde{{\varvec{A}}}}(x):{\mathbb {R}}^d\rightarrow {0,1}$ as

$$\begin{aligned} m_{\tilde{{\varvec{A}}}}=\begin{aligned}\left\{ \begin{array}{rcl} 0&{},&{}x\notin \tilde{{\varvec{A}}}\\ 1&{},&{}x\in \tilde{{\varvec{A}}} \end{array} \right. \\ \end{aligned} \end{aligned}$$

(1)

Membership function $m_{\tilde{{\varvec{A}}}}$ reflects whether x belongs to $\tilde{{\varvec{A}}}$. For $\tilde{{\varvec{A}}}, \tilde{{\varvec{B}}}\in {\varvec{P}}_0({\mathbb {R}}^d)$, we have the addition and scalar multiplication operation:

$$\begin{aligned} \tilde{{\varvec{A}}}\oplus \tilde{{\varvec{B}}}=\{a+b:a\in \tilde{{\varvec{A}}},b\in \tilde{{\varvec{B}}}\}\nonumber \\ \lambda \tilde{{\varvec{A}}}=\{\lambda a:a\in \tilde{{\varvec{A}}}\},\lambda \in {\mathbb {R}} \end{aligned}$$

(2)

interval-valued subtraction consists of two concepts. Similarly, this issue will arise when we discuss sets-valued subtraction. Sets-valued subtraction rule of ACI model (named Type-A subtraction in this paper) considered that the sets-valued subtraction operation should be the inverse of the sets-valued addition operation, shown in Eq.(3). However, the Sets-valued subtraction rule of Int-GARCH models (named Type-B subtraction in this paper) consider that the subtraction rule subtraction rule should be strictly adhered to in the set of basic arithmetic operations, shown in Eq.(4). This paper provides a detailed explanation in “Appendix A” on the ACI and Int-GARCH model, which are two benchmark models, along with their respective subtraction rules.

Type-A Subtraction:

$$\begin{aligned} \tilde{{\varvec{A}}}\ominus _A\tilde{{\varvec{B}}}=\{x\in {\mathbb {R}}^d,x+\tilde{{\varvec{B}}}\subset \tilde{{\varvec{A}}}\} \end{aligned}$$

(3)

where $x+\tilde{{\varvec{B}}}=\{y=x+b:b\in \tilde{{\varvec{B}}}\}$.

Type-B Subtraction:

$$\begin{aligned} \tilde{{\varvec{A}}}\ominus _B\tilde{{\varvec{B}}}=\tilde{{\varvec{A}}}\oplus (-1)\otimes \tilde{{\varvec{B}}} \end{aligned}$$

(4)

where $\oplus $ and $\otimes $ are addition and scalar operations in Eq. (2). We pay close attention to the subtraction operation because the choice of subtraction, i.e., Type-A substract of ACI model and Type-B subtract of Int-GARCH model will directly impact the structure of our sets-valued volatility model.

2.1.2 Fuzzy sets-valued variable

A fuzzy set $\tilde{{\varvec{A}}}$ on ${\mathbb {R}}^d$ is identified by its membership function $m_{\tilde{{\varvec{A}}}}(x):{\mathbb {R}}^d\rightarrow [0,1]$, where $m_{\tilde{{\varvec{A}}}}$ is interpreted as the degree of acceptance that $x\in {\mathbb {R}}^d$ is a member of $\tilde{{\varvec{A}}}$. Unlike the situation described in Eq. (1), whether x belongs to $\tilde{{\varvec{A}}}$ is not definite, but exists in an ambiguous “either/or” state.

The crisp set

$$\begin{aligned} \tilde{{\varvec{A}}}_\alpha \doteq \{x\in {\mathbb {R}}^d:m_{\tilde{{\varvec{A}}}}\ge \alpha \},\quad \alpha \in [0,1] \end{aligned}$$

(5)

is called the $\alpha $-cut of $\tilde{{\varvec{A}}}$. For $\alpha =0$, the support of $\tilde{{\varvec{A}}}$ is defined as $\tilde{{\varvec{A}}}_{\alpha =0}\doteq cl\{x\in {\mathbb {R}}^d:m_{\tilde{{\varvec{A}}}}>0\}\doteq supp\tilde{{\varvec{A}}}$. For any two fuzzy sets $\tilde{{\varvec{A}}}$ with membership function $m^A(x)$ and $\tilde{{\varvec{B}}}$ with $m^B(x)$, they have addition operation $\tilde{{\varvec{A}}}\oplus \tilde{{\varvec{B}}}=\tilde{{\varvec{C}}}$ and scalar multiplication operation $\lambda \otimes \tilde{{\varvec{A}}}=\tilde{{\varvec{D}}}$. Given the membership function of $\tilde{{\varvec{C}}}$ is $m^C(x)$ and $\tilde{{\varvec{B}}}$ is $m^D(x)$, we have that

$$\begin{aligned} m^C(x)&=sup\{\alpha \in [0,1]:x\in m^A(x)_\alpha +x\in m^B(x)_\alpha \} \nonumber \\ m^D(x)&=\begin{aligned}\left\{ \begin{array}{rcl} m^A(\frac{x}{\lambda })&{},&{}\lambda \ne 0\\ {\tilde{{\varvec{0}}}\in {\mathbb {R}}^d}&{},&{}\lambda =0 \end{array} \right. \\ \end{aligned} \end{aligned}$$

(6)

Similar to the sets-valued variable case, given $\tilde{{\varvec{A}}}\ominus \tilde{{\varvec{B}}}=\tilde{{\varvec{E}}}$, $\tilde{{\varvec{E}}}$ has the membership function $m^E(x)$. The subtraction $\ominus $ of fuzzy sets-valued variables could also be defined in Type-A Subtraction like Eqs. (A13) and (3),

$$\begin{aligned} m^E(X)=&m^A(x)\ominus _Am^B(x) \nonumber \\ =&sup\{\alpha \in [0,1]:x\in m^A(x)_\alpha -m^B(x)_\alpha \},\quad x\in {\mathbb {R}}^d \end{aligned}$$

(7)

or Type-B subtraction like Eqs. (A15) and (4)

$$\begin{aligned} m^E(x_E)=m^A(x_A)\ominus m^B(x_B)=\mathop {Sup}\limits _{x_A-x_B=x_E}Inf\{m^A(x_A),m^B(x_B)\} \end{aligned}$$

(8)

Fuzzy sets-valued Type-A subtraction is consistent with the idea of the ACI model. Fuzzy sets-valued Type-B subtraction is consistent with the idea of the Int-GARCH model and a classic fuzzy sets-valued subtraction rule (Zhü 2014).

2.1.3 The distance of fuzzy sets-valued variable

We give the concept of the support function first. Given $S_{\tilde{{\varvec{M}}}}\doteq \mathop {Sup}\limits _{y\in \tilde{{\varvec{A}}}}<u,y>$, $u\in {\mathbb {S}}^{d-1}$, where $\tilde{{\varvec{M}}}$ is a sets-valued variable, $<,>$ is a scalar-inner product, and ${\mathbb {S}}^{d-1}$ is the unit sphere of ${\mathbb {R}}^d$. The support function of a fuzzy sets-valued variable $\tilde{{\varvec{A}}}$ is $S_{\tilde{{\varvec{A}}}}(u,\alpha )\doteq S_{\tilde{{\varvec{A}}}_\alpha (u)}$, $\alpha \in (0,1]$, $u\in {\mathbb {S}}^{d-1}$ and $\tilde{{\varvec{A}}}_\alpha $ is $\alpha $-cut of $\tilde{{\varvec{A}}}$ in Eq. (5).

Using the concept of support function (Körner & Näther, 2002; Li et al., 2013), a popular $\rho _2$ distance measure between fuzzy sets-valued variable $\tilde{{\varvec{A}}}$ and $\tilde{{\varvec{B}}}$ is

$$\begin{aligned} \rho _2(\tilde{{\varvec{A}}},\tilde{{\varvec{B}}}) =\int \limits _{[0,1]^2\times ({\mathbb {S}}^{d-1})^2}{((S_{\tilde{{\varvec{A}}}}(u,\alpha )-S_{\tilde{{\varvec{B}}}}(u,\alpha ))(S_{\tilde{{\varvec{A}}}}(v,\beta )-S_{\tilde{{\varvec{B}}}}(v,\beta )))dK(u,\alpha ,v,\beta )} \nonumber \\ \end{aligned}$$

(9)

where $dK(u,\alpha ,v,\beta )$ is a kernel, and $\int \limits _{[0,1]^2\times ({\mathbb {S}}^{d-1})^2}dK(u,\alpha ,v,\beta )=1$. Moreover, the inner product between fuzzy sets-valued variable $\tilde{{\varvec{A}}}$ and $\tilde{{\varvec{B}}}$ is

$$\begin{aligned} <\tilde{{\varvec{A}}},\tilde{{\varvec{B}}}>\doteq \int \limits _{[0,1]^2\times ({\mathbb {S}}^{d-1})^2}S_{\tilde{{\varvec{A}}}}(u,\alpha )S_{\tilde{{\varvec{B}}}}(v,\beta )dK(u,\alpha ,v,\beta ) \end{aligned}$$

(10)

The expectation of any fuzzy random set $\tilde{{\varvec{X}}}$, denoted by ${\mathbb {E}}(\tilde{{\varvec{X}}})$, is also a fuzzy set-variable in such that for every $\alpha \in [0,1]$, i.e.,

$$\begin{aligned} ({\mathbb {E}}(\tilde{{\varvec{X}}}))_\alpha =cl\{{\mathbb {E}}f:f\in S_{\tilde{{\varvec{X}}}_\alpha }\} \end{aligned}$$

(11)

The variance of $\tilde{{\varvec{X}}}$ can be defined as

$$\begin{aligned} {\mathbb {D}}(\tilde{{\varvec{X}}})&=\int \limits _{[0,1]^2\times ({\mathbb {S}}^{d-1})^2}COV(S_{\tilde{{\varvec{X}}}}(u,\alpha ),S_{\tilde{{\varvec{X}}}}(v,\beta ))dK(u,\alpha ,v,\beta )\nonumber \\&={\mathbb {E}}(<\tilde{{\varvec{X}}},\tilde{{\varvec{X}}}>)-<{\mathbb {E}}(\tilde{{\varvec{X}}}),{\mathbb {E}}(\tilde{{\varvec{X}}})> \end{aligned}$$

(12)

where $<\tilde{{\varvec{X}}},\tilde{{\varvec{X}}}>$ is a random variable, and the definition of fuzzy sets-valued inner product can be referred to Eq. (10).

2.2 Fuzzy sets-valued price and returns

We build the highest log-price $H_t$, lowest log-price $L_t$, and closing log-price $C_t$ information of day t into a set to form a sets-valued stochastic variable, that is, $\tilde{{\varvec{P}}}_t=\{H_t,L_t,C_t\}$. The range of asset price movements is formed by the $H_t$ and $L_t$ of the asset. Compared to the opening and settlement prices, the closing price of an asset contains richer information relating to investors’ market perceptions. The closing price often reflects the level of market attention from investors towards a particular stock and can serve as an indicator of the expected movement for the next trading day.^{Footnote 3} Therefore, the performance of the closing price is worth paying attention to. In empirical research, most studies use $H_t$, $L_t$, and $C_t$ as the LR-fuzzy set-valued price for asset prices (Moussa et al., 2014; Hassan, 2009).

We give $\tilde{{\varvec{P}}}_t$ with membership function $m_{\tilde{{\varvec{P}}}_t}(x)$ into a classic LR-type fuzzy set-valued variable as

$$\begin{aligned} m_{\tilde{{\varvec{P}}}_t}(x)=\begin{aligned}\left\{ \begin{array}{rcl} \phi (\frac{C_t-x}{C_t-L_t};p)&{},&{}L_t\le x\le C_t\\ \phi (\frac{x-C_t}{H_t-C_t};p)&{},&{}H_t\ge x \ge C_t \end{array} \right. \\ \end{aligned} \end{aligned}$$

(13)

where $\phi (x;p)\doteq \frac{e^{-(-x)^p}-e^{-1}}{1-e^{-1}}{\varvec{1}}(x\le 0)+\frac{e^{-x^p}-e^{-1}}{1-e^{-1}}{\varvec{1}}(x>0)$ and $x\in [L_t,H_t]$. The benefit of choosing such a $\phi (x;p)$ is that the parameter p can control the morphology of $\tilde{{\varvec{P}}}(x)_t$ to produce rich variations (see “Appendix B”).

Let the closing price returns of day t be $R_{C,t}=C_t-C_{t-1}$, similarly, the highest returns of day t are $R_{H,t}=H_t-L_{t-1}$ and the lowest returns of day t are $R_{L,t}=L_t-H_{t-1}$, then a sets-valued stochastic variable $\tilde{{\varvec{R}}}_t$ under Type-B subtraction of Eq. (8) is also a fuzzy sets-valued variable with the membership function as

$$\begin{aligned} \tilde{{\varvec{R}}}_t&=\tilde{{\varvec{P}}}_t\ominus _B\tilde{{\varvec{P}}}_{t-1}=\tilde{{\varvec{P}}}_t\otimes -1\otimes \tilde{{\varvec{P}}}_{t-1}\nonumber \\ m_{\tilde{{\varvec{P}}}_t}(x)&=\begin{aligned}\left\{ \begin{array}{rcl} \phi (\frac{R_{C,t}-x}{R_{l,t}};p)&{},&{}R_{C,t}-R_{l,t}\le x\le R_{C,t}\\ \phi (\frac{x-R_{C,t}}{R_{r,t}};p)&{},&{}R_{C,t}\le x< R_{C,t}+R_{r,t} \end{array} \right. \\ \end{aligned} \end{aligned}$$

(14)

where $R_{l,t}=C_t-L_t$ and $R_{r,t}=H_t-C_t$. We will explain in Sect. 2.3 why we chose the Type-B subtraction of Eq. (7) instead of the Type-A subtraction of Eq. (8). A numerical example in “Appendix B” shows the membership function trajectory of our fuzzy sets-valued returns.

There are three benefits of using the fuzzy sets-valued variable of Eq. (13): (1) Compared with GARCH-type models’ point valued-returns, it expands the $H_t$ and $L_t$ information. (2) Compared with CARR-type models’ range based point-valued returns, it expands the “trend” information of $H_t$ and $L_t$. (3) Compared with ACI and Int-GARCH models’ interval valued-returns, Eq. (13) can flexibly highlight the closing price information.

2.3 Why do we choose the Type-B subtraction?

If the Type-A subtraction (like the ACI model as introduced in Section A.2) is used in the calculation of returns when $L_t-H_t\le L_t-L_{t-1}\le C_t-C_{t-1}\le H_t-H_{t-1}\le H_t-L_{t-1}$ and $L_t-H_{t-1}\le H_t-H_{t-1} \le C_t-C_{t-1} \le L_t-L_{t-1} \le H_t-L_{t-1}$,^{Footnote 4} the fuzzy sets-valued returns calculated by the Type-A subtraction $\tilde{{\varvec{R}}}_{Type-A}$ with membership function $m_{Type-A}(x)$ could be

$$\begin{aligned} \tilde{{\varvec{R}}}_{Type-A}&=\tilde{{\varvec{P}}}_t\ominus _A\tilde{{\varvec{P}}}_{t-1}\nonumber \\ m_{Type-A}(x)&=\begin{aligned}\left\{ \begin{array}{rcl} \phi (\frac{R_{C,t}-x}{R_{l,t}-(L_t-L_{t-1})};p)&{},&{}L_t-L_{t-1}\le x\le R_{C,t}\\ \phi (\frac{x-R_{C,t}}{(H_t-H_{t-1})-R_{r,t}};p)&{},&{}R_{C,t}\le x< H_t-H_{t-1} \end{array} \right. \\ \end{aligned} \end{aligned}$$

(15)

If we ignore the fuzziness of $\tilde{{\varvec{P}}}_{Type-A,t}$ or assume a relatively high value of p in Eq. (15), $\tilde{{\varvec{R}}}_{Type-A,t}$ will become $\tilde{{\varvec{R}}}^*_{Type-A,t}$, i.e.,

$$\begin{aligned} \tilde{{\varvec{R}}}^*_{Type-A,t}=[L_t-L_{t-1},H_t-H_{t-1}]=[\tilde{{\varvec{P}}}_{L,t}-\tilde{{\varvec{P}}}_{L,t-1},\tilde{{\varvec{P}}}_{R,t}-\tilde{{\varvec{P}}}_{R,t-1}] \end{aligned}$$

(16)

Due to $L_t\le C_t\le H_t$ and $L_{t-1}\le C_{t-1}\le H_{t-1}$, under the Type-A subtraction, we cannot guarantee that $R_{C,t}=C_t-C_{t-1}\in Supp \tilde{{\varvec{R}}}^*_{Type-A,t}$. It would be contrary to our original intent to absorb closing price data. Further, let $\Delta m(x;p)$ be the difference between $m_{\tilde{{\varvec{P}}}_t}(x)$ in Eq. (14) and $m_{Type-A}(x)$ in Eq. (15), i.e.,

$$\begin{aligned} \Delta m(x;p)=\begin{aligned}\left\{ \begin{array}{rcl} m_{\tilde{{\varvec{P}}}_t}(x)-m_{Type-A}(x)&{},&{}L_t-L_{t-1}\le x\le H_t-H_{t-1}\\ m_{\tilde{{\varvec{P}}}_t}(x)&{},&{}others \end{array} \right. \\ \end{aligned} \end{aligned}$$

(17)

then Fig 1 demonstrates the trajectory of $\Delta m(x;p)$.

Fig 1 demonstrates the trajectory of $\Delta m(x;p)$. When a real-world trading point-valued returns $r_t$ is in $[L_t-H_{t-1},L_t-L_{t-1}]$ and $[H_t-H_{t-1},H_t-L_{t-1}]$, as p in Eq. (17) increases, the degree of membership of $r_t$ to $\tilde{{\varvec{R}}}_t$ of Eq. (14) will surpass $\tilde{{\varvec{P}}}_{Tpyr-A,t}$ of Eq. (15) to a greater extent. When the point-valued returns $r_t$ is in $[L_t-L_{t-1},H_t-H_{t-1}]$, the greater the p, the smaller the difference between the membership of $r_t$ for $\tilde{{\varvec{R}}}_t$ and the membership for $\tilde{{\varvec{R}}}_{Tpye-A,t}$.

From the perspective of information absorption, when p is smaller, the difference between selecting Type-A and Type-B subtraction is smaller; and when p is larger, selecting the Type-B subtraction has a higher degree of information absorption on $[L_t-H_{t-1},L_t-L_{t-1}]$ and $[H_t-H_{t-1},H_t-L_{t-1}]$. This actually implies that we should regard the setting of p as a prior parameter, rather than putting it into our model and then estimating its value. All in all, if real-world trading returns $r_t$ fall in the interval $[L_t-H_{t-1},L_t-L_{t-1}]$ and $[H_t-H_{t-1},H_t-L_{t-1}]$, the returns defined by Type-A subtraction (like the ACI model) cannot cover $r_t$. This goes against the original intent of the model we wish to create, and we also find that the preceding parameter p in Eq. (13).

2.4 Discussion of $K(u,\alpha ,v,\beta )$

Here we discuss the setting of $K(u,\alpha ,v,\beta )$ in fuzzy sets-valued returns (He et al., 2021; Sun et al., 2018; Yang et al., 2016), which is used in scalar-inner product, distance, and variance calculation of $\tilde{{\varvec{R}}}_t$ from Eqs. (9). to (12). Given that the sets-valued volatility model in this study is for uni-variate fuzzy sets-valued time series, we have ${\mathbb {S}}^{d-1}={\mathbb {S}}^0=\{1,-1\}$ in the support function in Eq. (9). The u and v in Eq. (9) and $K(u,\alpha ,v,\beta )$ only takes 1 or $-1$ in this study. We have (He et al., 2021; Sun et al., 2020, 2018; Yang et al., 2016)

$$\begin{aligned} K(u,\alpha ,v,\beta )=\begin{aligned}\left\{ \begin{array}{rcl} a\cdot \delta _{\alpha }(\beta )d\alpha &{},&{}u=v=1\\ b\cdot \delta _{\alpha }(\beta )d\alpha &{},&{}u=v=-1\\ c\cdot \delta _{\alpha }(\beta )d\alpha &{},&{}u=-v\\ \end{array} \right. \\ \end{aligned} \end{aligned}$$

(18)

where $\delta _{\alpha }(\beta )=1$ when $\alpha =\beta $ and $\delta _{\alpha }(\beta )=0$ when $\alpha \ne \beta $. For the settings of a, b, c, a classic form is Körner and Näther (2002); Näther (2001)

$$\begin{aligned} a&=1-2\int _0^1td\psi (t)+\int ^1_0t^2d\psi (t)\nonumber \\ b&=\int ^1_0t^2d\psi (t)\nonumber \\ c&=\int _0^1td\psi (t)-\int ^1_0t^2d\psi (t) \end{aligned}$$

(19)

where $\psi (t)$ is the weight function. We set $\psi (t)=t$ in this study, thus in Eqs. (18) and (19), we have $a=1/3$, $b=1/3$, and $c=1/6$. The $\alpha $-cut of $\tilde{{\varvec{R}}}_t$ is $\tilde{{\varvec{R}}}_{\alpha ,t}$, and

$$\begin{aligned} \tilde{{\varvec{R}}}_{\alpha ,t}=[R_{C,t}-\phi ^{-1}(\alpha )R_{l,t},R_{C,t}+\phi ^{-1}(\alpha )R_{r,t}] \end{aligned}$$

(20)

where $\phi (x)$ is defined in Eq. (13). Combining Eqs. (10) and (20), the scalar inner product is

$$\begin{aligned}&<\tilde{{\varvec{R}}}_t,\tilde{{\varvec{R}}}_t>_{a=1/3,b=1/3,c=1/6} \nonumber \\&\quad =\int ^1_0(a(R_{C,t}+\phi ^{-1}(\alpha )R_{r,t})^2+b(\phi ^{-1}(\alpha )R_{l,t}-R_{C,t})^2)d\alpha \nonumber \\&\quad +\int ^1_0(2c(R_{C,t}+\phi ^{-1}(\alpha )R_{r,t})(\phi ^{-1}(\alpha )R_{l,t}-R_{C,t}))d\alpha \nonumber \\&\quad =(a+b-2c)R^2_{C,t}+aR^2_{r,t}\int ^1_0(\phi ^{-1}(\alpha ))^2d\alpha +bR^2_{l,t}\int ^1_0(\phi ^{-1}(\alpha ))^2d\alpha \nonumber \\&\quad +2R_{C,t}R_{r,t}(a-c)\int ^1_0\phi ^{-1}(\alpha )\alpha +2R_{C,t}R_{l,t}(c-b)\int ^1_0\phi ^{-1}(\alpha )\alpha \nonumber \\&\quad +2cR_{l,t}R_{r,t}\int ^1_0(\phi ^{-1}(\alpha ))^2d\alpha \end{aligned}$$

(21)

given the distance between $\tilde{{\varvec{R}}}_t$ and $\tilde{{\varvec{0}}}$ is $\rho _2(\tilde{{\varvec{R}}}_t,\tilde{{\varvec{0}}})=<\tilde{{\varvec{R}}}_t,\tilde{{\varvec{R}}}_t>\doteq \Vert \tilde{{\varvec{R}}}_t\Vert ^2_{\rho _2}$, and ${\mathbb {E}}(S_{\tilde{{\varvec{R}}}_t})=S_{{\mathbb {E}}\tilde{{\varvec{R}}}_t}$, the variance ${\mathbb {D}}(\tilde{{\varvec{R}}}_t)$ is,

$$\begin{aligned} {\mathbb {D}}(\tilde{{\varvec{R}}}_t)&={\mathbb {E}}(<\tilde{{\varvec{R}}}_t,\tilde{{\varvec{R}}}_t>_{a=\frac{1}{3},b=\frac{1}{3}, c=\frac{1}{6}})-<{\mathbb {E}}\tilde{{\varvec{R}}}_t,{\mathbb {E}}\tilde{{\varvec{R}}}_t>_{a=\frac{1}{3},b=\frac{1}{3},c=\frac{1}{6}} \nonumber \\&=(a+b-2c){\mathbb {D}}(R_{C,t})+a{\mathbb {D}}(R_{r,t})\int ^1_0(\phi ^{-1}(\alpha ))^2d\alpha \nonumber \\&\quad +b{\mathbb {D}}(R_{l,t})\int ^1_0(\phi ^{-1}(\alpha ))^2d\alpha +2COV(R_{C,t},R_{r,t})(a-c)\int ^1_0\phi ^{-1}(\alpha )d\alpha \nonumber \\&\quad +2COV(R_{C,t},R_{l,t})(c-b)\int ^1_0\phi ^{-1}(\alpha )d\alpha \nonumber \\&\quad +2COV(R_{l,t},R_{r,t})c\int ^1_0(\phi ^{-1}(\alpha ))^2d\alpha \end{aligned}$$

(22)

3 The random fuzzy sets-valued based GARCH model

3.1 Grounding ideas on the model setting

The modeling philosophy embodied in Eq. (A5) in subsection A.1.1 implies that changes in current observations are driven by historical observations. If we also wish to apply this modeling philosophy to the proposed model with parameter $\theta $, one classic mode is:

$$\begin{aligned} \tilde{{\varvec{R}}}_{t}=f(\tilde{{\varvec{R}}}_{t-1},\tilde{{\varvec{R}}}_{t-2},\ldots ,\tilde{\varvec{\epsilon }}_{t-1},\tilde{\varvec{\epsilon }}_{t-2},\ldots ;\theta )+\tilde{\varvec{\epsilon }}_{t} \end{aligned}$$

(23)

In Eq. (23), at time t, the term $f(\tilde{{\varvec{R}}}_{t-1},\tilde{{\varvec{R}}}_{t-2},\ldots ,\tilde{\varvec{\epsilon }}_{t-1},\tilde{\varvec{\epsilon }}_{t-2},\ldots ;\theta )$ is not stochastic any more, but is still a fuzzy sets-valued number, while $\tilde{\varvec{\epsilon }}_{t}$ is a random sets-valued variable that gives randomness to $\tilde{{\varvec{R}}}_{t}$.

Let $\tilde{{\varvec{r}}}_{t}=\{\tilde{{\varvec{r}}}_{T},\tilde{{\varvec{r}}}_{T-1},\tilde{{\varvec{r}}}_{T-2},\ldots ,\tilde{{\varvec{r}}}_{1},\tilde{{\varvec{r}}}_{0}\}$ the observations of $\tilde{{\varvec{R}}}_{t}$, and $\tilde{{\varvec{r}}}_{t}$ is a fuzzy sets-valued variable. Under the classic model structure of Eq. (23), when one uses the minimum loss function method to estimate the parameter $\theta $ with some loss function $\Psi $, the estimated parameter $\theta $ under type-A and type-B subtraction is

$$\begin{aligned} {\hat{\theta }}_{Type-A}&=\mathop {argmin}\limits _{\theta }\sum ^T_{i=1}\Psi (\tilde{{\varvec{r}}}_{i}\ominus _Af(\tilde{{\varvec{r}}}_{i-1},\tilde{{\varvec{r}}}_{i-2},\ldots ,\tilde{\varvec{\epsilon }}_{i-1},\tilde{\varvec{\epsilon }}_{i-2},\ldots ;\theta )) \nonumber \\ {\hat{\theta }}_{Type-B}&=\mathop {argmin}\limits _{\theta }\sum ^T_{i=1}\Psi (\tilde{{\varvec{r}}}_{i}\ominus _Bf(\tilde{{\varvec{r}}}_{i-1},\tilde{{\varvec{r}}}_{i-2},\ldots ,\tilde{\varvec{\epsilon }}_{i-1},\tilde{\varvec{\epsilon }}_{i-2},\ldots ;\theta )) \end{aligned}$$

(24)

respectively. However, given a real parameter $\theta ^*$, we will never find a ${\hat{\theta }}_{Type-B}=\theta ^*$ under minimum loss function method, because $\tilde{{\varvec{r}}}_{i}\ominus _Bf(\tilde{{\varvec{r}}}_{i-1},\tilde{{\varvec{r}}}_{i-2},\ldots ,\tilde{\varvec{\epsilon }}_{i-1},\tilde{\varvec{\epsilon }}_{i-2},\ldots ;\theta )\ne \tilde{{\varvec{0}}}$. The reason is that if we have $\tilde{{\varvec{A}}}=\tilde{{\varvec{B}}}$, then $\tilde{{\varvec{A}}}\ominus _A\tilde{{\varvec{B}}}=\tilde{{\varvec{0}}}$, while $\tilde{{\varvec{A}}}\ominus _B\tilde{{\varvec{B}}}\ne \tilde{{\varvec{0}}}$. However, we could find a ${\hat{\theta }}_{Type-A}=\theta ^*$.

Using maximum likelihood (ML) for parameter estimation, Type-B subtraction suffers from the same issue. Given that both $\tilde{{\varvec{R}}}_t$ and $\tilde{{\varvec{r}}}_t$ in Eq. (23) are random variables, one could maximize the likelihood function of $\tilde{{\varvec{R}}}_t$ and $\tilde{{\varvec{r}}}_t$ to estimate $\theta $ in Eq. (23). Given the fact that

$$\begin{aligned} \tilde{{\varvec{R}}}_t\ominus _Af(\tilde{{\varvec{R}}}_{t-1},\tilde{{\varvec{R}}}_{t-2},\ldots ,\tilde{\varvec{\epsilon }}_{t-1},\tilde{\varvec{\epsilon }}_{t-2},\ldots ;\theta )&=\tilde{\varvec{\epsilon }}_t \nonumber \\ \tilde{{\varvec{R}}}_t\ominus _Bf(\tilde{{\varvec{R}}}_{t-1},\tilde{{\varvec{R}}}_{t-2},\ldots ,\tilde{\varvec{\epsilon }}_{t-1},\tilde{\varvec{\epsilon }}_{t-2},\ldots ;\theta )&\ne \tilde{\varvec{\epsilon }}_t \end{aligned}$$

(25)

let the density function of $\tilde{{\varvec{R}}}_t$ and $\tilde{\varvec{\epsilon }}_t$ be $f_{\tilde{{\varvec{R}}}}$ and $f_{\tilde{\varvec{\epsilon }}}$, whether we maximize $f_{\tilde{{\varvec{R}}}}$ to obtain ${\hat{\theta }}^{\tilde{{\varvec{R}}}}$ or maximize $f_{\tilde{\varvec{\epsilon }}}$ to get ${\hat{\theta }}^{\tilde{\varvec{\epsilon }}}$. It should have ${\hat{\theta }}^{\tilde{{\varvec{R}}}}={\hat{\theta }}^{\tilde{\varvec{\epsilon }}}$. However, there is a paradox in the following maximum likelihood estimation function under model structure of Eq. (23),

$$\begin{aligned} {\hat{\theta }}^{\tilde{{\varvec{R}}}}_{Type-A}=\mathop {argmax}\limits _{\theta } \prod ^T_{t-1}{f_{\tilde{{\varvec{R}}}}(\theta \vert \tilde{{\varvec{r}}}_t)}=\mathop {argmax}\limits _{\theta } \prod ^T_{t-1}{f_{\tilde{\varvec{\epsilon }}}(\theta \vert \tilde{\varvec{\epsilon }}_t)}={\hat{\theta }}^{\tilde{\varvec{\epsilon }}}_{Type-A} \nonumber \\ {\hat{\theta }}^{\tilde{{\varvec{R}}}}_{Type-B}=\mathop {argmax}\limits _{\theta } \prod ^T_{t-1}{f_{\tilde{{\varvec{R}}}}(\theta \vert \tilde{{\varvec{r}}}_t)}=\mathop {argmax}\limits _{\theta } \prod ^T_{t-1}{f_{\tilde{\varvec{\epsilon }}}(\theta \vert \tilde{\varvec{\epsilon }}_t)}\ne {\hat{\theta }}^{\tilde{\varvec{\epsilon }}}_{Type-B} \end{aligned}$$

(26)

To solve this problem in the estimation process, one solution is to drive the dynamics of $\tilde{{\varvec{R}}}_t$ in the following model structure instead of Eq. (23)’s structure as

$$\begin{aligned} \tilde{{\varvec{R}}}_t&=\tilde{\varvec{\epsilon }}_t \nonumber \\ \tilde{\varvec{\epsilon }}_t&\sim f_{\tilde{\varvec{\epsilon }}_t}(\tilde{{\varvec{x}}};\theta ) \end{aligned}$$

(27)

Where the evolution and randomness of $\tilde{{\varvec{R}}}_t$ are all comes from a fuzzy sets-valued stochastic variable $\tilde{\varvec{\epsilon }}_t$ with time-varying probability density $f_{\tilde{\varvec{\epsilon }}_t}(\tilde{{\varvec{x}}};\theta )$. It could be found that the problem in Eq. (26) is resolved, because we set $\tilde{{\varvec{R}}}_t=\tilde{\varvec{\epsilon }}_t$ compulsively in Eq. (27). This kind of setting is similar to the Int-GARCH model and most GARCH-type models. In Sect. 4, we discover that the model structure of Eq. (27) is still capable of predicting volatility accurately. The limitation of the model setting does not necessarily impact the proposed model’s predictive power.

Recalling Eq. (27), $\tilde{{\varvec{R}}}_t=\tilde{\varvec{\epsilon }}_t$, $\tilde{\varvec{\epsilon }}_t\sim f_{\tilde{\varvec{\epsilon }}_t(\tilde{{\varvec{x}}};\theta )}$, if one wants to determine the change of $\tilde{\varvec{\epsilon }}_t$, a straightforward idea from Eq. (A5) of GARCH-type models is to construct a time-varying parameter $\theta _t$,^{Footnote 5} and use the past observations of $\tilde{\varvec{\epsilon }}_t$ (or $\tilde{{\varvec{R}}}_t$) to obtain the $\theta _t$. From an economic perspective, whether we use point values or the fuzzy set values as described in this paper to represent returns (or the innovations in returns), we must carefully consider the fact that current returns (or the innovations in returns) may be driven by past values and exhibit correlation with past values. The concept of lagged terms influencing current terms is widely applied in various econometric models (Creal et al., 2013; Koop & Korobilis, 2013).

Similar to the GARCH-type model, the type of distribution law of $\tilde{\varvec{\epsilon }}_t$ will not change over time. Let the parameter set $\theta _t$ in Eq. (27) is $\theta _t={\nu _{1,t},\nu _{2,t},\ldots ,\nu _{n,t}}$ and we provide the following general model structure:

$$\begin{aligned} \tilde{{\varvec{R}}}_t&=\tilde{\varvec{\epsilon }}_t,\quad \tilde{\varvec{\epsilon }}_t=\tilde{\varvec{\epsilon }}_t(\nu _{1,t},\nu _{2,t},\ldots ,\nu _{n,t}) \nonumber \\ \nu _{1,t}&\sim l_{1,t}(\theta _{\nu _{1,t}}),\quad \theta _{\nu _{1,t}}=f_1(\tilde{{\varvec{R}}}_{t-1},\tilde{{\varvec{R}}}_{t-2},\ldots ,\theta _{\nu _{1,t-1}},\theta _{\nu _{1,t-2}},\ldots ) \nonumber \\&\dots \nonumber \\ \nu _{n,t}&\sim l_{n,t}(\theta _{\nu _{n,t}}),\quad \theta _{\nu _{n,t}}=f_n(\tilde{{\varvec{R}}}_{t-1},\tilde{{\varvec{R}}}_{t-2},\ldots ,\theta _{\nu _{1,t-1}},\theta _{\nu _{1,t-2}},\ldots ) \end{aligned}$$

(28)

where $\nu _{1,t},\nu _{2,t},\ldots ,\nu _{n,t}$ are the random scalar parameters in $\tilde{\varvec{\epsilon }}_t$ and with density function $l_{1,t},l_{2,t},\ldots ,l_{n,t}$. Following the GARCH-type model, in Eq. (28), the scalar parameters $\theta _{\nu _{1,t}},\theta _{\nu _{2,t}},\ldots ,\theta _{\nu _{n,t}}$ in density functions $l_{1,t},l_{2,t},\ldots ,l_{n,t}$ are obtained by the past observed $\tilde{{\varvec{R}}}_t$ and lag-terms of themselves $\theta _{\nu _{1,t}},\theta _{\nu _{2,t}},\ldots ,\theta _{\nu _{n,t}}$.

We further explore the drivers of $\tilde{{\varvec{R}}}_t$ change. When we get the prior parameter p in Eq. (13), the shape of $\tilde{{\varvec{R}}}_t$ depends on $R_{C,t}$, $R_{r,t}$, and $R_{l,t}$. The evolution of scalar value $R_C,t$ is first obtained by the past term of itself, and the distance between $R_{C,t}$ and 0. The $\rho _2$ distance of Eq. (9) between $\tilde{{\varvec{R}}}_t$ and $\tilde{{\varvec{0}}}$ represents the degree of change in the overall price information set, which we note it by a 2-norm form $\Vert \tilde{{\varvec{R}}}_t\Vert ^2_{\rho _2}$. The overall change will also cause a change in the distance between $R_{C,t}$ and 0. In this vein, we have

$$\begin{aligned} R_{C,t}=g_1(R_{C,t-1},R_{C,t-2},\ldots ,\Vert \tilde{{\varvec{R}}}_{t-1}\Vert ^2_{\rho _2},\Vert \tilde{{\varvec{R}}}_{t-2}\Vert ^2_{\rho _2},\ldots ) \end{aligned}$$

(29)

If $R_{C,t}$ reflects a “standard” returns level, then $R_{r,t}$ and $R_{l,t}$ reflect the degree of extreme deviation from “standard” returns level in the positive and negative directions, respectively.^{Footnote 6} This implies that the current $R_{r,t}$ may be related to the past $R_{r,t-1}$ and the past $R_{l,t-1}$. The case is same for $R_{l,t}$. Therefore, we set the following drive mode:

$$\begin{aligned} R_{l,t}&=g_2(R_{l,t-1},R_{l,t-2},\ldots ,R_{r,t-1},R_{r,t-2},\ldots ,\Vert \tilde{{\varvec{R}}}_{t-1}\Vert ^2_{\rho _2},\Vert \tilde{{\varvec{R}}}_{t-2}\Vert ^2_{\rho _2},\ldots ) \nonumber \\ R_{r,t}&=g_3(R_{r,t-1},R_{r,t-2},\ldots ,R_{l,t-1},R_{l,t-2},\ldots ,\Vert \tilde{{\varvec{R}}}_{t-1}\Vert ^2_{\rho _2},\Vert \tilde{{\varvec{R}}}_{t-2}\Vert ^2_{\rho _2},\ldots ) \end{aligned}$$

(30)

where the form of functions $g_1$, $g_2$, and $g_3$ will be discussed later. From Eq. (28) to Eq. (30), we have

$$\begin{aligned} \tilde{{\varvec{R}}}_{t}&=\tilde{\varvec{\epsilon }}_{t} \nonumber \\ \tilde{\varvec{\epsilon }}_t&\sim f_{\tilde{\varvec{\epsilon }}_t}(R_{C,t-1},R_{C,t-2},\ldots ,R_{l,t-1},R_{l,t-2},\ldots ,R_{r,t-1},\nonumber \\ {}&R_{r,t-2},\ldots ,\Vert \tilde{{\varvec{R}}}_{t-1}\Vert ^2_{\rho _2},\Vert \tilde{{\varvec{R}}}_{t-2}\Vert ^2_{\rho _2},\ldots ) \end{aligned}$$

(31)

In this vein, we could determine the evolution of conditional random sets-valued $\tilde{{\varvec{R}}}_{t}\vert \Omega _{t-1}$ (whereby $\Omega _t={R_{C,t},R_{C,t-1},\ldots ,R_{l,t},R_{l,t-1},\ldots ,R_{r,t},R_{r,t-1},\ldots }$ is information set at time t) and calculate the in-sample volatility ${\mathbb {D}}(\tilde{{\varvec{R}}}_{t}\vert \Omega _{t-1})$ and out-of-sample volatility ${\mathbb {D}}(\tilde{{\varvec{R}}}_{t}\vert \Omega _{t-1})$ using Eq. (12).

3.2 Relationship between ${\mathbb {D}}(\tilde{{\varvec{R}}}_{t})$ and $\sigma _t$

The ${\mathbb {D}}(\tilde{{\varvec{R}}}_{t})$ is the volatility of fuzzy sets-valued returns $\tilde{{\varvec{R}}}_{t}$, which is not exactly the daily volatility (Sun et al., 2020). We need to perform an “average operation” for the “degree of fuzziness” $\tilde{{\varvec{R}}}_{t}$ to get $\sigma _t$. First, we use a fuzziness control parameter $\zeta \in [0,1]$ to control the “degree of fuzziness” of $\tilde{{\varvec{R}}}_{t}$, that is:

$$\begin{aligned} \tilde{{\varvec{R}}}(\zeta )_t=\begin{aligned}\left\{ \begin{array}{rcl} \phi \left( \frac{R_{C,t}-x}{\zeta R_{l,t}};p\right) &{},&{}R_{C,t}-R_{l,t}\le x\le R_{C,t}\\ \phi \left( \frac{x-R_{C,t}}{\zeta R_{r,t}};p\right) &{},&{}R_{C,t}\le x< R_{C,t}+R_{r,t} \end{array} \right. \\ \end{aligned} \end{aligned}$$

(32)

The smaller the value of $\zeta $, the better the $R_{C,t}$ is able to represent the fuzzy information of this day. In particular, when $\zeta $ is 0, fuzzy sets-valued $\tilde{{\varvec{R}}}(\zeta )_t$ collapses to $R_{C,t}$. Following Sun et al. (2020), in this study, we define an aggregate sets-valued volatility $\sigma _t^{set}$, which reflects the average change from accepting all possible returns information and assigning a certain membership, to accept only $R_{C,t}$. For any fuzzy sets-valued returns $\tilde{{\varvec{R}}}(\zeta )_t$ under a set information reception level $\zeta $, we give $\zeta $ a certain weight $W(\zeta )$. Then, the volatility $\sigma _t^{set}$ defined in our study is

$$\begin{aligned} \sigma _t^{set}=\frac{\int ^1_0{\mathbb {D}}(\tilde{{\varvec{R}}}(\zeta )_t)dW(\zeta )}{\int ^1_0dW(\zeta )} \end{aligned}$$

(33)

We set a general weight function $W(\zeta )=-\zeta +1$, $\zeta \in [0,1]$ in this study.

3.3 Model specification

In accordance with the analysis framework of subsections 3.1 and 3.2, we present our proposed random sets-valued GARCH model, Set-GARCH model, and its derivatives.

3.3.1 Set-GARCH model

We set $\theta ={\nu _1,\nu _2,\nu _3}$ in Eq. (28) as $\theta ={R_{C,t}, R_{l,t},R_{r,t}}$, which also means $R_{C,t}$, $R_{l,t}$, and $R_{r,t}$ would be stochastic processes. Thus, Eq. (27) can be expressed as:

$$\begin{aligned} \tilde{{\varvec{R}}}_t=\tilde{\varvec{\epsilon }}_t,\quad \tilde{\varvec{\epsilon }}_t=\tilde{\varvec{\epsilon }}_t(R_{C,t},R_{l,t},R_{r,t}) \end{aligned}$$

(34)

Following the GARCH-type models, we set $l_{q,t}$ in Eq. (28) to be a normal distribution ${\mathcal {N}}(0,1)$.^{Footnote 7} Thus, we have:

$$\begin{aligned} R_{C,t}&\mathop {\sim }\limits _{i.i.d}\sqrt{h_t}{\mathcal {N}}(0,1) \nonumber \\ h_t&=\omega _h+\sum ^{P_h}_{i=1}\alpha _{h,i}h_{t-i}+\sum ^{Q_h}_{i=1}\beta _{h,i}(\Vert \tilde{{\varvec{R}}}_{t-1}\Vert ^2_{\rho _2}-\frac{1}{3}R_{C,t-1}^2)+\sum ^{R_h}_{i=1}\gamma _{h,i}R_{C,t-1}^2 \end{aligned}$$

(35)

Given that $\Vert \tilde{{\varvec{R}}}_{t-1}\Vert ^2_{\rho _2}$ has a term of $\frac{1}{3}R_{C,t}^2$ from Eq. (21), which reveals that we should remove this term in $\Vert \tilde{{\varvec{R}}}_{t-1}\Vert ^2_{\rho _2}$, because the $\sum ^{R_h}_{i=1}\gamma _{h,i}R_{C,t-1}^2$ term in $h_t$ also has $R_{C,t}^2$. Since $R_{r,t}$ and $R_{l,t}$ must be positive numbers, we set $l_{2,t}$ and $l_{3,t}$ in Eq. (28) as Gamma distributions $\Gamma {1,\theta _{l_{2,t}}}$ and $\Gamma {1,\theta _{l_{3,t}}}$, which can flexibly control the variance and mean of $l_{2,t}$ and $l_{2,t}$ in Eq. (28), while reducing the complexity of the model. We have ${\mathbb {E}}(l_{2,t})=1/\theta _{l_{2,t}}$ and ${\mathbb {D}}(l_{2,t})=1/\theta ^2_{l_{2,t}}$. We denote $1/\theta _{l_{2,t}}$ as $\lambda _{l,t}$ and $1/\theta _{l_{3,t}}$ as $\lambda _{r,t}$. Here, we first give a simple setting, that is, $R_{l,t}$ and $R_{r,t}$ are independent of each other, or $COV(R_{l,t},R_{r,t})=0$. This assumption is not strong, because in the analysis of Eq. (30) we only discussed some possible influence paths of $R_{l,t}$ and $R_{r,t}$. In Sect. 3.3.2, we will discuss the case where $R_{l,t}$ and $R_{r,t}$ are not independent of each other. According to the setting of Eqs. (28) and (30) we provide the following structure:

$$\begin{aligned} R_{l,t}&\mathop {\sim }\limits _{i.i.d.}\lambda _{l,t}\Gamma (1,1),\qquad R_{r,t}\mathop {\sim }\limits _{i.i.d.}\lambda _{r,t}\Gamma (1,1) \nonumber \\ \lambda _{l,t}&=\Lambda (\omega _l+\sum ^{P_l}_{i=1}\alpha _{l,i}\lambda _{l,t-i}+\sum ^{Q_l}_{i=1}\beta _{l,i}\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}+\sum ^{R_l}_{i=1}\gamma _{l,t}R_{l,t-i} \nonumber \\ \lambda _{r,t}&=\Lambda (\omega _r+\sum ^{P_r}_{i=1}\alpha _{r,i}\lambda _{r,t-i}+\sum ^{Q_r}_{i=1}\beta _{r,i}\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}+\sum ^{R_r}_{i=1}\gamma _{r,t}R_{r,t-i} \end{aligned}$$

(36)

where $\Lambda :{\mathcal {R}}\rightarrow (0,\inf ]$ is a conversion function to ensure that $\lambda _{l,t}$ and $\lambda _{r,t}$ are positive values. Compared to Eq. (35), Eq. (36) selects $\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}$ term instead of $\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}^2$ term, given that distance, rather than the square of the distance, is more suitable for describing $R_{l,t}$ and $R_{r,t}$. Now we have the proposed Set-GARCH model, which means a GARCH-type model for sets-valued time series as:

$$\begin{aligned} \tilde{{\varvec{R}}}_t&=\tilde{\varvec{\epsilon }}_t,\quad \tilde{\varvec{\epsilon }}_t=\tilde{\varvec{\epsilon }}_t(R_{C,t},R_{l,t},R_{r,t}) \nonumber \\ R_{C,t}&\mathop {\sim }\limits _{i.i.d}\sqrt{h_t}{\mathcal {N}}(0,1),\quad R_{l,t}\mathop {\sim }\limits _{i.i.d.}\lambda _{l,t}\Gamma (1,1),\quad R_{r,t}\mathop {\sim }\limits _{i.i.d.}\lambda _{r,t}\Gamma (1,1) \nonumber \\ h_t&=\omega _h+\sum ^{P_h}_{i=1}\alpha _{h,i}h_{t-i}+\sum ^{Q_h}_{i=1}\beta _{h,i}(\Vert \tilde{{\varvec{R}}}_{t-i}\Vert ^2_{\rho _2}-\frac{1}{3}R_{C,t-i}^2)+\sum ^{R_h}_{i=1}\gamma _{h,i}R_{C,t-i}^2 \nonumber \\ \lambda _{l,t}&=\Lambda (\omega _l+\sum ^{P_l}_{i=1}\alpha _{l,i}\lambda _{l,t-i}+\sum ^{Q_l}_{i=1}\beta _{l,i}\Vert \tilde{{\varvec{R}}}_{t-i}\Vert _{\rho _2}+\sum ^{R_l}_{i=1}\gamma _{l,i}R_{l,t-i}) \nonumber \\ \lambda _{r,t}&=\Lambda (\omega _r+\sum ^{P_r}_{i=1}\alpha _{r,i}\lambda _{r,t-i}+\sum ^{Q_r}_{i=1}\beta _{r,i}\Vert \tilde{{\varvec{R}}}_{t-i}\Vert _{\rho _2}+\sum ^{R_r}_{i=1}\gamma _{r,i}R_{r,t-i} ) \end{aligned}$$

(37)

Given $COV(R_{l,t},R_{C,t})=0$, $COV(R_{C,t},R_{r,t})=0$, and $COV(R_{l,t},R_{r,t})=0$, now we give the set volatility $\sigma _t^{set}$ of Eq. (33) of Set-GARCH model as:

$$\begin{aligned} \sigma _t^{set}&=\frac{\int ^1_0{\mathbb {D}}(\tilde{{\varvec{R}}}(\zeta )_t)dW(\zeta )}{\int ^1_0dW(\zeta )} \nonumber \\&=\frac{1}{3}h_t+\frac{\lambda ^2_{l,t}+\lambda ^2_{r,t}}{18}\int ^1_0(\phi ^{-1}(\alpha ;p))^2d\alpha \end{aligned}$$

(38)

where $\int ^1_0(\phi ^{-1}(\alpha ;p))^2d\alpha $ depends on p and $h_t$ is in Eq. (35), and $\lambda ^2_{l,t}+\lambda ^2_{r,t}$ are as in Eq. (36).

3.3.2 Set-GARCH-LR model

One assumption in the Set-GARCH model is $COV(R_{l,t},R_{r,t})=0$. Now we remove this condition and consider that $R_{l,t}$ and $R_{r,t}$ are not independent. We call the proposed model in this case as Set-GARCH-LR model.

We consider using a joint bivariate Gamma distribution $\Gamma ^2$ to characterize $R_{l,t}$ and $R_{r,t}$ (Furman, 2008), where the marginal distribution $R_{l,t}$ or $R_{r,t}$ is a univariate Gamma distribution. This setup follows the CARR model’s specification of the distribution of returns’ range (Chou, 2005) (also see “Appendix A.1.2”).

Let ${\varvec{R}}=(Y_0,Y_1,Y_2)'$ is a tri-variate vector and $Y_i\sim \Gamma (\gamma _i,\alpha _i)$, which has the density of $f_{Y_i}(y)=e^{-\alpha _iy}\frac{y^{\gamma _i-1}\alpha ^{\gamma _i}_i}{\Gamma (y_i)}$, $y>0$, $\alpha _i>0$, $\gamma _i>0$. Let ${\varvec{A}}=\begin{bmatrix} \alpha _0/\alpha _1&{}1&{}0\\ \alpha _0/\alpha _2&{}\alpha _1/\alpha _2&{}1 \end{bmatrix}$, $(R_l,R_r)'={\varvec{A}}{\varvec{Y}}$, the joint distribution $(R_l,R_r)$ is a bi-variate gamma distribution which is controlled by the parameter $\{\alpha _0,\alpha _1,\alpha _2,\gamma _0,\gamma _1,\gamma _2\}$. Let $x^*=min\{\frac{\alpha _1}{\alpha _0}R_l,\frac{\alpha _2}{\alpha _0}R_r\}$, we have the density function of bivariate Gamma:

$$\begin{aligned} f(x_1,x_2)=e^{-\alpha _2x_2}\left( x_2-\frac{\alpha _1}{\alpha _2}x_1\right) ^{\gamma _2-1}\prod ^2_{j=0}\left( \frac{\alpha ^{\gamma _j}_j}{\Gamma (\gamma _j)}\right) \int ^{x^*}_0y_0^{\gamma _0-1}(x_1-\frac{\alpha _0}{\alpha _1}y_0)^{\gamma _1-1}dy_0\nonumber \\ \end{aligned}$$

(39)

According to the definition of Eq. (39), the marginal distribution of $R_{l,t}$ and $R_{r,t}$ is Gamma distribution, and the expectation and covariance of $R_{l,t}$ and $R_{r,t}$ is ${\mathbb {E}}(R_l)=\frac{\gamma _0+\gamma _1}{\alpha _1}$, ${\mathbb {E}}(R_r)=\frac{\gamma _0+\gamma _1+\gamma _2}{\alpha _2}$, ${\mathbb {D}}(R_l)=\frac{\gamma _0+\gamma _1}{\alpha _1^2}$, ${\mathbb {D}}(R_r)=\frac{\gamma _0+\gamma _1+\gamma _2}{\alpha _2^2}$, and $COV(R_l,R_r)=\frac{\gamma _0+\gamma _1}{\alpha _1\alpha _2}$. In this paper, we reparametrize Eq. (39). Let $\gamma _0=1$, $\alpha _0=1$, $\bar{\gamma }_1=\gamma _0+\gamma _1$, and $\bar{\gamma }_2=\gamma _0+\gamma _1+\gamma _2$, and thus we have:

$$\begin{aligned} f_{(R_l,R_r)'}(x_1,x_2)&=e^{-\alpha _2x_2}(x_2-\frac{\alpha _1}{\alpha _2}x_1)^{\bar{\gamma }_2-\bar{\gamma }_1-1}\frac{\alpha _1^{\bar{\gamma }_1-1}}{\Gamma (\bar{\gamma }_1-1)} \nonumber \\&\quad \cdot \frac{\alpha _2^{\bar{\gamma }_2-1}}{\Gamma (\bar{\gamma }_2-1)}\int ^{x^*}_0(x_1-\frac{1}{\alpha _1}y_0)^{\bar{\gamma }_1-2}dy_0 \end{aligned}$$

(40)

We keep $\bar{\gamma }_1$ and $\bar{\gamma }_1$ time-invariant, and let $\alpha _1$ and $\alpha _2$ change dynamically in Eq. (40). Under this condition, we simplified Eq. (39) while ensuring that the marginal distribution of $R_{l,t}$ and $R_{r,t}$ has a gamma distribution $\Gamma (\gamma ,\alpha )$, and more importantly, we can maintain the dynamics of the first and second moments of $R_{l,t}$ and $R_{r,t}$. Further, we have:

$$\begin{aligned} (R_{l,t},R_{r,t})\mathop {\sim }\limits _{i.i.d.}&\Gamma ^2(\alpha _{1,t},\alpha _{2,t},\bar{\gamma }_{1},\bar{\gamma }_2) \nonumber \\ \left[ \begin{array}{c} \alpha _{1,t} \\ \alpha _{2,t} \end{array} \right]&=\left[ \begin{array}{c} \omega _1 \\ \omega _2 \end{array} \right] +\sum ^{P_{lr}}_{i=1}\left[ \begin{array}{cc} a_{11,i}&{}0 \\ 0&{}a_{22,i} \end{array} \right] \left[ \begin{array}{c} \alpha ^*_{1,t-1} \\ \alpha ^*_{2,t-1} \end{array} \right] +\sum ^{Q_{lr}}_{i=1}\left[ \begin{array}{c} b_{1,i} \\ b_{2,i} \end{array} \right] \circ \left[ \begin{array}{c} \Vert \tilde{{\varvec{R}}}_{t-i}\Vert _{\rho _2} \\ \Vert \tilde{{\varvec{R}}}_{t-i}\Vert _{\rho _2} \end{array} \right] \nonumber \\&\quad +\sum ^{R_{lr}}_{i=1}\left[ \begin{array}{cc} c_{11,i}&{}c_{12,i} \\ c_{21,i}&{}c_{22,i} \end{array} \right] \left[ \begin{array}{c} R_{l,t-i} \\ R_{r,t-i} \end{array} \right] \nonumber \\ \alpha ^*_{1,t}&=\Lambda (\alpha _{1,t}),\quad \alpha ^*_{2,t}=\Lambda (\alpha _{2,t}) \end{aligned}$$

(41)

where $\circ $ is Hadamard product, $\Gamma ^2$ is the density function of Eq. (40) with four parameters, and $\Lambda (x)$ is a transformation function $\Lambda :{\mathbb {R}}\rightarrow (0,\infty ]$. Combining Eqs. (34), (35), and (41), we propose the derivative of Set-GARCH named Set-GARCH-LR model as:

$$\begin{aligned} \tilde{{\varvec{R}}}_t&=\tilde{\varvec{\epsilon }}_t,\quad \tilde{\varvec{\epsilon }}_t=\tilde{\varvec{\epsilon }}_t(R_{C,t},R_{l,t},R_{r,t}) \nonumber \\ R_{C,t}&\mathop {\sim }\limits _{i.i.d}\sqrt{h_t}{\mathcal {N}}(0,1) \nonumber \\ h_t&=\omega _h+\sum ^{P_h}_{i=1}\alpha _{h,i}h_{t-i}+\sum ^{Q_h}_{i=1}\beta _{h,i}(\Vert \tilde{{\varvec{R}}}_{t-i}\Vert ^2_{\rho _2}-\frac{1}{3}R_{C,t-i}^2)+\sum ^{R_h}_{i=1}\gamma _{h,i}R_{C,t-i}^2 \nonumber \\ (R_{l,t},R_{r,t})\mathop {\sim }\limits _{i.i.d.}&\Gamma ^2(\alpha _{1,t},\alpha _{2,t},\bar{\gamma }_{1},\bar{\gamma }_2) \nonumber \\ \left[ \begin{array}{c} \alpha _{1,t} \\ \alpha _{2,t} \end{array} \right]&=\left[ \begin{array}{c} \omega _1 \\ \omega _2 \end{array} \right] +\sum ^{P_{lr}}_{i=1}\left[ \begin{array}{cc} a_{11,i}&{}0 \\ 0&{}a_{22,i} \end{array} \right] \left[ \begin{array}{c} \alpha ^*_{1,t-i} \\ \alpha ^*_{2,t-i} \end{array} \right] +\sum ^{Q_{lr}}_{i=1}\left[ \begin{array}{c} b_{1,i} \\ b_{2,i} \end{array} \right] \circ \left[ \begin{array}{c} \Vert \tilde{{\varvec{R}}}_{t-i}\Vert _{\rho _2} \\ \Vert \tilde{{\varvec{R}}}_{t-i}\Vert _{\rho _2} \end{array} \right] \nonumber \\&\quad +\sum ^{R_{lr}}_{i=1}\left[ \begin{array}{cc} c_{11,i}&{}c_{12,i} \\ c_{21,i}&{}c_{22,i} \end{array} \right] \left[ \begin{array}{c} R_{l,t-i} \\ R_{r,t-i} \end{array} \right] \nonumber \\ \alpha ^*_{1,t}&=\Lambda (\alpha _{1,t}),\quad \alpha ^*_{2,t}=\Lambda (\alpha _{2,t}) \end{aligned}$$

(42)

The “LR” in the name of Set-GARCH-LR means that it reveals the dependence between $R_{l,t}$ and $R_{r,t}$. Recalling Eq. (33), the calculated by the Set-GARCH-LR model is:

$$\begin{aligned} \sigma _t^{set}=\frac{1}{3}h_t+\frac{1}{18}(\frac{\bar{\gamma }_1}{\alpha ^2_{1,t}}+\frac{\bar{\gamma }_2}{\alpha ^2_{2,t}}+\frac{\bar{\gamma }_1}{\alpha _{1,t}\alpha _{2,t}})\int ^1_0(\phi ^{-1}(\alpha ;p))^2d\alpha \end{aligned}$$

(43)

where $\int ^1_0(\phi ^{-1}(\alpha ;p))^2d\alpha $ depends on p. The $h_t$ is in Eq. (35), and $\bar{\gamma }_1$, $\bar{\gamma }_2$, $\alpha _{1,t}$ and $\alpha _{2,t}$ in Eq. (41).

3.4 Parameter estimation

Given that $\tilde{{\varvec{R}}}_t=\tilde{\varvec{\epsilon }}_t$, we can directly use historical observations for maximum likelihood estimation. For the Set-GARCH model, given that $f_{(R_C,R_l,R_r)}=f_{R_C}f_{R_l}f_{R_r}$, the log-likelihood function $ll_{Set-GARCH}$ w.r.t. the parameter set $\varvec{\theta }_{Set-GARCH}=(\omega _h,\varvec{\alpha }_h,\varvec{\beta }_h,\varvec{\gamma }_h,\omega _l,\varvec{\alpha }_l,\varvec{\beta }_l,\varvec{\gamma }_l,\omega _r,\varvec{\alpha }_r,\varvec{\beta }_r,\varvec{\gamma }_r)$ is

$$\begin{aligned}&ll_{Set-GARCH}(\omega _h,\varvec{\alpha }_h,\varvec{\beta }_h,\varvec{\gamma }_h,\omega _l,\varvec{\alpha }_l,\varvec{\beta }_l,\varvec{\gamma }_l,\omega _r,\varvec{\alpha }_r,\varvec{\beta }_r,\varvec{\gamma }_r\vert \tilde{{\varvec{r}}}_t) \nonumber \\&\quad =\sum ^T_{t=1}\ln f_{R_C}(x\vert \Omega _{t-1})+\sum ^T_{t=1}\ln f_{R_l}(x\vert \Omega _{t-1})+\sum ^T_{t=1}\ln f_{R_r}(x\vert \Omega _{t-1}) \nonumber \\&\quad \propto -\frac{1}{2}\sum ^T_{t=1}\ln h_t-\sum ^T_{t=1}\frac{r^2_{C,t}}{2h_t}-\sum ^T_{t=1}\ln \lambda _{l,t}-\sum ^T_{t=1}\frac{r_{l,t}}{\lambda _{l,t}}-\sum ^T_{t=1}\ln \lambda _{r,t}-\sum ^T_{t=1}\frac{R_{r,t}}{\lambda _{r,t}} \end{aligned}$$

(44)

Thus, we can deconstruct the maximum likelihood estimation process into three sub-maximum likelihood estimation terms, i.e.,

$$\begin{aligned} (\omega _h,\varvec{\alpha }_h,\varvec{\beta }_h,\varvec{\gamma }_h)&=argmax\{-\frac{1}{2}\sum ^T_{t=1}\ln h_t-\sum ^T_{t=1}\frac{r^2_{C,t}}{2h_t}\} \nonumber \\ (\omega _l,\varvec{\alpha }_l,\varvec{\beta }_l,\varvec{\gamma }_l)&=argmax\{-\sum ^T_{t=1}\ln \lambda _{l,t}-\sum ^T_{t=1}\frac{r_{l,t}}{\lambda _{l,t}}\} \nonumber \\ (\omega _r,\varvec{\alpha }_r,\varvec{\beta }_r,\varvec{\gamma }_r)&=argmax\{-\sum ^T_{t=1}\ln \lambda _{r,t}-\sum ^T_{t=1}\frac{r_{r,t}}{\lambda _{r,t}}\} \end{aligned}$$

(45)

For the Set-GARCH-LR model, given that $f_{(R_C,R_l,R_r)}=f_{R_C}f_{(R_l,R_r)}$, the likelihood function $ll_{Set-GARCH-LR}$ w.r.t. parameter set $\varvec{\theta }_{Set-GARCH-LR}=(\omega _h,\varvec{\alpha }_h,\varvec{\beta }_h,\varvec{\gamma }_h,\omega _1,\omega _2,\bar{\gamma }_1,\bar{\gamma }_2,{\varvec{a}}_{P_lr},{\varvec{b}}_{Q_lr},{\varvec{c}}_{R_lr})$ is:

$$\begin{aligned}&ll_{Set-GARCH-LR}(\omega _h,\varvec{\alpha }_h,\varvec{\beta }_h,\varvec{\gamma }_h,\omega _1,\omega _2,\bar{\gamma }_1,\bar{\gamma }_2,{\varvec{a}}_{P_lr},{\varvec{b}}_{Q_lr},{\varvec{c}}_{R_lr}\vert \tilde{{\varvec{r}}}_t) \nonumber \\&\quad =\sum ^T_{t-1}\ln f_{R_C}(x\vert \Omega _{t-1})+\sum ^T_{t-1}\ln f_{(R_l,R_r)}(x,y\vert \Omega _{t-1}) \nonumber \\&\quad \propto -\frac{1}{2}\sum ^T_{t=1}\ln h_t-\sum ^T_{t=1}\frac{r^2_{C,t}}{2h_t}-\sum ^T_{t=1}\alpha _{2,t}r_{r,t}+(\bar{\gamma }_2-\bar{\gamma }_1-1)\sum ^T_{t=1}\ln (r_{r,t}-\frac{\alpha _{1,t}}{\alpha _{2,t}}r_{l,t}) \nonumber \\&\quad +(\bar{\gamma }_1-1)\sum ^T_{t=1}\ln \alpha _{1,t}+(\bar{\gamma }_2-\bar{\gamma }_1)\sum ^T_{t=1}\ln \alpha _{2,t}-\sum ^T_{t=1}\ln \Gamma (\bar{\gamma }_1-1) \nonumber \\&\quad +\sum ^T_{t=1}\ln \Gamma (\bar{\gamma }_2-\bar{\gamma }_1)+\sum ^T_{t=1}\ln (\int ^{x^*}_0(r_{l,t}-\frac{y_0}{\alpha _{1,t}})^{\bar{\gamma }_1-2}dy_0) \end{aligned}$$

(46)

We can still find the optimal parameters to be estimated using a method similar to Eq. (45), that is:

$$\begin{aligned}&(\omega _h,\varvec{\alpha }_h,\varvec{\beta }_h,\varvec{\gamma }_h)=argmax\{-\frac{1}{2}\sum ^T_{t=1}\ln h_t-\sum ^T_{t=1}\frac{r^2_{C,t}}{2h_t}\} \nonumber \\&\quad (\omega _1,\omega _2,\bar{\gamma }_1,\bar{\gamma }_2,{\varvec{a}}_{P_lr},{\varvec{b}}_{Q_lr},{\varvec{c}}_{R_lr}) \nonumber \\&\quad =argmax\{-\sum ^T_{t=1}\alpha _{2,t}r_{r,t}+(\bar{\gamma }_2-\bar{\gamma }_1-1)\sum ^T_{t=1}\ln (r_{r,t}-\frac{\alpha _{1,t}}{\alpha _{2,t}}r_{l,t}) \nonumber \\&\quad +(\bar{\gamma }_1-1)\sum ^T_{t=1}\ln \alpha _{1,t}+(\bar{\gamma }_2-\bar{\gamma }_1)\sum ^T_{t=1}\ln \alpha _{2,t}-\sum ^T_{t=1}\ln \Gamma (\bar{\gamma }_1-1) \nonumber \\&\quad +\sum ^T_{t=1}\ln \Gamma (\bar{\gamma }_2-\bar{\gamma }_1)+\sum ^T_{t=1}\ln (\int ^{x^*}_0(r_{l,t}-\frac{y_0}{\alpha _{1,t}})^{\bar{\gamma }_1-2}dy_0)\} \end{aligned}$$

(47)

where $x^*=min\{\frac{\alpha _1}{\alpha _0}R_l,\frac{\alpha _2}{\alpha _0}R_r\}$. The scoring direction search optimization method is used to solve Eqs. (45) and (47). Let $\varvec{\theta }^*_{Set-GARCH}$ and $\varvec{\theta }^*_{Set-GARCH-LR}$ be the real parameters of the Set-GARCH and Set-GARCH-LR model. We have that:

$$\begin{aligned}&(T\rightarrow \infty )\hat{\varvec{\theta }}_{Set-GARCH}\mathop {\rightarrow }\limits ^{{\mathcal {P}}}\varvec{\theta }^*_{Set-GARCH} \nonumber \\&(T\rightarrow \infty )\hat{\varvec{\theta }}_{Set-GARCH-LR}\mathop {\rightarrow }\limits ^{{\mathcal {P}}}\varvec{\theta }^*_{Set-GARCH-LR} \nonumber \\&(T\rightarrow \infty )T^{\frac{1}{2}}(\hat{\varvec{\theta }}_{Set-GARCH}-\varvec{\theta }^*_{Set-GARCH})\mathop {\rightarrow }\limits ^{{\mathcal {D}}}{\mathcal {N}}\left( 0,-\left[ {\mathbb {E}}\frac{\partial ^2ll_{Set-GARCH}}{\partial \varvec{{\theta }^*}^2_{Set-GARCH})}\right] '\right) \nonumber \\&(T\rightarrow \infty )T^{\frac{1}{2}}(\hat{\varvec{\theta }}_{Set-GARCH-LR}-\varvec{\theta }^*_{Set-GARCH-LR})\nonumber \\&\mathop {\rightarrow }\limits ^{{\mathcal {D}}}{\mathcal {N}}\left( 0,-\left[ {\mathbb {E}}\frac{\partial ^2ll_{Set-GARCH-LR}}{\partial \varvec{{\theta }^*}^2_{Set-GARCH-LR})}\right] '\right) \end{aligned}$$

(48)

where $-[{\mathbb {E}}\frac{\partial ^2ll_{Set-GARCH}}{\partial \varvec{{\theta }^*}^2_{Set-GARCH})}]'$ and $-[{\mathbb {E}}\frac{\partial ^2ll_{Set-GARCH-LR}}{\partial \varvec{{\theta }^*}^2_{Set-GARCH-LR})}]'$ is the Fisher information matrix of $ll_{Set-GARCH}$ in Eq. (44) and $ll_{Set-GARCH-LR}$ in Eq. (46) at $\varvec{\theta }^*_{Set-GARCH}$ and $\varvec{\theta }^*_{Set-GARCH-LR}$, respectively. We can compute the numerical solution of Eq. (48) to obtain the standard errors of the estimated parameter.

4 An empirical application

4.1 Data selection

We select daily, weekly, and monthly data from Datastream for WTI oil futures, S &P500 stock index, and NYMEX gold futures to demonstrate the in-sample and out-of-sample volatility forecasting and returns interval forecasting capabilities of the proposed Set-GARCH model. Futures prices are chosen so that they represent the highest, lowest, and closing prices. The selection of Data is shown in Table 1.

Table 1 Data selection

Full size table

As shown in Table 1, the SD of the highest price, lowest price, and closing price of an asset would almost increase as the timescale lengthens. This is because of the cumulative change in asset prices in a month is always greater than the change in a day or week. The daily data is a great test of the forecasting performance of a model that incorporates range information, but we would like to investigate further how our Set-GARCH or Set-GARCH-LR models perform in this environment of high- and low-frequency data. Figure 2 clearly depicts the high, low, and closing price (returns) trajectories for the same sample period since 2018 for crude oil. If we only consider the closing price, we appear to lose a great deal of information.

4.2 In-sample volatility forecasting

Without loss of generality, we set the Set-GARCH model specification of proposed as follows:

$$\begin{aligned} \tilde{{\varvec{R}}}_t&=\tilde{\varvec{\epsilon }}_t,\quad \tilde{\varvec{\epsilon }}_t=\tilde{\varvec{\epsilon }}_t(R_{C,t},R_{l,t},R_{r,t}) \nonumber \\ R_{C,t}&\mathop {\sim }\limits _{i.i.d}\sqrt{h_t}{\mathcal {N}}(0,1),\quad R_{l,t}\mathop {\sim }\limits _{i.i.d.}\lambda _{l,t}\Gamma (1,1),\quad R_{r,t}\mathop {\sim }\limits _{i.i.d.}\lambda _{r,t}\Gamma (1,1) \nonumber \\ h_t&=\omega _h+\alpha _{h,1}h_{t-1}+\beta _{h,1}(\Vert \tilde{{\varvec{R}}}_{t-1}\Vert ^2_{\rho _2}-\frac{1}{3}R_{C,t-1}^2)+\gamma _{h,1}R_{C,t-1}^2 \nonumber \\ \lambda _{l,t}&=(\omega _l+\alpha _{l,1}\lambda _{l,t-1}+\beta _{l,1}\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}+\gamma _{l,1}R_{l,t-1})^2+0.001 \nonumber \\ \lambda _{r,t}&=(\omega _r+\alpha _{r,1}\lambda _{r,t-1}+\beta _{r,1}\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}+\gamma _{r,1}R_{r,t-1})^2+0.001 \end{aligned}$$

(49)

Similarly, we set each order in the Set-GARCH-LR model to 1, i.e.,

$$\begin{aligned} \tilde{{\varvec{R}}}_t&=\tilde{\varvec{\epsilon }}_t,\quad \tilde{\varvec{\epsilon }}_t=\tilde{\varvec{\epsilon }}_t(R_{C,t},R_{l,t},R_{r,t}) \nonumber \\ R_{C,t}&\mathop {\sim }\limits _{i.i.d}\sqrt{h_t}{\mathcal {N}}(0,1) \nonumber \\ h_t&=\omega _h+\alpha _{h,1}h_{t-1}+\beta _{h,1}(\Vert \tilde{{\varvec{R}}}_{t-1}\Vert ^2_{\rho _2}-\frac{1}{3}R_{C,t-i}^2)+\gamma _{h,1}R_{C,t-1}^2 \nonumber \\ (R_{l,t},R_{r,t})\mathop {\sim }\limits _{i.i.d.}&\Gamma ^2(\alpha _{1,t},\alpha _{2,t},\bar{\gamma }_{1},\bar{\gamma }_2) \nonumber \\ \left[ \begin{array}{c} \alpha _{1,t} \\ \alpha _{2,t} \end{array} \right]&=\left[ \begin{array}{c} \omega _1 \\ \omega _2 \end{array} \right] +\left[ \begin{array}{cc} a_{11,1}&{}0 \\ 0&{}a_{22,1} \end{array} \right] \left[ \begin{array}{c} \alpha ^*_{1,t-1} \\ \alpha ^*_{2,t-1} \end{array} \right] +\left[ \begin{array}{c} b_{1,1} \\ b_{2,1} \end{array} \right] \circ \left[ \begin{array}{c} \Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2} \\ \Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2} \end{array} \right] \nonumber \\&\quad +\left[ \begin{array}{cc} c_{11,1}&{}c_{12,1} \\ c_{21,1}&{}c_{22,1} \end{array} \right] \left[ \begin{array}{c} R_{l,t-1} \\ R_{r,t-1} \end{array} \right] \nonumber \\ \alpha ^*_{1,t}&=(\alpha _{1,t})^2+0.001,\quad \alpha ^*_{2,t}=(\alpha _{2,t})^2+0.001 \end{aligned}$$

(50)

Meanwhile, we set the prior parameter p reflecting the shape of the fuzzy set to three different values of 1, 2, and 10. Tables 2, 3, 4, 5, 6 and 7 demonstrate the parameter estimation results.

$\beta _{h,1}$ shows how item $\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}^2-\frac{1}{3}R^2_{C,t-1}$ in the Set-GARCH and Set-GARCH-LR model affects the change of $h_t$, which is also an important coefficient revealing the usage of fuzzy sets-valued variable. We found that for the same asset, $\beta _{h,1}$ is mostly insignificant under the daily data, while under the weekly and monthly data, $\beta _{h,1}$ is statistically significant. In Sect. 4.1, we found that as the data frequency decreases, the volatility of the $H_t$ and $L_t$ also becomes greater. The $\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}^2-\frac{1}{3}R^2_{C,t-1}$ changes in the weekly and monthly frequencies and will provide more information. In contrast to the daily frequency, which helps to predict the $h_t$ under the weekly and monthly data frequency.

In Set-GARCH model, the $\beta _{l,1}$ and $\beta _{r,1}$ shows how $\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}$ affects the expectation (and also variance) of $R_{r,t}$ and $R_{l,t}$. From Tables 2, 3 and 4, in almost all assets and different time scales, $\beta _{l,1}$ and $\beta _{r,1}$ are significant, indicating that the influence of fuzzy set numerical variables on $R_{r,t}$ and $R_{l,t}$ do not change with the data frequency. Further, $\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}$ always have a positive impact on $\lambda _{l,t}$ and $\lambda _{r,t}$ in our empirical application, which is different the pattern of the influence of $\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}$ on $h_t$. The $h_t$, $\lambda _{l,t}$, and $\lambda _{r,t}$ demonstrate strong dynamic patterns which drive the change of $\tilde{{\varvec{R}}}_{t}$. This implies the rationality of our settings on the Set-GARCH model.

The Eqs. (45) and (47) imply that the $\omega _h$, $\alpha _{h,1}$, $\beta _{h,1}$, $\gamma _{h,1}$ in both Set-GARCH and Set-GARCH-LR models are at the same value, which is demonstrated in Tables 5, 6 and 7. The coefficients $b_{1,1}$ and $b_{2,1}$ of the Set-GARCH-LR model are almost all significant under different assets and different sample frequencies, which shows that when $R_{r,t}$ and $R_{l,t}$ are not independent, $\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}$ would affect the $\alpha _{1,t}$ and $\alpha _{2,t}$ parameters in the distribution in a time-varying manner. This demonstrates once again the importance of our returns being in fuzzy random set values.

Compared to the Set-GARCH model, the $c_{12,1}$ and $c_{21,1}$ coefficients in the Set-GARCH-LR model are also significant in most cases from Tables 5, 6 and 7. This illustrates the interaction of $R_{r,t}$ and $R_{l,t}$ of assets, and this interaction will not disappear due to the changes in data frequency. In summary, the model settings of Set-GARCH and Set-GARCH-LR make full use of $\tilde{{\varvec{R}}}_{t-1}$ past information to drive changes of $\tilde{{\varvec{R}}}_{t}$.

Table 2 In sample Set-GARCH estimation of oil price

Full size table

Table 3 In sample Set-GARCH estimation of S &P500 price

Full size table

Table 4 In sample Set-GARCH estimation of gold price

Full size table

Table 5 In sample Set-GARCH-LR estimation of oil price

Full size table

Table 6 In sample Set-GARCH-LR estimation of S &P500 price

Full size table

Table 7 In sample Set-GARCH-LR estimation of gold price

Full size table

We select the following two loss functions to measure the volatility forecasting accuracy (Patton, 2011), and we denote the squared returns $\sigma ^2_t$ the proxy ${\hat{\sigma }}^2$ the predicted volatility of real volatility (Wang et al., 2020; Zhang et al., 2020):

$$\begin{aligned} MSE-SD&=\frac{1}{N}\sum ^N_{i=1}({\hat{\sigma }}-\sigma _i)^2 \nonumber \\ MAE&=\frac{1}{N}\sum ^N_{i=1}\vert {\hat{\sigma }}^2-\sigma _i^2\vert \end{aligned}$$

(51)

The Model Confidence Set (MCS) test (Hansen et al., 2011) is utilized to determine if a model could achieve an acceptance set with a specified confidence level. The MCS statistics range between 0 and 1. The greater the number, the higher the acceptance of one model (Wang et al., 2020, 2016). For CARR group models and the ACI model, after calculating their range, we use $\frac{H_t-L_t}{4\ln 2}$ to calculate their in-sample predicted volatility.

In general, compared to the benchmark model, the Set-GARCH and Set-GARCH-LR models exhibit significantly superior in-sample volatility prediction capabilities. The Set-GARCH and Set-GARCH-LR models demonstrate superior in-sample volatility prediction capabilities than daily or weekly data, particularly as the sample frequency of assets decreases (e.g., monthly data). For the same frequency and asset, the Set-GARCH-LR model’s in-sample prediction performance is frequently superior to that of the Set-GARCH model. These observations indicate that the degree of absorption of sets-valued information in the sample enables the model to better fit the in-sample data. Through the evidence presented in Tables 8, 9 and 10, we will further elaborate this claim.

Referring to the analysis in sections A.1 and 2, Set-GARCH and Set-GARCH-LR models have captured “range” and “level” information of $H_t$ and $L_t$, and the “point” information of $R_{C,t}$. The GARCH group models only contain the “point” information, while the ACI and Int-GARCH models do not contain the “point” information of $R_{C,t}$. The CARR group models only engage the “range” information of $H_t$ and $L_t$, and the “point” information of $R_{C,t}$. Provided that the ’ range” and “level” information of $R_{L,t}$ and $R_{H,t}$ contain rich information (or a relatively large change), it would certainly improve in-sample forecasting. Compared to daily and weekly time intervals, the changes of $R_{L,t}$ and $R_{H,t}$ in monthly data are more profound. The complete information empowers the Set-GARCH and Set-GARCH-LR models to close the information gap existing in the benchmark models.

The empirical results also show that the in-sample prediction performance of the Set-GARCH-LR model is superior to that of the Set-GARCH model when it was applied to crude oil. Crude oil is a highly volatile asset (Cerqueti & Fanelli, 2021; Cerqueti et al., 2020), and the mechanism of change between $R_{L,t}$ and $R_{H,t}$ is more significant, which could make Set-GARCH-LR superior for in-sample forecasting.

The high value of p means that we increase the degree of membership of returns value close to the $R_{L,t}$ and $R_{H,t}$ in $\tilde{{\varvec{R}}}_t$. From the performance of in-sample prediction, the Set-GARCH and Set-GARCH-LR models with $p=1$ and $p=2$ have better fitting results. Compared to the ACI and Int-GARCH models that fairly absorb all the interval-valued information, the small p controls our “degree of membership for various points in the interval-valued information. As shown in Fig. 3, there is no difference between $\tilde{{\varvec{R}}}_t$ and interval-valued variables for extremely large p values, making our model inferior to ACI and Int-GARCH. Figure 3 demonstrates the real volatility and best models’ fitted volatility.

Table 8 In sample daily returns volatility forecasting goodness-of-fit and MCS test

Full size table

Table 9 In sample weekly returns volatility forecasting goodness-of-fit and MCS test

Full size table

Table 10 In sample monthly returns volatility forecasting goodness-of-fit and MCS test

Full size table

4.3 Out-of-sample volatility forecasting

We use the rolling 300-length window one-step forward prediction method to evaluate the out-of-sample volatility prediction performance of the Set-GARCH model and the Set-GARCH-LR model. From Eqs. (49) and (50), the one-step head ${\hat{\sigma }}_{set,t}(1)$ is:

$$\begin{aligned} {\hat{\sigma }}_{set,t}(1)&=\frac{1}{3}{\hat{h}}_t(1)\frac{{\hat{\lambda }}^2_{l,t}(1)+{\hat{\lambda }}^2_{r,t}(1)}{18}\int ^1_0(\phi ^{-1}(a;p))^2d\alpha \nonumber \\ {\hat{h}}_t(1)&={\hat{\omega }}_h+{\hat{\alpha }}_{h,1}h_t+{\hat{\beta }}_{h,1}(\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}^2-\frac{R^2_{C,t}}{3})+{\hat{\gamma }}_{h,1}R^2_{C,t} \nonumber \\ {\hat{\lambda }}_{l,t}(1)&=({\hat{\omega }}_l+{\hat{\alpha }}_{l,1}\lambda _{l,t}+{\hat{\beta }}_{l,1}\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}+{\hat{\gamma }}_{l,1}R_{l,t})^2+0.001 \nonumber \\ {\hat{\lambda }}_{r,t}(1)&=({\hat{\omega }}_r+{\hat{\alpha }}_{r,1}\lambda _{r,t}+{\hat{\beta }}_{r,1}\Vert \tilde{{\varvec{R}}}_{t-1}\Vert _{\rho _2}+{\hat{\gamma }}_{r,1}R_{r,t})^2+0.001 \end{aligned}$$

(52)

and the one-step forward prediction of the Set-GARCH-LR is also calculated in a similar way.

As evidenced in Tables 11 and 13, the Set-GARCH group models generally demonstrate a superior ability to predict out-of-sample volatility compared to the benchmark models. With some exceptions, the CARR-B model exhibits certain predictive advantages for weekly data.

Unlike the Set-GARCH or the Set-GARCH-LR model when $p=1$ or $p=2$, the out-of-sample prediction effect of the Set-GARCH model when the prior parameter equals to 10 is not as satisfactory in our empirical applications. This may be because that a higher p value increases the absorption of $R_l$ and $R_r$ information in the model in out-of-sample predictions. This may lead to instability in the Set-GARCH model. The poor prediction performance of the Set-GARCH model under this large p is consistent with the poor out-of-sample volatility prediction performance of the Int-GARCH model presented in Tables 11, 12 and 13. According to the analysis in Sect. 2.3, $\tilde{{\varvec{R}}}_t$ turns into an interval number random variable equivalent to the Int-GARCH model when p is large. At this time, we approximate that Int-GARCH and Set-GARCH (-LR) models under $p=10$ absorb the same information. We note that the prediction results of Int-GARCH and Set-GARCH(-LR) are not satisfactory. This may suggest that extra interval information will not necessarily improve the model’s performance in volatility prediction in the out-of-sample analysis.

Under $p=1$ or $p=2$, the out-of-sample volatility prediction of Set-GARCH or Set-GARCH-LR model performs well. The out-of-sample prediction performance of GARCH group models is significantly inferior to our proposed Set-GARCH (-LR) model. This is largely due to the lack of information processed by the GARCH model, which only considers $R_{C,t}$ information. This confirms the importance of Set-GARCH (-LR) absorbing $R_{L,t}$ and $R_{R,t}$ information in the out-of-sample volatility prediction.

As discussed in Sect. 2.3, when p is small, the information absorbed by Set-GARCH-LR is closed to the information absorbed by the ACI model. In most cases, the Set-GARCH (-LR) out-of-sample prediction of $p=1$ or $p=2$, performs better than the ACI model. It may imply that the calculation mode of $\sigma ^{Set}_t$ is better than the calculation mode of $\frac{H_t-L_t}{4\ln 2}$ of the ACI model. Recalling Eqs. (38) and (43), $\sigma ^{Set}_t$ is a linear combination of term ${\mathbb {D}}(R_{C})$, ${\mathbb {D}}(R_r)$, ${\mathbb {D}}(R_l)$, $COV(R_C,R_r)$, $COV(R_C,R_l)$, and $COV(R_l,R_r)$.^{Footnote 8} Different combinations of information are blended together to give $\sigma ^{Set}_t$ an enhanced predictive capability. Different sample frequencies do not appear to have a substantial effect on the Set-GARCH (-LR) model’s ability to predict out-of-sample volatility.

Table 11 Out of sample daily returns volatility forecasting goodness-of-fit and MCS test

Full size table

Table 12 Out of sample weekly returns volatility forecasting goodness-of-fit and MCS test

Full size table

Table 13 Out of sample monthly returns volatility forecasting goodness-of-fit and MCS test

Full size table

5 Conclusion

In the last few decades, the data structure of the financial time series volatility model has evolved significantly from GARCH-type models with point-valued data to CARR-type models with range-valued data, the ACI model and Int-GARCH model with interval-valued data using random set theory etc. This study proposes a Set-GARCH model that drives the volatility changes in random fuzzy sets-valued time series. Adapting to the rules of random set operations, the proposed Set-GARCH model exhibits accurate volatility prediction.

We construct the sets-valued asset price using a fuzzy LR-form set. We present a general and adaptable form of the membership function with a prior parameter p that controls the shape of these functions. We examine the impact of various subtraction rules on sets-valued returns. This paper provides the inner-product definition, distance definition, and variance definition between two random fuzzy sets-valued returns.

Based on the sets-valued variable subtraction rule selected, we discuss the specifications that a model driving sets-valued variable changes should have and provide the specifications of our Set-GARCH model. We also propose the Set-GARCH-LR model as a derivative of the Set-GARCH model to increase the flexibility of structure settings. The Set-GARCH differs from Set-GARCH-LR in that the latter assumes that the two shape parameters in fuzzy sets-valued returns are dependent and follow bivariate Gamma distribution. Maximum likelihood could be utilized to estimate both the Set-GARCH and the Set-GARCH-LR models’ parameters. In addition, we provide a transforming formula between the variance of fuzzy sets- valued returns and the volatility of real returns.

In the empirical applications, we compare the volatility forecasting performance of the Set-GARCH model to that of three classic GARCH-type models, three classic CARR-type models, the interval valued-ACI model, and the interval valued Int-GARCH model using daily/weekly/monthly trading data for oil, gold, and the S &P500. The proposed Set-GARCH model/Set-GARCH-LR model performs well in both in-sample and out-of-sample volatility prediction tests.

This paper also points out the possible directions for future research on the development of sets-valued time series volatility models. First, to develop sets-valued time series models that could absorb more information on price aggregation (our model only absorbs three prices). Second, to develop an extension to the multivariate sets-valued time series.

Notes

“Appendix A” introduces the GARCH, CARR, ACI and Int-GARCH model mentioned above in details.
Li et al. (2013) provides a clear explanation of the relationship between fuzzy set-valued data and interval-valued data. When the fuzziness of fuzzy set-valued data degrades, it can be transformed into set-valued data. Set-valued data includes interval-valued data since intervals are a type of set value. Therefore, fuzzy set-valued data encompasses interval-valued data, and the computational properties of fuzzy set-valued data are equally applicable to interval-valued data.
Examples include the pricing of financial derivatives, which are typically based on the closing prices of stocks or commodities. Mutual fund net asset values (NAV) and performance are also often calculated using closing prices (Comerton-Forde & Putniņš 2011), Suen and Wan (2022). Moreover, when paired with the open price, these price levels provide crucial reference points for measuring strength and identifying key price levels to validate trade ideas or biases.
This is a necessary condition for type-A subtraction to hold.
In Eq. (A5) the $\sigma _t$ is time-varying and treated as a time-varying parameter.
If there is a large $R_{r,t}$ on day t, the probability of a large $H_t$ on day $t-1$ will also be high, which may induce a large $R_{l,t}$ on day t. Similarly, if there is a large $R_{r,t}$ on day t, it would show that investors have strong intention to push asset prices up.
In model settings with uncertainty and large sample scenarios, the normal distribution can maintain good asymptotic properties (White, 1982).
Noting that Eq. (33) is just a linear transformation.
The ranges of [3, 5] has the same the ranges as [13, 15].
If the price range of one day is 3 to 8, and the price range of another day is 13 to 18, then the “level” information of their ranges is the same, but the “trend” information is very different.

References

Atsalakis, G. S., Atsalaki, I. G., Pasiouras, F., & Zopounidis, C. (2019). Bitcoin price forecasting with neuro-fuzzy techniques. European Journal of Operational Research, 276(2), 770–780.
Article Google Scholar
Barunik, J., Krehlik, T., & Vacha, L. (2016). Modeling and forecasting exchange rate volatility in time-frequency domain. European Journal of Operational Research, 251(1), 329–340.
Article Google Scholar
Buansing, T. T., Golan, A., & Ullah, A. (2020). An information-theoretic approach for forecasting interval-valued sp500 daily returns. International Journal of Forecasting, 36(3), 800–813.
Article Google Scholar
Cerqueti, R., & Fanelli, V. (2021). Long memory and crude oil’s price predictability. Annals of Operations Research, 299, 895–906.
Article Google Scholar
Cerqueti, R., Giacalone, M., & Mattera, R. (2020). Skewed non-gaussian garch models for cryptocurrencies volatility modelling. Information Sciences, 527, 1–26.
Article Google Scholar
Chou, R. Y. (2005). Forecasting financial volatilities with extreme values: The conditional autoregressive range (CARR) model. Journal of Money, Credit and Banking, 37, 561–582.
Article Google Scholar
Comerton-Forde, C., & Putniņš, T. J. (2011). Measuring closing price manipulation. Journal of Financial Intermediation,20(2), 135–158.
Creal, D., Koopman, S. J., & Lucas, A. (2013). Generalized autoregressive score models with applications. Journal of Applied Econometrics, 28(5), 777–795.
Article Google Scholar
D’Urso, P., De Giovanni, L., & Massari, R. (2016). Garch-based robust clustering of time series. Fuzzy Sets and Systems, 305, 1–28.
Article Google Scholar
Escobar-Anel, M., Rastegari, J., & Stentoft, L. (2021). Option pricing with conditional Garch models. European Journal of Operational Research, 289(1), 350–363.
Article Google Scholar
Ezbakhe, F., & Pérez-Foguet, A. (2021). Decision analysis for sustainable development: The case of renewable energy planning under uncertainty. European Journal of Operational Research, 291(2), 601–613.
Article Google Scholar
Furman, E. (2008). On a multivariate gamma distribution. Statistics & Probability Letters, 78(15), 2353–2360.
Article Google Scholar
González-Rivera, G., & Lin, W. (2013). Constrained regression for interval-valued data. Journal of Business & Economic Statistics, 31(4), 473–490.
Article Google Scholar
Gonzalez-Rivera, G., Luo, Y., & Ruiz, E. (2020). Prediction regions for interval-valued time series. Journal of Applied Econometrics, 35(4), 373–390.
Article Google Scholar
Han, A., Hong, Y., Wang, S., & Yun, X. (2016). A vector autoregressive moving average model for interval-valued time series data. In: Essays in Honor of Aman Ullah (Vol. 36, pp. 417–460). Emerald Group Publishing Limited.
Hansen, P. R., & Lunde, A. (2005). A forecast comparison of volatility models: does anything beat a Garch (1, 1)? Journal of Applied Econometrics, 20(7), 873–889.
Article Google Scholar
Hansen, P. R., Lunde, A., & Nason, J. M. (2011). The model confidence set. Econometrica, 79(2), 453–497.
Article Google Scholar
Hassan, M. R. (2009). A combination of hidden Markov model and fuzzy model for stock market forecasting. Neurocomputing, 72(16–18), 3439–3446.
Article Google Scholar
He, Y., Han, A., Hong, Y., Sun, Y., & Wang, S. (2021). Forecasting crude oil price intervals and return volatility via autoregressive conditional interval models. Econometric Reviews, 40(6), 584–606.
Article Google Scholar
Hocine, A., Zhuang, Z.-Y., Kouaissah, N., & Li, D.-C. (2020). Weighted-additive fuzzy multi-choice goal programming (WA-FMCGP) for supporting renewable energy site selection decisions. European Journal of Operational Research, 285(2), 642–654.
Article Google Scholar
Hukuhara, M. (1967). Integration des applications mesurables dont la valeur est un compact convexe. Funkcialaj Ekvacioj, 10(3), 205–223.
Google Scholar
Jones, D., Firouzy, S., Labib, A., & Argyriou, A. V. (2022). Multiple criteria model for allocating new medical robotic devices to treatment centres. European Journal of Operational Research, 297(2), 652–664.
Article Google Scholar
Joshi, D., & Kumar, S. (2016). Interval-valued intuitionistic hesitant fuzzy choquet integral based topsis method for multi-criteria group decision making. European Journal of Operational Research, 248(1), 183–191.
Article Google Scholar
Koop, G., & Korobilis, D. (2013). Large time-varying parameter VARs. Journal of Econometrics, 177(2), 185–198.
Article Google Scholar
Körner, R., & Näther, W. (2002). On the variance of random fuzzy variables. In Statistical modeling, analysis and management of fuzzy data (pp. 25–42).
Li, S., & Guan, L. (2007). Fuzzy set-valued gaussian processes and Brownian motions. Information Sciences, 177(16), 3251–3259.
Article Google Scholar
Li, S., Ogura, Y., & Kreinovich, V. (2013). Limit Theorems and Applications of Set-valued and Fuzzy Set-valued Random Variables (Vol. 43). Berlin: Springer.
Google Scholar
Lin, E. M., Chen, C. W., & Gerlach, R. (2012). Forecasting volatility with asymmetric smooth transition dynamic range models. International Journal of Forecasting, 28(2), 384–399.
Article Google Scholar
Lyócsa, Š, Molnár, P., & Vỳrost, T. (2021). Stock market volatility forecasting: Do we need high-frequency data? International Journal of Forecasting, 37(3), 1092–1110.
Article Google Scholar
Maia, A. L. S., & de Carvalho, Fd. A. (2011). Holt’s exponential smoothing and neural network models for forecasting interval-valued time series. International Journal of Forecasting, 27(3), 740–759.
Article Google Scholar
Ma, F., Liao, Y., Zhang, Y., & Cao, Y. (2019). Harnessing jump component for crude oil volatility forecasting in the presence of extreme shocks. Journal of Empirical Finance, 52, 40–55.
Article Google Scholar
Molnár, P. (2012). Properties of range-based volatility estimators. International Review of Financial Analysis, 23, 20–29.
Article Google Scholar
Moussa, A. M., Kamdem, J. S., Shapiro, A. F., & Terraza, M. (2014). CAPM with fuzzy returns and hypothesis testing. Insurance: Mathematics and Economics, 55, 40–57.
Google Scholar
Näther, W. (2001). Random fuzzy variables of second order and applications to statistical inference. Information Sciences, 133(1–2), 69–88.
Article Google Scholar
Nowak, P., & Romaniuk, M. (2010). Computing option price for levy process with fuzzy parameters. European Journal of Operational Research, 201(1), 206–210.
Article Google Scholar
Nuti, G., Mirghaemi, M., Treleaven, P., & Yingsaeree, C. (2011). Algorithmic trading. Computer, 44(11), 61–69.
Article Google Scholar
Papanagnou, C. I., & Matthews-Amune, O. (2018). Coping with demand volatility in retail pharmacies with the aid of big data exploration. Computers & Operations Research, 98, 343–354.
Article Google Scholar
Parkinson, M. (1980). The extreme value method for estimating the variance of the rate of return. Journal of Business, 53, 61–65.
Article Google Scholar
Patton, A. J. (2011). Volatility forecast comparison using imperfect volatility proxies. Journal of Econometrics, 160(1), 246–256.
Article Google Scholar
Suen, W., Wan, K.-M., et al. (2022). Call auction design and closing price manipulation: Evidence from the Hong Kong stock exchange. Journal of Financial Markets, 58, 100700.
Article Google Scholar
Sun, S., Sun, Y., Wang, S., & Wei, Y. (2018). Interval decomposition ensemble approach for crude oil price forecasting. Energy Economics, 76, 274–287.
Article Google Scholar
Sun, Y., Han, A., Hong, Y., & Wang, S. (2018). Threshold autoregressive models for interval-valued time series data. Journal of Econometrics, 206(2), 414–446.
Article Google Scholar
Sun, Y., Lian, G., Lu, Z., Loveland, J., & Blackhurst, I. (2020). Modeling the variance of return intervals toward volatility prediction. Journal of Time Series Analysis, 41(4), 492–519.
Article Google Scholar
Treleaven, P., Galas, M., & Lalchand, V. (2013). Algorithmic trading review. Communications of ACM, 56(11), 76–85.
Article Google Scholar
Wang, L., Ma, F., Liu, J., & Yang, L. (2020). Forecasting stock price volatility: New evidence from the Garch-Midas model. International Journal of Forecasting, 36(2), 684–694.
Article Google Scholar
Wang, X., Zhang, Z., & Li, S. (2016). Set-valued and interval-valued stationary time series. Journal of Multivariate Analysis, 145, 208–223.
Article Google Scholar
Wang, Y., Wu, C., & Yang, L. (2016). Forecasting crude oil market volatility: A Markov switching multifractal volatility approach. International Journal of Forecasting, 32(1), 1–9.
Article Google Scholar
White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica: Journal of the Econometric Society, 50, 1–25.
Article Google Scholar
Wu, D., Dai, X., Zhao, R., Cao, Y., & Wang, Q. (2023). Pass-through from temperature intervals to China’s commodity futures’ interval-valued returns: Evidence from the varying-coefficient its model. Finance Research Letters, 58, 104289.
Article Google Scholar
Yang, W., Han, A., Hong, Y., & Wang, S. (2016). Analysis of crisis impact on crude oil prices: A new approach with interval time series modelling. Quantitative Finance, 16(12), 1917–1928.
Article Google Scholar
Zhang, Y., Ma, F., & Liao, Y. (2020). Forecasting global equity market volatilities. International Journal of Forecasting, 36(4), 1454–1475.
Article Google Scholar
Zhu, B., Wan, C., Wang, P., & Chevallier, J. (2023). Forecasting carbon market volatility with big data. Annals of Operations Research, 1–27.
Zhü, K. (2014). Fuzzy analytic hierarchy process: Fallacy of the popular methods. European Journal of Operational Research, 236(1), 209–217.
Article Google Scholar

Download references

Acknowledgements

Authors are grateful to the financial support from the National Social Science Fund of China (No. 21 &ZD110), the National Natural Science Foundation of China (No. 52270183), and Postgraduate Research & Practice Innovation Program of Jiangsu Province (No. KYCX21_0237).

Author information

Authors and Affiliations

College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, Jiangsu, China
Xingyu Dai & Qunwei Wang
Research Center for Soft Energy Science, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, Jiangsu, China
Xingyu Dai & Qunwei Wang
Department of Social and Economic Sciences, Sapienza University of Rome, Piazzale Aldo Moro, 5, Rome, 00185, Italy
Roy Cerqueti
GRANEM, University of Angers, SFR CONFLUENCES, Angers, F-49000, France
Roy Cerqueti
Royal Holloway University of London, Egham, London, TW20 0EX, UK
Ling Xiao

Authors

Xingyu Dai
View author publications
You can also search for this author in PubMed Google Scholar
Roy Cerqueti
View author publications
You can also search for this author in PubMed Google Scholar
Qunwei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ling Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qunwei Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: The benchmark models

This Appendix illustrates the volatility models that are taken as benchmark frameworks. We treat separately the point-valued and the interval-valued models.

1.1 A.1 Point-Valued volatility models

1.1.1 A.1.1 GARCH-type models

Classical GARCH-type models incorporate the point-valued data. Assuming the closing price at day t is $P_t$ and the point-valued returns series $y_t=log P_t-log P_{t-1}$, $\mu _t$ is some mean process or just a constant with innovation $\epsilon _t$, the mean equation of GARCH-type models is

$$\begin{aligned} y_t&=\mu _t+\epsilon _t \nonumber \\ \epsilon _t&=\nu _t\sigma _t \end{aligned}$$

(A1)

The $\sigma _t$ is the standard deviation (SD) of returns $y_t$ at day t. The variance equation, which describes the evolution of $\sigma _t$ varies across different GARCH-type models. We select GARCH, E-GARCH, and GJR-GARCH as the three benchmark models, and their respective specifications are listed below (Hansen & Lunde, 2005; Escobar-Anel et al., 2021).

Benchmark model 1, GARCH(1,1) model:

$$\begin{aligned} \sigma ^2=\omega +\alpha _1\epsilon ^2_{t-1}+\beta _1\epsilon ^2_{t-1} \end{aligned}$$

(A2)

Benchmark model 2, EGARCH(1,1) model:

$$\begin{aligned} ln\sigma ^2=\omega +\alpha _1(\Vert \epsilon _{t-1}\Vert -{\mathbb {E}}(\Vert \epsilon _{t-1}\Vert ))+\beta _1\epsilon ^2_{t-1} \end{aligned}$$

(A3)

Benchmark model 3, GJR-GARCH(1,1) model:

$$\begin{aligned} \sigma ^2=\omega +\alpha _1\epsilon ^2_{t-1}+\gamma _1I(\epsilon _{t-1}<0)\epsilon ^2_{t-1}+\beta _1\epsilon ^2_{t-1} \end{aligned}$$

(A4)

where $I(\epsilon _t<0)=1$ if $\epsilon _t<0$ and $\omega $, $\alpha _1$, $\beta _1$ and $\gamma _1$ are scalar parameters. Following the variance equations from Eq. (A2) to Eq. (A4), GARCH-type models use the square of past volatility $\sigma ^2_{t-i}$ and and past innovation $\epsilon ^2_{t-j}$ to determine the change of current volatility $\sigma ^2_t$ as

$$\begin{aligned} \sigma ^2_t=f(\sigma ^2_{t-1}, \sigma ^2_{t-2},\ldots ,{\mathbb {D}}_{t-1}(\epsilon ^2_{t-1}), {\mathbb {D}}_{t-1}(\epsilon ^2_{t-1}),\ldots ) \end{aligned}$$

(A5)

The $\sigma ^2_t$ is calculated using the information of the point valued-returns $y_t$, and only the point valued-operation in the Euclidean space is used.

1.1.2 A.1.2 Conditional Autoregressive Range (CARR)-type models

The point-valued data structure will lose a substantial amount of price information, and it is challenging to serve as a proxy for diverse and extensive investor behaviors. A substantial amount of research has centered on using range-valued (which is also a point-valued-data) information incorporating the highest and lowest prices at day t to estimate volatility $\sigma ^2_t$ (Parkinson, 1980; Lin et al., 2012). Let $R_t$ be the log-price range of day t as

$$\begin{aligned} R_t=ln(\mathop {max}\limits _{\tau }P_{\tau ,t})-ln(\mathop {min}\limits _{\tau }P_{\tau ,t}),\quad \tau =1,2,\ldots ,T \end{aligned}$$

(A6)

where $\tau $ is the intraday timing of the day t. The CARR model could describe the returns volatility $\sigma ^2_t$ by $\frac{R^2_t}{4ln2}$ (Chou, 2005; Parkinson, 1980). The mean equation of Eq. (A6) is

$$\begin{aligned} R_t=\lambda _t\epsilon _t \nonumber \\ \epsilon \mathop {\sim }\limits _{i.i.d}(0,1) \end{aligned}$$

(A7)

Benchmark model 4, CARR(1,1) model:

$$\begin{aligned} \lambda _t=\omega +\alpha _1R_{t-1}+\beta _1\lambda _{t-1} \end{aligned}$$

(A8)

The CARR model incorporates the day’s highest and lowest prices, but discards the closing price information. In addition, CARR does not store trend information, only range information.^{Footnote 9} Two classic derivatives of CARR model is the CARR-A and CARR-B models (Chou, 2005).

Benchmark model 5, CARR-A(1,1) model:

$$\begin{aligned} \lambda _t=\omega +\alpha _1R_{t-1}+\beta _1\lambda _{t-1}+\gamma y_{t-1}+\delta \Vert r_{t-1}\Vert \end{aligned}$$

(A9)

Benchmark model 6, CARR-B(1,1) model:

$$\begin{aligned} \lambda _t=\omega +\alpha _1R_{t-1}+\beta _1\lambda _{t-1}+\gamma y_{t-1} \end{aligned}$$

(A10)

where the parameters $\omega $, $\alpha _1$, $\beta _1$, and $\gamma $ are constant to be calibrated. Theoretically, CARR-A and CARR-B absorb the highest price, the lowest price, and the closing price to describe volatility changes. The CARR-type models convert the price information set $\tilde{{\varvec{P}}}_t=\{H_t,L_t,C_t\}$ into multiple pointed-value data and compute the $\sigma ^2_t$ using point valued-operation rules.

1.2 A.2 Interval-valued volatility models

1.2.1 A.2.1 Interval-valued variable and debates on subtraction operation rules

Noting that the returns of the day are equal to the log-price at day t minus the log-price at day $t-1$, the interval valued-returns could be defined in a similar way. The interval valued-variable $\tilde{{\varvec{x}}}$ could be defined as $\tilde{{\varvec{x}}}=[a,b]=\{x\in {\mathbb {R}}\Vert a,b\in {\mathbb {R}}\}$. A random interval valued-variable $\tilde{{\varvec{X}}}$ on a probability space $(\Omega , {\mathcal {F}}, {\mathcal {P}})$ is a measurable mapping $\tilde{{\varvec{X}}}:\Omega \rightarrow I_{{\mathbb {R}}}$ where $I_{{\mathbb {R}}}$ is the space of closed sets of ordered numbers in ${\mathbb {R}}$. The addition $\oplus $ and scalar multiplication $\otimes $ is

$$\begin{aligned} \tilde{{\varvec{A}}}\oplus \tilde{{\varvec{B}}}&=[\tilde{{\varvec{A}}}_L+\tilde{{\varvec{B}}}_L,\tilde{{\varvec{A}}}_R+\tilde{{\varvec{B}}}_R] \nonumber \\ a\otimes \tilde{{\varvec{A}}}&=[a\tilde{{\varvec{A}}}_L+,a\tilde{{\varvec{A}}}_R] \end{aligned}$$

(A11)

where $\tilde{{\varvec{A}}}=[\tilde{{\varvec{A}}}_L,\tilde{{\varvec{A}}}_R]$, $\tilde{{\varvec{B}}}=[\tilde{{\varvec{B}}}_L,\tilde{{\varvec{B}}}_R]$, and $\tilde{{\varvec{A}}}_L$, $\tilde{{\varvec{A}}}_R$, $\tilde{{\varvec{B}}}_L$, $\tilde{{\varvec{A}}}_R$, $a\in {\mathbb {R}}$, $a\ge 0$. However, the sets-valued time series model may be affected differently depending on the subtraction rule chosen. The first kind of subtraction operation $-_{H}$ follows Hukuhara rule (Han et al., 2016; Hukuhara, 1967; Sun et al., 2018), which is named as the Type-A subtraction in this paper. in Type-A subtraction,

$$\begin{aligned} \tilde{{\varvec{A}}}-_{H}\tilde{{\varvec{B}}}=[\tilde{{\varvec{A}}}_L-\tilde{{\varvec{B}}}_L,\tilde{{\varvec{A}}}_R-\tilde{{\varvec{B}}}_R] \end{aligned}$$

(A12)

Given the highest price $H_t$ and lowest price $L_t$ in day t, the interval valued-returns $\tilde{{\varvec{R}}}_t$ by Type-A subtraction is

$$\begin{aligned} \tilde{{\varvec{R}}}_t=[L_t-L_{t-1},H_t-H_{t-1}] \end{aligned}$$

(A13)

Type-A subtraction is the inverse of addition, while it doesn’t follow the rules for set operations, i.e., for any operations $*$, $\tilde{{\varvec{A}}}*\tilde{{\varvec{B}}}$ should be $\{x\vert A*B, A\in [A_L,A_R],B=[B_L,B_R]\}$. Here, we give the second kind of subtraction, Type-B subtraction $\ominus $ as

$$\begin{aligned} \tilde{{\varvec{A}}}\ominus \tilde{{\varvec{B}}}=[\tilde{{\varvec{A}}}_L-\tilde{{\varvec{B}}}_R,\tilde{{\varvec{A}}}_R-\tilde{{\varvec{B}}}_L] \end{aligned}$$

(A14)

In this vein, the interval valued-returns $\tilde{{\varvec{R}}}_t$ of day t is

$$\begin{aligned} \tilde{{\varvec{A}}}=[L_t-H_{t-1},H_t-L_{t-1}] \end{aligned}$$

(A15)

As we will see in the followings, sets-valued random variables and intervals share similar characteristics, and different subtraction rules can have varying outcomes.

1.2.2 A.2.2 Autoregressive conditional interval (ACI) model

Range-valued time series retains only the “leve” information of the price range, whereas “trend” information is lost (Han et al., 2016).^{Footnote 10} Following the Type-A subtraction and the interval valued-returns definition in Eq. (A13), Autoregressive conditional interval (ACI) model obtains the dynamic evolution of interval valued-returns

Benchmark model 7, ACI(1,1) model: $\tilde{{\varvec{Y}}}_t$ (Han et al., 2016; Yang et al., 2016; Sun et al., 2018) as

$$\begin{aligned} \tilde{{\varvec{Y}}}_t=\alpha _0+\beta _0\tilde{{\varvec{I}}}_0+\beta _1\tilde{{\varvec{Y}}}_t+\tilde{{\varvec{u}}}_t \end{aligned}$$

(A16)

where the $\tilde{{\varvec{Y}}}_t=[\tilde{{\varvec{Y}}}_{L,t},\tilde{{\varvec{Y}}}_{R,t}]$ and $\tilde{{\varvec{I}}}_t0=[-1,1]$ is the interval-valued variable. The $\alpha _0$, $\beta _0$, and $\beta _1$ in Eq. (A16) are scalar values in ${\mathbb {R}}$. The $\tilde{{\varvec{u}}}_t$ is the interval-valued white noise process. In the ACI model, there is no pointed-value operation rule, and the volatility $\sigma ^2_t$ at day t could be computed as $\frac{\tilde{{\varvec{Y}}}_{R,t}-\tilde{{\varvec{Y}}}_{L,t}}{4ln2}$. In both the evolution process equation and the parameter estimation process in Eq. (A16), the pointed-value operation rule in Euclidean space is no longer used. However, the random set theory’s operation rule is adopted.

1.2.3 A.2.3 Interval-valued GARCH (Int-GARCH) model

The interval-valued GARCH model (Int-GARCH) determines the variance of interval-valued returns $\tilde{{\varvec{Y}}_t}$ defined by Eq. (A15) using a GARCH-type structure and Type-B subtraction as

Benchmark model 8, Int-GARCH(1,1) model:

$$\begin{aligned} \tilde{{\varvec{Y}}_t}&= h_t\otimes \tilde{\varvec{\nu }_t}\nonumber \\ \tilde{\varvec{\nu }_t}&= [\epsilon _t-\eta _t,\epsilon _t+\eta _t]\nonumber \\ \epsilon _t&\mathop {\sim }\limits _{i.i.d}{\mathcal {N}}(0,1) \nonumber \\ \eta _t&\mathop {\sim }\limits _{i.i.d}\Gamma (k,1) \nonumber \\ h_t&=\alpha _0+\alpha _1\Vert \lambda _{t-1}\Vert +\beta _1\delta _{t-1}+\gamma _1h_{t-1} \end{aligned}$$

(A17)

where ${\mathcal {N}}$ is is a normal distribution, and $\Gamma $ is a univariate Gamma distribution. The Int-GARCH model uses the Type-B subtraction. Let an arbitrary return at position a in $\tilde{{\varvec{Y}}_t}$ be $ \tilde{{\varvec{Y}}_t}(a)=h_t(\epsilon _t+a\eta _t)$, $a\in [-1,1]$, the Int-GARCH volatility $\sigma _t^2$ is the average of ${\mathbb {D}}(\tilde{{\varvec{Y}}_t}(a))$ or $\int _{-1}^1{{\mathbb {D}}(\tilde{{\varvec{Y}}_t}(a))}da$ (Sun et al., 2020). The Int-GARCH model failed to absorb the closing price information.

Appendix B: A numerical example of Eq. (13)

The log-price of S &P 500 index at $t=$ Oct 15, 2021 is 8.406445 ($H_t$), 8.40014 ($L_t$), 8.40545 ($C_t$), and on Oct 14, 2021 is 8.398349 ($H_{t-1}$), 8.386344 ($L_{t-1}$), 8.398018 ($C_{t-1}$), the shape of $m_{\tilde{{\varvec{P}}}_t}(x)$ on Oct 14, 2021 is shown in Fig. 4. In Eq. (13), the p is a priori parameter, and different values of p can represent the attitudes of different types of investors to risk perception. As shown in Fig. 4, when $p=1$, the membership of the closing price returns $R_{C,t}$ in the fuzzy set $\tilde{{\varvec{R}}}_t$ is 1. The closer to the $R_{L,t}$ and $R_{H,t}$ in the $\tilde{{\varvec{R}}}_t$, the $m_{\tilde{{\varvec{P}}}_t}(x)$ value decreases nearly linearly to 0. $\tilde{{\varvec{R}}}_t$ is almost close to a triangular fuzzy sets-valued random variable. When $p=2$, the $m_{\tilde{{\varvec{P}}}_t}(x)$ value shows a rapid increase in the process of changing from the support set boundary of $m_{\tilde{{\varvec{P}}}_t}(x)$ to the $R_{C,t}$. As p continues to increase, the $m_{\tilde{{\varvec{P}}}_t}(x)$ value at left and right ends of $\tilde{{\varvec{R}}}_t$ gradually decreases, and $\tilde{{\varvec{R}}}_t$ gradually becomes an interval-valued variable, making $\tilde{{\varvec{R}}}_t$ degenerate into an interval valued-random process (Sun et al., 2020).

If we eliminate the fuzziness of $\tilde{{\varvec{R}}}_t$ in Eq. (13), or we set p in $\phi (x;p)$ to a very large number, $\tilde{{\varvec{R}}}_t$ will degenerate into an interval-valued random variable $\tilde{{\varvec{R}}}^*$ similar to Int-GARCH model, and the definition of $\tilde{{\varvec{R}}}^*$ can become:

$$\begin{aligned} \tilde{{\varvec{R}}}^*=[L_t-H_{t-1},H_t-L_{t-1}]=[\tilde{{\varvec{P}}}_{L,t}-\tilde{{\varvec{P}}}_{R,t-1},\tilde{{\varvec{P}}}_{R,t}-\tilde{{\varvec{P}}}_{L,t-1}] \end{aligned}$$

(B18)

$\tilde{{\varvec{R}}}^*$ becomes $Supp\tilde{{\varvec{R}}}$. Consequently, we can find that we have selected the same Type-B Subtraction rule as in the Int-GARCH model. Indeed, the Type-B subtraction used in Eqs. (13) and (B18) can give the returns value of day t (with specific membership) in all real-world trading cases based on the price information of day t and day $t-1$.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dai, X., Cerqueti, R., Wang, Q. et al. Volatility forecasting: a new GARCH-type model for fuzzy sets-valued time series. Ann Oper Res (2023). https://doi.org/10.1007/s10479-023-05746-z

Download citation

Received: 06 February 2023
Accepted: 21 November 2023
Published: 14 December 2023
DOI: https://doi.org/10.1007/s10479-023-05746-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Volatility forecasting: a new GARCH-type model for fuzzy sets-valued time series

Abstract

Similar content being viewed by others

Evolving Fuzzy-GARCH Approach for Financial Volatility Modeling and Forecasting

Ordered Fuzzy GARCH Model for Volatility Forecasting

A Hybrid Fuzzy GJR-GARCH Modeling Approach for Stock Market Volatility Forecasting

1 Introduction

2 Construction of fuzzy sets-valued returns

2.1 Preliminaries of the fuzzy sets-valued random variable

2.1.1 Sets-valued variable

2.1.2 Fuzzy sets-valued variable

2.1.3 The distance of fuzzy sets-valued variable

2.2 Fuzzy sets-valued price and returns

2.3 Why do we choose the Type-B subtraction?

2.4 Discussion of \(K(u,\alpha ,v,\beta )\)

3 The random fuzzy sets-valued based GARCH model

3.1 Grounding ideas on the model setting

3.2 Relationship between \({\mathbb {D}}(\tilde{{\varvec{R}}}_{t})\) and \(\sigma _t\)

3.3 Model specification

3.3.1 Set-GARCH model

3.3.2 Set-GARCH-LR model

3.4 Parameter estimation

4 An empirical application

4.1 Data selection

4.2 In-sample volatility forecasting

4.3 Out-of-sample volatility forecasting

5 Conclusion

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A: The benchmark models

1.1 A.1 Point-Valued volatility models

1.1.1 A.1.1 GARCH-type models

1.1.2 A.1.2 Conditional Autoregressive Range (CARR)-type models

1.2 A.2 Interval-valued volatility models

1.2.1 A.2.1 Interval-valued variable and debates on subtraction operation rules

1.2.2 A.2.2 Autoregressive conditional interval (ACI) model

1.2.3 A.2.3 Interval-valued GARCH (Int-GARCH) model

Appendix B: A numerical example of Eq. (13)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation