Keywords

1 Introduction

Many investors have taken a well-known strategy, which is pairs trading and it was invented at Morgan Stanley in 1987. Pairs trading strategy works by taking the arbitrage opportunity of temporary abnormality between two related assets. When such event exists, one asset can be overvalued relative to its pair. Then the overvalued asset is sold while the undervalued asset is bought. Pairs trading is a market-neutral strategy following two-step process: first, identify two stocks whose prices have moved together historically, and second, sell the winner and buy the loser when the price relation is broken. The profit can be made and the prices of the two stocks will converge to a mean if the past is a good mirror of the future. There are a great number of different studies within pairs trading framework, such as distance approach, co-integration approach and time series approach. These can be sorted into three main approaches. Firstly, the distance method utilizes nonparametric distance matrices to calculate the sum of squared deviations between two normalized stock prices as the criteria to form pair trading opportunities. The most cited paper was that by Gatev et al. [8] who found that the strategy provides average annualized excess returns of up to 11% based on the large sample of US equities. Later, Perlin [16] furthered the analysis to examine the profitability and risk of the pairs trading strategy for Brazilian stock market. Do and Faff [6] replicated the original methodology of Gatev et al. [8] and by the sample period extension to June 2008. They confirmed pairs trading strategy to be profitable for a long period of time, despite at a decreasing rate. Secondly, Vidyamurthy [17] developed a co-integration approach. The co-integration approach describes how to figure out co-moving stocks relying on formal co-integration testing. Applying this method to pairs trading is mostly based on Gatev et al. [8] threshold rule. Vidyamurthy [17] suggested a univariate co-integration approach, which is employed to preselect the potential co-integrated pairs, and to design the trading rule with nonparametric methods, based on statistical information. By using co-integration approach, Miao [15] provided high frequency and dynamic pairs trading system. For co-integrated assets in a continuous-time economy, Chiu and Wong originated the optimal pairs trading strategy in a closed-form solution. Thirdly, the time series approach was developed by Elliott et al. [9], which utilizes a Kalman filter for estimating a parametric model of the mean-reverting spread, in which the formation period is ignored and the spread is assumed to follow the state space model. This approach focuses on describing mean-reversion of the spread with other time series methods rather than co-integration. Do et al. [6] criticized and extended the method of Elliott et al. [9] into the stochastic residual spread method to improve the former method.

In previous literature, the pair spread exhibited a non-linear behavior and it seems to switch across the economic regimes. In addition, the deviation of the spread may also temporarily or persistently endure (Bock and Mestel [2]). Therefore, non-linear models have been proposed to deal with this behavior and they were found to provide a better fitting performance to the pair spread than linear models. Bock and Mestel [2] and Yang et al. [18], Zhu et al. [19] employed the Markov Switching to develop a trading rules for pairs trading. Chen et al. [5] proposed an alternative threshold model to capture mean and volatility asymmetries in pair spread. We found that all of those studies show the superior and the better fitting performance of the non-linear models. However, as financial time series often exhibit a volatility clustering, asymmetry in conditional mean and variance, and fat-tailed distributions [5], it is important to capture this volatility. In order to capture the dynamic volatility of the pair spread, a generalized autoregressive conditional heteroskedasticity (GARCH) model of Engle [7] and Bollerslev [3] is also proposed to the non-linear models. To find the optimal investment decisions, the spreads of pairs stock are compared with predictions from calibrated model. In this study, three non-linear models consisting of Markov Switching, threshold, and kink models are employed to find the best prediction. Thus, we propose Markov Switching AR-GARCH, threshold-AR-GARCH, and Kink-AR-GARCH and compare the prediction performance of these competing models. For the Kink-AR-GARCH, based on our best knowledge, this is the first study that employed the Kink-AR-GARCH of Boonyasana and Chinnakum [4] to develop useful trading rules for pairs trading.

This paper is organized as follows: Sect. 1 offers the introduction. The three non-linear models with GARCH effects are presented in Sect. 2. In Sect. 3, we discuss the criteria for pairs trading rules. The empirical results are shown in Sect. 4. Finally, Sect. 5 presents the conclusions.

2 Methodology

In this study, we propose three non-linear models consisting of kink, threshold and Markov Switching models. We would like to model the return spread of potential stock pairs by these three models with GARCH effects and the upper and lower regimes in each model are used to find the trading entry and exit signals. In addition, in this study, we consider only lag-one in our model because we aim to test zero serial correlations lag-one autocorrelation. If the estimated parameter at lag-one is not statistically significant, this indicates that potential pair arbitrage opportunities may not exist [5]. For the GARCH equation, our study also considers only AR(1)-GARCH(1,1) since it is able to reproduce the volatility dynamics of financial data and also bring a good fit and accurate predictions in many empirical applications.

2.1 Kink Autoregressive GARCH Model

The model has been proposed in Boonyasana and Chinnakum [4]. They extend the classical GARCH model of Bollerslev [3] to the kink model with unknown threshold of Hansen [12] and the model is called Kink Autoregressive GARCH (KAR-GARCH). The feature of the model is that it allows structural change in both mean and variance equations. The function of each equation is continuous but the slope has a discontinuity at a threshold point or kink point. It splits the lag data into two groups based on a function. KAR-GARCH model can be written as

$$\begin{aligned}&{{y}_{t}}={{\phi }_{0}}+\phi _{1}^{-}({{y}_{t-1}}\le \gamma )+\phi _{1}^{+}({{y}_{t-1}}>\gamma )+{{\varepsilon }_{t}}, \end{aligned}$$
(1)
$$\begin{aligned}&{{\varepsilon }_{t}}=\sqrt{h_t}{{\eta }_{t}}, \end{aligned}$$
(2)
$$\begin{aligned}&\begin{matrix} { { h }_{ t } }=\alpha _{ 0 }^{ { } }+\alpha _{ 1 }^{ - }\varepsilon _{ t-1 }^{ 2 }({ { y }_{ t-1 } }\le \gamma )+\alpha _{ 1 }^{ + }\varepsilon _{ t-1 }^{ 2 }({ { y }_{ t-1 } }>\gamma ) \\ \,\,\,\,\,\,\,+\,\beta _{ 1 }^{ - }{ { h }_{ t-1 } }({ { y }_{ t-1 } }\le \gamma )+\beta _{ 1 }^{ + }{ { h }_{ t-1 } }({ { y }_{ t-1 } }>\gamma ), \end{matrix} \end{aligned}$$
(3)

where \({{\phi }_{0}}\), \(\phi _{1}^{-}\) and \(\phi _{1}^{+}\) are the estimated parameters for mean equation (1) while \(\alpha _{0}^{{}}\), \(\alpha _{1}^{-}\), \(\alpha _{1}^{+}\), \(\beta _{1}^{-}\) and \(\beta _{1}^{+}\) are the estimated parameters of variance equation (3). r is the threshold parameter or kink point value defining the regimes for both mean and variance equations through indicator function and \(({{y}_{t-1}}>\gamma )\) for upper regime, \(({{y}_{t-1}}\le \gamma )\) for lower regime. The error term, \({{\varepsilon }_{t}}\), is the noise with the i.i.d. distribution and assumed to have normal distribution, a Student-t distribution, and a skewed-t distribution. Consider the GARCH equation, the estimated parameters \(\alpha _{1}^{-},\alpha _{1}^{+}>0\), \(\beta _{1}^{-},\beta _{1}^{+}>0\), \(\alpha _{1}^{+}+\beta _{1}^{+}<1\) and \(\alpha _{1}^{-}+\beta _{1}^{-}<1\).

2.2 Threshold Autoregressive GARCH Model

This model has been proposed by Li and Li [13]. It allows mean and the conditional variance to vary across regimes. A general two-regime threshold autoregressive GARCH model can be expressed as

$$\begin{aligned}&\begin{matrix} { { y }_{ t } }=\phi _{ 0 }^{ 1 }+\phi _{ 1 }^{ 1 }{ { y }_{ t-1 } }+{ { \varepsilon }_{ 1t } }\, \, \, \, \, \, \, \, \, \, \, \, \, \, { { y }_{ t-d } }\le \gamma \\ { { y }_{ t } }=\phi _{ 0 }^{ 2 }+\phi _{ 1 }^{ 2 }{ { y }_{ t-i } }+{ { \varepsilon }_{ 2t } }\, \, \, \, \, \, \, \, \, \, \, \, \, { { y }_{ t-d } }>\gamma \end{matrix}\end{aligned}$$
(4)
$$\begin{aligned}&{{\varepsilon }_{1t}}=\sqrt{{{h}_{1t}}}{{\eta }_{1t}},{{\varepsilon }_{2t}}=\sqrt{{{h}_{2t}}}{{\eta }_{2t}}, \end{aligned}$$
(5)
$$\begin{aligned}&\begin{matrix} h_{ 1t }^{ { } }=\alpha _{ 0 }^{ (1) }+\alpha _{ 1 }^{ (1) }\varepsilon _{ 1t-1 }^{ 2 }+\beta _{ 1 }^{ (1) }h_{ 1t-1 }^{ { } }\, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, { { y }_{ t-d } }\le \gamma \\ h_{ 2t }^{ { } }=\alpha _{ 0 }^{ (2) }+\alpha _{ 1 }^{ (2) }\varepsilon _{ 2t-1 }^{ 2 }+\beta _{ 1 }^{ (2) }h_{ 2t-1 }^{ { } }\, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, { { y }_{ t-d } }>\gamma \end{matrix} \end{aligned}$$
(6)

where Eqs. (4) and (6) are the conditional mean and variance equation, respectively. \(\phi _{i}^{(j)}\), \(\alpha _{i}^{(j)}\), \(\beta _{1}^{(j)}\), \(j=1,2\) are the estimated parameters of the model in 2 regimes. Here, we restrict \(\alpha _{1}^{(j)}>0\), \(\beta _{1}^{(j)}>0\) and \(\alpha _{1}^{(j)}+\beta _{1}^{(j)}<1\). \({{\varepsilon }_{t}}\) is the residual term which consists of the standard variance, \({{h}_{t}}\), and the standardized residual, \({{\eta }_{t}}\). Note that we assumed to have a normal distribution, a Student-t distribution, and a skewed-t distribution. The movement of the observations between the regimes is controlled by threshold variable \({{y}_{t-d}}\) with the delay parameter being a positive integer. Note that we consider only lag-one, d is set to be 1, \({{y}_{t-1}}\). If \({{y}_{t-1}}\) is greater or lower than threshold parameter, \(\gamma \), the separated observations can be estimated as different regressions then the model can vary across regimes.

2.3 Markov Switching Autoregressive–GARCH Model

Roughly speaking, this model consists of the Markov regime-switching model proposed by Hamilton [11] and the GARCH of Engle [7] and Bollerslev [3]. As discussed by Bauwens et al. [1], the persistence in the estimated single regime of GARCH process could be considered as resulting from the misspecification and thus they introduced a way to control it using an MS-GARCH model where the regime switches are governed by a hidden Markov chain. Our study follow the works of Haas et al. [10]; Marcucci [14]; and Bauwens et al. [1], whose are used the Markov Switching GARCH approach to gain more ability to capture some stylized facts of financial time series. The general form of the Markov Switching AR–GARCH(1,1) model can be written as

$$\begin{aligned}&{{y}_{t}}={{\phi }_{0,{{S}_{t}}}}+{{\phi }_{i,{{S}_{t}}}}{{y}_{t-1}}+{{\varepsilon }_{t}},\end{aligned}$$
(7)
$$\begin{aligned}&{{\varepsilon }_{t}}=\sqrt{{{h}_{t,{{S}_{t}}}}}{{\eta }_{t}}, \end{aligned}$$
(8)
$$\begin{aligned}&h_{t,{{S}_{t}}}^{{}}={{\alpha }_{0,{{S}_{t}}}}+{{\alpha }_{1,{{S}_{t}}}}\varepsilon _{t-1}^{2}+{{\beta }_{1,{{S}_{t}}}}h_{t-1}^{{}}, \end{aligned}$$
(9)

where Eqs. (7) and (9) are the mean and variance equations, respectively, and both are regime dependent. This means that these two equations are allowed to switch across regime. The estimated variance equation parameters \({{\alpha }_{1,{{S}_{t}}}}>0\), \({{\beta }_{1,{{S}_{t}}}}>0\) and \({{\alpha }_{1,{{S}_{t}}}}+{{\beta }_{1,{{S}_{t}}}}<1\) are to ensure the positive conditional variance, \(h_{t,{{S}_{t}}}^{{}}\). \({{S}_{t}}\) is the state variable which is the probabilistic structure of the switching regime indicator and is defined by first-order Markov process with constant transition probabilities Q.

$$\begin{aligned} Q=\left[ \begin{matrix} {{p}_{11}} &{} {{p}_{12}} \\ {{p}_{21}} &{} {{p}_{22}} \\ \end{matrix} \right] , \end{aligned}$$
(10)

where \({{p}_{11}}=\Pr ({{S}_{t}}=1\left| {{S}_{t-1}}=1 \right. )\), \({{p}_{22}}=\Pr ({{S}_{t}}=2\left| {{S}_{t-1}}=2 \right. )\), \({{p}_{21}}=\Pr ({{S}_{t}}=2\left| {{S}_{t-1}}=1 \right. )\), and \({{p}_{12}}=\Pr ({{S}_{t}}=1\left| {{S}_{t-1}}=2 \right. )\). To estimate the parameter set in this model, the maximum likelihood method is used and the general form of the likelihood can be defined as

$$\begin{aligned} L({{\varTheta }_{{{S}_{t}}}}\left| y \right. )=f({{\varTheta }_{{{S}_{t}}}}\left| y)Pr({{S}_{t}}=k) \right. , \end{aligned}$$
(11)

where \(f({{\varTheta }_{{{S}_{t}}}})\) is the density function, \({{\varTheta }_{{{S}_{t}}}}=\left\{ {{\phi }_{0,{{S}_{t}}}},{{\phi }_{i,{{S}_{t}}}},{{\alpha }_{0,{{S}_{t}}}},{{\alpha }_{1,{{S}_{t}}}},{{\beta }_{1,{{S}_{t}}}} \right\} \) is state dependent parameter set of the model and \(Pr({{S}_{t}}=k)\) is the filtered probabilities in each regime (k). To estimate \(Pr({{S}_{t}}=k)\) we employed Hamiltons filter of Hamilton [11] which can be written as

$$\begin{aligned} \Pr ({{S}_{t}}=k\left| {{\varTheta }_{{{S}_{t}}}})= \right. \frac{f({{y}_{t}}\left| {{S}_{t=k}},{{\varTheta }_{{{S}_{t}}}}_{t-1})\Pr ({{S}_{t=k}}\left| {{\varTheta }_{{{S}_{t}},}}_{t-1}) \right. \right. }{\sum \limits _{k=1}^{2}{f({{y}_{t}}\left| {{S}_{t=k}},{{\varTheta }_{{{S}_{t}}}}_{t-1})\Pr ({{S}_{t=k}}\left| {{\varTheta }_{{{S}_{t}},}}_{t-1}) \right. \right. }}, \end{aligned}$$
(12)

where \(f({{y}_{t}}\left| {{S}_{t=k}},{{\varTheta }_{{{S}_{t}}}}_{t-1}) \right. \) is the density function of each regime k.

2.4 Pairs Trading

In this study, we employ pairs trading strategy, therefore the selection of pair stock is very important. To select the appropriate pair stock, we followed the study of Chen et al. [5] where the pair stock is selected using the lowest value of the Minimum Squared Distance method (MSD). The formula of this method is given as follows.

$$\begin{aligned} MSD=\sum \limits _{t=1}^{T}{(P_{t}^{A}-}P_{t}^{B})^2, \end{aligned}$$
(13)

where \(P_{t}^{A}\) and \(P_{t}^{B}\) is the normalized stock price and \(P_{t}^{i}=(p_{t}^{i}-\bar{p}_{t}^{i})/s{{d}_{i}}\). Here, the two stocks with the first five smallest MSD pair among the 30 stocks are selected. Then, the selected pairs are used to calculate the series of returns by taking differences of the logarithms of the daily closing price. The next step is to compute the return spread, \({{y}_{t}}=\text {r}_{t}^{A}-\text {r}_{t}^{B}\) and fit our non-linear model with GARCH effects to the return spread. Once the best non-linear model is fitted, we conduct one of the trading rules, called distance method, introduced in Yang et al. [18] and first proposed by Gatev et al. [8].

$$\begin{aligned} \begin{matrix} { sell }\, { A }\, { buy }\, { B }\, \\ { sell }\, B\, { buy }\, A \end{matrix}\begin{matrix} = \\ = \end{matrix}\begin{matrix} \, \, { { y }_{ t } }\ge \mu +\delta { { h }_{ t } } \\ \, \, { { y }_{ t } }<\mu -\delta { { h }_{ t } } \end{matrix} , \end{aligned}$$
(14)

where \(\mu \) is the predicted value that was estimated during the pair-formation period of the best fit model, \(\delta \) is set at 1.96, 1.99 and 2.3428 when the error is normal, student-t, and skewed-t distribution, respectively, at the 5% significance level. Finally, we can obtain the average return of pairs trading returns on the sell stock A and buy stock B position by

$$\begin{aligned} {{r}_{1}}=\frac{1}{D}\left[ -\ln \frac{P_{sell}^{A}}{P_{buy}^{A}}+\ln \frac{P_{sell}^{B}}{P_{buy}^{B}} \right] . \end{aligned}$$
(15)

Likewise, the average trading return on the buy stock A and sell stock B position is given as follows

$$\begin{aligned} {{r}_{2}}=\frac{1}{D}\left[ \ln \frac{P_{sell}^{A}}{P_{buy}^{A}}-\ln \frac{P_{sell}^{B}}{P_{buy}^{B}} \right] , \end{aligned}$$
(16)

where D is number of holding days.

3 Estimate Results

3.1 Data Description

The daily closing prices of 36 companies in the Dow Jones Industrial Average (DJIA), New York Stock Exchange (NYSE), and NASDAQ stock markets are used as an illustration. The data are obtained from Thomson Reuters data stream, Faculty of Economics, Chiang Mai University from January 3, 2005 to December 30, 2016. Before the estimation of our model, we transform all the daily data to be log-return and the Augmented Dickey Fuller (ADF) test is employed for stationary test and we found that all log-returns are stationary at the level.

In this study, we select 36 companies comprising 3M (MMM), Apple (AAPL), American Express (AXP), AT&T(T), Bank of America (BAC), Boeing (BA), Caterpillar (CAT), Chevron Corporation (CVX), Cisco Systems (CSCO), Coca-Cola (KO), Dupont (DD), ExxonMobil (XOM), General Electric (GE), Google (GOOGL), The Goldman Sachs Group (GS), HewlettPackard (HPQ), The Home Depot (HD), Intel (INTC), IBM (IBM), Johnson & Johnson (JNJ), JPMorgan Chase (JPM), Lowes Companies (LOW), McDonalds (MCD), Merck (MRK), Microsoft (MSFT), Nike (NKE), PepsiCo (PEP), Pfizer (PFE), Procter & Gamble (PG), Travelers (TRV), UnitedHealth Group Incorporated (UNH), United Technologies Corporation (UTX), Verizon Communications (VZ), Wal-Mart (WMT), Walt Disney (DIS), and Yahoo (YHOO).

Prior to illustrating the pairs trading strategy, we calculate the MSD between any two normalized price series for all possible pair stocks. The number of possible pairs is 630. The MSD is conducted here to select the first five stock pairs that provide the lowest MSD. We find the five pairs trading candidates as presented in Table 1.

Table 1. Pair selection

Then we fit a non-linear model with GARCH effects to these five selected pair returns. Once the model is fitted, the upper and lower threshold values, which are calculated from the standard deviation of return spread of the stock pair, are used as trading signals. In this study, we follow a line of literatures in the pairs trading strategy by specifying that if return spread is above or below the upper or lower threshold value, we then either sell or buy one stock and either buy or sell the other stock. Once the position is open and the spread falls back to the standard deviation line, the position is closed.

3.2 Model Selection

As we mentioned before, the study would like to model the return spread of potential stock pairs by the proposed three models with GARCH effects. The study also conducted with three different error distributions, namely Normal (norm), Student-t (std), and Skew Student-t (sstd), in two-regime model. To select the best fit non-linear model and distribution for our models, the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) are employed to compare the performance of our models. Table 2 provides evidence that Markov Switching models is the best fit non-linear model for all pairs. We find that NIKE-DISNEY pair, TRAVELERS-3M pair, PEPSI-COLA pair prefer student-t while Skew Student-t provides the best fit to HOME-LOWE pair, DISNEY-TRAVELERS pair. Please note that the numbers between AIC and BIC of PEPSI-COLA pair are not consistent, we select student-t the best fit for this pair according to BIC.

Table 2. Model selection
Table 3. Estimation results of MS-AR-GARCH for the five pair returns

3.3 Estimation of MS-AR-GARCH Model

Table 3 shows the estimated results of MS-AR-GARCH(1,1) when the error term has student-t distribution for NIKE-DISNEY pair, TRAVELERS-3M pair, PEPSI-COLA pair, and Skew Student-t distribution for HOME-LOWE pair, DISNEY-TRAVELERS pair. The model provides two equations namely, mean equation and variance equation for two regimes. Lets consider the variance equation in order to interpret the meaning of each regime. It is important to identify which of these regimes presents a high volatility and which regime presents a low volatility. To answer this question, we consider the persistence of volatility shocks for each regime. Generally, the volatility persistence can be measured by the sum and the higher value of corresponds to the higher unconditional variance of the process. According to Table 3, we can interpret the first regime for all pairs as the high persistence of volatility shock regime and second regime as low persistence volatility regime since the intercept of regime 1, \(f_{0S_{t}=1}\), is higher than regime 2, \(f_{0S_{t}=2}\). This evidence is very important for investors because putting an investment in different period seems to face with a different market situation.

Moreover, the Table 3 also provides the result of the transition matrix and shows that the regimes in all pairs are persistent because the probability of staying in their own regime is larger than 96%, while the probability of switching between these regimes is less than 4%, except for PEPSI-COLA pair. This indicates that only an extreme event can switch the pair returns to change between regimes.

4 Pairs Trading Strategy

The estimated results of MS-AR-GARCH(1,1) are then extended to find the upper and lower threshold values, which are calculated from their predicted value of return spread of the stock pair and use them as trading signals. Note that we sell stock A and buy stock B when the observed spread is larger than the predicted value. In contrast, we will sell stock B and buy stock A when the observed spread is smaller l than the predicted value, see Eq. (14). Figures 1, 2, 3, 4 and 5 show the pairs return spread of five pairs return, during the in-sample period (1 month and 6 months), the red and green lines located at upper and lower criteria, respectively, which are employed as trading entry and exit signals. When the pairs trading strategy is used, the returns of pair trades are shown in Tables 4 and 5.

Fig. 1.
figure 1

HOME-LOWES pair return spread (Color figure online)

Fig. 2.
figure 2

DISNEY-TRAVELER pair return spread (Color figure online)

Fig. 3.
figure 3

NIKE-DISNEY pair return spread (Color figure online)

Fig. 4.
figure 4

TRAVELER-3M pair return spread (Color figure online)

Fig. 5.
figure 5

PEPSI-COLA pair return spread (Color figure online)

Table 4. Company returns in five pairs and pairs returns from December 1, 2016 to December 30, 2016
Table 5. Company returns in five pairs and pairs returns from July 1, 2016 to December 30, 2016

Consider Table 4, it shows the mean returns of companies in five pairs from December 1, 2016 to December 30, 2016, and the mean return of five pairs. It is a one-month in sample result. In a similar way, we also calculate a six-month in-sample result from July 1, 2016 to December 30, 2016, as shown in Table 5. Moreover, we also compare the pair return with individual return. Here, we assume buying the stock at the first day and selling at the last day of trading period. Thus we can calculate gains or losses at the first day and last day of trading in-sample period. According to Table 4, we find that our trading signals contribute a positive return to all stock pairs during in-sample period. The profits are respectively 3.0227%, 1.3453%, 3.4213%, and 6.9327% for HOME-LOWE pair, DISNEY-TRAVELERS pair, NIKE-DISNEY pair, TRAVELERS-3M pair, PEPSI-COLA pair. For the comparison, we consider the individual return of stock and find that most individual stocks have returns less than pair stock returns.

In Table 5, the in-sample period is expanded to be six months and we find that our trading signals contribute a positive return to all stock pairs from July 1, 2016 to December 31, 2016. Lets consider the individual pair return, we find that NIKE-DISNEY pair provides the highest pair return, followed by HOME-LOWE pair. When we compare our pair returns with the single stock return, we find that the returns from our pairs trading strategy generate a higher return in all cases.

Finally, we can conclude that our pairs trading signal can generate a higher return when compared with the single mean return of individual stock. Thus, the obtained trading signal which was computed under the Markov switching approach works well in our application study.

5 Conclusions and Future Research

Movements of financial variables exhibit extreme fluctuations during turbulence period and market uncertainty. They can be affected by the institutional policies and intervention of regulatory authorities. Some studies also mentioned that news release from these institutes and government is another factor leading to structural change in the market. In this study, we aim to develop a pairs trading model that combines the non-linear approach to search for trading entry and exit signals. We have employed three nonlinear GARCH models consisting of Markov Switching GARCH, Threshold GARCH, and Kink GARCH and also compared the results with previous linear GARCH in order to confirm the structural change in our analysis. Additionally, the trading rule was applied for the investing universe of companies in the Dow Jones Industrial Average (DJIA), New York Stock Exchange (NYSE), and NASDAQ stock markets. The comparison results show that the Markov Switching model performs slightly better than other models for all pair returns.

The empirical results suggest that the regime-switching rule for pairs trading generates positive returns and so it offers an interesting analytical alternative to traditional pairs trading rules. Our pairs trading signal can generate a higher return when compared with the single mean return of individual stock. Thus, the obtained trading signal which was computed under the Markov switching approach works well in our application study.

A natural line of future research could be the extension of our framework to more than two Markov-regimes. This, however, leads to highly parameterized models which become increasingly difficult to estimate. However, other estimation procedures rather than our ML approach may be implemented, for example Bayesian Markov Chain Monte Carlo (MCMC) algorithms which have the potential to provide an alternative way of circumventing the problem of path dependence [1].