Abstract
We carry out a large-scale empirical data analysis to examine the efficiency of the so-called pairs trading. On the basis of relevant three thresholds, namely, starting, profit taking, and stop loss for the ‘first-passage process’ of the spread (gap) between two highly correlated stocks, we construct an effective strategy to make a trade via ‘active’ stock-pairs automatically. The algorithm is applied to 1784 stocks listed in the first section of the Tokyo Stock Exchange leading up to totally 1,590,436 pairs. We are numerically confirmed that the asset management by means of the pairs trading works effectively at least for the past three years (2010–2012) data sets in the sense that the profit rate becomes positive (totally positive arbitrage) in most cases of the possible combinations of thresholds corresponding to ‘absorbing boundaries’ in the literature of first-passage processes.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Cross-correlations often provide us very useful information about financial markets to figure out various non-trivial and complicated structures behind the stocks as multivariate time series (Bouchaud and Potters 2009). Actually, the use of the cross-correlation can visualize collective behavior of stocks during the crisis. As such examples, we visualized the collective movement of the stocks by means of the so-called multidimensional scaling (MDS) during the earthquake in Japan on March 2011 (Ibuki et al. 2012a, b, 2013). We have also constructed a prediction procedure for several stocks simultaneously by means of multi-layer Ising model having mutual correlations through the mean fields in each layer (Ibuki et al. 2012a, b, 2013; Murota and Inoue 2013).
Usually, we need information about the trend of each stock to predict the price for, you might say, ‘single trading’ (Murota and Inoue 2013; Kaizoji 2000; Bouchaud 2012). However, it sometimes requires us a lot of unlearnable ‘craftsperson’s techniques’ to make a profit. Hence, it is reasonable for us to use the procedure without any trend-forecasting-type way to manage the asset with a small risk.
From the view point of time series prediction, Elliot et al. (2005) made a model for the spread and tried to estimate the state variables (spread) as hidden variables from observations by means of Kalman filter. They also estimated the hyper-parameters appearing in the model using EM algorithm (Expectation and Maximization algorithm) which has been used in the field of computer science. As an example of constructing optimal pairs, (Mudchanatongsuk 2008) regarded pair prices as Ornstein–Uhlenbeck process, and they proposed a portfolio optimization for the pair by means of stochastic control.
For the managing of assets, the so-called pairs trading (Vidyamurthy 2004; Whistler 2004; Gatev et al. 2006) has attracted trader’s attention. The pairs trading is based on the assumption that the spread between highly correlated two stocks might shrink eventually even if the two prices of the stocks temporally exhibit ‘mis-pricing’ leading up to a large spread. It has been believed that the pairs trading is almost ‘risk-free’ procedure; however, there are only a few extensive studies (Perlin 2009; Do and Faff 2010) so far to examine the conjecture in terms of big-data scientific approach.
Of course, several purely theoretical approaches based on probabilistic theory have been reported. For instance, the so-called arbitrage pricing theory (APT) Gatev et al. (2006) in the research field of econometrics has suggested that the pairs trading works effectively if the linear combination of two stocks, each of which is non-stationary time series, becomes stationary. Namely, the pair of two stocks showing the properties of the so-called co-integration (Engle and Granger 1987; Stock and Watson 1988) might be a suitable pair. However, it might cost us a large computational time to check the stationarity of the co-integration for all possible pairs in a market, whereas it might be quite relevant issue to clarify whether the pairs trading is actually safer than the conventional ‘single trading’ [see for instance (Murota and Inoue 2013)] to manage the asset, or to what extent the return from the pairs trading would be expected etc.
With these central issues in mind, here we construct a platform to carry out and to investigate the pairs trading which has been recognized an effective procedure for some kind of ‘risk-hedge’ in asset management. We propose an effective algorithm (procedure) to check the amount of profit from the pair trading easily and automatically. We apply our algorithm to daily data of stocks in the first section of the Tokyo Stock Exchange, which is now available at the Yahoo! finance web site http://finance.yahoo.co.jp. In the algorithm, three distinct conditions, namely, starting (\(\theta \)), profit-taking (\(\varepsilon \)) and stop-loss (\(\Omega \)) conditions of transaction are automatically built into the system by evaluating the spread (gap) between the prices of two stocks for a given pair. Namely, we shall introduce three essential conditions to inform us when we should start the trading, when the spread between the stock prices satisfies the profit-taking conditions, etc. by making use of a very simple way. Numerical evaluations of the algorithm for the empirical data set are carried out for all possible pairs by changing the starting, profit-taking and stop-loss conditions to look for the best possible combination of the conditions.
This paper is organized as follows. In the next Sect. 2, we introduce several descriptions for the mathematical modeling of pairs trading and set-up for the empirical data analysis by defining various variables and quantities. Here we also mention that the pairs trading is described by a first-passage process (Redner 2001), and explain the difference between our study and arbitrage pricing theory (APT) (Gatev et al. 2006) which has highly developed in the research field of econometrics. In Sect. 3, we introduce several rules of the game for the trading. We define two relevant measurements to quantify the usefulness of pairs trading, namely, winning probability and profit rate. The concrete algorithm to carry out pairs trading automatically is also given in this section explicitly. The results of empirical data analysis are reported and argued in Sect. 4. The last section is devoted to summary.
2 Mathematical descriptions and set-up
In pairs trading, we first pick up two stocks having a large correlation in the past. A well-known historical example is the pair of coca cola and pepsi cola http://en.wikipedia.org/wiki/Pairs_trade. Then, we start the action when the spread (gap) between the two stocks’ prices increases up to some amount of the level (say, \(\theta \)), namely, we sell one increasing stock (say, the stock \(i\)) and buy another decreasing one (say, the stock \(j\)) at the time \(t_{<}^{(ij)}\). We might obtain the arbitrage as a profit (gain) \(g_{ij}\):
when the spread decreases to some amount of the level (say, \(\varepsilon (> \theta )\)) again due to the strong correlation between the stocks, and we buy the stock \(i\) and sell the stock \(j\) at time \(t_{>}^{(ij)}\). We should keep in mind that we used here the stock price normalized by the value itself at \(\tau \)-times before (we may say ‘rate’) as
where we defined \(p_{i}(t)\) as a price of the stock \(i\) at time \(t\). It is convenient for us to use the \(\gamma _{i}(t)\) (or \(\gamma _{i}(t)+1\)) instead of the price \(p_{i}(t)\) because we should treat the pairs changing in quite different ranges of price. Hence, we evaluate the spread between two stocks by means of the rate \(\gamma _{i}(t)\) which denotes how much percentage of the price increases (or decreases) from the value itself at \(\tau \)-times before. In our simulation, we choose \(\tau =\text{250 } \text{[days] }\). Using this treatment (2), one can use unified thresholds \((\theta , \varepsilon , \Omega )\) which are independent of the range of prices for all possible pairs.
2.1 Pairs trading as a first-passage process
Obviously, more relevant quantities are now not the prices themselves but the spreads for the prices of pairs. It might be helpful for us to notice that the process of the spread defined by
also produces a time series \(d_{ij}(0) \rightarrow d_{ij}(1) \rightarrow \cdots \rightarrow d_{ij}(t) \rightarrow \cdots \), which is described as a stochastic process.
In financial markets, the spread (in particular, the Bid-Ask spread) is one of the key quantities for double-auction systems [for instance, see (Ibuki and Inoue 2011)] and the spread between two stocks also plays an important role in pairs trading. Especially, it should be regarded as a first-passage process (or sometimes referred to as first-exit process) [see for instance (Redner 2001; Inoue and Sazuka 2010; Sazuka et al. 2009; Inoue and Sazuka 2007; Sazuka and Inoue 2007) for recent several applications to finance] with absorbing boundaries \((\theta , \varepsilon , \Omega )\), and the times \(t_{<}^{(ij)}\) and \(t_{>}^{(ij)}\) are regarded as first-passage times. Actually, \(t_{<}^{(ij)}\) and \(t_{>}^{(ij)}\) are the times \(t\) satisfying the following for the first time
respectively. More explicitly, these times are given by
The above argument was given for somewhat an ideal case, and of course, we might lose the money just as much as
where \(t_{*} (>t_{<})\) is a ‘termination time’ satisfying
This means that we should decide a ‘loss-cutting’ when the spread \(d_{ij} (t)\) does not shrink to the level \(\varepsilon \) and increases beyond the threshold \(\Omega \) at time \(t_{*}\). It should bear in mind that once we start the trading, we get a gain or lose, hence the above \(t_{>}^{(ij)}\) and \(t_{*}^{(ij)}\) are unified as ‘time for decision’ \(\hat{t}^{(ij)}\) by
where we defined a unit step function as
Namely, if a stochastic process \(d_{ij}(t)\) firstly reaches the threshold \(\varepsilon \), the time for decision is \(\hat{t}^{(ij)}=t_{>}^{(ij)}\), whereas if the \(d_{ij}(t)\) goes beyond the threshold \(\Omega \) before shrinking to the level \(\varepsilon \), we have \(\hat{t}^{(ij)}=t_{*}^{(ij)}\).
2.2 Correlation coefficient and volatility
We already mentioned that the pairs trading is based on the assumption that the spread between highly correlated two stocks might shrink shortly even if the two prices of the stocks exhibit a temporal spread. Taking into account the assumption, here we select the suitable pairs of stocks using the information about correlation coefficient (the Pearson estimator) for pairs to quantify the correlation:
and standard deviation (volatility):
It should be noted that we also use the definition of the logarithmic return of the rescaled price \(\gamma _{i}(t)+1=p_{i}(t)/p_{i}(t-\tau +1)\) [see Eq. (2)] for the duration \(\Delta t\) in (11) by
and moving average of the observable \(A(t)\) over the duration \(\tau \) as
At first glance, the definition of (11) for correlation coefficient might look like unusual because (13) accompanying with (2) implies that the correlation coefficient consists of the second derivative of prices. However, as we already mentioned, the ‘duration’ \(\tau =1 \text{[year] }\) appearing in (2) is quite longer than \(\Delta t =\text{1 } \text{[day] }\), namely, \(\tau \gg \Delta t\) is satisfied. Hence, the price difference in (2) could not regarded as the same derivative as in the derivative definition of (13). Therefore, the definition of correlation coefficient (11) is nothing but the conventional first derivative quantity.
From the view point of these quantities \(\{\rho _{ij}(t), \sigma _{i}(t)\}\), the possible pairs should be highly correlated and the standard deviation of each stock should take the value lying in some finite range. Namely, we impose the following condition for the candidates of the pairs at time \(t_{<}^{(ij)}\), that is, \(\rho _{ij} (t_{<}^{(ij)}) > \rho _{0}\) and \({\sigma _\text{min}} < \sigma _{i}(t_{<}^{(ij)}) < \sigma _{\text{max}}\). Thus, the total number of pairs for us to carry out the pair trading (from now on, we call such pairs as ‘active pairs’) is given explicitly as
where a factor \(\Theta ({\tau_\text{max}}-\hat{t}^{(ij)})\) means that we terminate the game if ‘time of decision’ \(\hat{t}^{(ij)}\) [see Eq. (9)] becomes longer than the whole playing time \({\tau _\text{max}}=\text{1 } \text{[year] }\). Therefore, the number of active pairs is dependent on the thresholds \((\theta ,\epsilon , \Omega )\), and we see the details of the dependence in Tables 1 and 2 under the condition \(\Omega =2\theta -\varepsilon \) (see \(N_{w}+N_{l} \equiv N_{(\theta ,\varepsilon )}\) in the tables).
From Fig. 1, we are also confirmed that the number pairs \(N\) satisfying \(\rho _{ij} (t_{<}^{(ij)}) > \rho _{0}\) and \({\sigma _\text{min}} < \sigma _{i}(t_{<}^{(ij)}) < {\sigma _\text{max}}\) is extremely smaller (\(N \sim 300\)) than the number of combinations for all possible \(n\) stocks, namely, \(N \ll n(n-1)/2=1,590,436\).
2.3 Minimal portfolio and APT
It might be helpful for us to notice that the pairs trading could be regarded as a ‘minimal portfolio’ and it can obtain the profit even for the case that the stock average decreases. Actually, it is possible for us to construct such ‘market neutral portfolio’ (Livan et al. 2012) as follows. Let us consider the return of the two stocks \(i\) and \(j\) which are described as
where parameters \(\beta _{i},\beta _{j}\) denote the so-called ‘market betas’ for the stocks \(i,j\), and \(\Delta \gamma _{m}(t)\) stands for the return of the stock average, namely,
and here we select them (\(\beta _{i},\beta _{j}\)) as positive values for simplicity. On the other hand, \(q_{i}(t),q_{j}(t)\) appearing in (16), (17) are residual parts (without any correlation with the stock average) of the returns of stocks \(i,j\). Then, let us assume that we take a short position (‘selling’ in future) of the stock \(j\) by volume \(r\) and a long position (‘buying’ in future) of the stock \(i\). For this action, we have the return of the portfolio as
Hence, obviously, the choice of the volume \(r\) as
leads to
which is independent of the market (the average stock \(\Delta \gamma _{m}(t)\)). We should notice that \(\Delta \gamma _{ij}(t) = \Delta \gamma _{i}(t) -r \Delta \gamma _{j}(t)\) is rewritten in terms of the profit as follows:
Therefore, in this sense, the profit \(\Delta \gamma _{ij}(t)\) is also independent of the market \(\Delta \gamma _{m}(t)\). This empirical fact might tell us the usefulness of pairs trading.
In the arbitrage pricing theory (APT) (Vidyamurthy 2004; Gatev et al. 2006), the condition for searching suitable pairs is the linear combination of ‘non-stationary’ time series \(\gamma _{i}(t)\) and \(\gamma _{j}(t)\), \(\gamma _{i}(t)-r \gamma _{j}(t)\) becomes co-integration, namely, it becomes ‘stationary’. Then, the quantity possesses the long-time equilibrium value \(\mu \) and we write
with a small deviation \(\omega ({>}0)\) from the mean \(\mu \). Therefore, we easily find
namely, we obtain the profit with a very small risk. Hence, the numerical checking for the stationarity of the linear combination \(\gamma _{i}(t)-r \gamma _{j}(t)\) by means of, for instance, exponentially fast decay of the auto-correlation function or various types of statistical test might be useful for us to select the possible pairs. However, it computationally cost us heavily for large-scale empirical data analysis. This is a reason why here we use the correlation coefficients and volatilities to investigate the active pairs instead of the co-integration-based analysis as given in the references (Vidyamurthy 2004; Whistler 2004; Gatev et al. 2006; Engle and Granger 1987; Stock and Watson 1988).
3 Procedures of empirical analysis and ‘rules of the game’
In this section, we explain rules of our game (trading) using the data set for the past three years 2010–2012 including 2009 to evaluate the quantities like correlation coefficient and volatility in 2010 by choosing \(\tau =250\) [days]. In the following, we explain how one evaluates the performance of pairs trading according to the rules.
3.1 A constraint for the thresholds
Obviously, the ability of the asset management by pairs trading is dependent on the choice of the thresholds \((\theta , \varepsilon , \Omega )\). Hence, we should investigate how much percentage of total active pairs can obtain a profit for a given set of \((\theta , \varepsilon , \Omega )\). To carry out the empirical analysis, we define the ratio between the profit \(d_{ij}(t_{<}^{(ij)})-d_{ij}(t_{>}^{(ij)}) ({>}0)\) and the loss \(d_{ij}(t_{*}^{(ij)})-d_{ij}(t_{<}^{(ij)}) ({>}0)\) for the marginal spread, namely, \(d_{ij}(t_{<}^{(ij)})=\theta , d_{ij}(t_{>}^{(ij)})=\varepsilon , d_{ij}(t_{*}^{(ij)})=\Omega \) as
where \(\alpha ({>}0)\) is a control parameter. It should be noted that for positive constants \(\delta ,\delta ^{'}\), the gap of the spreads (profit) \(d_{ij}(t_{<}^{(ij)})-d_{ij}(t_{>}^{(ij)})\) is written as
Therefore, the difference \(\theta -\varepsilon \) appearing in the denominator of Eq. (26) gives a lower bound of the profit. Although the numerator \(\Omega -\theta \) in (26) has no such an explicit meaning; however, implicitly it might be regarded as a ‘typical loss’ because the actually realized loss \(d_{ij}(t_{*}^{(ij)})-d_{ij}(t_{<}^{(ij)})\) fluctuates around the typical value and it is more likely to take a value which is close to \(\Omega -\theta \).
Hence, for \(\alpha >1\), the loss for the marginal spread \(\Omega -\theta \) is larger than the lowest possible profit once a transaction is taken place and vice versa for \(\alpha <1\). If we set \(\alpha <1\), it is more more likely to lose the money less than the lowest bound of the profit \(\theta -\varepsilon \), however, at the same time, it means that we easily lose due to the small gap between \(\Omega \) and \(\theta \). In other words, we might frequently lose with a small amount of losses. On the other hand, if we set \(\alpha >1\), we might hardly lose; however, once we lose, the total amount of the losses is quite large. Basically, it lies with traders to decide which to choose \(\alpha >1\) or \(\alpha <1\); however, here we set the marginal \(\alpha =1\) as a ‘neutral strategy’, that is
Thus, we have now only two thresholds \((\theta , \varepsilon )\) for our pairs trading, and the \(\Omega \) should be determined as a ‘slave variable’ from Eq. (28). Actually this constraint (28) can reduce our computational time to a numerically tractable revel. Under the condition (28), we sweep the thresholds \(\theta , \varepsilon \) as \(0.01\le \theta \le 0.09\), \(0.0\le \varepsilon \le \theta \) (\(d\theta =0.01\)) and \(0.1\le \theta \le 1.0\), \(0.0\le \varepsilon \le \theta \) (\(d\theta =0.1\)) in our numerical calculations (see Tables 1 and 2).
3.2 Observables
In order to investigate the performance of pairs trading quantitatively, we should observe several relevant performance measurements. As such observables, here we define the following wining probability as a function of the thresholds:
where we defined
where \(N_{w},N_{l}\) are numbers of wins and loses, respectively, and the conservation of the number of total active pairs
should hold (see the definition of \(N_{(\theta ,\varepsilon ,\Omega )}\) in (15) under the condition \(\Omega =2\theta -\varepsilon \)). The bracket \(\ll \cdots \gg \) appearing in (30) is defined by
We also define the profit rate:
which is a slightly different measurement from the winning probability \(p_{w}\). We should notice that we now consider the case with the constraint (28) and in this sense, the explicit dependences of \(p_{w}\) and \(\eta \) on \(\Omega \) are omitted in the above descriptions. We also keep in mind that \(d_{ij}(t_{<}^{(ij)})- d_{ij}(\hat{t}^{(ij)})\) takes a positive value if we make up accounts for taking the arbitrage at \(\hat{t}^{(ij)}=t_{>}^{(ij)}\). On the other hand, the \(d_{ij}(t_{<}^{(ij)})- d_{ij}(\hat{t}^{(ij)})\) becomes negative if we terminate the trading due to loss-cutting. Therefore, the above \(\eta \) denotes a total profit for a given set of the thresholds \((\theta ,\varepsilon ).\)
3.3 Algorithm
We shall list the concrete algorithm for our empirical study on the pairs trading as follows:
-
1.
We collect a pair of stocks \((i,j)\) from daily data for the past one year.
-
2.
Do the following procedures from \(t=0\) to \(t=\tau (=250: \,{\rm the}\, {\rm number}\, {\rm of}\,{\rm daily} \,{\rm data}\, {\rm for} \,{\rm one} \,{\rm year}).\)
-
(a)
Calculate \(\rho _{ij}(t)\) and \(\sigma _{i}(t), \sigma _{j}(t)\) to determine whether the pair \((i,j)\) satisfies the start condition.
Start condition:
-
If \(\sigma _\mathrm{min} < \sigma _{i}(t), \sigma _{j}(t) < \sigma _\mathrm{max}\) and \(\rho _{ij}(t) > \rho _{0}\) and \(d(t_{<}^{(ij)}) > \theta \), go to (c).
-
If not, go to (b).
-
-
(b)
\(t \leftarrow t+1\) and back to (a).
-
(c)
Termination condition:
\(t \leftarrow t+1\) and go to the termination condition.
-
If \(d_{ij}(t_{>}^{(ij)})<\varepsilon \) (we ‘win’), go to the next pairs \((k,l) \ne (i,j).\)
-
If not, go back to (c). If \(d_{ij}(t_{*}^{(ij)})> \Omega \), we ‘lose’. If \(\hat{t}^{(ij)} > \tau _\mathrm{max}\), go to 1.
-
-
(a)
Then, we repeat the above procedure for all possible pairs of 1784 stocks listed in the first section of the Tokyo Stock Exchange leading up to totally \({}_{1784}C_{2}=1,590,436\) pairs. We play our game according to the above algorithm for each pair, and if a pair \((i,j)\) passes their decision time \(\hat{t}^{(ij)}\) resulting in the profit:
or the loss:
we discard the pair \((i,j)\) and never ‘recycle’ the pair again for pairs trading. Of course, such treatment might be hardly accepted in realistic pairs trading because traders tend to use the same pairs as the one which gave them a profit in the past markets. Nevertheless, here we shall utilize this somewhat ‘artificial’ treatment to quantify the performance of pairs trading through the measurements \(p_{w}\) and \(\eta \) systematically. We also simplify the game by restricting ourselves to the case in which each trader always makes a trade by a unit volume. In the next section, we show several result of empirical data analysis.
4 Empirical data analysis
Here we show several empirical data analyses done for all possible pairs of 1784 stocks listed in the first section of the Tokyo Stock Exchange leading up to \({}_{1784}C_{2}=1,590,436\) pairs. The daily data sets are collected for the past four years 2009–2010 from the web site http://finance.yahoo.co.jp. In our empirical analysis, we set \(\tau =250\) [days], \(\rho _{0}=0.6, \sigma _\mathrm{min}=0.05, \sigma _\mathrm{max}=0.2.\)
4.1 Preliminary experiments
Before we show our main result, we provide the two empirical distributions for the correlation coefficients and volatilities, which might posses very useful information about selecting the active pairs. We also discuss the distribution of the first-passage time to quantify the processing time roughly.
4.1.1 Correlation coefficients and volatilities
In Fig. 1, we plot the distributions of \(\{\rho _{ij}(t)\}\) (left) and \(\{\sigma _{i}(t)\}\) (right) for the past four years (2009–2012).
From the left panel, we find that the distribution of correlation coefficients is apparently skewed for all years and the degree of skewness in 2011 is the highest among the four due to the great east Japan earthquake as reported in Ibuki et al. (2012a, b, 2013). Actually, we might observe that most of stocks in the multidimensional scaling plane shrink to a finite restricted region due to the strong correlations.
On the other hand, the distribution of the volatility is almost independent of the year and possess a peak around \(0.02\). We are confirmed from these empirical distributions that the choice of the system parameters \(\rho _{0}=0.6, \sigma _\mathrm{min}=0.05, \sigma _\mathrm{max}=0.2\) could be justified properly in the sense that the number of pairs \((i,j)\) satisfying the criteria \(\rho _{ij}(t) <\rho _{0}\) and \(\sigma _\mathrm{min} <\sigma _{i}(t) < \sigma _\mathrm{max}\) is not a vanishingly small fraction but reasonable number of pairs (\(\sim 300\)) can remain in the system.
4.1.2 First-passage times
We next show the distributions of the first-passage times for the data set in 2010. It should be noted that we observe the duration \(t\) as a first-passage time from the point \(t_{<}\) in time axis, hence, the distributions of the duration \(t\) are given for
respectively. We plot the results in Fig. 2.
From the left panel, we find that one confirms the lose by loss-cutting by 50 days after the start point \(t_{<}\) in most cases, and the decision is disclosed at latest by 250 days after the \(t_{<}\). On the other hand, we win within several days after the start and a single peak is actually located in the short time frame. These empirical findings tell us that in most cases, the spread between highly correlated two stocks actually shrink shortly even if the two prices of the stocks exhibit ‘mis-pricing’ leading up to a large spread temporally. Taking into account this fact, our findings also imply that the selection by correlation coefficients and volatilities works effectively to make the pairs trading useful.
4.2 Winning probability
As our main results, we first show the wining probability \(p_{w}\) as a function of thresholds \((\theta ,\varepsilon )\) defined by (30) in Fig. 3. To show it effectively, we display the results as three-dimensional plots with contours. From these panels, we find that the winning probability is unfortunately less than that of the ‘draw case’ \(p_{w}=0.5\) in most choices of the thresholds \((\theta ,\varepsilon )\). We also find that for a given \(\theta ({>}\varepsilon )\), the probability \(p_{w}\) is almost a monotonically increasing function of \(\theta \) in all the three years. This result is naturally accepted because the trader might take more careful actions on the starting of the pairs trading for a relatively large \(\theta \).
To see the result more carefully, we write the raw data produced by our analysis in Tables 1 (2012) and 2 (2011). From these two tables, we find that relatively higher winning probabilities \(p_{w} \sim 0.7\) are observed; however, for those cases, the number of wins \(N_{w}\) (or loses \(N_{l}\)) is small, and it should be more careful for us to evaluate the winning possibility of pairs trading from those limited data sets.
4.3 Profit rate
In order to consider the result obtained by our algorithm for pairs trading from a slightly different aspect, we plot the profit rate \(\eta \) given by (33) as a function of thresholds \((\theta ,\varepsilon )\) in Fig. 4. We clearly find that for almost all of the combinations \((\theta ,\varepsilon )\), one can obtain the positive profit rate \(\eta >0\), which means that our algorithm actually achieves almost risk-free asset management and it might be a justification of the usefulness of pairs trading.
At a glance, it seems that the result of the small winning probability \(p_{w}\) is inconsistent with that of the positive profit rate \(\eta >0\). However, the result can be possible to be obtained. To see it explicitly, let us assume that the pairs \((i,j)\) and \((k,l)\) lose and the pair \((m,n)\) wins for a specific choice of thresholds \((\theta _{+},\varepsilon _{+})\). Then, the wining probability is \(p_{w}=1/3\). However, the profits for these three pairs could satisfy the following inequality:
From the definition of the profit rate (33), we are immediately conformed as
Hence, an active pair producing a relatively large arbitrage can compensate the loss of wrong active pairs by choosing the threshold \((\theta ,\varepsilon )\) appropriately. It might be an ideal scenario for the pairs trading.
Finally, we should stress that the fact \(\eta > 0\) in most cases of thresholds \((\theta ,\varepsilon )\) implies that automatic pairs trading system could be constructed by applying our algorithm for all possible \((\theta ,\varepsilon )\) in parallel. However, it does not mean that we can always obtain positive profit ‘actually’. Our original motivation in this paper is just to examine (from the stochastic properties of spreads between two stocks) how much percentage of highly correlated pairs is suitable for the candidate in pairs trading in a specific market, namely, Tokyo Stock Exchange. In this sense, our result could not be used directly for practical trading. Nevertheless, as one can easily point out, we may pare down the candidates by introducing the additional transaction cost, and even for such a case, the game to calculate the winning probability etc. by regarding the trading as a mixture of first-passage processes might be useful.
4.4 Profit rate versus volatilities
In Fig. 5, we plot the profit rate \(\eta \) against the volatilities \(\sigma \) as a scattergram only for the winner pairs.
In this plot, we set the profit-taking threshold \(\varepsilon \) as
and vary the starting threshold \(\theta \) in the range of \(0.1 <\theta < 0.3\). For each active winner pair, we observe the profit rate \(\eta \) and the average volatility \(\sigma \) of the two stocks in each pair, and plot the set \((\sigma ,\eta )\) in the two-dimensional scattergram. From this figure, we find that there exist two distinct clusters (components) in the winner pairs, namely, the winner pairs giving us the profit rate typically as much as
for the range of \(0.1 \le \theta \le 0.3\), which are almost independent of \(\sigma \), and the winner pairs having the profit rate linearly dependent on the volatility \(\sigma \). The former is a low-risk group, whereas the latter is a high-risk group. The density of the points for the low-risk group in Fig. 5 is much higher than that of the high-risk group. Hence, we are confirmed that our selection procedure of the active pairs works effectively to manage the asset as safely as possible by reducing the risk which usually increases as the volatility grows.
Finally, it should be noted that as we discussed in Sect. 3.1, the value \(\theta -\varepsilon \) is a lower bound of the profit rate [see Eqs. (27) and (41)]. Therefore, in the above case, the lower bound for the profit rate \(\eta \) should be estimated for \(0.1 \le \theta \le 0.3\) as
The lowest value for the profit rate (42) is consistent with the actually observed lowest value in the scattergram shown in Fig. 5.
4.5 Examples of winner pairs
Finally, we shall list several examples of active pairs to win the game. Of course, we cannot list all of the winner pairs in this paper, hence, we here list only three pairs as examples, each of which includes SANYO SPECIAL STEEL Co. Ltd. (ID: 5481) and the corresponding partners are HITACHI METALS. Ltd. (ID: 5486), MITSUI MINING & SMELTING Co. Ltd. (ID: 5706) and PACIFIC METALS Co. Ltd. (ID: 5541). Namely, the following three pairs
actually won in our empirical analysis of the game. Note that each ID in the above expression corresponds to each identifier used in Yahoo! Finance http://finance.yahoo.co.jp. As we expected as an example of coca cola and pepsi cola http://en.wikipedia.org/wiki/Pairs_trade, these are all the same type of industry (the steel industry). We would like to stress that we should act with caution to trade using the above pairs because the pairs just won the game in which the pairs once got a profit in the past are never recycled in future. Therefore, we need much more extensive analysis for the above pairs to use them in practice.
5 Summary
In this paper, we proposed a very simple and effective algorithm to make the pairs trading easily and automatically. We applied our algorithm to daily data of stocks in the first section of the Tokyo Stock Exchange. Numerical evaluations of the algorithm for the empirical data set were carried out for all possible pairs by changing the starting (\(\theta \)), profit-taking (\(\varepsilon \)) and stop-loss (\(\Omega \)) conditions to look for the best possible combination of the conditions \((\theta ,\varepsilon ,\Omega )\). We found that for almost all of the combinations \((\theta ,\varepsilon )\) under the constraint \(\Omega =2\theta - \varepsilon \), one can obtain the positive profit rate \(\eta >0\), which means that our algorithm actually achieves almost risk-free asset management at least for the past three years (2010–2012) and it might be a justification of the usefulness of pairs trading. Finally, we showed several examples of active pairs to win the game. As we expected before, the pairs are all the same type of industry (for these examples, it is the steel industry). We should conclude that the fact \(\eta > 0\) in most cases of thresholds \((\theta ,\varepsilon )\) implies that automatic pairs trading system could be constructed by applying our algorithm for all possible \((\theta ,\varepsilon )\) in parallel way.
Of course, the result does not mean directly that we can always obtain positive profit in a practical pairs trading. Our aim in this paper was to examine how much percentage of highly correlated pairs is suitable for the candidate in pairs trading in a specific market, namely, Tokyo Stock Exchange. In this sense, our result could not be used directly for practical pairs trading. Nevertheless, we may pare down the candidates by introducing the additional transaction cost, and even for such a case, the game to calculate the winning probability etc. by regarding the trading as a mixture of first-passage processes might be useful.
We are planning to consider pairs listed in different stock markets, for instance, one is in Tokyo and the other is in NY. Then, of course, we should also consider the effect of the exchange rate. Those analyses might be addressed as our future study.
References
Bouchaud J-P, Potters M (2009) Theory of financial risk and derivative pricing: from statistical physics to risk management, 2nd edn. Cambridge University Press
Bouchaud J-P (2012) Crisis and collective socio-economic phenomena: simple models and challenges. J Stat Phys 149(6):969–1172
Do B, Faff R (2010) Does simple Pairs trading still work? Financ Anal J 66(4):83–95
Elliot RJ, van der Hoek J, Malcolm WP (2005) Pair trading. Quant Finance 5(3):271–276
Engle RF, Granger CW (1987) Co-integration and error-correction: representation. Estim Test Econ 55(2):251–276
Gatev EG, Goetzmann WN, Rouwenhorst KG (2006) Pairs trading: performance of a relative value arbitrage rule. Rev Financ Stud 19(3):797–827 see also NBER Working Papers 7032, National Bureau of Economic Research Inc. (1999)
Ibuki T, Inoue J (2011) Response of double-auction markets to instantaneous Selling-Buying signals with stochastic Bid-Ask spread. J Econ Interact Coord 6(2):93–120
Ibuki T, Higano S, Suzuki S, Inoue J (2012) Hierarchical information cascade: visualization and prediction of human collective behaviour at financial crisis by using stock-correlation. ASE Hum J 1(2):74–87
Ibuki T, Higano S, Suzuki S, Inoue J, Chakraborti A (2013) Statistical inference of co-movements of stocks during a financialcrisis. Journal of Physics: Conference Series 473, 012008 (16 pages)
Ibuki T, Suzuki S, Inoue J (2012) Cluster analysis and gaussian mixture estimation of correlated time-series by means of multi-dimensional scaling. Econophysics of systemic risk and network dynamics, New Economic Windows, vol 2013, pp 239–259, Springer (Italy-Milan)
Inoue I, Sazuka N (2007) Crossover between Levy and Gaussian regimes in first-passage processes. Phys Rev E 76:021111 9 pages
Inoue J, Sazuka N (2010) Queueing theoretical analysis of foreign currency exchange rates. Quant Finance 10(2):121–130 (No. 10)
Kaizoji T (2000) Speculative bubbles and crashes in stock markets: an interacting-agent model of speculative activity. Phys A 287:493
Livan G, Inoue J, Scalas E (2012) On the non-stationarity of financial time series: impact on optimal portfolio selection. J Stat Mech Theory Exp. doi:10.1088/1742-5468/2012/07/P07025
Mudchanatongsuk S (2008) Optimal pairs trading: a stochastic control approach. Proc Am Control Conf 2008:1035–1039
Murota M, Inoue J (2013) Characterizing financial crisis by means of the three states random field Ising model. Econophysics of Agent-based Models, New Economic Windows, vol 2014, pp 83–98, Springer (Italy-Milan)
Perlin MS (2009) Evaluation of pairs-trading strategy at the Brazilian financial market. J Deriv Hedge Funds 15:122–136
Redner S (2001) A guide to first-passage processes. Cambridge University Press, Cambridge
Sazuka N, Inoue J (2007) Fluctuations in time intervals of financial data from the view point of the Gini index. Phys A 383:49–53
Sazuka N, Inoue J, Scalas E (2009) The distribution of first-passage times and durations in FOREX and future markets. Phys A 388(14):2839–2853
Stock JH, Watson MW (1988) Testing for common trends. J Am Stat Assoc 83(404):1097–1107
Vidyamurthy G (2004) Pairs trading: quantitative methods and analysis. Wiley Finance
Whistler M (2004) Trading pairs: capturing profits and hedging risk with statistical arbitrage strategies. Wiley Trading
Acknowledgments
This work was financially supported by Grant-in-Aid for Scientific Research (C) of Japan Society for the Promotion of Science No. 2533027803 and Grant-in-Aid for Scientific Research (B) of Japan Society for the Promotion of Science No. 26282089. We were also supported by Grant-in-Aid for Scientific Research on Innovative Area No. 2512001313. One of the authors (JI) thanks Anirban Chakraborti for his useful comments on this study at the early stage.
Conflict of interest
On behalf of all authors, the corresponding author (JI) states that there is no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Murota, M., Inoue, Ji. Large-scale empirical study on pairs trading for all possible pairs of stocks listed in the first section of the Tokyo Stock Exchange. Evolut Inst Econ Rev 12, 61–79 (2015). https://doi.org/10.1007/s40844-015-0002-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40844-015-0002-5
Keywords
- Pairs trading
- Empirical data analysis
- Financial time series
- First-passage processes
- Tokyo Stock Exchange
- Econophysics