1 Introduction

In most basic terms, arbitrage refers to a trade opportunity which entails exploiting an inefficiency in the securities market by simultaneously buying and selling a security for a profit (Ehrman 2006). It is an opportunity of making a profit in securities market without risk and without net investment of capital (Delbean and Schachermayer 2006). Statistical arbitrage represents statistically significant deviations from historically identified average price relationships. It is taking counteracting positions in securities that are historically or mathematically associated, but taking these positions at times when their inter-relationship has been temporarily askewed.

Statistical Arbitrage or Stat Arb for short has a history of being a highly profitable algorithmic trading strategy for many big hedge funds and investment banks. It initiated in 1978 on the trading desk of Morgan Stanley, with a trade comprising of going long on General Motors (NYSE: GM) and short on Ford (NYSE: F). It is accredited to Wall Street quantitative analyst Nunzio Tartaglia, who with a team of mathematicians, computer scientists and physicists wanted to establish quantitative arbitrage strategies using state-of-the-art statistical tools. This rapidly extended to general pairs trading and then in 1990s further broadened to comprise long and short market neutral portfolios composed of groups of securities selected from stocks, ETFs, options, index futures and options on index and it can also include commodities, fixed income and other derivatives. Some examples of securities for statistical arbitrage strategies include cross-listed stocks, futures and stock indexes, pairs of stocks, foreign currency, portfolio of stocks and pairs of foreign currencies (Zacks 2011).

One of the most frequently referred to statistical arbitrage tools is the ‘pairs trading’ strategy. Ehrman (2006) defined pairs trading as, “a nondirectional, relative-value investment strategy that seeks to identify two companies with similar characteristics whose equity securities are currently trading at a price relationship that is outside their historical trading range. This investment strategy entails buying the undervalued security while short selling the overvalued security, thereby maintaining market neutrality”. Thus, it is fundamentally a market neutral strategy, which obtains its returns from the inter-relation between the performance of its long position and short position. Basically, the performance of the strategy is governed by the relative performance and not by the absolute performance of the securities involved.

Pairs trading presupposes that while markets might not be in equilibrium, with time they steer towards a rational equilibrium, and trader has an opportunity to take utmost benefit of the discrepancies from the equilibrium (Göncü and Akyildirim 2016). The theoretical explanation for the correlated or similar movement of security prices arises from Arbitrage Pricing Theory (APT). According to this theory, if two securities have absolutely the same risk factor exposures, then the expected return of the two securities for a stated time interval is the same.

Although the pairs trading technique of statistical arbitrage is commonly used by hedge funds and investment banks, empirical studies testing the profitability of this strategy are scarce especially in Indian context. The most significant and extensive study testing the pairs trading strategy and its profitability is that of Gatev et al. (2006) which focused on U.S. market. Their study matched stocks into pairs with minimum distance between normalized historical prices. The trading done through this strategy for the period 1962-2002 yielded annualized risk-adjusted returns of about 11%.

Many academic researches provide frameworks to carry out pairs trading. Vidyamurthy (2004) presented an implementation strategy based on cointegration by explaining the nexus between pricing theory and pairs trading theory. Many studies claimed that usage of statistical arbitrage strategies based on cointegration with optimal weight allocations would generate significant abnormal annual returns between 2.44% and 11.96% (Norden and Schaller 1993; Grobys 2012). Pole (2007) in a comprehensive review on statistical arbitrage and cointegration stated that the portfolio was anticipated to generate a positive return as valuations converge. The mean-reversion paradigm is specifically related with securities being transiently over or under priced in connection to one or more reference securities and market over-reaction to it (Lo and MacKinlay 1990).

Bossaerts and Green (1989) and Jagannathan and Viswanathan (1988) found that the pairs trading strategy might be supported within an equilibrium asset-pricing framework with nonstationary common factors. They opined that if the long and short components fluctuated with common nonstationary factors, then the prices of the component portfolios would be cointegrated and the pairs trading strategy would be expected to work. Statistical Arbitrage denies the securities market to be in any economic equilibrium, a crucial imperative for an efficient market (Jarrow 1988).

The existence of arbitrage opportunities is in contradiction to the efficient market hypothesis given by Fama (1970). The hypothesis states that securities are justly priced in the market and that arbitrage situations cannot prevail. Hence, it seems to be a paradox, ipso facto that the market exhibits efficiency on one hand and yet have arbitrage opportunities. The plausible explanation is that investors and traders who seek arbitrage opportunities are themselves the reason for the existence of efficient markets. By pursuing and exploiting market inefficiencies, they essentially exclude these inefficiencies in the mechanism by enabling prices to reach their appropriate level. Although the market is broadly efficient, there is generally a time lag in this efficiency which explains the existence of arbitrage opportunities. Therefore, it is not necessary to resolve the paradox.

India is a strong emerging economy. Huge investment and trading activities from all over the world are directed to Indian securities market. Global hedge funds are now prioritizing Indian securities market for their emerging market investments. Many strategies and algorithms are now implemented and tested in Indian securities market and statistical arbitrage has been a profitable quantitative strategy. There are many platforms available in India where one can trade using statistical arbitrage techniques. These platforms do not provide any documented evidence of profitability. The research aims at studying the statistical arbitrage opportunities that are present in Indian stock futures market through pairs trading and how profitable they are for the investors. It would help the investors to select best statistical arbitrage strategies for investment.

2 Literature Review and Research Gap

Financial markets have been a subject matter of wide research across the globe. Majority of studies have been done in advanced countries like U.S., Italy, Japan and others. However, of late, attempts to study the ‘emerging markets’ like India have been on the rise. A large body of literature is available on the various statistical arbitrage opportunities in securities market all over the world. One of the largest centers of statistical arbitrage in early 1980s, Morgan Stanley, defines statistical arbitrage as, “a model-based investment process, which aims to build long and short portfolios whose relative value is currently different from a theoretically or quantitatively predicted value.” The concept of pairs trading was first carried out at Morgan Stanley trading desk. The trading group was named as ‘Automated Proprietary Trading’ (or APT) by Nunzio Tartaglia and by 1987, it was generating USD 50 million of annual profits.

After a very optimistic start, the trading strategy however, started to show negative results and the group was disintegrated. But pairs trading persisted to allure practitioners (traders and investors) and academics equally. A huge and still developing body of study has concentrated on the execution of pairs trading in different securities markets (Nath 2003; Hong and Susmel 2003; Gatev et al. 2006; Perlin 2009; Do and Faff 2010, 2012).

Gatev et al. (1999) in their breakthrough study, found statistically significant returns from a simple pairs trading strategy applied in U.S. equity stock market in 1962-1997. The robustness of the results were confirmed with stringent estimates of transaction costs and it was concluded that pairs trading payoffs were not particularly related to a classical mean reversion effect. Gatev et al. (2006) then expanded their study till 2002 and found average annualized superior returns of up to 11%. The researchers opined that such excess returns from the pairs trading strategies were basically a reward to arbitrageurs for implementing the law of one price.

In Brazilian financial market, Perlin (2009) found that the pairs trading strategy had performed well while focusing on the fact that the positive superior returns were not the result of chance. Do and Faff (2010, 2012) used same methodology as Gatev’s et al. (1999) during the period of 2000–2009 and reported that the pair trading strategy was still profitable but with a downward trend. They attributed this trend to a worsening of arbitrage risks and increasing market efficiency. After incorporating the impact of trading costs they concluded that after 2002 pairs trading was largely a loss making proposition. Using high frequency pairs trading, Bowen et al. (2010) concluded that returns from pairs trading were highly susceptible to transaction costs and speed of execution. They also presented that the maximum of returns happen in the first and last hour of trading.

Mori and Ziobrowski (2011), in order to expand the scope of pairs trading, contrasted the implementation of pairs trading strategies in the U.S. REIT (real estate investment trust) market and the equity market during 1987–2008. It was found that the REIT market gave excess returns during 1993–2000, which vanished later on. Alsayed and McGroarty (2012) found pairs trading to be a significant price correcting mechanism in ADR (American depository receipt) market. UK stocks and ADRs were used to form pairs and it was found that pairs trading strategy generated 1.45% excess return vis-à-vis risk free return. In Finnish stock market, Broussard and Vaihekoski (2012) used various weighting structures and trade initiation conditions to test the profitability of pairs trading strategy. They established that the returns from a pairs trading strategy were not associated to market risk and the returns could be increased by reducing the threshold for opening a pair.

In Indian context, Aggarwal and Gupta (2015) carried out pairs trading using futures contracts traded on financial stock futures. They found that superior returns of 3.71% were generated by pairs trading portfolio using methodology used by Gatev et al. (2006) with holding period of maximum 2 weeks. Zargar and Kumar (2019) in their work “Opening Noise in the Indian Stock Market: Analysis at Individual Stock Level”, suggested that the traders in the Indian stock market should shorten the stock at the beginning of the day and lengthen the same stock at the end of the day when the overnight return (ONR) is positive, and when the same (ONR) is negative do the reverse, that is, buy at the beginning and sell at the end. Thomaidis and Kondakis (2006) studied stocks of Infosys and WIPRO from the Indian stock markets proposed an intelligent combination of neural network theory and financial statistics for the identification of statistical arbitrage opportunities in specific pairs of stocks.

The recent available literature on statistical arbitrage stressed on the optimization of different phases of the pairs trading strategy and on the control of the variables that influence its profitability. Interesting models were proposed by Huck (2010), Xie and Wu (2013) and Göncü and Akyildirim (2016) in this direction. Many researchers studied changes to the pairs trading methodology employed by Gatev et al. (2006). For example, Elliott et al. (2005) used a ‘Gaussian Markov chain model’ for measuring and exploiting the spread while Do et al. (2006) captured the spread using theoretical asset pricing methods and mean reversion.

While Vidyamurthy (2004), Burgess (2005) and Haque and Haque (2014) applied cointegration for pairs selection, Papadakis and Wisocky (2007) expanded the scope of methodology used by Gatev et al. (2006) by testing the influence of accounting information events (that is, analyst forecasts and earnings announcements) on the amount of returns of the pairs trading strategy.

Various academic researches and studies proposed frameworks to apply pairs trading rather than giving empirical evidence of the effectiveness of pairs trading. Vidyamurthy (2004) for example, explained the link between pairs trading and pricing theory proposing an execution strategy based on cointegration. Elliott et al. (2005) proposed an analytical framework for pairs trading by applying a mean reverting Gaussian Markov chain model. Huck (2010) applied multi-criteria decision tools for the selection of pairs for pairs trading. Huck and Afawubo (2015) using the components of the S&P 500 index examined the profitability of a pairs trading strategy using various pairs selection methods. It was found that after controlling for risk and transaction costs, the distance method generated insignificant superior returns. It was concluded that cointegration approach provided a stable, high and robust return.

The gap is clearly evident on the following counts—although many studies have been done about statistical arbitrage, the literature lacks a comprehensive study of stock futures using different methodologies of statistical arbitrage techniques in Indian context. The study extends the aforementioned literature by studying statistical arbitrage opportunities in Indian stock futures market and studying its profitability from retail investor’s point of view. While market efficiency may hold good in the long-run, practically markets do present a number of short-term opportunities and hence it is important to know how such opportunities can be encashed. This study intends to add value to the existing research knowledge on Indian securities markets and also provide valuable suggestions to practitioners. Thus, the study has both academic and real-life significance.

3 Data and Methodology

Pairs trading involves identifying two assets that enjoy a long-term relationship and taking long-short positions in case of divergence from the relationship. The methodology closely follows the work of Gatev et al. (2006) where in pairs trading is carried out over two periods—a pairs portfolio formation period, immediately followed by a trading period.

The population of the study includes stocks listed on National Stock Exchange (NSE) in cash and futures segment. The sample for the study is stocks traded on NSE futures segment. The basis of sample selection is that the Indian market does not permit interday naked short selling in the cash segment. Shorting in the spot market has to be done on an intraday basis only, while shorting a stock in the futures segment has no restrictions. The stocks have been selected for the analysis on the basis of trading volume and taking top quintile of all the stocks traded in the futures and option (F&O) segment. NSE offers stock future contracts with maturities of one, two and three months. However, only one-month future contracts have been used as volumes tend to be lower in longer maturities. The most liquid stocks are taken so that transaction costs would be lower. Daily stock future prices have been taken from NSE database. The list of stocks selected for study is given in the appendix.

The study is conducted over a time-period of Jan 1, 2011 to Dec 31, 2017, a time period that covers several market upturns and downturns, as well as relatively calm and volatile periods. There is a formation period (training period) of twelve methods in which pairs are identified and traded over next six-month period (trading period) on a rolling basis. The first formation period is from Jan 1, 2011 to Dec 31, 2011, followed by trading period from Jan 1, 2012 to Jun 30, 2012. The second formation period is from Jul 1, 2011 to Jun 30, 2012, followed by trading period from Jul 1, 2012 to Dec 31, 2012, and so on. Hence there are total 12 formation periods followed by 12 trading periods. The formation period and trading period of 12 and 6 months respectively are chosen based on prior studies.

The methodology used is concerned with three major points: identifying two securities for pairs trading, how to trigger a long/short market neutral position and performance measurement (Desai et al. 2012).

The pairs are formed using two criteria; distance approach (popularly known as Gatev methodology) and cointegration approach. Distance approach involves choosing a pair of securities that minimizes the sum of squared deviations between the two normalized price series. The pairs are ordered on the basis of distance. After all the securities have been paired according to least distance criterion, top 5, top 10 and top 20 pairs with the smallest historical distance measure are studied. Engle and Granger (1987) approach has been used to find cointegrating pair of securities. A regression is performed on the two non-stationary series of prices of stocks. Consider \(P_{1,t}\) and \(P_{2,t}\) are the prices of stocks 1 and 2 at time t; then the regression of \(P_{1,t}\) against \(P_{2,t}\) is:

$$P_{1,t} - P_{2,t} = \mu + \varepsilon_{t}$$
(1)

where \(\mu\) denotes an intercept. Potential cointegration between the two stocks is examined using the order of integration of the residuals, \(\varepsilon_{t}\) and stocks are cointegrated if the residuals of the regression are stationary. Cointegration has been performed pair-wise in R using ‘egcm-package’. It has been done using command ‘allpairs.egcm’ which performs cointegration tests for all pairs of securities in a list. The following Fig. 1 depicts the approaches of formation of pairs for the purpose of pairs trading.

Fig. 1
figure 1

Portfolio formation approaches for pairs trading

The analysis has been divided into parts: with and without sector restriction. Each category has been further analyzed including and excluding transaction costs. The Fig. 2 presents it:

Fig. 2
figure 2

Classification of pairs trading analysis

The pairs of stocks are formed with and without sector restriction. Sector means a group of securities that share a common line of business—or set of risk factors—and are therefore expected to perform similarly to one another (Hoffstein 2017). Without sector restriction, a stock can be paired with any other stock from the entire sample of stocks, but with sector restriction, both stocks of a pair belong to the same sector, say, banking or energy.

It involves Indian Rupee (INR) neutrality. It refers to buying equal amounts of long and short investments so that the INR risk is equal on each side of the portfolio. By employing INR neutrality in a market-neutral strategy, an investor ensures that her net INR exposure to market swings is zero. As future contracts are available in predetermined lot sizes, the number of lots gone long/short are such that the monetary value of the long and short positions is as close as possible.

The trading positions are put on the basis of standard deviation metric. The mutual mispricing between the two securities is captured by the notion of spread. The greater the spread, the higher the magnitude of mispricing and greater the profit potential. The Table 1 describes the various trading strategies employed.

Table 1 Pairs trading strategies

The stocks are traded using some assumptions and guidelines. The z-score is calculated for the log of price ratio of the pair using mean value and standard deviation of the log ratio during the formation period. Conditions are set on the z-score to trigger trade orders. The limits to take positions are + 2 or − 2. Once the conditions are attained, the overperforming stock is sold short and the underperforming one is bought long. Stop-loss (SL) and take-profit (TP) parameters are set at approximately 2% and 4% of the trading value to close the positions once a trade has been triggered. These parameters are defined to exit a trade once the loss or profit made on trade has reached the predefined SL or TP parameters. It has been set at these levels to avoid sub-par returns, prioritize safety of capital and increase the likelihood of execution of profitable trades.

The bid-ask spread is ignored. Initial margin is assumed to be 15% in all cases as per the common norms of various brokerage houses for futures trading and it is same for both buy and sell sides. Only one lot is traded at a time and signals are ignored while the trade is on. If prices do not cross before the end of the trading interval, gains or losses are calculated at the end of the last trading day of the trading interval. The duration of the strategy is six months on a rolling basis.

To measure profitability form the retail trader point of view, transaction costs are factored in. Transaction costs are averaged on the basis of the trading expenses tracked through various stock brokers in India (Zerodha, Angel Broking, Motilal Oswal, ICICI Direct, HDFC Securities). Transaction costs include brokerage charges, securities transaction tax, exchange transaction charges, SEBI charges, GST/service tax (before GST regime) and stamp duty charges. Sharpe Ratio (1994) has been used to calculate risk-adjusted returns. The Sharpe measure of portfolio performance (designated \(S\)) is stated as follows:

$$S_{i} = \frac{{\bar{R}_{i} - \bar{R}_{f} }}{{\sigma_{i} }}$$
(2)

where, \(\bar{R}_{i}\) = the average rate of return for pairs portfolio i during a specific time period, \(\bar{R}_{f}\) = the average rate of return on a risk-free investment during the same time period, \(\sigma_{i}\) = the standard deviation of the rate of return of Portfolio i during the time period.

The returns are then attributed to Fama–French three factor asset pricing model (Fama and French 1993). A model like the Fama and French model is used in this study to present a finer analysis of the underlying risk in financial assets. It not only takes into account the market risk as per Capital Asset Pricing Model (CAPM), it also considers the size and value risk. Using thousands of random stock portfolios, Fama and French conducted studies to test their model and found that when size and value factors are combined with the beta factor, they could then explain as much as 95% of the return in a diversified stock portfolio. The model is as follows:

$$R_{e} = \alpha + b\left( {R_{m} - R_{f} } \right) + S\left( {SMB} \right) + h\left( {HML} \right) + e$$
(3)

where, \(R_{e}\) = excess return on pairs portfolio, \(R_{m}\) = returns from market portfolio, \(R_{f}\) = risk free rate of return, \(SMB\) = size risk: the difference between small and big stocks, \(HML\) = value risk: the difference between value and growth stocks, \(\alpha\) = intercept showing value added by the pairs strategy, \(b\) = measure of exposure of pairs portfolio to systematic market risk, \(S\) = measure of exposure of pairs portfolio to size risk (SMB), \(h\) = measure of exposure of pairs portfolio to value risk (HML).

The standard methodology has been used for calculation of different risk factors (Agarwalla et al. 2013).

4 Results and Discussion

This section examines different dimensions:

  • Does pairs trading generate profits without transaction costs?

  • Are pairs trading profits robust to transaction costs?

  • Does one portfolio or one trading strategy dominate the others?

  • Does pairs trading outperform the market?

  • Is pairs trading still profitable after accounting for traditional risk factors?

4.1 Pairs Trading Without Sector Restriction

In the following section, the summary statistics for the pairs trading strategy have been provided. Tables 2, 3 and 4 summarize the average six-monthly returns for the pairs portfolios without sector restriction. They are unrestricted in the sense that the matching stocks do not necessarily belong to the same broad industry category.

Table 2 Descriptive statistics for six-monthly returns from profit and loss trading approach without sector restriction
Table 3 Descriptive statistics for six-monthly returns from standard deviation trading approach without sector restriction
Table 4 Descriptive statistics for six-monthly returns from combined trading approach without sector restriction

Table 2 shows the summary statistics of profit and loss trading approach where positions are triggered when log of price series deviate more than two standard deviation of the mean of log of price ratio during formation period in either direction and squared off when returns reach either of the stop loss or take profit parameters. The non-significant Shapiro–Wilk statistic implies the normal distribution of data. The portfolio of the five best pairs using distance approach has six-monthly average return of 16.8% (statistically significant at 5% level) without considering the transaction costs. However, after considering the transaction costs, the average six-monthly returns are reduced to 6.77%. As the number of pairs in a portfolio increases, the portfolio standard deviation falls. The returns of top 10, top 20 and cointegrating pairs portfolios are negatively skewed, suggesting pairs trading is profitable only in case of top 5 pairs portfolio when they are traded according to stop loss and take profit approach.

Table 3 shows the descriptive statistics of standard deviation trading approach where positions are triggered when log of price series deviate more than two standard deviation of the mean of log of price ratio during formation period in either direction and squared off when log of price series came down to one standard deviation of log of price ratio during formation period. The average six-monthly return of portfolio of cointegrating pairs is 5.03%, although not statistically significant. The average six-monthly return diminishes with the inclusion of more pairs in distance approach, from 4.34% with 5 pairs to -0.44% when a portfolio consists of 20 pairs.

Table 4 shows the summary statistics of combined trading approach where positions are triggered when log of price series deviate more than two standard deviation of the mean of log of price ratio during formation period in either direction and squared off when either log of price series came down to one standard deviation of log of price ratio during formation period, or returns reach any of the SL or TP parameters. Combined approach combines the characteristics of both profit and loss and standard deviation trading approach. The non-significant Shapiro–Wilk statistic implies the normal distribution of data. The portfolio of the five best pairs using distance approach has average six-monthly return of 16.66% (statistically significant at 5% level).

4.2 Pairs Trading with Sector Restriction

In the following section, returns on pairs trading are examined where stocks are matched only within the four sector groupings, viz., Automobiles, Financial Services, Energy and Metals. Tables 5, 6 and 7 summarize the average six-monthly returns for the pairs portfolios with sector restriction.

Table 5 Descriptive statistics for six-monthly returns from profit and loss trading approach with sector restriction
Table 6 Descriptive statistics for six-monthly returns from standard deviation trading approach with sector restriction
Table 7 Descriptive Statistics for six-monthly returns from combined trading approach with sector restriction

Table 5 summarizes the results when pairs are formed with the sector restriction and are traded using profit and loss trading approach (with stop loss and take profit parameters). The top 10 pairs portfolio formed using distance approach has positive average six-monthly return of 14.25% (statistically significant at 1%). The returns are positively skewed. Even after factoring in transaction costs, all four portfolios are giving positive average returns over a six-month period.

Table 6 summarizes the results when pairs are formed with the sector restriction and are traded using standard deviation trading approach. The top 10 pairs portfolio formed using distance approach has positive six-monthly return of 14.25% (statistically significant at 5%). The returns are positively skewed as well. This trading strategy has given statistically significant positive results (p < 0.05) even after factoring in transaction costs.

Table 7 summarizes the results when pairs are formed with the sector restriction and are traded using combined approach (with stop loss and take profit parameters). All the portfolios formed under distance approach are generating statistically significant positive six-monthly returns. Even after factoring in transaction costs, all four portfolios are giving positive average returns over a six-month period. The portfolio of cointegrating pairs has six-monthly return of 8.76% after transaction costs, though not statistically significant.

The pairs trading strategy is more profitable when pairs are formed with the sector restriction. The portfolios of cointegrating pairs and of top 5 and top 10 pairs under distance approach have sufficient returns that even after factoring in transaction costs, they remain fairly profitable. Portfolios formed using cointegration approach have given adequate positive average returns including transaction costs (though not statistically significant), both when pairs are formed without sector restriction and with sector restriction, and traded using various approaches.

4.3 Returns’ Attribution of Pairs Trading Strategies

In this section, an attempt has been made to choose the trading strategy that is most profitable for the retail trader with least risk involved. For this purpose, only those portfolios have been considered that have generated statistically significant positive returns. Sharpe ratio has been used to choose such portfolios. However, Sharpe ratio can be misleading when return distributions have negative skewness (Gatev et al. 2006). This is unlikely to be a concern, because in Table 8, all the portfolios considered are positively skewed. Further, to explore the contribution of different risk factors towards the returns from various strategies implementation, results from application of Fama–French (Fama and French 1993) asset pricing model have also been shown.

Table 8 Portfolios with risk adjusted returns and attribution of returns

As seen from Table 8, the pairs portfolio of top 10 pairs formed with sector restriction under distance approach and traded using standard deviation approach has produced the highest returns (statistically significant at 5%), both excluding and including transaction costs (average six-monthly return of 17.14% and 15.75% respectively). However, after considering standard deviation and Sharpe ratio, the pairs portfolio of top 10 pairs formed with sector restriction and traded using stop loss and take profit approach is generating the highest risk-adjusted returns excluding transaction costs (average six-monthly return of 14.25%).

After including transaction costs, returns remain positive in all strategies. However the best portfolio after factoring in transaction costs is portfolio of top 10 pairs formed with sector restriction and traded using standard deviation approach, which is statistically significant at 5% (average six-monthly return of 15.75%). On more conservative side, from retail trader perspective, the portfolio of top 10 pairs formed with sector restriction and traded through profit and loss trading strategy can also be considered (average six-monthly return of 5.83% including transaction costs), though not statistically significant.

The conclusion is that pairs trading is profitable with every trading strategy when pairs are formed with sector restriction, though only the portfolios of top 5 and top 10 pairs formed using distance approach are significantly profitable. The returns are large in an economical and statistical sense, and suggest that pairs trading is profitable.

The following Fig. 3 shows the graphical representation of six-monthly returns over all the trading periods when portfolio is formed by top 10 pairs with sector restriction under distance approach and traded using profit and loss and standard deviation trading strategies.

Fig. 3
figure 3

Graphical presentation of six-monthly returns over twelve trading periods

Risk-adjusted returns of the strategy are estimated by employing the Fama and French (1993) three factor asset pricing model and the results of the same are shown in Table 8. The foremost component is the intercept value α which signifies the value added by the strategy over and above the compensation for the risk factors considered in the model. Risk-adjusted alpha is positive for all strategies excluding transaction costs (though not statistically significant) over and above the market risk, size risk and value risk factors considered in the model. In line with the literature, the pairs trading strategy is market neutral: the exposure to the market is insignificant in nearly all cases. The exposure to the size has predictable sign but is statistically insignificant in most of the cases. Value risk factor also doesn’t have any significant contribution towards returns generated by various portfolios. Hence pairs trading strategy is able to provide positive excess returns net of transaction costs and these returns are because of the strategy itself, which has its roots in mean reversion strategy and law of one price.

5 Conclusion

This study makes an attempt to evaluate different methods to select and trade pairs. As Indian capital market does not allow interday naked short selling of stocks in the spot market, pairs trading has been implemented using one-month stock futures contracts. After controlling for risk factors, pairs trading exhibits positive alpha (though not statistically significant). The average annualized return is 37.22% excluding transaction costs when portfolio is composed of top 10 pairs formed with sector restriction using distance approach. Even after including transaction costs, returns can rise up to 34% on annualized basis (statistically significant at 5%). The returns are superior to the average annualized market return of 19.32%.

The findings confirming the profitability of pairs trading corroborates with previous researches in the area, such as those carried out by Gatev et al. (2006), Nath (2003), Do and Faff (2010, 2012), Broussard and Vaihekoski (2012) and Aggarwal and Gupta (2015).

The application of Fama and French (1993) asset pricing model has further implied that systematic market risk, size risk or the value risk are not helpful in explaining these returns. The study also shows that pairs trading returns are sensitive to the key parameters like the trading strategy used or composition of portfolios. The pairs trading is in essence a contrarian investment strategy which bets on the convergence of diverged prices (short term price reversal).

Indian financial markets are maturing and are attracting sizable retail and institutional investments. Advanced applications like the one presented in this study are of significance for the investors and investment consultants so that they can benefit from the different trading strategies as researched in this study.

Emerging economies like India have arbitrage opportunities because of lack of traders in futures market to exploit them. In other countries, stock lending has taken off in a big way and hence short selling is feasible in both spot and futures market. In India, Stock Lending and Borrowing Mechanism (SLBM) is not that popular. At the same time, ban on naked short selling in interday spot market and high transaction costs by margin provide for weaker arbitrage opportunities in Indian futures market. This unique environment provides for the possibility to generate significant positive returns from pairs trading.

Leading agencies like NSE and SEBI (Securities and Exchange Board of India) should promote and make SLBM more convenient to encourage arbitrage strategies in both spot and futures market. In its Investor Awareness Program, NSE should provide training in arbitrage trading. Indian stock markets are mature and big enough to provide space for pairs trading and hence its awareness should be increased. The overall quality of the underlying market needs to be sufficiently robust for various arbitrage strategies to work. Hence, SEBI should make sure that underlying markets remain transparent, prices are readily available, counterparties are easily accessible and volumes are sufficiently high.

There is a survivorship bias in the study as only those stocks have been selected that have been in trading in the futures segment for the entire trading period. This paper explores the arbitrage opportunities using daily stock future prices. Technological developments in computational modeling have also paved the way for the use of statistical arbitrage in high frequency trading with the machine learning methods, such as neural networks and genetic algorithms (Brogaard et al. 2014; Chaboud et al. 2014; Ortega and Khashanah 2014). The same can be applied to intraday data with five minutes’ interval (tick-by-tick data). It can also be extended to other securities classes like commodities and currencies. It can further be extended by including comparison with international stock indices in other emerging economies. In more recent years, statistical arbitrage has seen renewed interest in emerging areas as well such as bitcoin (Brandvold et al. 2015; Lintilhac and Tourin 2016), big data (McAfee and Brynjolfsson 2012; Nardo et al. 2016) and factor investing (Maeso and Martellini 2017).