Introduction

News about flows from actively managed funds into indexed funds make frequent headlines in the media. According to the ICI’s 2018 Annual Report,Footnote 1 at the end of 2017 of the $19.2 trillion invested in long-term funds 18% of this total was invested in indexed mutual funds. A further 17% of this total was invested in indexed ETFs. Together then, index mutual funds and ETFs comprised 35% of the total. These two investment categories accounted for just 15% of the total $9.5 trillion of long-term funds in 2007. These flows into indexed, or rules-based investment vehicles have been driven by a number of factors. One such factor is probably the plethora of independent academic papers that indicate that, on average, active fund managers do not produce returns in excess of common benchmarks sufficient to cover their fees. In one of their influential papers (Fama and French 2010), who examine the performance of around 5000 US actively managed mutual funds, ultimately come to the unflattering and perhaps understated conclusion that: “In terms of net returns to investors, performance is poor” (page 1921). Another, related, factor must be the increasing investor’ focus on fees. Indexed mutual funds and equivalent ETFs are often far cheaper than comparable actively managed funds. Finally, it is likely that investors have also come to the view that a much more significant decision for their investment portfolio is the split between broad asset classes, rather than the choice between one actively managed fund and another [see Brinson et al. (1986, 1991)] for early expositions of this point). The dispersion of performance amongst active managers benchmarked against the same financial market index is often relatively small; in sharp contrast, the dispersion in performance between, say, an 80/20 equity/bond portfolio and a 20/80 equity/bond portfolio is usually very substantial from 1 year to the next. Given this, the choice between having the equity or bond proportion of one’s portfolio managed on a discretionary or rules basis becomes, at best, a secondary consideration where cost will play a significant role in the decision.

Although the figures quoted above indicate that indexed and rules-based equity investing have become very popular, there is still a substantial proportion of equity assets managed on an active, discretionary basis, that is, actively managed funds still account for 65% of the total net assets in US long-term funds. The majority of investors and advisors therefore still appear to believe that active management is preferable to indexed fund management. There could be many reasons for such a preference, but one plausible reason for the preference is the knowledge that with a typical indexed investment, even if the manager is a competent index tracker, the return that the investor will receive is guaranteed to underperform the chosen benchmark by an amount equivalent to the fees paid (a fact that explains the intense fee competition between indexed fund providers).

In this paper we do not focus on the merits of discretionary versus non-discretionary fund management techniques, instead we take as our starting point the fact that many investors, including individual investors, their advisors and fund of fund managers that create funds for their investors comprising portfolios of funds, choose to invest their funds with active fund managers. Given their revealed preference for seeking out manager skill, are there any ex ante indicators that could help these investors choose a manager with skill? This question is the focus of this paper. Using the recursive portfolio technique due originally to Hendricks et al. (1993) and Carhart (1997) and a large sample of US-domiciled, US equity mutual funds over the period from 2000 to 2017, we test the usefulness of a range of indicators of future, benchmark-adjusted performance.

Our results provide some guide to the indicators that investors should search for in their attempt to identify a manager with skill. One set of indicators that we use is derived from benchmark-adjusted fund returns. For example, we find some evidence of positive performance persistence, but stronger evidence of negative performance persistence, that is, the worst performing funds in year t, tends to be the worst performing funds in year t + 1. We also find that funds with a high tracking error in year t tend to produce poor benchmark-adjusted performance in year t + 1. However, the ranking criteria that produces one of the most consistent post-ranking results is a fund’s information ratio. Funds with a high information ratio in year t, produce positive post-ranking benchmark-adjusted returns in year t + 1, while those with a low information ratio in year t tend to produce negative post-ranking benchmark-adjusted returns. We find the difference in the post-ranking performance of those funds with high information ratios in year t compared with those with low information ratios to be statistically significant. Finally, we find evidence to suggest that those funds whose benchmark-adjusted returns generate a t statistic on the fund’s alpha in year t which is low tends to produce lower benchmark-adjusted returns in year t + 1 than funds that produce an alpha with a high associated t statistic.

We used a second set of indicators to create portfolios that could best be described as being based upon a fund’s characteristics. For example, when we use a fund’s AUM at the end of year t as the ranking indicator we find only weak evidence to suggest that both very large and very small funds tend to produce negative, benchmark-adjusted performance in the following year. We also find evidence of the negative impact of high net inflows in year t, on performance in year t + 1 when we use annual net inflows as the ranking indicator. It seems plausible that net inflows could distract managers from their investment strategies and, indeed, make it difficult to implement those strategies, depending upon their sector focus. Our results also show that high levels of turnover can be a sign of future poor benchmark-adjusted performance. Finally, our results also show that high fund fees are a reliable predictor of poor benchmark-adjusted performance.

The rest of this paper is organised as follows. In “Literature review” section we provide a review of the performance evaluation literature; in “Methodology” and “Data description” sections we describe the methodology and data used in the paper; in “Results” section we describe the results of the recursive portfolio experiments; and finally “Summary and conclusions” section provides some concluding thoughts.

Literature review

There is a very large academic literature that seeks to evaluate the performance of mutual funds. In an early paper, Grossman and Stiglitz (1980) argue that in equilibrium, expected abnormal returns—which we can think of as in excess of some valid benchmark—should be positive. If these equilibrium returns were zero, then there would be no incentive to gather and process information about the securities issued by corporations. However, Berk and Green (2004) develop a model where low barriers to entry ensure that any short-term abnormal profits, which might arise from either manager skill or lower production costs, are competed away. In this model, knowledge of a fund manager’s past performance will not provide information about future performance. An important aspect of this model is the assumption that inflows to a fund are subject to diminishing returns, implying a negative relationship between fund size and performance. In practice these diseconomies of scale could arise from a range of factors, including capacity. For example, and as an extreme example, if a manager focuses on a very niche sector of an equity market, such as the small-cap South-East Asian tech sector, a strategy that works with a relatively small AUM may be impossible to implement as AUM grows given the size of the companies in that sector. As managers of funds reach these constraints they may need to adapt or change previously successful strategies, perhaps forcing them to purchase stocks that they might previously have preferred not to hold, ultimately leading to a diminution in performance. In the Berk and Green model then, there will be an optimal size for each fund in equilibrium. Lynch and Musto (2003) also develop a model of fund performance. In their model exogenous differences in manager skill are assumed, and in contrast to the Berk and Green model, there are no diminishing returns as AUM grows with inflows. In their model successful managers do not change their strategy, but unsuccessful managers do. As such we might expect to find that positive performance persist and that poor performance does not (since the strategies of poor performers’ changes over time).

The theoretical literature provides competing hypotheses about the performance of mutual funds. The empirical literature typically focusses on equity mutual funds (though there is a small number that focus on the performance of fixed income mutual funds [see for example Blake et al. (1993) for an early study, or Moneta (2015) for a more recent study)]. Although the empirical performance literature has developed over the years as the databases investigating many aspects of fund performance have become progressively richer, it can be separated into two, broad categories. The first we can refer to as ex post evaluation the second we can refer to as ex ante evaluation.Footnote 2 Both are relevant to the empirical work conducted in this paper.

Ex post performance evaluation

The ex post evaluation literature, as the phrase suggests, focuses on understanding the past performance of mutual funds. The first question that needs to be addressed is the format of the mutual fund returns to be analysed.

The dependent variable

Typically the literature examines returns in excess of a proxy for the risk free rate (Rf). The difference between the two represents the unconditional, nominal premium earned from investing in a mutual fund. Examining this premium is convenient for academic researchers since it allows them to embed the evaluation in formal asset pricing models easily, such as the CAPM. However, most long-only fund managers, and almost certainly all long-only managers of equity funds are not evaluated or compensated for their performance in excess of a cash benchmark. Instead they are typically evaluated by their employers and by their investors against the performance of a financial market benchmark, which can be either a financial market index or the average performance of a relevant peer group (though this latter approach to benchmarking has been largely replaced by the former, more objective approach). Suppose, for example, a fund markets itself as a ‘Small-Cap Value fund’, then the managers of that fund will be constrained in terms of the stocks that they can hold in the fund’s portfolio, as a result of regulatory requirements and other restrictions, including those placed on the manager by the fund’s sponsors and trustees [see Clarke et al. (2002)]. Chen et al. (2000) and Kosowski et al. (2006), all produce evidence to suggest that conclusions drawn about manager performance can be affected significantly by the choice of comparator. Arguably we should evaluate the performance of mutual funds by comparing their performance against their self-declared, primary benchmarks. Indeed Clare et al. (2015) argue that “style-appropriate, investible benchmarks not only provide a more parsimonious way of describing manager performance, but also better align performance evaluation with the real-world performance targets of fund managers”.

According to Christopherson et al. (2009) a benchmark should be a “naïve” representation of the set of investment opportunities that an investor (manager) can choose for their portfolio. The benchmark should comprise securities that are investible, so that investment in them is not restricted, that is, it should be based upon the market capitalisation of tradable shares (excluding those shares that are not freely available for purchase). Just as importantly, the benchmark’s construction methodology should be clear and transparent so that its composition is potentially replicable. The financial market indices produced by index providers such as FTSE-Russell, S&P and MSCI generally satisfy Chistopherson’s definition of an adequate benchmark.

Risk adjustment

Typically, the dependent variable in an assessment of the performance of mutual funds is the portfolio return in excess of the risk free rate (Rp − Rf). This excess return will be achieved by a manger over time by investing in securities with risks inherently larger than those represented by cash rates. Academic research nearly always focuses on comparing risk-adjusted returns. To do this requires the specification of model that adequately captures investment risks. These models allow us to decompose the performance of a mutual fund into three components: the returns that the market provides for being exposed to sources of priced risk premia; the skill of the fund manager; and finally, luck, either good or bad. Expression (1) presents a typical factor model used to evaluate fund performance. It is referred to as the Carhart four-factor model (Carhart 1997):

$$R_{{{\text{p}},t}} - R_{{{\text{f}},t}} = \alpha_{\text{p}} + \beta_{{1{\text{p}}}} {\text{ER}}_{{{\text{m}},t}} + \beta_{{2{\text{p}}}} {\text{SMB}}_{{{\text{m}},t}} + \beta_{{3{\text{p}}}} {\text{HML}}_{{{\text{m}},t}} + \beta_{{4{\text{p}}}} {\text{MOM}}_{{{\text{m}},t}} + \varepsilon_{{{\text{p}},t}}$$
(1)

where Rm is the excess return on a proxy for the market portfolio, SMB, HML and MOM are zero investment factor mimicking portfolios for size, book-to-market value and momentum effects, respectively. If we set β2 = β3 = β4 = 0, then we have the CAPM one-factor (or market) model, where \(\alpha_{\text{p}}\) represents Jensen’s alpha (Jensen 1968). If, instead, we set only β4p = 0 then the model become the Fama–French three-factor model derived from Fama and French (1992, 1993). In all three cases, \(\alpha_{\text{p}}\) represents a measure of manager skill; \(\varepsilon_{{{\text{p}},t}}\) represents the fund’s performance derived from luck; while the factors represent the reward derived from passive exposure to priced sources of systematic risk.

Prior to Fama and French’s papers (1992, 1993) ex post studies of mutual fund performance concentrated on calculating estimates of Jensen’s alpha. Using a sample of funds spanning the period from 1965 to 1984, Ippolito (1989) found evidence that these funds produced abnormal returns sufficient to cover fees. However, using the same sample period and adjusting for non-S&P500 stocks in the proxy for the market index Elton et al. (1993) find no evidence of positive, pre-expense alphas. Using a later sample period from 1971 to 1991, Malkiel (1995) found little evidence of positive and statistically significant positive alphas using gross-of-fee returns, and none when net-of-fee returns were used.

The use of multi-factor models, usually either the Fama–French three-factor model, but also the four-factor model of Carhart, have often been used to risk-adjust performance. However, the evidence provided in Fama and French (2010) based on the CRSP database of funds that invest primarily in US common stocks, for the scarcity of positive and significant ex post alpha using either the one-, three- or four-factor models is fairly typical of the findings when these multi-factor models are used to evaluate performance. In the paper, the authors report that the average alphas produced by the [industry], over the period from 1984 to 2006, net of fees, are generally negative and statistically significantly so. They conclude that: “In terms of net returns to investors, performance is poor” (page 1921).

However, some researchers believe that the use of factor models of the kind embedded in expression (1) is not an appropriate way to evaluate the performance of mutual funds. This belief is rooted in the realisation that the SML, HML and MOM factors—that are essentially, zero net wealth, arbitrage portfolios—are not replicable, and therefore do not represent a feasible investment set for a fund manager restricted to long-only investment positions. Furthermore, they do not form any part of the performance evaluation process undertaken by fund management companies when determining the remuneration of fund managers, or by most investors that invest in mutual funds either. The multi-factor performance evaluation models are not replicable, investible benchmarks and as such their use in performance evaluation raises the question as to what exactly is being evaluated when they are used. Indeed, Kothari and Warner (2001) and Angelidis et al. (2013) both argue that factor-based performance measures will be unable to identify any significant abnormal performance (positive or negative) if a fund’s objectives differ from the characteristics of the benchmark used to evaluate it. Chan et al. (2009) show that for various size and value styles of US equity mutual funds over the period 1989–2001, that evaluating performance using either factor-based models or models that use financial market benchmarks that are consistent with the stated styles of the funds that there the two approaches lead one to different conclusions about the sign of the excess returns generated by the funds in approximately one quarter of the cases examined. Using US equity mutual fund data from 1990 to 2011 Clare et al. (2015) show that the average performance of different sets of mutual funds, using style-consistent benchmarks is economically different from those obtained using the standard multi-factor models by as much as 0.34% per month in the case of Small-Cap Growth funds. All the small-cap style groups (growth, blend and value), on aggregate, generate significant superior performance (net of cost) when measured against their respective style benchmarks.

Skill versus luck

Another important development in the debate about whether there is convincing evidence of active fund manager skill is the literature that focuses on that group of managers that may seem to display ex post skill. Wermers (2003) provides evidence to suggest that a small number of managers may possess skill by taking “big bets”. However, the performance of active fund managers will almost certainly be determined by a combination of skill and luck, unlike the performance of a Grand Chess Master where the outcome will almost certainly be determined by skill, and unlike the performance of a gambler playing the roulette wheel where the outcome will be determined entirely by luck. Using a bootstrap methodology Kosowski et al. (2006) examine the performance of fund managers in the extreme tail of the performance distribution to determine whether the alphas identified as being positive and significant using factors models described above were truly achieved through skill rather than through luck. The authors find that the proportion of US equity mutual funds producing a positive and significant alpha in the period from 1975 to 1989 was between 30 and 40%, however, this fell to 5% in the period from 1990 to 2002. One possible explanation for the decline in the proportion of skill over this period is increased completion due to an expanding mutual fund sector and the emergence of the hedge fund sector. Using an alternative bootstrap technique, Fama and French (2010) find no evidence of superior, net alpha in the right-hand tail of the alpha distribution of a large sample of US equity fund managers. However, both Kosowski et al. (2006) and Fama and French (2010) both find evidence of funds producing significant negative alpha, that is, funds that are managed by managers that display statistically significant levels of negative skill. Using Fama and French’s (2010) bootstrap technique but using style-consistent benchmarks to determine whether any observed alpha produced by a sample of U.S. equity funds is due to skill or to luck, Clare et al. (2015) find that different segments of the market, ranging from large-cap growth to small-cap value, exhibit more skill than when alphas are calculated using factor-based models. Finally, using data on 842 UK equity mutual funds for the period 1972 to 2002, Cuthbertson et al. (2008) find that only 12 of the top 20 funds produced an alpha that could be said to be due to skill rather than to luck.

Market timing

As well as attempting to determine, ex post, the skill of a fund manager as measured by the constant (alpha) in an expression such as (1), researchers have also focused on another aspect of manager skill—market timing. The focus here is the manager’s ability to increase the risk profile of their fund ahead of a general rise in the market, and reduce that exposure ahead of a general fall in the market. By doing so successfully, other things equal, the manager will be able to add value to their portfolio over time. Researchers have tended to use three approaches to identify timing skill. The first two, due to Treynor and Mazuy (1966) and to Henriksson and Merton (1981):

$$R_{{{\text{p}},t}} - R_{{{\text{f}},t}} = \alpha_{\text{p}} + \theta_{\text{p}} R_{{{\text{m}},t}} + \gamma_{\text{p}} f\left[ {R_{{{\text{m}},t}} } \right] + \varepsilon_{{{\text{p}},t}}$$
(2)

where \(f[R_{m,t} ] = R_{m,t}^{2}\) for the Treynor and Mazuy model and \(f[R_{m,t} ] = [R_{m,t} ]^{ + }\) in the Henriksson and Merton model. In each case the model can be estimated using OLS, where evidence that \(\gamma_{\text{p}}\) is positive and statistically significant is taken as evidence that the manager has displayed positive market timing ability. However, the unconditional model in (2) cannot properly distinguish between the skill represented by \(\alpha_{\text{p}}\) and \(\gamma_{\text{p}}\) which led Ferson and Schadt (1996) to propose a conditional version of (2) as follows:

$$R_{{{\text{p}},t + 1}} - R_{{{\text{f}},t + 1}} = \, \alpha_{\text{p}} + \beta_{{0{\text{p}}}} R_{{{\text{m}},t + 1}} + \beta_{{1{\text{p}},t + 1}}^{'} \left( {z_{t} \times R_{{{\text{m}},t + 1}} } \right) + \gamma_{\text{p}} \left( {R_{{{\text{m}},t + 1}}^{2} } \right) + \varepsilon_{{{\text{p}},t + 1}}$$
(3)

where \(\beta_{{1{\text{p}}}}\) is a linear function of \(z_{t}\), a set of publicly available information.

Using both unconditional and conditional market timing models researchers have tended to find very little evidence of market timing abilities amongst fund managers when using monthly data. For example, using US mutual fund data Ferson and Schadt (1996), Wermers (2000) or Goetzmann et al. (2000) find no evidence of market timing ability amongst US equity fund managers; Fletcher (1995) finds little evidence amongst UK equity fund managers; and Clare et al. (2009) find no evidence amongst mangers of pooled UK equity pension fund managers. One explanation for the absence of market timing skill as defined by these tests is the “dilution effect” [see Warther (1995) and Kothari and Warner (2001)]. This effect refers to the inflows of investor money when markets are rising which leads to temporary increases in the cash holdings which in turn dilutes the returns generated by a fund that might otherwise be expected from a rising market.

Other influences on performance

Researchers have also explored the ex post relationship between fund performance and an increasing number of fund and fund manager characteristics. For example, Chevalier and Ellison (1999a) explored the relationship between performance and MBA v non-MBA qualified managers finding that the former tended to favour “glamour stocks”. Chevalier and Ellison (1999b) found a positive relationship between managers undergraduate SAT scores and risk-adjusted performance. A number of researchers have explored the relationship between fund manager gender and performance. Atkinson et al. (2003) find that there is no significant difference between male and female managers; while Bliss and Potter (2002) find that female managers of both US and international equity mutual funds tended to achieve higher raw returns than their male colleagues. Other research has focussed on fund manager age or experience [for example, Clare (2017)], the geographical location of a fund [for example, Otten and Bams (2007)]; the impact of the relationship between a fund’s performance and the nature of the “family” of funds that it belongs to [for example, Nanda, Wang and Zheng (2004)]; and the impact of fund fees (for example, Carhart 1997).

Ex ante performance evaluation

Identifying ex post fund manager skill using factor models, in general has produced results that indicate that true fund management skill is probably quite a rare commodity. Other authors have sought to examine the performance of fund managers from an ex ante perspective. More specifically, they have sought to identify whether there might be any fund-specific indicators or characteristics that would allow investors to pick “winning” funds. If positive fund manager performance persists, in other words if a fund manager “wins” regularly, then if we could just identify, ex ante, the features or characteristics of a likely winning fund it might be possible to avoid investing with average or “losing” fund managers.

Recursive portfolio construction techniques

The most intuitively appealing approach to identifying characteristics that are related to future fund performance is the recursive portfolio methodology due originally to Hendricks et al. (1993). This methodology involves ranking fund managers into fractiles using performance of characteristic data in year t, and then calculating either the equally or value-weighted returns on each fractile over year t = 1, and then repeating this process, using data in year t = 1 to rank the funds again and then calculating fractile returns over t + 2, etc. This process eventually produces a times series on each fractile from t + 1 to t + n where the data sample period spans t to t + n. Conventional performance evaluation metrics (such as the Sharpe ratio) and factor models can then used to evaluate the performance of the fractile portfolios, usually over the period t + 1 to t + n. If these fractiles produce statistically significant alphas, it implies that an investor employing an investment strategy that involves choosing a set of funds at the start of the year based on ex post criteria could lead to superior performance for that investor.

Ranking by fund alpha

Clearly there is potentially a very wide set of criteria that one could use to rank the performance of a mutual fund, the main limitation to this set is the richness of the available mutual databases. Most extant studies that have employed the recursive portfolio construction technique to evaluate mutual fund performance have used estimates of alpha as the ranking indicator, where these alphas have been calculated using a range of factor models.

Using a recursive ranking process a sample of 188 US equity funds, Elton et al. (1996) find evidence of a positive alpha of 0.5% pa for the top decile of funds and a negative alpha of between − 2.4% and − 5.4% pa for the bottom decile of funds depending upon the rebalancing period used. Brown and Goetzmann (1995) found a similar result. Using a much larger sample, Carhart (1997) found that the post-ranking alpha of all decile portfolios was negative, and significantly so for the bottom three deciles. These alphas were calculated using the Carhart four-factor model. Carhart’s results suggest that the more positive results documented by Elton et al. (1996) and Brown and Goetzmann (1995) were due to a failure to account for momentum stocks that may have been prominent components of the portfolios of mutual fund “winners”. Teo and Woo (2001) argue that the returns on mutual funds should be style-adjusted to account for the differences in approach. Using US equity mutual fund data from 1984 to 1999 and an alpha calculated from an augmented four-factor model to pre-rank funds they find that the top ranked mutual funds produced excess performance of around 3.6%. Their research also highlighted that negative return persistence was more significant, statistically and economically, than any positive performance persistence. Finally, as well as ranking by alpha, some researchers have advocated ranking by the t-ratio of the alpha since, for example, a top decile alpha in any year could be due to one, lucky return observation.

Other possible ranking criteria

Sorting mutual funds into fractiles based upon pre-ranking alphas is only one of a very wide range of ranking criteria that could be used to examine, ex ante, the performance of mutual funds. Indeed, a whole host of these possible contenders could be derived from fund returns, including: cumulative return in excess of a fund-specific benchmark; standard deviation of return; Sharpe ratio; information ratio; factor betas; maximum or minimum monthly/daily returns; etc. However, researchers have used the growing richness of mutual fund databases to explore the relationship between the performance of funds and a range of fund characteristics and information, usually from an ex post perspective. But these relationships could provide potentially useful information about future fund performance.

There exists a significant body of research that has looked at the relationship between fund returns and net flows into those funds. This fund flow literature is related to the fund performance persistence literature. If fund performance is persistent, and investors know this so that they are able to act on this information by switching investments continuously to higher performing funds, then we would expect to see positive (poor) performance to be followed by positive (negative) net inflows. However, there is very strong evidence to suggest a convex relationship between fund flows and fund performance, with superior performance attracting disproportionately large net inflows, while poor performance does not lead to commensurate net outflows [for example see Ippolito (1992) or Nanda, Wang and Zheng (2004)]. Zheng (1999) finds that funds that experience high cash inflows tend to outperform funds that experience low cash inflows. However, Sapp and Tiwari (2004) repeat Zheng’s experience and find that when they assess the post-ranking performance of funds ranked by net fund flow with a four-factor rather than the three-factor model used by Zheng, that funds that experience positive net inflows do not significantly outperform those with lower net inflows. Amihud and Goyenko (2013) estimate a multi-factor model of fund returns, and then use the R2 as predicting future fund performance. They argue that lower R2 indicates great selectivity and that as such this metric is able to predict better future fund performance. Their results are consistent with those of Cremers and Petajisto (2009) who find that their measure of active fund management—the divergence of the mutual fund’s holdings from the benchmark—is also a good predictor of future alpha.

Finally, in tangentially related research, Barber and Odean (2000) found that “too much trading can damage your wealth”. A number of researchers have explored the relationship between turnover and performance. Using fund data from 1975 to 1993 and the recursive portfolio construction methodology, Wermers (2000) did not find that prior knowledge of a fund’s turnover could help investors generate positive abnormal net returns. However, both Carhart (1997) and Chalmers et al. (1999) found that performance was negatively related to turnover (and to fees).

Methodology

The aim of this paper is to explore whether it is possible to identify variables that might act as indicators of either future, benchmark-adjusted positive or poor performance.

A number of studies have looked at the predictability of fund returns over 1 month, 1-year and 2-year horizons [for example see Grinblatt and Titman (1992), Hendricks et al. (1993), Goetzmann and Ibbotson (1994), Brown and Goetzmann (1995) and Elton, Gruber and Blake (1996)]. Using a comprehensive database comprising over 1800 US equity mutual funds spanning the period from 1963 to 1993, Carhart (1997) applies a recursive portfolio formation methodology, similar to the one applied in this paper. Amongst many results, Carhart finds evidence of persistence based on a 1 year, net return ranking period and that the persistence identified in earlier studies may have been due to momentum effects, not accounted for in the one- and three-factor performance evaluation models, persistence that was less significant with the application of the Carhart four-factor model.

To investigate the usefulness, or otherwise, of any set of indicators we apply the recursive portfolio methodology. This methodology involves the construction of portfolios of mutual funds formed recursively according to sets of possible indicators of future performance. The value of each indicator at the end of year t is used to rank funds, and then to create equally weighted deciles.Footnote 3 The monthly performance of these fractile portfolios is then monitored over year t + 1. At the end of year t + 1, the indicator value at the end of year t + 1 is used to re-rank the mutual funds. The monthly performance of these fractile portfolios is then monitored over year t + 2, and so on. This process eventually produces a time series of monthly portfolio returns of mutual funds from year t + 1 to tn, for the n years of the sample period. To evaluate the performance of the portfolios we report the average, net-of-fee benchmark-adjusted returns of each portfolio decile and the alpha and t-alpha of each portfolio decile based upon both the one- and four-factor model.

Data description

Fund returns

To establish whether it is possible to identify features of a fund’s performance that indicate that investors should invest in a fund or, alternatively, remove that fund from their portfolio, we collected data on the performance and other characteristics of a set of surviving and non-surviving mutual funds from the Morningstar database over the period from January 2000 to December 2017. The monthly return data was collected both gross and net of management fees. All of these funds were US equity funds, managed on an active basis and all were domiciled in the USA. Any duplicates were removed from the sample, by downloading only the oldest share class of each fund. Each fund in the sample was categorised as focussing on large-cap, mid-cap or small-cap stocks where these categories were decomposed further by Morningstar into Blend, Growth and Value styles. The range of fund styles therefore meant that the aims of each fund and the performance of each manager could vary quite substantially. As one way of evaluating the performance of such a wide range of fund management objectives, we also collected the monthly returns on a set of benchmarks. Each fund and its stated benchmark were combined to create benchmark-adjusted returns. We also calculated the returns on each fund in excess of cash, where we proxied the risk free cash rate as the yield on a 1 month US Treasury Bill. Finally, we calculated risk-adjusted returns using the single factor model, where this single factor was the excess return on the S&P 500 Composite index, and using the four-factor Carhart model. We collected the returns for the S&P Composite index and the cash rate from Thomson Financials’ DataStream service and the size and book-to-market factors from the Professor Kenneth French’s website.Footnote 4 Tables 1 and 2 contain some basic descriptive statistics for the performance of the funds in the database.

Table 1 Excess return descriptive statistics
Table 2 Benchmark-adjusted return descriptive statistics

Table 1 presents descriptive statistics about the performance of the full sample of funds in excess of the proxy for the risk free rate, in both gross and net-of-fee formats. The average gross and net excess returns on the full sample of 2159 funds is 0.48% and 0.38% per month, while the average gross and net monthly alphas is 0.27% and 0.17%, respectively, where these were calculated using the single factor model. The proportion of funds producing a positive and statistically significant alpha over this period, net of fees, is 13.4%. This is a relatively high proportion compared with previous studies. But there is a good reason for this. It has long been recognised that the single factor model, with an index like the S&P500 as the factor proxy, should probably be enhanced with other factors, particularly when the range of funds focus on different styles, for example, large-cap versus small-cap, and value versus growth stocks.

Table 1 also presents the performance statistics for sets of funds grouped according to their stated benchmark. Of the full sample, 653 are benchmarked against the S&P500. The proportion of statistically significant net alphas is 8.9%, compared to 13.4% for the full sample. A number of the categories have generated a very high proportion of net significant alphas. For example, 28.6% of the 175 funds that are benchmarked against the Russell 2000 generate a positive and significant net alpha. These results show that either the funds benchmarked against other indices are managed by more skilful managers, or that the S&P 500 is not able to capture the risks embodied in these funds appropriately.

Table 2 presents descriptive performance statistics based upon fund returns in excess of the fund’s benchmark for the 2133 of funds that had a stated benchmark in the Morningstar database. The proportion of funds producing a statistically significant, gross-of-fee significant alpha is 17.6%—just shy of one in five. However, once fees are accounted for this figure falls to 7.5% of this set of funds. Table 2 also shows that only 3.3% of the 242 funds benchmarked against the Russell 1000 Growth produce positive and significant net alphas. The sector that produces the most significant net alphas is benchmarked against the Russell 3000 Value index where 23.1% of the 26 funds demonstrate skill as defined by alpha generated from a single factor model. The much lower proportions achieving positive net alphas are more in keeping with the results reported by other researchers using a similar set of funds [for example see Clare et al. (2015)].

The performance statistics presented in Tables 1 and 2 indicate that manager skill may not be distributed evenly across the US equity mutual fund sector. They also confirm the results of others that around half of the small proportion of funds that do appear to produce a statistically significant alpha do not appear to do so once fees are taken into account. This suggests that fund management companies are charging too high a fee for the skill of their managers. Furthermore, our results suggest that fund managers also struggle to produce positive and significant alphas, net of fees when we benchmark-adjust fund returns using the funds’ self-declared benchmarks.

The ranking indicators

To establish whether there may be clues to the future performance of a fund based on prior information, we need to establish a set of indicators. These indicators will form the basis of the recursive portfolio evaluation technique described above. The following set of indicators were calculated from the returns data described in Table 1, these are:

  • Average Benchmark-adjusted Return (Rp − Rb)

  • Tracking Error (SD, Rp − Rb)

  • Information Ratio (Rp − Rb)/SD, (Rp − Rb))

  • Minimum Monthly Return

  • t-alpha

Using a fund’s Average Benchmark-adjusted Return as a way of ranking funds, will demonstrate whether there is persistence in fund returns, that is, the performance of the portfolios comprised of the highest benchmark-adjusted returns will tend to be high if positive performance persistence exists within the sample. Conversely the subsequent performance of the portfolios comprised of the lowest benchmark-adjusted returns will tend to be low if negative performance persistence is present in the sample. The Tracking Error indicator is calculated using monthly benchmark-adjusted returns over a full calendar year. This variable represents a measure of the off-benchmark bets that a manager has taken over the course of a year. Of course a tracking error of zero indicates that the fund has probably been run like an index tracking fund. The Information Ratio indicator is the ratio of average benchmark-adjusted returns to the fund’s Tracking Error. This reward-to-risk ratio may tell us more about the future skill of the manager than either the benchmark-adjusted return or the tracking error since, for example, a high tracking error may be a sign of poor risk control if returns are low, but a sign of manager skill if achieved returns are commensurately high. The Minimum Monthly Return is simply the lowest, benchmark-adjusted return that fund i generates over the course of a calendar year in any single month. If a fund experiences a very large benchmark-adjusted loss in any single month, this could be a sign that the risk management of this fund is poor and therefore that the fund manager’s skill is also low. Finally, we estimate the alpha of each fund, using a single factor model, but rather than using the alpha as the ranking indicator we instead use the t statistic of the alpha. A high alpha may have been achieved by luck, but a high t-alpha may indicate that the related alpha has been achieved with some consistency and is therefore could be a better measure of future manager skill.

The second set of potential performance indicators were collected from Morningstar, and represent characteristics of the funds in the sample. This set comprises the following indicators:

  • Fund Size (AUM)

  • Annual Net Flow

  • Annual Turnover Ratio

  • Average Number of Stocks

  • Fund fees

Fund Size is the AUM of fund i at the end of year t. It is possible that the AUM of a fund may have some impact on its future performance, in particular managers of large funds may encounter capacity constraints in employing their strategies as the fund gets larger. Annual Net Flow is the total net flow into fund i in year t. A healthy net inflow of funds could conceivably have either a positive impact on future performance by giving managers the liquidity to adapt and develop their strategies or, alternatively, high net inflows may be a distraction as managers seek to find new investments for the new funds, rather than focusing on the monitoring of existing strategies. Annual Turnover Ratio is the annual turnover of fund i over year t as a proportion of AUM. Managers that turnover their portfolios frequently will incur higher transactions costs and, additionally, high turnover may be a sign of a poorly focused approach to strategy that may change frequently. Finally, there may be a relationship between the Average Numbers of Stocks held by a fund and its subsequent performance. A fund, focussed on a small number of stocks, may be a good indicator of manager conviction and or confidence. Conversely, too few stocks could lead to higher tracking error and a worse risk-adjusted performance in the absence of adequate diversification. To investigate this relationship, we use the average number of fund holdings as an indicator. Finally, we rank funds by their annual fee, calculated by subtracting the fund’s net-of-fee return from its gross-of-fee return in year t. A recent Morningstar study (Morningstar 2016) found evidence to suggest that there existed a negative relationship between fund fees and subsequent fund performance.

Results

In this section of the paper, we present the performance evaluation results using the recursive portfolio technique described in Sect. 3. We begin by presenting the results of pre-ranking portfolios using indicators based upon net-of-fee fund returns in excess of stated benchmarks, and then report equivalent results where the decile portfolios were pre-ranked using other fund-related data. Studies of this kind are normally based on net-of-fee returns in excess of the risk free rate, but given the variation in performance from category to category reported in Tables 1 and 2 we believe that it is important to report benchmark-adjusted returns.

Figures 1, 2, 3, 4 and 5 present the annualised benchmark-adjusted performance of the decile portfolios based upon ex post returns, while Tables 3, 4, 5, 6 and 7 present the related portfolio ranking performance statistics. The performance statistics for the equally weighted portfolios of mutual funds are based upon the sample from January 2001 to December 2017, where the returns are again benchmark-adjusted. For each table, we present the annualised benchmark-adjusted return of the portfolios over this sample period, the annualised alphas and the t statistics produced by both a one-factor model and the Carhart (1997) four-factor model. For each ranking criteria, we present the performance statistics for portfolios formed into deciles (D1–D10). The size of the sample of funds used in this study meant that the average number of funds in each decile ranged from around 130 to 200. Given this, we also look further into the ranking tails by also forming equally weighted portfolios of funds comprising the top and bottom twenty ranked funds according to each ranking criteria (referred to in the tables as T20 and B20 respectively).

Fig. 1
figure 1

This figure presents the annualised, net-of-fee, benchmark-adjusted performance of portfolios of US-domiciled mutual funds. The statistics are based upon equally weighted, benchmark-adjusted returns. The portfolios were formed using the recursive portfolio technique described in the text, where the ranking indicator was the average benchmark-adjusted return

Fig. 2
figure 2

This figure presents the annualised, net-of-fee, benchmark-adjusted performance of portfolios of US-domiciled mutual funds. The statistics are based upon equally weighted, benchmark-adjusted returns. The portfolios were formed using the recursive portfolio technique described in the text, where the ranking indicator was each fund’s tracking error

Fig. 3
figure 3

This figure presents the annualised, net-of-fee, benchmark-adjusted performance of portfolios of US-domiciled mutual funds. The statistics are based upon equally weighted, benchmark-adjusted returns. The portfolios were formed using the recursive portfolio technique described in the text, where the ranking indicator was each fund’s information ratio

Fig. 4
figure 4

This figure presents the annualised, net-of-fee, benchmark-adjusted performance of portfolios of US-domiciled mutual funds. The statistics are based upon equally weighted, benchmark-adjusted returns. The portfolios were formed using the recursive portfolio technique described in the text, where the ranking indicator was each fund’s worst monthly performance

Fig. 5
figure 5

This figure presents the annualised, net-of-fee, benchmark-adjusted performance of portfolios of US-domiciled mutual funds. The statistics are based upon equally weighted, benchmark-adjusted returns. The portfolios were formed using the recursive portfolio technique described in the text, where the ranking indicator was the t-ratio on each fund’s alpha, generated by a one-factor model

Table 3 Recursive portfolio performance statistics: benchmark-adjusted returns
Table 4 Recursive portfolio performance statistics: tracking error
Table 5 Recursive portfolio performance statistics—information ratio
Table 6 Recursive portfolio performance statistics—worst within-month return
Table 7 Recursive portfolio performance statistics: t-alpha

Returns-based rankings

Figure 1 presents the annualised returns of the portfolios of mutual funds ranked by average, benchmark-adjusted return in year t and then monitored over year t + 1, with this process repeated annually. The chart shows some evidence of positive performance persistence, but more evidence of negative performance persistence. Table 3 presents the related performance statistics of the portfolios. Portfolio D1 produces an average, benchmark-adjusted outperformance of 0.32% per annum, while portfolio D10 produces an annualised underperformance of 1.09% pa. These results therefore provide evidence of performance persistence at both at the top and bottom of the performance spectrum, as documented by others (see “Methodology” section). Further into the tail, we find that the annualised benchmark-adjusted performance of B20 is − 4.08%, a performance that is not quite significant at the 90% level having an associated p value of 0.12.

Figure 2 shows the average returns of deciles 1–10, ordered by ex post tracking error. The figure indicates that funds with high levels of tracking error subsequently tend to underperform. Table 4 presents the alphas and t-alphas produced by these portfolios. The decile portfolios generally produce average benchmark-adjusted returns that are close to zero, with the exception of Decile 1, where the annualised underperformance of those funds with high tracking error is − 1.07% pa. When we delve deeper into this decile, we find very high underperformance for T20 of − 6.88% per annum, a result that is significant at the 95% level of confidence. This result indicates clearly that investors should consider selling their holdings in any fund that produces a high tracking error over any year. However, the ranking of the performance deciles indicates that the average performance of those funds with the lowest tracking errors, Decile 10, also results in poor subsequent performance too, though not as poor as for those with the highest tracking errors that comprise Decile 1. It is possible that in taking fewer off-benchmark bets in a year (resulting in a low tracking error) that performance suffered as a result and that this then leads to increased risk taking in the subsequent year as a way of improving performance, which in turn lead to poor performance. The alphas for all of the deciles, along with B20, are found not to be statistically significant.

Figure 3 shows a fairly clear decline in post-ranking performance from portfolios T20–B20 when we pre-rank funds by information ratio. We find the ordering of the performance of these portfolios to be significant at the 99% level of confidence, which indicates that the process may have some value to investors. Generally, funds with a high information ratio subsequently tend to produce higher benchmark-adjusted returns. Portfolio T20 produces an annualised benchmark-adjusted return of 0.59%, while portfolio D1 produced an annualised return of 0.41%. At the other end of the scale portfolios, D8, D9, D10 and B20 all produced negative annualised benchmark-adjusted returns. Table 5 shows that the alphas are not statistically different from zero even though they are meaningful economically at the top and bottom ends of the distribution of portfolio returns. However, a test of the significance of the difference between the return on portfolios D1 and D10 indicates that there is a difference at almost the 90% confidence interval with an associated p value of 0.13, while the same test applied to the differences in average, benchmark-adjusted returns for portfolios T20 and B20 indicate a significant difference at the 95% confidence level.

It is possible that any manager that has a lax approach to the risk management of their portfolio might experience some occasional bad return outcomes this, in turn, may be an indicator of future poor performance. To test this hypothesis for each fund in the sample, for each year we calculated each fund’s worst within-month benchmark-adjusted performance. We then used these “worst month” returns to rank funds, so that the top ten per cent of funds with the “best–worst” months’ performance in any year were placed in decile 1, etc. The results are reported in Fig. 4 and in Table 6. On the whole the results indicate that the use of the worst month as a way of ranking funds is not very effective since the annualised returns are all very close to 0.00%. However, portfolio B20 produces a very negative annualised benchmark return of − 4.97%. We find the alpha produced by B20 to be statistically significant at at least the 95% level of confidence when we apply both the one- and four-factor model. This result suggests that a fund that produces a very bad month in year t should be avoided in year t + 1.

Finally, in the skill and luck literature [see for example Fama and French (2010)], skill is assessed using the t statistic on alpha generated by a factor model, rather than by fund alpha. We estimate the alpha for each fund over each calendar year in the sample, as follows:

$$R_{{{\text{p}},t}} - R_{{{\text{b}},t}} = \alpha_{\text{p}} + \beta_{{0{\text{p}}}} R_{{{\text{m}},t}} + \varepsilon_{{{\text{p}},t}}$$
(4)

We then use the t statistic on each fund’s alpha generated in year t to create mutual fund portfolios, monitoring their performance over year t + 1, and rebalancing annually. Figure 7 shows a fairly monotonic decline in performance as we move from portfolio T20 to B20, which implies that funds with a high t-alpha tend subsequently produce a positive benchmark-adjusted return and tend to outperform those that produce a lower t-alpha. We find the decline in post-ranking portfolio performance to be significant at the 99% level of confidence—which again indicates that this approach to ranking funds may be of some value. Table 7 presents the performance statistics produced by this portfolio construction exercise. Portfolio D10 produces an annualised performance of − 1.12%, while portfolio B20 produce an annualised performance of − 1.95%. Although the alphas produced by these portfolios are not significant, and the difference in performance between D1 and D10 is not quite statistically significant, the performance difference between T20 and B20 is found to be statistically significant at the 90% confidence interval.

Ranking by other fund characteristics

Figures 6, 7, 8 and 9 and Tables 8, 9, 10 and 11 provide the performance of mutual fund portfolios formed using other criteria.

Fig. 6
figure 6

This figure presents the annualised, net-of-fee, benchmark-adjusted performance of portfolios of US-domiciled mutual funds. The statistics are based upon equally weighted, benchmark-adjusted returns. The portfolios were formed using the recursive portfolio technique described in the text, where the ranking indicator was each fund’s AUM

Fig. 7
figure 7

This figure presents the annualised, net-of-fee, benchmark-adjusted performance of portfolios of US-domiciled mutual funds. The statistics are based upon equally weighted, benchmark-adjusted returns. The portfolios were formed using the recursive portfolio technique described in the text, where the ranking indicator was each fund’s annual net inflows

Fig. 8
figure 8

This figure presents the annualised, net-of-fee, benchmark-adjusted performance of portfolios of US-domiciled mutual funds. The statistics are based upon equally weighted, benchmark-adjusted returns. The portfolios were formed using the recursive portfolio technique described in the text, where the ranking indicator was each fund’s annual turnover

Fig. 9
figure 9

This figure presents the annualised, net-of-fee, benchmark-adjusted performance of portfolios of US-domiciled mutual funds. The statistics are based upon equally weighted, benchmark-adjusted returns. The portfolios were formed using the recursive portfolio technique described in the text, where the ranking indicator was each fund’s annual holdings

Table 8 Recursive portfolio performance statistics: AUM
Fig. 10
figure 10

This figure presents the annualised, net-of-fee, benchmark-adjusted performance of portfolios of US-domiciled mutual funds. The statistics are based upon equally weighted, benchmark-adjusted returns. The portfolios were formed using the recursive portfolio technique described in the text, where the ranking indicator was each fund’s annual fees

Figure 6 and Table 8 report the results of constructing portfolios according to fund AUM. The results show that the average performance of all of the portfolios from T10 to D6 is negative. This indicates that larger funds tend to underperform their benchmarks. Portfolios D7, D8 and D10 produce positive though still small excess returns. However, we do not find the alphas produced by any of these portfolios to be statistically significant, indeed, few of the alphas are even economically significant. Intriguingly, although the average return on D10 is positive, portfolio B20 produces a negative average excess returns which is economically significant if not statistically significant. The funds in these portfolios are essentially the very smallest funds in the fund universe, and they seem to perform as badly as the very largest funds captured in portfolio T20. Perhaps at both extremes the funds suffer from a lack of fund manager attention? The small funds, because they are just not significant from a fee generation perspective, and the larger funds perhaps because the manager does not want to rock the fee-generating boat?

Figure 7 shows a fairly smooth appreciation in post-ranking performance as we move from portfolio T20 to B20 when we rank funds according to annual net inflows. We find this improvement in performance as net flows decline to be significant at the 99% level of confidence. In Table 9, we present the related performance statistics. Portfolios D1–D5 produce negative annualised returns that are economically significant though not statistically so. The negative average returns produced by portfolio T20 are particularly large at 1.18%. These results suggest that strong net inflows tend to lead to below benchmark performance in subsequent months. Portfolios D6–D9 all produce positive average returns, though these are arguably only economically significant for D6—a portfolio that comprises those funds that receive neither extremely high or low levels of net inflow. However, once again the alphas of these funds are found to be insignificantly different from zero. Portfolio D10 produces a very small average, negative return, but portfolio B20 produces a positive excess return of 0.64% pa which is economically significant. We find the performance difference between portfolios T20 and B20 to be significant at just under the 95% confidence interval. It seems plausible then that net inflows could be distracting managers from their investment strategies and, indeed, making it difficult for them to implement those strategies. Our results are also consistent with those of Sapp and Tiwari (2004) who, using data from a much earlier period, find that the post-ranking performance of funds that experience relatively high levels of net inflows do not subsequently outperform those with lower net inflows.

Table 9 Recursive portfolio performance statistics—annual net flows

An anonymous referee suggested that small-cap funds might suffer disproportionately from net inflows, as a consequence of the additional constraints involved in investing in small-cap stocks. To test this hypothesis we recalculated the fractile performance portfolios using the recursive portfolio formation fist, using only small-cap funds and then using only large-cap funds. We then subtracted the benchmark-adjusted returns of the large-cap portfolio from the equivalent returns generated by the small-cap portfolio. The results are shown in the last column of Table 9. They show that there is a performance difference between the two subsets of funds. The portfolios comprising the small-cap funds tend to underperform their large-cap equivalents. For example, the table shows that the small-cap D1 portfolio underperforms the large-cap D1 portfolio by 0.03% per month, thus providing tentative evidence in support of the hypothesis. However, we do not find the difference in mean performance of these portfolios to be statistically significant at conventional confidence levels.

Figure 8 presents a similar picture to the pattern shown in Fig. 7 also displaying a fairly smooth improvement in performance as we move from funds that had high turnover to funds that had lower turnover. We again find that this performance progression is statistically significant at the 99% level of confidence. In Table 10, we present the related performance statistics. The results indicate that funds with high turnover subsequently tend to produce negative benchmark-adjusted returns. Portfolios D1–D4 produce annualised returns ranging from − 0.43 to − 0.17%. The extreme turnover portfolio, T20, produces even larger, and economically meaningful annualised underperformance of − 1.01%. By contrast, portfolios D7–D10 that comprise funds with relatively low levels of portfolio turnover all produce small positive annualised returns in excess of their benchmarks of 0.21%, 0.20%, 0.40% and 0.09%, respectively. However, intriguingly, portfolio B20 produces negative average returns. The funds that make up these portfolios had the lowest turnover levels. High turnover would seem to be an indicator of future, negative excess returns, but at the same time those funds with very low turnover also subsequently tend to produce negative excess returns. It is fairly straightforward to rationalise the relationship between high turnover and negative excess returns since turnover increases the costs of running a portfolio, other things being equal, and has been associated with wealth damaging behavioural biases such as overtrading linked to overconfidence [see Barber and Odean (2000)]. Perhaps the very low levels of turnover represented in portfolio B20 are associated with a paucity of manager investment ideas or, more simply, due to manager neglect?

Table 10 Recursive portfolio performance statistics: annual turnover

Figure 9 and Table 11 present the performance of portfolios constructed according to the average annual holdings in each fund. No clear patterns really emerge with this portfolio ordering technique. A result that indicates that this criterion is not a useful indicator of future excess performance. The only economically notable, though not statistically notable result in Table 11, is found for portfolio B20 where annualised negative excess return is found to be 1.45%, which might be due to poor portfolio diversification within the 20 funds that make up this portfolio each year.

Table 11 Recursive portfolio performance statistics—average annual fund holdings

Finally, Fig. 10 and Table 12 present the performance of the pre-ordered portfolios using annual fund fees as the ranking criteria. Figure 10 shows a clear outperformance of low fee funds compared to funds with higher fees, for example, portfolio D1 produces an average, benchmark-adjusted performance of − 0.08% per month, while portfolio 10 produces an average monthly outperformance of 0.04% per month. The 0.12% difference between the two is clearly very significant in economic terms. A test for the statistical significance of the difference between the two shows that this difference is significant at the 99% level of confidence. Table 12 presents the portfolios’ alphas and alpha t statistics from the one- and four-factor model. Taken together these results indicate that, other things equal, funds with relatively low fees should be preferred to those with relatively high fees.

Table 12 Recursive portfolio performance statistics: fund fees

Summary and conclusions

In this paper, using a comprehensive and up-to-date database of US equity mutual funds, we have explored the possibility that some indicators may act as predictors of benchmark-adjusted mutual fund performance. Using indicators based upon mutual fund returns, there does seem to be some evidence to suggest that performance persists at the top and bottom end of the performance spectrum. Using average net, benchmark-adjusted returns, and the standard deviation of these returns (tracking error) all indicate that performance tends to persist, particularly at the bottom end of the performance range. But the indicators that seem to produce the most consistent, returns-based measure of ex ante performance are the t-ratio generated on an alpha based upon a one-factor model of these returns and the information ratio. We find a significant deterioration in performance as we move from funds with a high pre-ranking t-alphas and information ratios to funds with progressively lower pre-ranking t-alphas and information ratios. We also explored other potential indicators of performance. We find weak evidence that both large and small funds subsequently produce poor benchmark-adjusted returns. Perhaps at both extremes the funds suffer from a lack of fund manager attention? The small funds, because they are just not significant from a fee generation perspective, and the larger funds perhaps because the manager does not want to rock the fee-generating boat? However, overall we find that AUM does not prove to be a very consistent way of identifying post-ranking performance. However, we find that both net flows and fund turnover do produce fairly consistent post-ranking returns. Funds that receive high net inflows in year t, tend to produce poor benchmark-adjusted returns in year t + 1. It seems plausible then that in- and out-flows distract managers from their investment strategies making it difficult for them to implement those strategies. We find similar evidence when we use fund turnover as a pre-ranking criteria. Funds that experience high levels of turnover in year t, tend to produce poor benchmark-adjusted returns in year t + 1. High turnover has been linked to wealth damaging behavioural biases by Barber and Odean (2000). It appears that there may be some evidence for this phenomenon in our results.

The results in this paper focus on short-term post-ranking performance—over 1 year. They suggest, in particular, that investors should avoid investing in, or should disinvest from managers that produce a low information ratio, have high turnover, where the fund experiences high net inflows and where the fund fees are high. We hope that this information will be useful to professional financial advisors and fund selectors who generally have access to data sources that will help them identify the funds that comprise, for example, the top ten per cent of funds by fund fee. However, retail investors might not have access to this sort of information. For them to make use of the results in the paper, fund providers would need to include such information in the Key Investor Information Document (KIID).Footnote 5 So for example, as well as providing information about the fund’s past performance, regulators should perhaps compel fund management companies to provide percentile-based information about the fund’s fees, turnover etc., based upon a comparable cohort.