1 Introduction

The creation of exchange traded funds (ETFs) is arguably the most significant financial innovation of the past two decades. Over 2,000 ETFs trade on North American exchanges, giving investors a means of trading categories as broad as the world stock market index (iShares MSCI World ETF) and as narrow as water purification (Invesco Global Water ETF). These funds have attracted massive capital flows, first outstripping mutual funds for net inflows in 2003 and by the end of 2019 having an estimated $4 trillion in assets under management (Blackrock 2020). In the face of this substantial shift in tradable assets, a natural question arises: how does the production of information by analysts respond to such changes? We examine how the number of analysts covering a firm and the content of their recommendations change after changes in the availability of an industry ETF that holds the firm and other firms in the same industry.

A key feature of an industry ETF is that it gives investors exposure to a common industry payoff while diversifying away exposure to the idiosyncratic payoffs of the individual firms. Further, ETFs are highly liquid, making them easily tradable for both long and short positions. Consequently, they are excellent vehicles for hedged trading strategies. For example, one use of industry ETFs is to trade the strategy “long the firm, short the ETF” to concentrate the long position on the idiosyncratic portion of the firm payoff without risking exposure to industry price movements (Bloomberg Intelligence 2017). Numerous other strategies are also available, all taking advantage of the ability to separate the common payoff factor of the ETF from the idiosyncratic payoff factors of the individual firms.

With the common and idiosyncratic components of the payoff effectively trading separately, information about each component may become more valuable. For instance, knowing that the common source of payoff variation can be hedged away, one possibility is that analysts focus their attention on collecting information that pertains only to the idiosyncratic portion of the firm’s payoff. They might focus on answering the question, “What makes this firm better or worse than other firms in its industry?” And, at the same time, other analysts might specialize in writing industry reports. More generally, with a richer set of trading opportunities, the demand for analysts’ services may change, and the nature of the information they provide may also change.

The popular media has made claims that ETFs are ruining traditional financial analysis by syphoning off liquidity from individual stocks and thus lowering the value of being privately informed about individual firms. For instance, an article in Financial News reports that “Analysts think the ‘relentless’ appetite for passive investing means stock trades are based on fund flows rather than company fundamentals (Vlastelica 2017).” And, consistent with this claim, Israeli et al. (2017) find that an increase in the percentage of a firm’s shares held by all types of ETFs collectively is associated with higher bid-ask spreads and lower liquidity. However, the impact on the value of analysts’ services is much less clear.Footnote 1 Lundholm (2021) shows that the introduction of an ETF creates two competing forces on the value of being privately informed about a firm’s idiosyncratic payoff: the hedging opportunity allows investors to remove the common source of payoff variation, making their idiosyncratic information more valuable and their equilibrium demand more extreme, but the resulting extreme investment positions also increase the informativeness of prices, making their idiosyncratic information less valuable. To understand how these two forces trade off in the data, we examine how the number of analysts making firm recommendations changes around changes in the availability of an industry ETF. The premise is that analysts produce private information about firm outcomes and this information may become more or less valuable as a result of the introduction of an industry ETF. As a consequence, a firm’s analyst following might reasonably change as a result of the ETF introduction.

In addition to studying the number of analysts following a firm, we also examine changes in the raw content of analysts’ recommendations submitted to IBES. In 2002 IBES began keeping track of whether an analyst’s firm recommendation included a recommendation about the firm’s industry (Glushkov 2010). For analyst reports that include both a firm and an industry recommendation, the firm recommendation can easily be interpreted as a relative statement about the firm inside its industry.Footnote 2 Placing a firm in the context of its industry has long been a staple of detailed analyst reports; the IBES database provides an objective way to identify and quantify this activity. Our question is, do analysts change the content of their reports to include an industry recommendation in response to changes in the availability of industry ETFs. An analyst report about a firm that also contains a recommendation about the firm’s industry is precisely the information needed to trade on idiosyncratic firm information and then use the industry ETF to hedge out the common industry component.

Analysts use varying language to summarize their final recommendation for IBES. Our second measure of the content of analysts’ firm recommendations is whether the recommendation is stated in absolute or relative terms. For instance, comparing the two negatively toned recommendations—sell and ‘underperform—the first is an absolute statement while the second is relative to some benchmark. Kadan, Madureira Wang, and Zach (2020) examine a sample of detailed analyst reports and find wide variation in what analysts list as their benchmark, including the industry, the market, or some fixed percentage return.Footnote 3 Our question is, do analysts change their recommendations from being absolute to relative in response to changes in the availability of industry ETFs? Even if the analyst recommendation doesn’t include an industry report, if the firm recommendation is stated in relative terms, it still provides the necessary information for a trade that takes opposite positions in the firm and the benchmark ETF.Footnote 4

While analysts certainly gave some attention to industry or relative observations prior to the introduction of ETFs, as an empirical matter, most did not state their opinions in a relative manner, nor did their reports express explicit views about the industry.

The number of industry ETFs has fluctuated significantly since their introduction in 1999; in the 21 years covered in our data, the total number of industry ETFs increases in 15 years and decreases in six years. As a consequence, the number of firms held by an industry ETF also fluctuates significantly over time. We find that the number of analysts making recommendations for a firm increases in the year following an increase in the number of industry ETFs holding the firm. This result remains robust after controlling for changes in institutional ownership and other firm controls, after including year and industry fixed effects, and after removing observations most likely to be affected by endogeneity. We also find that the number of firm-specific recommendations that include an industry recommendation and the number of recommendations stated in relative terms both increase in the year following an increase in the number of ETFs. For instance, the launch of a new industry ETF is followed the next year by an estimated 15 percent increase in the number of analysts following the firm, a magnitude comparable to a 13 percent increase in institutional ownership. And the launch of a new industry ETF is followed in the next year by a 10 percent increase in the frequency of analyst recommendations with industry recommendations and a 25 percent increase in the frequency of recommendations stated in relative terms. Together these results strongly and consistently show that analysts change their behavior in response to changes in the investment opportunity set of their clients. Our final test is necessarily circumstantial, as it links the new style of relative analyst recommendations to hedge fund behavior, completing the chain of association that links investment opportunities to investor behavior.

2 Related literature

Our work contributes to the literature on analysts’ coverage decisions. Jegadeesh et al. (2004) show that sell-side analysts tend to follow “glamour stocks,” characterized by high market-to-book ratios, high sales growth rates, high momentum, and high turnover.Footnote 5 Analysts also appear attracted to large firms (Bhushan 1989), firms with intangible assets (Barth et al. 2001), firms with less readable 10-K filings (Lehavy et al. 2011), and firms with good disclosure practices (Lang 1996). All of these attributes are likely to reflect the types of firms institutional investors find attractive. Consistent with this, Brown et al. (2015) survey analysts and find that client demand drives analyst coverage. We add to this literature by showing that analyst coverage responds to changes in the investment opportunities of their clients.

Our results also add to the literature on what type of information analysts provide to their clients. Bradshaw (2011, p. 40) evaluates the rankings by institutional investors and concludes: “Clearly, analysts are valued for their ability to see individual companies within the context of the industry as a whole.” Consistent with this, Lawrence et al. (2016) find that the demand for analyst information on Yahoo!Finance increases around firm-specific events, such as earnings announcements or management guidance. Much research in this area has studied the investment value of analysts’ firm recommendations. Boni and Womack (2006) show that changes in analyst recommendations predict the relative performance of stocks within an industry. Similarly, Liu (2011) decomposes the source of value in analyst recommendations, finding that their primary contribution is their firm knowledge rather than their industry knowledge.Footnote 6 However, Howe et al. (2009) aggregate individual stock recommendation changes into industry and market predictors and find that these aggregate statistics help predict industry and market returns. Rather than build an aggregate recommendation from individual ones, Kadan et al. (2012, p. 95) exploit the new industry recommendations in IBES. They find: “Analysts exhibit across-industry expertise, as portfolios based on industry recommendations generate abnormal returns over both short and long horizons, beyond what would be explained by industry momentum.”

In sum, there is strong evidence that analysts’ recommendation changes predict relative returns within the industry and some evidence that they have predictive ability across industries. While we study recommendations, our interest is not in how valuable the recommendations are; rather we contribute to the literature by showing that an increase in the availability of industry ETFs increases the demand for analyst recommendations and changes how the recommendations are framed.Footnote 7

Finally, our work contributes to the literature on the impact of ETFs on market efficiency. As already discussed, Israeli et al. (2017) find that changes in liquidity are negatively related to changes in the percentage of a firm’s shares collectively held by all ETFs. They conclude that this reduction in liquidity lowers analysts’ incentives to collect private information about the firm. However, Huang, O’Hara, and Zhong (2021) find evidence that, when hedge funds expect a positive earnings surprise, they trade the strategy of going long in the firm and short in the firm’s industry ETF. Further, they show this activity serves to lower the subsequent post-earnings-announcement drift. The results of Bhojraj et al. (2020) are consistent with this view. They find that firms held by sector ETFs exhibit more efficient pricing of the constituent firms’ earnings news when compared to firms held only by broad-based ETFs. These results suggest that the existence of an industry ETF serves to improve the market’s efficiency. We contribute to this literature by showing that increases in the availability of industry ETFs tend to attract analyst coverage and focus their attention on relative statements of value. Both results are consistent with a market for analyst services that responds to changes in their clients’ investment opportunities.Footnote 8

3 Hypothesis development

Define the payoff to an investment in firm i as Fi = \(F^c\;+\;F_i^s\), where \({F}^{c}\) is the common industry component and \({F}_{i}^{s}\) is the firm-specific component, \({F}^{c}\) and \({F}_{i}^{s}\) are independent, and \({F}_{i}^{s}\) is mean zero. An analyst recommendation for the firm can then be characterized as the sum of her noisy signals about the industry payoff, \({Y}^{c}{\;=\;F}^{c}\) + ε, and the firm payoff: \({Y}_{i}^{s}\;=\;{F}_{i}^{s}\;+\;{F}_{i}^{s}\), so that the firm recommendation is Yi = \(Y^c{\;+\;Y}_i^s\).

To this setup, add an industry ETF asset whose payoff is the average of the constituent firm payoffs: \({F}_{ETF}\;=\;\frac{1}{n}\sum {F}_{i}\;=\;{F}^{c}\;+\;\frac{1}{n}\sum {F}_{i}^{s}\;\approx\;{F}^{c}\), where the last approximation is valid because the average of the firm payoff components goes to zero as n increases.Footnote 9 The possibility of trading the \({F}^{c}\) payoff separately from the total Fi payoff means that a hedged position can be formed by taking opposite positions in Fi and FETF, thereby attaining the payoff Fi – FETF \({F}_{i}^{s}\). The analyst now has the choice to augment her firm recommendation Yi with an industry recommendation \({Y}^{c}\). This is effectively the same as adding the firm recommendation \({Y}_{i}^{s}\). Similarly, a recommendation that is stated in relative terms can be characterized as issuing \({Y}_{i}\;-\;{Y}^{c}\;=\;{Y}_{i}^{s}\). Note that this is effectively what happens when analysts state that their recommendations are relative to industry benchmarks. The question then is, when an industry ETF market opens and effectively makes \({F}^{c}\) and \({F}_{i}^{s}\) separately tradable, do analysts begin making recommendations on \({Y}_{i}^{s}\), either by making recommendations on both Yi and \({Y}^{c}\) or by stating the recommendation in relative terms, as in \({Y}_{i}-{Y}^{c}={Y}_{i}^{s}\).

In the simple framework above, the signal \({Y}_{i}^{s}\) can be recovered either from a relative recommendation or a firm recommendation that includes an industry recommendation. However, the two types of recommendations are not identical. An analyst who provides an industry recommendation may be doing so to provide an extra service beyond the firm recommendation. In this case, the analyst feels she or he knows something about the industry payoff. However, providing a relative recommendation says nothing about the industry payoff; rather it presumes the client doesn’t need to know anything about the industry payoff because she or he will hedge it out using the industry ETF.Footnote 10 As summarized above, there is accumulated evidence that analysts have some skill in identifying relative value within an industry and recent evidence that they can identify across-industry value.

With this simple framework we can pose a number of questions about how analysts might respond when a new industry ETF launches. First, do analyst recommendations (Yi, \({Y}^{c}\) or \({Y}_{i}^{s}\)), become more valuable and hence do more analysts provide them once the ETF launches? Lundholm (2021) shows that, if the informativeness of the firm price remains the same, then the ability to trade the ETF increases the value of Yi or equivalently of \({Y}_{i}^{s}\). However, trading in the ETF market generally increases the informativeness of price, which lowers the value of private information about the firm payoff, and so the net result depends on the trade-off between these two forces. In other words, it is an empirical question.

Second, if indeed there is an increase in the demand for analyst services following the launch of an industry EFT, are the analyst recommendations more likely to be stated in relative terms (\({Y}_{i}-{Y}^{c}={Y}_{i}^{s}\)) or include an industry report (\({Y}^{c}\))?

Third, if the increase in demand for analyst services and the change in the nature of their recommendations respond to the launch of an ETF, is the response stronger for firms with payoffs that depend more on the industry payoff component? In particular, is the response larger when the beta from the firm’s returns regressed on the ETF’s returns is larger? To see this prediction, modify the firm payoff to be \({F}_{i} = {\gamma }_{i}{F}^{c} + {F}_{i}^{s}\), so that firms in the ETF have varying degrees of exposure to the industry payoff component. With this, the ETF payoff becomes \({F}_{ETF} = \overline{\gamma }{F}^{c} + \overline{{F }^{s}}\), where the last term is the average of the \({F}_{i}^{s}\) terms which, by definition, are mean zero and the average goes to zero as n grows.Footnote 11 Because the firm-specific payoffs vary with \({\gamma }_{i}\), the optimal hedge also varies with \({\gamma }_{i}\). By weighting the ETF payoff by \(\left({\gamma }_{i}/\overline{\gamma }\right)\), the hedge payoff is \({{F}_{i}\;-\;\left({\gamma }_{i}/\overline{\gamma }\right)F}_{ETF}\;=\;{F}_{i}^{s}- \left({\gamma }_{i}/\overline{\gamma }\right)\overline{{F }^{s}}\). The last term goes to zero as n increases, leaving only \({F}_{i}^{s}\), which is the exact payoff the informed investor is informed about. Comparing the unhedged payoff (\({F}_{i}\)) with the hedged payoff (\({F}_{i}^{s}\)) gives a measure of how important it is to be able to hedge and consequentially how important relative information is to investors. Taking the difference in payouts gives \({F}_{i}- {F}_{i}^{s}\approx {\gamma }_{i}{F}^{c}\) for large n. Perhaps not surprisingly, the bigger the firm’s exposure to the industry payoff, the more important it is to be able to hedge out the industry component. To estimate \({\gamma }_{i}\), regress Fi on FETF. If we define Var(\({F}^{c}\)) = g and Var(\({F}_{i}^{s}\)) = h, then the population beta is given by

$${\beta }_{i}=\frac{Cov({\gamma }_{i}{F}^{c} + {F}_{i}^{s},\overline{\gamma }{F}^{c} + \overline{{F }^{s}})}{Var(\overline{\gamma }{F}^{c} + \overline{{F }^{s}})}=\frac{{\gamma }_{i}\overline{\gamma }g+h/n}{{\overline{\gamma }}^{2}g+h/n}.$$

The righthand side approaches \({~}^{{\gamma }_{i}}\!\left/ \!{~}_{\overline{\gamma }}\right.\) as n gets large. Thus how important it is to hedge increases with \({\gamma }_{i}\), and so we predict that the analyst response to the ETF launch strengthens as \({\beta }_{i}\) grows.

While our hypotheses about analyst behavior are based on a change in the investment opportunity set of their clients, we do not actually observe investor behavior. However, Huang et al. (2021) find that short interest in industry ETFs increases simultaneously with hedge fund holdings of the constituent stocks just before those stocks report a positive earnings surprise. This is consistent with the ETF trading strategy “long the firm, short the industry.” Anecdotally, in a 2017 interview with etf.com, Eric Balchunas, senior ETF analyst at Bloomberg, noted that short positions in ETFs amount to $104 billion as compared to only $30 billion for long positions. Balchunas remarked: “A lot of people think hedge funds are out there trying to swing for the fences and return 100% every year. But most of them are looking to isolate certain things in the market, whether they’re using merger arbitrage, event-driven, or long/short strategies. To do the short side of those trades, they’ll use ETFs so they can cancel out the beta of the market and isolate their positions.” While our results do not depend on identifying exactly how each trader uses analyst recommendations, this evidence shows that at least some important investors trade in the way that supports our hypotheses.

4 Sample and data items

The first industry ETFs were sponsored by State Street in December 1998, so we begin our sample period in 1999 and collect data through 2020. The first industry recommendations were not recorded in IBES until September 2002, so tests involving industry recommendations are based on this somewhat smaller sample. We begin with the CRSP mutual fund database and identify all equity ETFs. From this list, we manually identify all ETFs with an industry or sector focus and then identify all firms ever held by these ETFs.Footnote 12 The result is 265 distinct industry ETFs and 3,723 distinct firms over the sample period. These numbers are resemble those reported by Bhojraj et al. (2020), who also manually identify industry and sector ETFs. Then, for each of these firms, we collect their history of analyst recommendations in IBES from all analysts and all years in our sample period. After eliminating duplicate observations and requiring data for our eight control variables (discussed later), we have a sample of 307,040 firm-date-analyst observations.Footnote 13 There is considerable variation on the availability of ETFs over time; Fig. 1 shows the number of new industry ETF offerings and dissolutions by year. For example, in 2008 there were 19 new industry ETFs launched, while four were dissolved, and in 2009 there were three new industry ETF launched, while 13 were dissolved. Thus, while ETFs are relatively new, it is important to remember that their numbers have fluctuated over time, and so our tests are not simply picking up a secular trend. Further, all our tests include year fixed effects.

Fig. 1
figure 1

Industry ETF Offering and Dissolutions

4.1 Variable definitions

4.1.1 Dependent variables

Denote the number of analysts who provide at least one recommendation for the firm during the calendar year as Analysts_ctt. From these annual counts, we create the annual change in analyst following as.

  • ΔAnalystst = Analysts_cttAnalysts_ctt-1.Footnote 14

The next two dependent variables focus on the content of analyst recommendations. For each firm-analyst-date recommendation, we create two indicator variables:

  • Industry_It = 1 if the recommendation includes a recommendation for the industry and is 0 otherwise; and.

  • Relative_It = 1 if the recommendation is stated in relative terms and is 0 otherwise.

Glushkov (2010) provides the SAS code to identify when the firm-specific recommendation also contains an industry recommendation. Basically, if the firm recommendation in the etext field has a “/,” then the content after the “/” is the industry recommendation. Who makes this recommendation is not shown in the database; it could be the analyst who is making the firm recommendation or it could be a strategy or macro analyst employed for this purpose. The distinction is not important to our analysis—in either case, the analyst includes the industry recommendation as part of her or his firm-specific recommendation and puts her or his name on it. We simply describe this situation as a firm-specific recommendation that also contains an industry recommendation. Note also that we are not interested in the tone or direction of the industry recommendation, only in its existence.

We manually classify Relative_It as absolute (0) or relative (1) based on the words in the etext field of the IBES firm recommendation. This most basic text exists for every recommendation, regardless of whether there is an industry recommendation in the etext field. A firm recommendation is classified as relative if it implies that it is in reference to some benchmark, even though the benchmark itself is not identified. For example, “undervalued” is absolute while “underweight” is relative; “buy” or “sell” are absolute while “under-perform” or “overperform” are relative. The complete recommendation word lists and associated classifications are given in Appendix A. To validate our coding of Relative_It, we asked three fellow faculty to independently code this variable. The average correlation between their coding and the coding in Appendix A is 95 percent.

For each firm, we sum Industry_It and Relative_It over the calendar year to create annual count variables:

  • Industry_ctt = number of industry recommendations for the firm during the calendar year; and.

  • Relative_ctt = number of firm recommendations worded as relative for the firm during the calendar year.

We then use these annual count variables to create the annual change variables that, along with ΔAnalystt, are the dependent variables in our main testsFootnote 15:

  • ΔRelativet = Relative_cttRelative_ctt-1, and.

  • ΔIndustryt = Industry_cttIndustry_ctt-1.

Table 1 Panel A shows that a firm recommendation is stated in relative terms 35.3 percent of the time and includes an industry recommendation 13.1 percent of the time. Over a calendar year, the median number of analysts making recommendation is five per firm, with two of their recommendations stated in relative terms, and, at the median, they do not have an industry recommendation. The median annual changes in our three dependent variables, ΔAnalystst, ΔRelativet, and ΔIndustryt, are all zero and roughly symmetric. ΔAnalystst decreases two at the 25th percentile and increases two at the 75th percentile, and ΔRelativet decreases one at the 25th percentile and increases one at the 75th percentile.

Table 1 Descriptive Statistics

4.1.2 Treatment variables

Our primary variable of interest is the change in a firm’s industry hedgeability—how easy it is to form an investment position that takes opposite positions in the firm and its industry ETF. For each firm-year, we count the number of ETFs that hold the firm on the last day of the year; Table 1 Panel B shows the distribution of this variable, labeled # of ETFs. The median firm is held by two industry ETFs, and 19 percent of the firm-years are not held by any ETF (untabulated). The distribution of the # of ETFs is very skewed, with the 75th percentile at four and a maximum of 20 (untabulated). For our purposes, there is no meaningful difference between being held by 19 or 20 ETFs; in either case, the industry payoff component is easily hedged. For this reason, we severely damp the right side of the distribution by creating the following variable that takes only three values:

Hedge3t  =  0 if at the end of year t there are no industry ETFs that hold the firm,

  1. 1.

    if at the end of year t either one or two industry ETFs hold the firm, or

  2. 2.

    if at the end of year t three or more industry ETFs hold the firm.

The idea is that, with zero ETFs, the industry factor is unhedgeable; with one or two ETFs, the industry factor is potentially hedgeable, but the trade may not be priced competitively or the ETF might not capture the industry factor effectively; and with three or more ETFs, the industry factor can definitely be identified and hedged at a competitive price. The damping of the right tail in Hedge3t resembles what we would get using the log of one plus the number of ETFs, but it is easier to interpret. Table 1 Panel B shows the distribution of Hedge3t; the median is one, and the 75th percentile is two.

For our changes tests, we construct the lagged annual change of Hedge3t. We use the prior year’s change in Hedge3t as the primary treatment variable because we expect it to take some time for analysts to adapt to the newly available hedging instrument and change their recommendation style or following decisions accordingly. By using a treatment variable that is measured a year before our outcome variables, we provide stronger evidence that the treatment caused the outcome. Our primary treatment variable for all the changes specifications is therefore:

ΔHedge3t-1  =  the annual change in Hedge3t from year t-2 to year t-1.

As expected, Table 1 Panel B shows that a firm’s hedgeability doesn’t change much from year to year; ΔHedge3t-1 is zero through the 75th percentile. However, because of our sample composition, ΔHedge3t-1 has to increase at least once over the sample period, at the inception date of the first industry ETF to hold it.

We also consider two variations on Hedge3t and ΔHedge3t-1. Hedge2t is a binary variable based solely on whether an industry ETF exists; it takes the value of 0 if Hedge3t is 0 and takes the value 1 if Hedge3t is 1 or 2. We then construct the lagged annual change as before to get ΔHedge2t-1. The correlation between ΔHedge2t-1 and ΔHedge3t-1 is 0.744.

Hedge3Zt adds to the list of industry ETFs all the short/inverse ETFs with an industry focus. These ETFs use options, futures, and leverage to gain a negative exposure to some factor. While in theory a hedged position could be formed by going long in the firm and long in the short/inverse ETF, in practice these ETFs are only used for very brief investment strategies. The tracking error on short/inverse ETFs relative to their stated benchmark is very poor after only a day or two, making them poor substitutes for shorting the ETF with direct exposure to the factor (Dulaney et al. 2012). There are 39 inverse industry ETFs in existence at some point in our sample period, offering negative exposure to each of the 12 Fama/French industries. We create a new hedging variable, Hedge3Zt, that adds one to the count of ETFs for all firms in the same Fama–French 12 industry as the short/inverse ETF during the period that it exists. From this count variable we create Hedge3Z and ΔHedge3Zt-1. This variable is noisy, as many firms in the FF12 industry will have little exposure to the factor but still get counted as being hedgeable, unlike the observations in ΔHedge3t-1, where the firm is actually held as part of the ETF. The correlation between ΔHedge3Zt-1 and ΔHedge3t-1 is 0.581.

4.1.3 Control variables

We collect control variables based on other studies of analyst following. It is less clear what control variables are necessary in our models of ΔIndustryt and ΔRelativet, but we include the same control variables in all tests for ease of comparison. The levels of the control variables (before we construct changes) are computed at the end of the prior calendar year (i.e., they are lagged to the same period as ΔHedge3t-1). We begin with the control variables as Israeli et al. (2017):

  • Sizet-1 =  log of market capitalization,

  • Instpctt-1 =  the percentage of shares held by institutional investors,

  • BTMt-1 = book-to-market ratio,

  • Turnovert-1 = average monthly trading volume divided by shares outstanding,

  • Volatilityt-1 = standard deviation of monthly returns computed over the prior calendar year,

  • Intangiblest-1 = ratio of intangible assets to total assets,

  • R&Dt-1 = ratio of R&D expense to total expense, and.

  • Momentumt-1 = cumulative return over prior calendar year.

Numerous studies, beginning with Bhushan (1989), have shown that analyst coverage is related to Sizet-1 and institutional ownership (Instpctt-1); larger firms and firms with more institutional coverage present a larger market for analysts to sell their services. Analysts have also been shown to be attracted by glamour stocks, proxied here by the BTMt-1 ratio and Momentumt-1 (Jegedeesh et al. 2004), and high share Turnovert-1 (Brennan and Hughes 1991), presumably because trading in these firms generates higher commissions. The literature has found mixed results on the association between analyst following and return Volatilityt-1; Brennan and Hughes (1991) find a positive relation, Jegedeesh et al. (2004) find a negative relation, and Israeli et al. (2017) and Lee and So (2017) find no relation. Finally, Barth et al. (2001) find that analyst services are more valuable for firms where the standard accounting measures are less relevant, as proxied by firms with high Intangiblest-1 and high R&Dt-1.

Like Israeli et al. (2017), we construct the changes in Instpctt-1, BTMt-1, Volatilityt-1, Turnovert-1, Intangiblest-1, and R&Dt-1. We leave Sizet-1 as a levels variable because the change would be almost the same thing as Momentumt-1, and Momentumt-1 is already defined as a change variable. Table 1 Panel B gives summary statistics for the levels and changes in control variables. Comparing some of our control variables to those of Israeli et al. (2017), our firm size is somewhat smaller, with a median log of market value of 7.2 rather than theirs of 12.9. This makes sense because the requirement to be part of an industry ETF necessarily rules out large conglomerate firms. The percentage institutional ownership is noticeably higher in our sample, at 76.8 (untabulated), as compared to 62.5 in their sample. Thus, as compared to a sample of firms held by all types of ETFs, our sample of firms held by industry ETFs is smaller but more heavily owned by institutional investors.

Table 2 reports the correlations between our variables of interest. The dependent variables are positively correlated at significant levels, giving some assurance that they are measuring the same underlying forces. The highest correlation is for ∆Analystst and ∆Relativet, which are correlated at the 0.607 level (p-value < 0.01). All three dependent variables are significantly correlated with the primary treatment variable ΔHedge3t-1. While some of the control variables are significantly related to the dependent variables, none have a correlation higher than 10 percent.

Table 2 Correlations

5 Research design and results

Our results are organized as follows. We first present results for changes in analyst following. A change in this variable represents a significant change in the effort allocation of analysts. We then present results for changes in the two variables derived from the content of the recommendations. These variables capture specific ways an analyst may change her or his behavior. These three tables form our main results. We follow this with three different specification checks that limit the data in some way. Finally, we consider two very different model designs, repeating the analysis on quarterly periods, and examining the level of the recommendation data (rather than changes).

5.1 Annual change model and results

Our primary treatment variable is ΔHedge3t-1, the lagged change in hedgeability. A change specification is a natural way to make the firm its own control. And by lagging the main treatment variable, we increase our confidence that the changes in the dependent variables are in response to the change in the hedgeability of the firm. Below we show the specification for ΔAnalystst and ΔHedge3t-1. The specifications for the other two dependent variables, ΔIndustryt and ΔRelativet,, and when the treatment variable is ΔHedge2t-1 or ΔHedge3Zt-1, is the sameFootnote 16:

$$\begin{array}{l} \triangle Analyst_t=\beta_0\triangle Hedge3_{t-1}+ \\ \qquad \qquad \qquad \beta_1Size_{t-1}+\beta_2Chg\_Instpct_{t-1}+\beta_3Chg\_BTM_{t-1}+\beta_4Chg\_Turnover_{t-1}+ \\ \qquad \qquad \qquad \beta_5{Chg\_Volatility}_{t-1}+\beta_6{Chg\_Intangibles}_{t-1}+\beta_{7}\triangle Chg\_R\&D_{t-1}+{\beta}_{8}{Momentum}_{t-1}+ \\ \qquad \qquad \qquad \alpha_{t}+\delta_{k}+\varepsilon_{t},\;where\;t=\mathit{1}\;to\;\mathit{21}\;is\;the\;number\;of\;year\;fixed\;effects\;and\;k=\mathit{1}\;to\;\mathit{12} \\ \qquad \qquad \qquad is\;the\;number\;of\;industry\;fixed\;effects.\end{array}$$

The first two columns of Table 3 present the regressions for ΔAnalystst on ΔHedge3t-1, with and without the control variables. The coefficient of interest is β0. All models include year and industry fixed effects, and t-statistics are computed with standard errors clustered at the firm level.Footnote 17 The first two columns show that, after an increase in the number of industry ETFs holding the firm, there is a significant increase in the number of analysts following the firm. The estimated effect in column 1 is 0.148, relative to the median analyst following of five and the median change in following of zero. For the ΔAnalystst model with controls, the estimate increases to 0.182. In terms of the control variables, the change in institutional ownership, the change in volatility, and momentum are significant. Comparing estimates across variables in column 2, an increase of one ETF in the prior year is estimated to result in a 0.182 increase in the number of analysts, which is the same effect as increasing institutional ownership by 13.1 percent (1.384*0.131 = 0.182).

Table 3 Changes in Analyst Following

The regressions in columns 3 and 4 of Table 3 are based on changes in Hedge2t, a binary variable that equals zero if no ETF exists and one otherwise. As seen in columns 3 and 4, the coefficient on ΔHedge2t-1 is slightly higher for both the simple regression and the regression with controls added. The coefficients on the control variables also remain very similar. The regressions in columns 5 and 6 are based on ΔHedge3Zt-1, the measure of hedgeability that includes short/inverse industry ETFs. As seen in the table, the coefficient is lower than the other two versions of this variable but still significant both with or without the controls. As discussed earlier, we conjecture that the lower coefficients are due to the noise in ΔHedge3Zt-1.

To summarize Table 3, an increase in the hedgeability of the firm leads to an increase in the number of analysts providing recommendations for the firm. This conclusion holds in the presence of industry and year fixed effects, with or without control variables, and for three different definitions of hedgeability.

The next two tables examine the relation between changes in hedgeability and changes in the use of relative and industry recommendations. We consider all three definitions of hedgeability and present results with and without the control variables. Table 4 gives the results for ΔRelativet. As seen in column 1, an increase in ΔHedge3t-1 is associated with a significant increase in the use of relative words in analyst recommendations. Recall that, for the full sample, the median firm has two relative recommendations, and so the 0.246 increase shown in column 1 is meaningful. The estimated effect increases slightly when we add the control variables, as given in column 2. The results for ΔHedge2t-1, shown in columns 3 and 4, are virtually identical to the results for ΔHedge3t-1. The coefficients on ΔHedge3Zt-1 are smaller than in the other regressions but are still significant. Across all definitions of hedgeability, the three control variables that were significant in the ΔAnalystst regressions—the change in institutional ownership, the change in volatility, and momentum—are significant in these regressions as well. In addition, Sizet is also significant in all regressions.

Table 4 Changes in Use of Relative Terms in Recommendations

Table 5 gives the results for ΔIndustryt. As seen in column 1, an increase in ΔHedge3t-1 is associated with an increase in the provision of industry recommendations. The overall median number of firm recommendations that include industry recommendations is 0 (with a mean of one), so the 0.102 increase shown in column 1 is meaningful. The results for ΔHedge2t-1 are slightly larger and equally significant. The result for ΔHedge3Zt-1 is smaller but significant when the control variables are present but is not significant in the simple regression shown in column 5.

Table 5 Changes in Provision of Industry Recommendations

To summarize Tables 3, 4, and 5, an increase in the ability to hedge the firm in year t-1 is greeted in year t with an increase in the number of analysts who follow the firm and an increase in their use of relative recommendations and industry recommendations. These results hold in the presence of industry and year fixed effects, with or without controls, and for three different measures of the hedgeability of the firm. We depict our results in Fig. 2. The figure plots the level of our three dependent variables over time, separately for observations with increasing Hedge3t-1 or decreasing Hedge3t-1 at date t-1. For all three plots, there is little difference between the increasers and decreasers prior to the change in hedgeability. However, starting with date 0 and continuing for the next three years, the difference between increasers and decreasers grows.

Fig. 2
figure 2

Mean number of analysts, industry recommendations, and relative recommendations over time. (Date -1 is when the change in Hedge3 is measured. Date 0 is the year of the response.)

5.2 Robustness tests

In all the tables that follow, we only report results for Hedge3t-1, but in all cases, the results are very similar with either of the other two variations on hedgeability. The three different tests in this section are all based on subsets of the data.

5.2.1 Removing hot ETF offerings

One concern with our results so far is that some industries become hot for unmodeled reasons and this attracts analysts and ETF sponsors; that is, our results could be driven by an unobserved endogenous event. The lagged timing between the change in hedgeability and the changes in our analyst variables gives us some assurance of causality. However, we do not have a classic exogenous identifying event that could be used to remove a hot industry effect. Nonetheless, we can exploit the way that sponsors issue ETFs to increase the likelihood that we have identified a causal event. ETFs are issued by sponsoring banks, and they are frequently issued in batches. In our sample period, there are 54 industry ETF launching events with either a unique sponsor or from the same sponsor but separated by more than six months, with a median of four ETFs included in a given launch (untabulated). We reason that, when the sponsor offers ETFs in many different industries at the same time, it is less likely to be responding to a hot industry effect—not all industries can be hot at the same time. Thus, to create a treatment variable that is less likely to be influenced by this effect, we eliminate all ETF offerings that are part of a launch that has fewer than four ETFs. We then create a new treatment variable, ΔHedge3Xt-1, that mirrors the construction of ΔHedge3t-1 but only counts ETFs using this modified database. The results for all three dependent variables are in Table 6. For the ΔAnalystst model, comparing the first two columns of Table 3 with the first two columns of Table 6, we see that the coefficients on ΔHedge3Xt-1 in Table 6 are larger than the corresponding coefficients on ΔHedge3t-1 in Table 3.Footnote 18 For the ΔAnalystst model with controls (column 2), the coefficient goes from 0.182 in Table 3 to 0.254 in Table 6, a 40 percent increase. Similarly, for all specifications of ΔRelativet and ΔIndustryt, the coefficient is larger using the sample with hot offers removed in Table 6 than in the corresponding models shown in Tables 4 and 5. If an endogenous hot industry effect was driving our results, then we would expect that eliminating the most suspicious observations would weaken our findings. The fact that the results strengthen in all cases is added evidence that the change in hedgeability is causing the change in analyst behavior.

Table 6 Annual Changes Excluding Hot-offerings

5.2.2 Removing cases where the ETF is not a good hedging instrument

For the next test, we measure how good the industry ETF is as a hedging instrument, as derived in our hypothesis section. We regress the firm’s monthly stock returns on the ETF’s monthly returns for the two years ending in the current year and record the beta. Recall that the usefulness of the ETF as a hedge increases with the exposure the firm has to the ETF, as measured by the beta from this regression. We then create the variable GoodHedget, which equals one if the beta is greater than or equal to one and zero otherwise, and then only study the subsample where GoodHedget = 1. As shown in Table 1 Panel B, 57.4% of the observations qualify as having a good hedge by this rule. As before, we estimate models with and without the control variables. The first two columns in Table 7 show the results for ΔAnalystst when Goodhedget = 1. Comparing columns 1 and 2 from Table 7 with columns 1 and 2 from Table 3, we see that the coefficients on ΔHedge3t-1 increase when the sample is restricted to Goodhedget = 1. For example, in the ΔAnalystst regression with controls, the coefficient increases from 0.184 in the full sample to 0.207 in the restricted sample, a 13 percent increase. Similarly, comparing the results for the ΔRelativet regression shown in column 4 of Table 7 with column 2 in Table 4 shows that the coefficient on ΔHedge3t-1 increases dramatically, going from 0.255 in the full sample to 0.388 in the restricted sample, a 52 percent increase. And finally, comparing columns 5 and 6 from Table 7 with columns 1 and 2 from Table 5 shows that the coefficients in the ΔIndustryt regression stay the same or increase. As hypothesized, as the usefulness of the ETF increases, the analyst following and the use of relative statements and industry recommendations increase in response.

Table 7 Subsample where Hedge is Good

5.2.3 Removing cases around the global settlement

The next robustness test we consider restricts the sample to 2003 and beyond. The Global Settlement and all the associated exchange rule changes caused a significant shake-up in how analysts work, including the requirement that they explain what their benchmark is when they make a recommendation. The majority of these changes took effect in 2002, which is in our sample period for ΔRelativet and ΔAnalystst. (ΔIndustryt does not start until 2003.) Recall that, if these rule changes caused a change in our dependent variables in 2002, it would be captured by the year fixed effect. However, to be sure that our results are not driven by this regulatory change, in Table 8, we restrict the sample to 2003 and beyond. The results for ΔAnalystst are very similar to the results from the full sample in Table 3, with a somewhat larger coefficient in column 1 and a somewhat smaller one in column 2. The results for ΔRelativet are smaller than in Table 4 but still significant. While the changes following the Global Settlement were significant, our results are not driven by these changes.

Table 8 After the Global Settlement

5.3 Additional tests

5.3.1 Analysis of quarterly changes

The last two tests we present are based on more dramatic changes in our research design. First, we redo the entire analysis using quarters as our unit of time, rather than years. The tests are now about how a change in the ETF hedgeability in quarter t-1 is related to changes in our dependent variables in quarter t. The advantage of this test is that it is less likely to include unknown effects in the event window; the disadvantage is that it assumes analysts change their behavior in response to the change in hedgeability relatively quickly. The results are given in Table 9. As seen in the table, ΔHedge3t-1 is significant for all three dependent variables and all specifications. For instance, column 2 shows that an additional ETF in quarter t-1 is associated with 0.196 more analyst coverage in quarter t. The results for ΔRelativet and ΔIndustryt, for all specifications, are similarly significant when using quarterly data.

Table 9 Quarterly Changes Analyses

5.3.2 Analysis based on recommendation levels

Our last test changes the unit of analysis for the two recommendation-based dependent variables to be at the firm-analyst-date level. This is the most basic level of the analyst recommendation data, with 307,040 observations for Relative_It and 242,172 observations for Industry_It. While we believe our previous changes analysis is a much stronger research design, it is interesting to see whether our hypothesis about hedgeability predicts whether each recommendation will be stated in relative terms (Relative_It = 1) or will include an industry recommendation (Industry_It = 1). In addition, with data at this level of detail, we can use brokerage-level clustering for our significance tests. This is important because brokerage houses might make policy changes that cause all their analysts to issue industry recommendations or to state their recommendations using specific language. Such a policy would cause the residuals from firms covered by the same brokerage to be correlated. Brokerage-level clustering addresses this concern.Footnote 19

Recall from Table 1 that 35.3 percent of the observations are stated in relative terms and 13.1 percent of the observations include an industry recommendation. The correlation between these two indicator variables is 0.22, and the correlation between either variable and any of the control variables is less than 10 percent (untabulated). Below we show the model for Relative_It for the OLS regression with control variables, where a recommendation observation is for a given firm at a given date by a given analyst; the model is the same for Industry_It (firm and analyst subscripts are supressed):

$$Relative\_It = {\beta }_{0}Hedge{3}_{t} + {\beta }_{1}{Size}_{t} + {\beta }_{2}{Instpct}_{t} + {\beta }_{3}{BTM}_{t} + {\beta }_{4}{Turnover}_{t} + {\beta }_{5}{Volatility}_{t} + {\beta }_{6}{Intangibles}_{t} + {\beta }_{7}R\&{D}_{t} + {\beta }_{8}{Momentum}_{t} + {\alpha }_{t} + {\delta }_{k} + {\varepsilon }_{t}, where.$$

The coefficient of interest is β0. αt are the year fixed effects that capture heterogeneity across years that is firm invariant, and δk are the industry fixed effects that capture heterogeneity across the Fama and French 12 industries that is time invariant. We estimate two types of models: conditional logit models because our dependent variables are dichotomous and standard OLS models because the estimates of β0 are easier to interpret. Because we include industry and year fixed effects, the results should be interpreted as explaining variation within the industry and year. The year fixed effects are particularly important, as they control for any unobserved differences across years that might influence our dependent variables. For instance, if there is a secular trend over time or specific policy changes in a certain year, the year fixed effects control for this. We estimate the models with and without the previously discussed control variables, and all models compute standard errors clustered at the firm and brokerage levels.Footnote 20

The first four columns of Table 10 shows that, for all specifications, the likelihood of the analyst recommendation being stated in relative terms is significantly higher the more hedgeable the firm is. For instance, the model in column 3 estimates that, within a given year and industry, a one-unit higher value of Hedge3t is associated with a 2.5 percent increase in the likelihood that the recommendation is stated in relative terms. The results using either the conditional logit or OLS model are significant, with and without the control variables present. The last four columns of Table 10 show that, for all specifications, the likelihood that the recommendation includes an industry recommendation is significantly higher, the more hedgeable the firm is. For instance, the model in column 7 shows that a one-unit higher value of Hedge3t is associated with a 4.7 percent increase in the likelihood of having an industry recommendation. These results are significant for both types of models, with year and industry fixed effects, with or without the control variables.

Table 10 Recommendation Level Analysis

The control variables are mainly significant. The likelihood that the recommendation is stated in relative terms or that it includes an industry recommendation increases with the size of the firm and the number of institutional investors. This makes sense, as it is more likely that large firms with a heavy institutional ownership will be part of a hedged investment strategy. All the glamour proxies are also significant (BTM, Turnover, and Momentum) with the signs suggesting that value stocks are more likely to receive recommendations that are relative or include an industry recommendation. This would in turn suggest that value firms are also more likely to be part of a more sophisticated investment strategy that calls for a hedged position, and so analysts provide information to aid in that type of strategy. Return volatility is significantly positive in all models, which also points to the use of a hedged strategy to lower the payoff volatility.

The results in Table 10 are consistent with our hypothesis that relative information is more valuable when investors have the ability to hedge out the industry factor. Consequently, when a firm has a hedgeable industry payoff component, analysts produce information that is most useful to an investor who is interested in hedging out the industry factor. The control variables suggest that this is most valuable for large, institutionally held value firms, with relatively volatile returns.

5.3.3 Analysis based on recommendation levels

To investigate whether investors trade based on the relative information in an analyst recommendation, we seek evidence that hedge funds increase their holdings before an analyst recommendation upgrade and, at the same time, hedge their industry exposure with a short position in an ETF. This evidence is necessarily indirect, as hedge funds do not disclose their strategies at this level of detail. Our method closely follows that of Huang et al. (2021), who document that hedge funds appear to use a long-short strategy around positive earnings announcements.

From the large collection of 13F filers we identify hedge funds using the hand-collected data provided by Jiang (2019).Footnote 21 Following Huang et al. (2021), we create an indicator for when the data suggest that hedge funds are going long in a stock and short in the associated ETF. For firm i at quarter-end t, we measure abnormal hedge fund holdings (AHF) t as the aggregate hedge fund holdings of the stock less the average of the past four quarters’ holdings, scaled by shares outstanding at quarter end. We measure abnormal short interest (ASI) as the short interest in all ETFs that hold the stock less the average of the past four quarters’ short interest, scaled by shares outstanding. The indicator I_long-short equals one if both AHF and ASI are in the top quintile and equals zero otherwise. In this case, the data are consistent with hedge funds trading the hypothesized long-short strategy. The second step is to create an indicator for when it is likely that the hedge fund has private information at quarter-end t about an upcoming analyst recommendation upgrade or about the underlying event that causes the upgrade. The indicator I_upgrade equals one if there is at least one analyst recommendation increase between quarter-end t and quarter-end t + 1 that is stated in relative terms or provides an industry recommendation and equals zero otherwise. This procedure replicates the Huang et al. (2021) method but replaces positive earnings announcements with analyst upgrades. As with their study, there are repeated firm-quarter observations because a stock could be included in multiple ETFs. The final sample is 235,702 ETF-firm-date observations.

To test for evidence that hedge funds are informed and trading the hypothesized strategy, we estimate the regression:

$$I\_long-{short}_{t} = {\beta }_{0}I\_{upgrade}_{t} + {\beta }_{1}{Size}_{t} + {\beta }_{2}{BTM}_{t}+ {\beta }_{3}{Reversal}_{t} + {\beta }_{4}{Instpct}_{t} + {\beta }_{5}{Momentum}_{t} + {a}_{t} + {b}_{k} + {c}_{k} + {d}_{k} + {e}_{t}, where$$

at, bk, cj, and di are year, quarter, industry and ETF fixed effects. The standard errors are clustered by ETF and year-quarter.

The results presented in Table 11 consistently demonstrate a positive association, with or without controls, between an analyst upgrade stated in relative terms during a quarter and a hedge firm adopting a long position in the upgraded firm while simultaneously going short in an associated industry ETF. These findings not only corroborate our hypotheses but also form a crucial link in the logical chain, connecting the presence of an industry ETF to the stylistic evolution of analyst recommendations and subsequently influencing investor behavior.

Table 11 Relation between Analyst Upgrades and Hedge Fund Behavior

6 Conclusion

Analysts play a crucial role in gathering and disseminating information in financial markets. Our study shows that changes in the investment opportunity set of analysts’ clients causes changes in the type of information they gather and disseminate. When an industry ETF launches, the analyst following of firms in that ETF increases in the following year, and the resulting recommendations are more often stated in relative terms and more often include an industry recommendation. This behavior is consistent with the view that the primary value of analyst recommendations comes from analysts’ ability to understand the firm relative to other firms in the same industry. The advent of ETFs, offering a cost-effective method to hedge out factors where analysts lack a comparative advantage in forecasting, enhances the value of analysts’ firm knowledge. Thus, rather than concluding that ETFs are destroying the value of analysts’ research, our evidence suggests the opposite: after removing exposure to the common payoff component that the analyst has little advantage in forecasting, the value of firm knowledge grows.

An unanswered question is where do the resources come from to fulfill this new demand for relative information. Are more total resources devoted to providing recommendations both at the firm and industry level? Or do analysts shift resources away from other activities to provide this new type of information?