1 Introduction

Arguably one of the most important questions in industrial organization is: what is the effect of market structure on innovation? Schumpeter (1934) initially suggested that small entrepreneurial firms were the source of most innovation but subsequently (Schumpeter 1942) argued that, to the contrary, large established firms were the force behind technological progress. Economists have subsequently offered an array of conflicting theoretical arguments and predictions about the effects of different market structures on innovation.

Arrow (1962) famously argued that because of the “replacement effect,” a firm operating in a competitive market has a greater incentive to innovate than a monopoly. Specifically, since an incumbent monopolist’s return from innovation is only the increment above the monopoly rents it already earns, while a firm operating in a competitive market could realize the full return on its investment, the latter has a greater incentive to invest in R&D. Gilbert and Newbery (1982), by contrast, argue that a monopoly firm facing the loss of its monopoly to an innovating entrant has an incentive to invest in innovation more aggressively than a prospective entrant. On the other hand, Reinganum (1982) showed that this result may reverse once the uncertainty of the outcome of the R&D process is considered.

A very large and somewhat bewilderingly diverse literature has arisen to address the relationship between firm size and innovation. Important surveys of this literature include Cohen (2010), Baker (2007), and Shapiro (2011). Several theoretical arguments have been advanced to justify a positive effect of firm size on investment in innovation. One claim is that capital market imperfections confer an advantage on large firms in securing finance for risky R&D. Additional arguments are that there are scale economies in R&D and that there are complementarities between R&D and other nonmanufacturing activities such as marketing. Empirically, the consensus is that R&D activity does indeed increase with firm size, but only proportionately (Cohen 2010). This finding suggests that, contrary to Schumpeter (1942), large size offers no advantage in the conduct of R&D since, holding industry sales constant, the same amount of R&D will be conducted whether an industry is composed of large firms or a greater number of smaller firms.

Turning from R&D inputs to innovative output, Scherer (1965), Acs and Audretsch (1988, 1990, 1991), and Geroski (1995a) present evidence that R&D productivity (i.e., innovations per unit of R&D) tends to decline with firm size, while Pavitt et al. (1987) suggest a U-shaped relationship, with very large firms displaying relatively high R&D productivity. More recently, Lerner (2006) finds that smaller firms account for a disproportionate share of financial service innovations. However, (Huergo and Jaumandreu 2004) find a negative association between innovation and small size, but that entering firms (which tend to be relatively small) have a high probability of innovating, suggesting that the probability of innovation may be more related to a firm’s age rather than size per se. Recently, Block et al. (2017), bring into the question the view of small entrepreneurial firms as engines of innovation and economic growth. Summarizing evidence from over 100 published empirical studies, they show that growth is mostly generated by a very small number of innovative high growth ventures, whereas the vast majority of new ventures only experience moderate growth. Firm size may also affect the type of innovation that firms pursue. Specifically, it has been found that larger incumbent firms tend to pursue relatively more incremental and relatively more process innovation than smaller firms, which tend to pursue more radical innovation (Cohen 2010).

One interpretation of the apparent decline in R&D productivity with firm size is that smaller firms are more capable of innovating than larger firms, as the established firms’ greater experience with older technologies may hobble their ability to exploit new technology, relative to newer, more aggressive firms. In this vein, Henderson (1993) suggests that larger firms’ greater experience confers an advantage with respect to incremental innovation which is an extension of existing knowledge, but that this same knowledge and experience can disadvantage them when innovation is radical and renders previous processes and procedures obsolete. A contrasting view is advanced by Cohen and Klepper (1996), who argue that observed lower R&D productivity of larger firms may not be evidence of a relative inefficiency of larger firms with respect to R&D productivity but rather may reflect the larger firms’ superior ability to profit from their fixed R&D costs due their cost spreading advantage. In particular, larger firms will realize lower average innovative output per unit of R&D because they can profitably undertake projects further down the diminishing marginal productivity of R&D schedule than smaller firms, yielding a higher expected return per R&D dollar.

While the above literature considers the relationship between innovation and firm size, taking the latter as given, another literature attempts to account for size differences between firms. Lucas (1978) proposed an early model of firm size distribution, based on the idea that firms leverage (heterogeneous) managerial talent. Jovanovic (1982) presented a model in which firms enter without knowing their true efficiency type, and learn about their type as they produce in the industry. Over time, more efficient firms learn their type and expand output accordingly while less efficient firms contract, and may eventually exit. Hopenhayn (1992) presents a model of a steady state size distribution in which firms are subject to repeated productivity shocks which are correlated over time. Efficient firms grow more efficient, on average, over time, and expand while less efficient firms grow less efficient, contract and eventually exit. While the preceding models assume competitive markets, Fishman and Rob (2003) present a model of a steady state firm size distribution for imperfectly competitive markets characterized by consumer search frictions. Frictions keep consumers “locked in” with firms and consequently new entrants start off with a small customer base which expands over time, resulting in a firm size distribution in which size is correlated with the time of entry (age).

This paper introduces an opportunity to innovate into an equilibrium model of firm size distribution which is consistent with “stylized facts” about entry and firm survival. In particular, survival is positively correlated with firm size and age, and firm size at the time of entry is typically below the average size in the industry (Geroski 1995b; Agarwal and Audretsch 2001; Klapper and Richmond 2011). Moreover, the correlation between firm size and survival is especially pronounced when innovative activity plays an important role (Audretsch 1995). In this setting, we argue that evidence that smaller or newer entrants innovate more successfully than incumbents may be subject to sample selection bias. Specifically, in the model there are two entry periods and because of search frictions, second period entrants only have access to new, as yet unaffiliated consumers and are therefore smaller than incumbents (first period entrants). Once the firm size distribution is determined, an opportunity to innovate by investing in a new technology arises and firms who fail to innovate are candidates for exit. Incumbents’ size advantage constitutes an endogenous first mover advantageFootnote 1 which enables incumbents which fail to innovate to survive when smaller entrants which fail cannot. Therefore, the exit rate of non innovating incumbents is lower than that of non innovating entrants and, hence the proportion of surviving incumbents which fail to innovate is smaller than the corresponding proportion of entrants. Thus, failure to account for past exit may erroneously suggest that incumbency or, equivalently, large size, hobbles innovation when in fact successful innovation is unrelated to firm age or size or even if successful innovation is positively correlated with firm size/age.Footnote 2

The rest of the paper is organized as follows. The basic model is introduced in Section 2. In Section 3, the model is expanded to include R&D expenditures. Section 4 includes further discussion and avenues for future research. The proofs of the main propositions are relegated to the ??.

2 The model

Consider a market for an homogenous product which lasts for three periods. The product is launched at period 1 and at that period W 1 identical consumers (early adopters) enter the market. At period 2, W 2 new customers (late adopters), otherwise identical to period 1 customers, enter the market. No new consumers enter at period 3.

Firms

There is a continuum of identical potential firms, and the measure of firms which actually enter the market is determined endogenously. Any firm can enter the market at any period. At any period, a firm must pay a fixed cost F > 0 to be operative, a cost which it can save by exiting (exit is costless). After paying, the fixed cost it can produce any number of units at a constant unit cost which, as described below, may vary over time and across firms. We denote by N t the measure of firms which enter at period t.

Consumers

Each consumer demands one unit per period, for which she is willing to pay up to \(\overline {p}\). At any period, it is costly for consumers to learn firms’ prices. Specifically, upon entering the market, a new consumer is randomly matched with a firm and costlessly observes its price. The consumer can either buy from that first firm, or search by sequentially and randomly sampling the prices of other firms at random at a cost of s > 0 per firm. At subsequent periods, a consumer costlessly learns the price of the firm from which she bought at the preceding period, but must incur search costs to find a new firm.

Technology

At periods 1 and 2, the only available technology is the “high cost” technology, under which the unit cost is \(c_{h}<\overline {p}\). At period 3, a new “low cost” technology becomes available which reduces unit costs to c l < c h . In this section, we assume for simplicity that it is costless to adopt the new technology but that successful adoption is uncertain.Footnote 3 Specifically, at period 3, a firm successfully adopts the low cost technology with exogenous probability σ < 1, which is independent of firm size or age, and with probability 1 − σ its unit cost continues to be c h . For computational convenience, only we set σ = 0.5 but nothing of substance depends on this. A firm learns its realized unit cost at the beginning of period 3, before paying the fixed cost. To economize on notation, we assume that the firms’ discount factor is 1. This assumption has no meaningful effect on the analysis.

Equilibrium

At each period, a strategy for a firm is whether or not to exit, and what price to charge. As a tie breaking rule, we assume that a firm exits if and only if its profit from remaining in the market is strictly negative and a consumer buys as long as her utility is non-negative. A strategy for a consumer is a search rule specifying which prices to accept and which to reject in favor of search. In equilibrium, firms’ entry/exit and pricing strategies maximize their expected discounted profits given the strategies of all other firms and consumers’ search rule, and consumers’ search rule maximizes their utility given firms’ strategies. Let V t be the discounted expected equilibrium profit from entry at period t. Free entry implies that the expected equilibrium profit of any entrant is zero. Thus, V t ≤ 0 at every period and V t = 0 if there is entry at period t. In equilibrium, N 1 > 0 since otherwise a single firm could monopolize the market at period 1 and earn positive profit.Footnote 4

Analysis

The following lemma, proved by Fishman and Rob (2003), is a straightforward application of the “Diamond paradox” (Diamond 1971) to our dynamic setting.

Lemma 2.1

  • a) At each period t, t = 1,2,3 the unique equilibrium price is \(\overline {p}\) . A consumer accepts this price at her first period and returns to buy from the same firm at subsequent periods.

  • b) If a consumer’s first firm exits at any period, the consumer exits the market at that period as well.

Thus, consumers patronize their first firm until it exits.Footnote 5 We refer to those “locked-in” consumers as the firm’s “customer base.”

Let \(\pi _{h}=\overline {p}-c_{h}\) and \(\pi _{l}=\overline {p}-c_{l}\) be the profit per customer when the unit cost is c h and c l , respectively. We normalize π h = 1 and, thus π l > 1. Consider a period 1 entrant. At period 1, its profit is \(-F+\frac {W_{1}}{N_{1}}\) . At period 2, if it does not exit, it retains its customer base from period 1, and in addition gets an equal share of the W 2 new consumers (which divide equally between firms). Thus, if it is operative at period 2, its profit at that period is \(-F+\frac {W_{1}}{N_{1}}+\frac {W_{2}}{N_{1}+N_{2}}\). If it is operative at period 3, it has the same customer base as at period 2 and, thus its profit that period is: \(-F+\frac {W_{1}}{N_{1}}+\frac {W_{2}}{N_{1}+N_{2}}\) if it is high cost, and \(-F+\pi _{l}\left (\frac {W_{1}}{N_{1}}+\frac {W_{2}}{N_{1}+N_{2}}\right )\) if it is low cost. Thus, a period 1 entrant’s expected profit is:

$$\begin{array}{@{}rcl@{}} V_{1}\!\,=\,-F\!\,+\,\frac{W_{1}}{N_{1}}\,+\,\max\!\!\!\!\!&&\left\{0, \!-F\!\,+\,\frac{W_{1}}{N_{1}}\,+\,\frac{W_{2}}{N_{1}\,+\,N_{2}}\right. \\ &&\!+\frac{1}{2}\max\left\{0,\!\!-F\,+\,\frac{W_{1}}{N_{1}}\,+\,\frac{W_{2}}{N_{1}+N_{2}}\right\}\\ &&\,+\,\!\left.\frac{1}{2}\max\left\{0,\!-F\!\,+\,\pi_{l}\!\left( \!\frac{W_{1}}{N_{1}}\,+\,\frac{W_{2}}{N_{1}+N_{2}}\!\right)\!\right\}\!\right\}\\ \end{array} $$
(1)

where the “0” in the max operators refer to the exit options at periods 2 and 3, respectively.

Lemma 2.2

No period 1 entrants exit at period 2 and no low-cost period 1 entrants exit at period 3.

Proof

If any period 1 entrants exit at period 2, then \(-F+\frac {W_{1}}{N_{1}}+\frac {W_{2}}{N_{1}+N_{2}}<0\), which implies that V 1 < 0 , a contradiction. Similarly, if low-cost period 1 entrants exit at period 3, it implies V 1 < 0, a contradiction. □

Now consider a period 2 entrant. Analogously to the above, its expected profit is:

$$\begin{array}{@{}rcl@{}} V_{2}=-F+\frac{W_{2}}{N_{1}+N_{2}}&+&\frac{1}{2}\max\left\{0,-F+\frac{W_{2}}{N_{1}+N_{2}}\right\} \\ &+&\frac{1}{2}\max\left\{0,-F+\pi_{l}\frac{W_{2}}{N_{1}+N_{2}}\right\} \end{array} $$
(2)

where V 2 > 0 if there is entry at period 2 and V 2 ≤ 0 otherwise. As was noted above, in equilibrium N 1 > 0. However, it is possible that there is no entry at period 2 so that N 2 = 0. Instead, our focus is on equilibria in which N 2 > 0.

Lemma 2.3

Suppose N 2 > 0. Then if there is entry at period 2, all high-cost period 2 entrants exit at period 3 and no low-cost firms exit at period 3.

Proof

If any high-cost period 2 entrants do not exit at period 3, then \(-F+\frac {W_{2}}{N_{1}+N_{2}}\geq 0\) implying that V 2 > 0, a contradiction. Similarly, if any low-cost period 2 entrants exit at period 3, then \(-F+\pi _{l}\frac {W_{2}}{N_{1}+N_{2}}\leq 0\), implying that V 2 < 0. □

The preceding lemmas establish that no low-cost firms exit and that if there is entry at both periods, high-cost period 2 entrants exit at period 3. The possibility that high-cost period 1 entrants exit has not been ruled out. The following proposition establishes parameter values corresponding to which there is a unique equilibrium, in which there is entry at both periods, all high-cost period 2 entrants exit at period 3, and no period 1 entrants exit.

Proposition 1

If1 < π l < 2.37and \(W_{2}<\frac {W_{1}(5+\pi _{l})}{1+\pi _{l}}\) then there is a unique equilibrium in which there is entry at both periods 1 and 2, all high-cost period 2 entrants exit at period 3 and no period 1 entrants exit at period 3.

Under the conditions of the preceding proposition, high-cost period 1 entrants do not exit at period 3, while high-cost period 2 entrants do. The survival of the former is due to their size advantage over the latter which results from their earlier entry into the industry. However, it is important to note that period 1 entrants enjoy a first-mover advantage only in the ex post sense. Ex ante, firms are indifferent between entering at period 1 or period 2 since expected profit from entry is zero in any case; the positive expected profit period 1 entrants enjoy at period 3 is offset by the losses they bear at period 1. This contrasts with the usual sense of “first-mover advantage” which refers to situations in which a firm gains first-mover opportunities through some combination of superior proficiency and luck (Lieberman and Montgomery 1988). Thus, corresponding to the parameter values for which proposition 1 obtains, at period 3 all the younger/smaller firms are low cost while only half the larger/older firms are low cost. Failure to account for past exit may suggest that smaller more recent entrants are more likely to innovate than incumbents, even if in fact, as is the case in our example, proficiency at innovation is actually unrelated to size or age.

3 Investing in innovation

In this section, we extend the model to a setting in which firms must invest in R&D in order to innovate. We show that the main features of the analysis of the preceding section extend to this setting as well. In fact, the potential selection bias is now even greater. In particular, it will be the case that period 1 entrants invest more than smaller period 2 entrants and therefore now the proportion of period 1 entrants which innovate successfully is greater than the corresponding proportion of period 2 entrants. Nevertheless, the percentage of surviving period 2 entrants which innovate is higher than that of surviving incumbents, which may lead to the biased inference that smaller firms innovate more successfully.

The sequence of events, model assumptions, and notation at periods 1 and 2 are now the same as in the preceding section. What is different is that at period 3 the probability with which a firm innovates depends on how much it invests in R&D. Specifically, at period 3, the probability that a firm innovates successfully and increases its profit per consumer from 1 to π l > 1 is given by σ(H) where we assume that σ(H) > 0, σ (H) < 0, σ(0) = 0 and \(\lim _{H\rightarrow \infty }\sigma (H)= 1\). As in the previous section, it is obvious that N 1 > 0, while N 2 may be zero. Again, we are interested in equilibria in which there is entry at both periods, i.e., N 2 > 0, and such that high-cost period 1 entrants do not exit while high-cost period 2 entrants do. Let H 1 and H 2 denote the equilibrium amount invested by a period 1 entrant and a period 2 entrant, respectively. Then:

$$\begin{array}{@{}rcl@{}} V_{1}\!\!&=&\!\frac{W_{1}}{N_{1}} -F+\frac{W_{1}}{N_{1}} +\frac{W_{2}}{N_{1}+N_{2} }-F \\ &&\!\!+\max\!\left\{0,\!-H_{1}\,+\,(1\,-\,\sigma(H_{1}))\!\max\!\left\{\!0,\!\frac{W_{1}}{N_{1}} \,+\,\frac{W_{2}}{N_{1}+N_{2} }\,-\,F\!\right\}\right.\\ &&\!\!+\left.\sigma(H_{1})\left( \pi_{l} (\frac{W_{1}}{N_{1}} +\frac{W_{2}}{N_{1}+N_{2} })-F\right)\right\} \\ V_{2}&=&\frac{W_{2}}{N_{1}+N_{2}}-F \\ &&+\max\left\{0,\,-\,H_{2}\,+\,\left( 1\,-\,\sigma(H_{2} )\right)\max\left\{0,\frac{W_{2}}{N_{1}+N_{2} }-F\right\}\right. \\ &&+\left.\sigma(H_{2} )\left( \pi_{l} \frac{W_{2}}{N_{1}+N_{2}}-F\right)\right\} \end{array} $$

The above formulation implicitly assumes that successful innovators cannot or do not preclude rivals from innovating successfully by means of patents. And indeed, empirically, patents seem to be important in only a relatively small number of industries (Lieberman and Montgomery 1988). However, innovation is assumed to be appropriable in the sense that a firm’s success at innovation does not spill over to competitors; that is, successful innovation by a competitor does not affect the probability with which a firm innovates successfully.

Proposition 2

In any equilibrium H 1 > H 2 .

The logic here is similar to that in Cohen and Klepper (1996). Incumbents invest more than entrants because their investment cost is spread over a greater number of units. Note that the incumbents’ expected return from innovation is greater than that of period 2 entrants. To see this, note that if incumbents and entrants invest the same amount, and thus succeed with the same probability, incumbents’ expected return from innovation is greater than that of entrants since the formers’ higher profit per customer applies to a larger number of customers. Therefore, the fact that they optimally invest more implies that incumbents enjoy a higher expected return from innovation a fortiori.

It is also interesting that in equilibrium, investment by period 1 entrants, H 1, is negatively correlated with the number of period 2 entrants, N 2.Footnote 6 The reason is that investment is a fixed cost and therefore the expected return from investment is higher the more customers a firm has. And since period 2 consumers divide equally between all firms, the number of customers per incumbent firm decreases with N 2.

The above proposition characterizes equilibrium properties for any concave investment function. Proving existence of the type of equilibria in which we are interested is much more complicated than in the simpler model of the previous section. To accomplish this, the following proposition specifies a particular investment function.

Proposition 3

Let σ(H) = 1 − 1/(1 + H)and let π l = 2. For any value of F,1 < F < 8.267and W 1 > 0, there is W 0 = f(F,W 1) > 0such that for any W 2 > W 0 , there is a unique equilibrium such that there is entry at both periods 1 and 2, period 2 entrants invest H 2 in R&D, period 1 entrants invest H 1 > H 2 , all high cost period 2 entrants exit at period 3 and no period 1 entrants exit.

The above analysis has assumed that large and small firms face the same investment function. That is, large and small firms which invest the same amount innovate with the same probability. Actually, one might suppose that a larger firm, given its larger resources in terms of R&D personnel and ownership of relevant patents, might innovate with higher probability. In that case, the sampling bias would be even more pronounced. Not only do larger firms invest more, but dollar for dollar innovate with higher probability, which further increases the success rate of larger firms relative to smaller ones. And nevertheless, a higher proportion of surviving small firms will be observed to innovate.

Conversely, one might suppose that recent entrants are more dynamic, idea driven and unencumbered by bureaucracy, and therefore innovate with higher probability per dollar than larger incumbents. In that case, it might be the case that a higher proportion of smaller firms innovate and the inference that innovation decreases with size might be correct. In that case, the direction of the size effect on innovation might be unbiased, but failure to account for past exit would exaggerate the size of small firms’ advantage.

4 Summary and contribution to the literature

This paper contributes to the literature on the effect of firm size on innovation by showing, in the context of a simple industry model, that evidence implying that smaller entrants are more successful at innovation than larger incumbents may be subject to sample selection bias. In particular, even if incumbents in fact innovate more successfully, smaller entrants may appear to be more innovative simply because unsuccessful incumbent firms’ size advantage enables them to survive when unsuccessful entrants cannot. In other words, because of their size advantage, —incumbents may be “too big to fail.” Thus, inferring the effect of firm size on innovation from the proportions of large and small innovators may be misleading if smaller non innovators exit at a higher rate than larger ones. Our analysis thus suggests that an empirically valid analysis of size effects should account not only for innovative activity of contemporary firms but must also account for innovation by contemporaneous entrants firms which have previously exited.

Our analysis is also related to the literature on firm size distribution. While the focus of that literature, summarized in the introduction, is to account for the evolution of firm size over time or the steady state distribution of firm size, we analyze the effect of introducing an opportunity to innovate after the equilibrium firm size distribution is established, where the equilibrium level of entry during the industry’s growth stage accounts for the likelihood of successful future innovation. For concreteness, our analysis is framed in a much simplified (two period) version of the model of (Fishman and Rob 2003), but qualitatively similar results would also seem to apply in the context of alternative models of firm size distribution discussed in the introduction.

An implicit assumption of our model is that innovation is incremental. Specifically, as is established by Lemma 2.1, the assumption of sequential search with unit demand implies that the equilibrium price is independent of firms’ production costs and, thus the lower cost attained by innovating firms cannot price non innovators out of the market. In other words, with unit demand, innovation is never “drastic” in Arrow’s sense.Footnote 7 Alternatively, consider an analogous model in which consumers have identical downward sloping demand for the product. In that case, as shown by Reinganum (1979), the low-cost firms’ equilibrium price, p l , is lower than the high-cost firms’ equilibrium price, p h , p h > p l , where the difference between those prices is increasing in consumers’ search costs. Thus, if the cost difference between innovators and non innovators is sufficiently large, and if consumer search costs are low enough, then p h < c h , and non innovators are priced out of the market. This suggests that the selection bias that we have identified is more likely to be an issue when innovation is incremental than if it is drastic.

While in our model, the “consumer lock-in” effect confers an endogenous first-mover advantage on earlier entrants, new entry of smaller competitors is nevertheless viable because the latter are able to attract new consumers, who choose a firm at random. This feature of random consumer search seems appropriate for industries in which there are no prominent recognized industry leaders. By contrast, in industries with universally recognized leaders, such as Apple, Google, and Facebook, new consumers will naturally gravitate to those leading firms, especially given the importance of network effects in those industries. In such cases, new entry can be viable only if entrants develop a significant technological advantage over incumbents prior to entry and a somewhat different selection bias may arise. In particular, since only the most innovative entrants will make it into the data, potential entrants which are less successful at innovation are not observed as they do not enter the industry in the first place.

Finally, our analysis considered process innovations. As these do not change demand, failure to innovate does not affect incumbents’ size advantage. By contrast, successful product innovation increases demand facing innovators at the expense of non innovators and analysis of this case could lead to different conclusions.