1 Introduction

Many high-technology firms co-locate, or cluster, spatially (Audretsch and Feldman 1996; Porter 2000). A plausible inference is that firm founders or executives often believe that co-location will enhance the firm’s expected performance (Feldman et al. 2005). Despite the plausibility of this argument, there is little hard evidence that co-location is positively related to firm performance. In particular, there is relatively little evidence concerning the impact of location or co-location on the performance of high-tech firms: “few systematic empirical studies analyze spatial heterogeneity in new venture creation in high-technology industries, and even fewer document the effect of geographic location on organizational viability” (Stuart and Sorenson 2003, p. 230). Accordingly, our research examines the effect of clustering on the growth performance of a sample of successful “new technology-based firms” (NTBFs) in the United States.Footnote 1 Almost all of these NTBFs are in either biotechnology or information and communications technology (ICT).

The primary empirical research question addressed here is whether location in, or near, a specialized industry cluster affects the growth performance of these firms. In order to consider why NTBFs might benefit from location within or near a relevant cluster, we draw upon the resource-based view (RBV) of the firm which emphasizes that a firm is a heterogeneous bundle of resources both in its initial resources (Wernerfelt 1984) and in its ability to absorb new knowledge, and other resources (Cohen and Levinthal 1990). Building on the RBV, we suggest that clustering benefits arise from the ability of firms to externally augment resources, knowledge and capabilities, which in turn give rise to competitive advantage. We distinguish between resources that can be obtained through markets and those that can be obtained through spillovers, because, as discussed below, they help to distinguish NTBFs from other firms in terms of cluster benefits and to explore the contingent nature of benefits.Footnote 2 Although clustering may benefit many types of firms, we argue that benefits are pronounced when firms require specialized resources, but are unable to generate these resources internally. Thus, we expect that cluster benefits are particularly important for NTBFs.

Although the RBV focuses on intra-industry heterogeneity, industry effects also matter because resource heterogeneity is conditioned by task domain and the nature of underlying production functions. Cluster benefits vary across industries because firms benefit from different external resources and capabilities. For example, evidence suggests that biotechnology firms rely more on external knowledge acquisition that is distant-dependant than do other firms; we argue that, as a consequence, they are more likely to benefit from clustering. An additional question is whether cluster benefits differ according to the nature of the cluster. In the Marshall–Arrow–Romer tradition, clusters are defined as being specialized to an industry or sector. However, such specialized clusters are often embedded in metropolitan areas that provide augmented access to highly diversified resources and capabilities (Jacobs 1969). We therefore examine whether cluster benefits are enhanced when the specialized cluster is located within a more diverse economic environment and specifically whether cluster diversity provides additional benefits, and whether they vary by sector.

Empirically, we test the relationship between clustering and firm growth for a sample of high growth NTBFs. In doing so, we try to integrate the growth/performance and the location/clustering research perspectives. Audretsch and Lehmann (2005, p. 207) argue, “these two research trajectories remain separate”. We augment familiar Gibrat-like growth equations with location variables in order to combine these two perspectives. Although originally framed in terms of the independence of growth and firm size, the Law has been extended to other systematic determinants, such as firm age. But, to our knowledge, it has not been employed to determine whether location is a systematic determinant of firm growth.

We consider the growth performance of a cohort of 451 U.S. NTBFs drawn from the Deloitte and Touche 500 fastest growing high-tech firms in North America over the period 1995–1999. We therefore focus on young and relatively small NTBFs that are “successful” in terms of growth. At the beginning of the study period, the average sample firm had existed for 5 years, the average revenue in 1995 was $15 million, and the average revenue growth rate over the period from 1995 to 1999 was over 4,000%. These firms are also successful in terms of survival—99% of the firms still existed, in some form, in 2002. Because of these growth and survival characteristics, the sample is certainly not random. Consequently, our results are most relevant for small- and fast-growing NTFBs. We expect these firms to receive net benefits from clustering, because as a result of their newness, smaller size, and their sector knowledge-intensiveness, they are particularly resource-constrained. There are offsetting benefits to focusing on a relatively homogeneous sample of firms. In particular, by focusing on successful firms we are able to isolate more clearly the impact of co-location on growth from effects arising from unobserved firm heterogeneity. Firm-specific cluster benefits will depend not only on the resources, knowledge and capabilities available in the cluster, but on the ability of a firm to absorb these resources, and in particular on their ability to absorb knowledge. Differences in absorptive capacity can therefore be a source of unobserved heterogeneity across firms. A sample of successful firms reduces the need to identify firm-specific factors apart from location that contribute to growth.

A feature of this article is that we use the Harvard Clustering Mapping Project (HCMP) data to define relevant clusters. We measure cluster benefits both by whether a firm is located within that cluster, and by the distance of a firm from the center of the cluster. We also use the HCMP data to define the degree to which a given metropolitan area contains several specialized clusters and this provides one measure of cluster diversity.

To preview our findings, our results indicate that proximity to a relevant cluster is only weakly related to the growth performance of most sample NTBFs, but more strongly and significantly related to the growth of biotech firms. We also find no evidence that most firms located within specialized clusters that are embedded within diversified metropolitan economies experience augmented growth. However, we do find that a more diversified metropolitan area is associated with better growth performance for clustering ICT firms. As we emphasize in the concluding section, these results are subject to a number of caveats, mainly regarding the degree to which the sample is representative and the difficulty of drawing causal inferences from the available data.

2 The cluster literature

There are three aspects of the cluster literature that are particularly pertinent to this study: (1) the relative lack of a firm-level, or strategic, focus, (2) the particular nature of the industries or firms, where firms are the focus of study, and (3) ambiguity around the definition and measurement of clusters. We briefly discuss each in turn.

The cluster literature builds on a long tradition of studies in economics, economic geography, and industrial organization that investigate firm agglomeration, especially in metropolitan areas (Marshall 1920; Jacobs 1969; Krugman 1991). Given these roots, this literature is not primarily concerned with the strategic behavior or performance of individual firms per se (Tallman et al. 2004). Many studies have primarily mapped industry clusters in specific locations (e.g., Baptista and Swann 1999; Braunerhjelm et al. 2000). A few studies have examined the behavior and performance of young firms in high-tech industries, but they have focused on the growth patterns of small samples of firms in one location and it is difficult to generalize from them.

Until recently there has been relatively little research that related location and clustering to firm performance. Recently, however, firm-level location choices have been investigated in a number of industries and in manufacturing, using location decisions to infer cluster benefits. These studies do not provide overwhelming support for the proposition that clustering enhances firm performance; however, they focus on industries with very different dynamics from those considered here. Kalnins and Chung (2004) study the hotel industry where demand complementarities (positive spillovers), in particular the reduction of consumer search costs that result from co-locating with competing firms, are important. Gimeno et al. (2005) study the international expansion of large telecommunications firms with few competitors. Shaver and Flyer (2000) examine foreign entrants to U.S. manufacturing; these firms are relatively old, large, and mature. In contrast, our focus is on small firms, operating in emerging industries where market structure is still relatively fluid and undefined. These differences are heightened by suggestive evidence that access to knowledge resources is more likely to drive location decisions in high-tech sectors, such as pharmaceuticals (Chung and Alcacer 2002).

There is considerable variety, and some ambiguity, concerning the meaning of a cluster (Martin and Sunley 2003; Tallman et al. 2004). The term cluster is used mostly to refer to spatial co-location involving other organizations that relate to the supply chain of the industry, often including competitors, complementors, suppliers, and customers. But, cluster terminology is sometimes used more broadly to refer to (usually urban) agglomerations involving organizations that provide a broad range of complementary assets, whether via spillovers or through market-based transactions (e.g., Porter 2000). A useful definition of the minimum requirement for a cluster is “a group of firms from the same or related industries located geographically near to each other” (Bell 2005, p. 228). It is important to emphasize that this does not mean that the benefits of clustering derive from co-locating with competing firms—for some firms, as discussed below, the presence of competing firms may not matter or may be disadvantageous. However, the co-location of competing firms is a definitional necessity for cluster existence. Consequently, clusters are specialized around at least a somewhat well-defined industry. This requirement is consistent with Marshall’s original conception of industrial districts (Marshall 1920) and with the Marshall–Arrow–Romer externalities tradition that argues that co-location by firms in a single industry fosters firm and cluster growth. However, as we discuss below, the cities or metropolitan regions in which these clusters are embedded may well be diversified. Thus, a single metropolitan area may contain more than one specialized cluster. Jacobs (1969) has emphasized the benefits from inter-industry flows of knowledge and other benefits associated with industrially diverse regions.

3 The resource-based view of the firm and cluster benefits

Based on the RBV, we posit that from the perspective of the firm, benefits arise from the ability of firms to obtain and integrate valuable external resources and capabilities made available in the cluster. NTBFs must acquire such resource bundles in order to be successful and innovative (Audretsch and Feldman 2003). Most importantly, they must access specialized inputs and complementary assets, particularly knowledge and knowledge embedded in human capital (Tallman et al. 2004). These resources and capabilities could be acquired in a number of ways: internal development, market-based transactions, or (non-market-based) spillovers. However, primarily because of their youth and size, NTBFs rely significantly on external sources, whether through market-based transactions or through non-market-based spillovers.

3.1 Market-based benefits

Firm-specific benefits can arise from the ability to acquire resources and capabilities through market transactions whether through negotiated prices, procurement policy, employment contracts, or joint ventures (Breschi and Lissoni 2001; Owen-Smith and Powell 2004; Tallman et al. 2004). Location within a cluster reduces transaction costs, particularly search and information costs and limits the possibility of hold-up by specialized suppliers and distributors and facilitates the acquisition of resources and capabilities (Helsley and Strange 2002). There are also benefits from co-locating with specialized upstream suppliers and/or with downstream customers (Marshall 1920; Krugman 1991; Oerlemans and Meeus 2005). Additionally, co-location with competitors may generate demand-side benefits by reducing consumer search costs. It is an empirical question as to whether market-based transaction “benefits” will translate into performance benefits for individual firms. It is quite possible that, because these effects are related to market transactions, others—suppliers, customers, value chain substitutes, or complements—may capture most or all of the rent.

3.2 Spillover benefits

Clustering benefits may also stem from various spillovers (Audretsch and Feldman 1996; Braunerhjelm et al. 2000; Feldman et al. 2005). Spillovers benefits include: (1) knowledge spillovers from competing co-located firms; (2) knowledge spillovers from public infrastructure, most notably from public research institutions; (3) spillovers that emanate from suppliers and customers. We consider each in turn.

First, for co-location to be relevant, any competing-firm spillovers must be distance-dependent to some degree. Researchers have posited that “sticky knowledge” is the critical distant-dependent spillover for high-tech firms (Audretsch and Feldman 1996). However, as firm resources are heterogeneous, competing-firm spillovers are often not symmetric or equal; some firms may gain more than others. Similarly, the nature of the spillover benefits may differ across the co-locating firms—one firm may gain technology benefits, while another gains marketing benefits. For a given firm, spillovers from co-location with competitors can be negative (Tallman et al. 2004); if spillovers are (net) negative, the firm has an incentive to locate away from competitors, holding constant, for the moment, other benefits of clustering. Shaver and Flyer (2000) also emphasize the potential for negative competing-firm spillovers. Additionally, positive benefits at the start-up stage may become negative over time as competitors increase in number (Stuart and Sorenson 2003).

Second, there are (intentional) spillovers that arise from public and quasi-public research institutions, including universities, and from other public infrastructure, including physical infrastructure such as transportation facilities (Acs et al. 1992). Obviously, for proximity to be valuable, these spillovers also have to distance-dependent (Acs et al. 1992). When a firm assesses such spillovers the location of rival firms per se is not relevant, unless the benefit would be dissipated because proximate competitors have access to the same knowledge resources.

Third, co-location may provide the opportunity for improving existing products to serve new market niches. Co-location with potential customers from diverse sectors may provide superior information about their evolving market needs (Porter 2000). Co-location with suppliers may facilitate the transfer of tacit knowledge regarding new input technologies. While the first two spillover effects refer primarily to costs, the third effect may have both cost and demand elements.

3.3 Diversity benefits

Finally, we note that specialized clusters may not be the only source of firm-specific location benefits. Both market and spillover benefits also arise in large metropolitan areas with highly diversified economic activities. Jacobs (1969) is most commonly identified with the view that large diversified urban economies cities provide benefits to firms: they gain access to more generalized human capital and to generic services, such as advertising, legal, consulting and accounting services. More diversified metropolitan areas or regions also provide a broader range of proximate customers who provide feedback on the suitability of products for new markets. Diversity benefits may also derive from the fact that public infrastructure spillovers can only be realized at scales associated with large cities. Thus, firm-specific cluster benefits may also emanate from the degree to which specialized clusters are embedded in a diversified economic region.

4 Hypotheses

4.1 Cluster benefits

Where both market and spillover effects are significant, there will be strong incentives to locate in specific locations, and extensive benefits from doing so. The evidence to date suggests that these benefits are significant for high-tech industries, particularly ones where firms are smaller and newer, lack the ability to develop resources internally, and are developing new products. Further, small firms with limited R&D resources, particularly those developing new products, benefit from both competing-firm and public infrastructure spillovers (Audretsch and Feldman 1996). Cluster benefits have been identified in the biotech and computing sectors (Swann and Prevezer 1996). Braunerhjelm et al. (2000) provide evidence of knowledge spillovers from public research institutions to high-tech firms. A number of studies have identified localized supply chain benefits in high-tech industries (Porter and Stern 2001). Hence, we argue that location within a (specialized) cluster, or proximity to such a cluster, enhances the performance of small, high-tech firms due to knowledge spillovers and proximate access to market-based benefits. Specifically we propose that location within, or proximity to, a (specialized) cluster will lead to higher growth rates for NTBFs (H1).

4.2 Cluster diversity

Building on Jacobs (1969), a number of scholars have investigated the benefits of “heterogeneity” or “metropolitan industrial diversity” (Rosenthal and Strange 2005). However, the empirical evidence on the performance effects of Jacobian-type diversity is mixed. Swann and Prevezer (1996), while concluding that the main promoter of NTBF growth is strength in own-sector firms and employment at a cluster, also find that such growth is not significantly affected by an increase of firms and employees in other geographically proximate sectors. In contrast, Glaeser et al. (1992, p. 1129) find that “industries grow slower in cities in which they are heavily over-represented”, thus lending support to Jacobs’ position that diversified cities promote innovation and growth. Acs et al. (2002) provide other empirical evidence demonstrating that diversity promotes growth. None of these studies, however, use firm-level data. Using firm-level data, Globerman et al. (2005) find evidence that in North America there are growth performance benefits to locating within a large, economically diversified metropolitan area. In contrast, Rosenthal and Strange (2005) find only limited evidence of diversity benefits within the New York metropolitan area.

Although the evidence is mixed, it is plausible that firms can also benefit from access to the resources and capabilities associated with Jacobian-type diversity. As NTBFs are particularly reliant on external sources of knowledge and other complementary assets, they should benefit from diversified clusters. Thus, in sum, we propose that clusters embedded in more diversified (Jacobian) metropolitan areas lead to higher growth rates for NTBFs (H2)

4.3 Biotech sector benefits

Although co-location benefits are firm-specific, they also can vary widely across industries (Steinle and Schiele 2002). We argue here that clustering is relatively more beneficial for biotechnology firms, both because of the greater knowledge intensity of the sector and because more biotech firms are tied to locations that do not provide cluster benefits. The empirical evidence suggests that biotech is “different” from most other high-tech sectors (Autant-Bernard et al. 2006). Biotech research is intrinsically knowledge-intensive and highly tacit in nature (Audretsch and Feldman 2003; Owen-Smith and Powell 2004), and tacit knowledge is often identified as a critical distant-dependent spillover (Audretsch and Feldman 1996). It has been also noted that specialized knowledge is often embodied in “star scientists” (Audretsch and Stephan 1999; Zucker et al. 2002) and these academic scientists frequently found biotech firms (Zucker et al. 1998). Thus, biotech firms naturally cluster around universities and benefit from knowledge spillovers to a greater degree than other NTBFs (Zucker et al. 1998; Lemarie et al. 2001).

The reliance on star scientists does, however, create the potential for greater dispersion of biotech firms. In the U.S., R&D scientists in biotech-related fields are dispersed geographically at universities (Zucker et al. 2002). Many universities are not in the metropolitan areas associated with clusters, and so many biotech firms are located outside of relevant clusters (Stuart and Sorenson 2003). Indeed, Feldman (2003) suggests these conflicting forces are finely balanced in the U.S.: “The industry is simultaneously becoming more widely distributed across a variety of locations, and at the same time, the [biotech] industry is becoming more geographically concentrated in a few locations.” This relatively greater dispersion obviously limits the ability of biotech firms to take advantage of cluster benefits. There is some evidence that the high degree of knowledge tacitness in biotech increases the value of clustering. For example, Aharonson et al. (2004) find that Canadian biotech firms locate in order to benefit from knowledge spillovers from similar firms. Clustering is likely to be particularly beneficial for biotech firms that benefit relatively more from tacit knowledge spillovers, and benefits related to specialized human capital. We, therefore, propose that location within, or proximity to, a (specialized) cluster will lead to relatively higher growth rates for NTBFs in the biotech sector (H3).

4.4 ICT sector benefits

Particular industries or sectors are more likely to benefit from access to a broad array of customers and suppliers, that is, from economic diversity. In terms of customers, when firms must innovate and customize their products, they benefit by locating close to potential customers because tacit knowledge can be more easily exchanged. ICT firms typically have a broader set of industry customers than biotech firms, and a diverse metropolitan area offers a broader range of customers for such firms. Hence, we expect that ICT firms benefit more from downstream supply chain benefits.

We therefore hypothesize that proximity to diverse customers present in more diverse metropolitan regions would promote the growth of these ICT firms. As with H2, we frame this hypothesis within the context of specialized clusters and propose that clusters embedded in more diversified (Jacobian) metropolitan areas lead to relatively higher growth rates for NTBFs in the ICT sector (H4).

5 Methodology

5.1 Dependent variable: growth

The importance of firm growth as a measure of performance has long been recognized, particularly for NTBFs (Eisenhardt and Schoonhoven 1990). There are three practical reasons for using growth to measure NTBF performance: (1) it is difficult to get information on profitability for privately held firms; (2) profitability is rarely present or observable, given the early stage of the industry life cycle; (3) NTBFs typically have significant intangible assets that are difficult to value with accounting-based performance measures. All three reasons make the use of growth as a measure of performance, a practical necessity. But, firm growth rate is probably a reasonable proxy for profitability anyway. A number of studies have found higher growth rates to be an indicator of higher profitability (Klepper 1996). For these reasons, we use growth rates as a dependent variable and an indicator of strategic success. In addition, studies of high-tech firms often use either the growth of revenues or the growth of employees as the performance metric (e.g., Hamilton et al. 2002). The standard approach when using revenue is to average it over a number of years (e.g., Sadler-Smith et al. 2003); we also adopt this approach.

5.2 Gibrat’s law

We use Gibrat’s Law—otherwise known as the Law of Proportionate Effect—as a departure point for empirical testing (Sutton 1997). Gibrat’s Law posits that firm growth is random and therefore independent of systematic determinants, such as firm size, age, or location. The basic model of firm growth then is as follows:

$$ {\rm GROWTH}\,(i,t) = G^{\beta } {\left[ {{\rm SIZE}\,(i, t'),{\rm AGE}\,(i, t)} \right]}e^{{[\mu (i, t)]}} \quad t' > t > 0,{\left[ {u(i, t) \sim (iid)} \right]} $$
(1)

where GROWTH(i, t) is the growth of firm i between period t and t′ (Sales (i, t′)−Sales (i, t)); SIZE (i, t) is the size of firm i at time t, AGE (i, t) is the age of firmi at time t; β is a growth parameter; and μ(i, t) is firm i’s draw from the common distribution of growth rates. It is further assumed that μ (i, t) ∼ N (α, σ2), and therefore that:

$$ \mu\,(i,t) = \alpha + \xi (i,t)\;{\rm where}\;E\,[\xi (i,t)] = 0 $$

Taking the natural log of both sides of Eq. 1 produces the following cross-sectional relationship:

$$ \begin{aligned} {\rm GROWTH} &= {\left( {\ln {\rm SIZE}\,(i, t') - \ln {\rm SIZE}\,(i, t)} \right)}/d \\&= \alpha + \beta _{s} \ln{\rm SIZE}\,(i, t) + \beta _{a} \ln{\rm AGE}\,(i, t) + \xi (i, t) {\left\{ {\varepsilon _{{(i, t)}} \sim (iid),\quad t' > t > 0,d = (t' - t)} \right\}} \end{aligned} $$
(2)

The Gibrat hypothesis is that the estimated coefficients β s and β a are not different from zero, so that growth is for the most part random. Contrary to Gibrat’s Law, most recent empirical studies of high-tech industries find firm growth to be negatively correlated both to firm age and to firm size over a wide range of industries, countries, and regions (e.g., Hamilton et al. 2002).Footnote 3 However, the Law still forms a useful null hypothesis because a few empirical studies do find random growth rates for small firms in some circumstances, although not in high-tech sectors (e.g., Lotti et al. 2003; Audretsch et al. 2004).

Although Gibrat’s Law provides a convenient estimation framework, we do not expect it to fully capture the systematic determinants of firm growth. Access to capital is critical to the growth of new firms, and NTBFs require more capital than do entrants to other types of industries. At the same time, there is evidence that privately held firms are more capital-constrained than comparable publicly traded counterparts (Storey 1994). Additionally, public firms generally enjoy more comprehensive limited liability than do privately held firms, which may allow them to take more risks and, if they survive, to grow more rapidly (Davidsson et al. 2002). Given these factors, we augment the Gibrat equation to include a term accounting for whether the firm was publicly traded. Specifically, we add a dummy variable to indicate whether the firm is privately held (PRIVATE = 1)) or publicly traded. We expect that privately held firms would have experienced slower growth, even during the relatively benign time period between 1995 and 1999, when NTBFs had easy access to capital.

Finally, in order to test the first two hypotheses regarding the impacts of location on growth, we add two location-based variables. The first measures (specialized) cluster effects (CLUSTER) and the second measures the degree to which a specialized cluster location is itself diversified (DIV).Footnote 4

The growth equations to be estimated can be therefore summarized as:

$$ \begin{aligned} {\rm GROWTH} =& \alpha + \beta _{s} \ln {\rm SIZE}\,(i, t) + \beta _{a} \ln {\rm AGE}\,(i, t) \\&+ \beta _{p}\, {\rm PRIVATE}\,(i, t) + \beta _{c}\,{\rm CLUSTER}\,(i, t) + \beta _{d}\,{\rm DIV}\,(i, t) + \xi (i, t) \end{aligned} $$
(3)

Given the literature on the estimation of Gibrat equations discussed above, we expect that: size is negatively related to growth (β s  < 0); age is negatively related to growth (β a  < 0); and privately held ownership status is negatively related to growth (β p  < 0).

The estimated coefficients on the CLUSTER and DIV terms provide the primary tests for H1 and H2. We expect that cluster proximity promotes growth and distance from a cluster retards growth (β d  < 0)Footnote 5 and that diversity of the cluster promotes growth (β d  > 0). H3 suggests that cluster effects are stronger for biotech firms, where biotech firms are defined to include both bio-pharmaceutical and medical devices firms. In order to test this hypothesis, we include two interactive variables (CLUSTER*BIOPHARMA) and CLUSTER*MEDICAL DEVICES); these coefficients are expected to be positive. H4 maintains that diversity effects are stronger for ICT firms. Accordingly, we include two interactive variables (DIV*IT and DIV*COMMUNICATIONS), whose coefficients are also expected to be positive.

6 Data and measurement

Our primary data source is a sample of U.S. technology firms compiled by Deloitte and Touche (D&T) (Deloitte & Touche, n.d.). The sample consists of 500 young, fast growing, publicly traded and privately held firms in the software, communications, Internet, computers, semiconductors, medical instruments, biotechnology, life sciences, and “other” technology-related sectors. The sample is selected on the basis of high revenue growth between 1995 and 1999, and therefore focuses on successful firms. D&T compile the sample by combining data from three sources: (1) D&T’s 20 regional centers; (2) external on-line nomination of candidate firms, and (3) D&T database research on the publicly traded firms in the sample. As a consequence, the sample consists of firms from a variety of technology sectors and geographic locations across the U.S. For reasons of data availability, our sample size is 451.

The D&T database contains data on revenues in 1995 and 1999, which provides us with measures of initial size and firm growth. Because these firms were selected on the basis of growth, it is not surprising that their growth rates over the period were very high (Table 1). The database also contains information on the firm’s location (address) and postal (zip) code, the founding date of the firm, from which we compute its age as of 1995, and whether the firm traded on a public stock exchange from which we create a dummy variable for privately held firms. Firms in the sample were typically very young, with a mean age of about 5 years in 1995. They were also relatively small, with average revenues of $15 million in 1995. A slight majority (56%) was publicly traded. Lastly, D&T allocates these NTBFs to eight business sectors using categories developed by Price-Waterhouse-Cooper (PWC): software, Internet-related, computers and peripherals, semiconductors//electronics, communications, biotechnology, medical/scientific/technical and “other” (Table 1, column 1).

Table 1 Descriptive summary of sample

Next, we utilize the Harvard Cluster Mapping Project (HCMP) to define relevant clusters (Harvard Cluster Mapping Project, n.d.). The HCMP provides a “cluster competitiveness” ranking of the top 20 U.S. metropolitan locations for a large number of industries. It compares locations that are more or less attractive to firms in a given business sector, based on indicators such as overall sector employment within each metropolitan area, the percentage of national sector employment in each metropolitan area, and sector wage rates within each metropolitan area. For example, the HCMP provides a ranking of the top 20 U.S. metropolitan locations for biopharmaceutical firms, based on a biopharmaceutical cluster employment indicator.

To utilize this location information, we first match the firm’s business sector (as determined by D&T) to a relevant technology “cluster” as defined by the HCMP (Harvard Cluster Mapping Project, n.d.). Each firm is allocated to a single technology cluster category. We allocate the eight PWC categories to 5 HCMP technology clusters (see Table 1, columns 1 and 4). It is important to note that for this purpose these HCMP technology clusters do not relate to location, but rather to the technology category to which the firm belongs, regardless of its location. A relevant cluster therefore refers to a cluster that is of the same technology industry category as the firm. As shown in Table 1, almost all firms are allocated to one of four relevant technology cluster categories: information technology (IT)—324 firms, communications equipment—40 firms, biopharmaceuticals—43 firms, and medical devices—43 firms. Thus, almost all firms fall into one of two broad categories, information and communications technology (ICT) or biotechnology (including medical devices). Only seven firms fall outside these two categories.

We then use the address of the firm to determine whether it was located in, or near its relevant cluster, where the relevant cluster is defined in several ways. The simplest measure of cluster location is a dummy variable indicating that the firm is located within a top-10 cluster, as defined by the HCMP. This same variable is also weighted by the rank of the cluster, once again as defined by HCMP. For example, if a firm is located in a top-10 cluster of its technology cluster category, it is assigned a value of unity. For the weighted measure, the dummy variable is divided by the rank of the cluster so that if the firm is located in the largest cluster, the variable equals 1; if the firm is located in the second largest cluster, the variable equals ½, and so forth. Table 1 indicates that about half of the sample firms are located within a top-10 cluster, but this does vary by technology cluster category.

We determine the distance of a firm from a relevant cluster using three measures: distance from the largest relevant cluster, or distance from the nearest relevant top-10 cluster, either unweighted or weighted by its rank.Footnote 6 The weighted measure is employed to control for the possibility that larger clusters have greater influence on firm performance. In order to measure distances between a firm and the relevant cluster, the longitude and latitude of each cluster city and the city in which the firm was located were obtained from the U.S. Geological Survey (U.S. Geological Survey, n.d.) and Look-up Latitude and Longitude–USA (Look-up Latitude and Longitude, n.d.). For clusters that include more than one city (such as the New York area), the largest city was chosen. The relevant data for each firm are obtained by entering its zip code into the U.S. Census Bureau program (U.S. Census Bureau, n.d.). Then, each longitude and latitude is entered into The Great Circle Calculator program (Great Circle Calculator, n.d.) to compute the distance (in miles) between locations. In order to ensure the accuracy of the Great Circle Calculator’s computations, we also employed a surface distance calculation program (U.S. Department of Agriculture, n.d.) that uses the longitude and latitude of a city to compute distances. This method provides essentially the same results. The average distance between a firm and the largest relevant cluster is about 900 miles (1,500 km), while the average distance to the nearest top-10 cluster is just over 80 miles.Footnote 7 Thus, there is considerable geographic dispersion in the sample.

Table 2 Means, standard deviations and correlation matrix

The diversity of a cluster is measured in two ways, which we collectively term “Jacobian cluster diversity”. First, we use the Hachman Index for the city (SMA) in which the cluster is located. The Hachman Index is a widely used measure of regional industrial diversification. The Index calculates the degree to which the share of employment across industries in a region differs from the distribution of employment for the U.S. as a whole.Footnote 8 The maximum value of the Hachman Index is 1, which occurs when the region is as diversified as the U.S., and lower values indicate lower levels of diversification. Because of the different use of the term diversification in the business strategy literature, we consistently refer to this as “cluster metropolitan economic diversity”. In order to calculate the Index for the SMAs in which the cluster is located, industry and employee statistics for each relevant SMA and the entire U.S. were obtained from the Bureau of Labor Statistics (U.S. Bureau of Labor Statistics, n.d.). Using two-digit SIC codes, we then calculate the share of employment accounted for by an industry within each SMA, as well as the U.S. The mean Hachman Index for the HCMP cluster nearest to the firm is 0.946.

As a second measure, we use the Harvard cluster data to measure the range of cluster activities that are found within a specific cluster location. We refer to this as the “cluster frequency”. The measure is created by counting the number of top-20 clusters of any kind found in the city in which the cluster is located. The HCMP identifies 41 technology cluster categories in total, and each category contains a list of the top-20 clusters. We simply count the number of times a particular location is listed as a top-20 cluster. For example, West Palm Beach is listed in three of the 41 technology cluster categories: Aerospace Engines, Agricultural Products, and Communications Equipment, so it’s cluster frequency is three. The mean cluster frequency for the HCMP cluster located nearest to the firm is 5.12.

Table 2 presents descriptive statistics and the correlation matrix for the independent variables that are employed in the subsequent analysis.Footnote 9 Consistent with the Gibrat framework, firm size and age are measure in logarithms, as are the distance measures. The latter implies that any distance effects are non-linear. The correlation coefficients suggest that multicollinearity is not an issue. The highest correlation coefficients are found among the cluster location and distance measures, but these are treated as alternatives in the empirical analysis.

7 Results

7.1 Basic results

The basic empirical results are obtained from OLS estimation, with heteroskedastic-consistent standard errors (Table 3). We initially assume that the cluster location terms are exogenous. We begin with a model that does not include location variables, and then add different location variables, using different measures of cluster and cluster diversity effects. As a consequence of using these different measures, the relevant coefficients cannot be compared across estimated equations (i.e., across columns in Table 3). Initial estimates included business sector dummies, but their collective significance is rejected (using an F-test) and not included in the reported results. This does not affect the results in any way. In addition, the inclusion of higher order terms for firm size and age provides no statistically significant coefficients. In all specifications, we find that growth is negatively and statistically significantly related to initial size, firm age, and ownership status. This confirms previous results indicating that growth is not random for samples of NTBFs. Smaller and younger firms experience higher growth rates, but privately held firms are penalized in terms of growth.

Table 3 Growth model regression results. Dependent variable: Logarithmic growth rate 1995–1999

Models (2)–(8) add different cluster and cluster diversity variables to the basic model in order to test H1 through H4. It is important to note that the cluster and cluster diversity measures are always consistent within any particular equation: if the cluster effect is measured by the distance to the nearest cluster, then the diversity effect is measured by the diversity of the nearest cluster; if the cluster effect is measured by location in a cluster, then the diversity effect is measured by the diversity of that cluster. Thus, the cluster diversity measure is conditional on the way in which the cluster effect is measured. In order to test whether clusters are more important for biotech firms (H3) and cluster diversity for ICT firms (H4), interactive terms were required and these proved to be highly collinear. Consequently, in Models (3)–(8) where interactive terms are present, we introduce cluster measures and cluster diversity measures separately. The results are broadly supportive of H1 and H3, but do depend on the exact measure employed. Support for H2 and H4 is more ambiguous.

Model (2) presents a specification that includes both direct cluster and diversity effects. Here, cluster effects are measured by the distance of a firm to the largest cluster relevant to that firm (e.g., the largest biopharmaceutical cluster if the firm is a biopharmaceutical firm) and diversity effects are measured by the Hachman Index associated with that cluster. The results indicate that the coefficient on the cluster term is negative and statistically significant (supporting H1), while the coefficient on the diversity term is positive, but not statistically significant (no support for H2). Similar results are obtained when other measures are employed for cluster and diversity effects (for example, distance to the nearest cluster and number of HCMP clusters in the nearest cluster). In addition, we considered alternative specifications, specifically one in which included the cluster effect and an interactive term (CLUSTER EFFECT*DIVERSITY EFFECT). F-tests rejected this specification in favor of one that contained only a cluster term. Thus, at this stage, we find evidence to suggest that all NTBFs benefit from proximity to a cluster, but the diversity of the metropolitan area in which the cluster is imbedded is not important.

Cluster effects are further examined in Models (3)–(5). In these specifications, we omit the cluster diversity term, but this does not affect the results. We find that, in general, cluster effects exist if defined by distance from a relevant cluster, but cluster effects do not exist when defined by location within a cluster. When cluster effects do exist, they are stronger for biotech firms. This can be seen by comparing Models (3), (4) and (5). When cluster effects are defined by distance from either the nearest top-10 cluster (weighted by rank) or by distance from the largest relevant cluster we find that growth rates decline with distance from the cluster, suggesting that cluster benefits spill over across distance. Moreover, the effects are stronger for biotech firms (a category which includes biopharmaceutical and medical devices firms), as hypothesized.Footnote 10 However, when cluster effects are defined by whether the firm is located in a top-10 cluster, we find no evidence of these effects. We also do not find them for biotech firms. Zaheer and George (2004) recently found that market value of biotech firms is not related to location in a cluster, although they did not consider distance. For this specification, when cluster effects are measured using dummy variables, we cannot find any evidence that clustering and growth are related. However, when cluster effects are measured as a continuous variable, we find that proximity to a relevant cluster is associated with enhanced growth and the effect is stronger for biotech firms.Footnote 11

Shaver (2006) suggests that results should be discussed in terms of economic as well as statistical significance. In particular, he suggests moving the independent variable across its range and observing the impact on the dependent variable, in turn evaluated against the mean of the dependent variable. Accordingly, the economic impact of distance is shown in Table 4 where distance is measured by kilometer increments from the largest relevant cluster. From Table 4, it is evident that distance from the cluster imposes a penalty in terms of foregone growth, and that the penalty increases relatively rapidly with distance, but the effects are non-linear. About half of the lost growth is experienced within 100 km, a result similar to that reported in Globe man et al. (2005). The penalty is quite high, particularly for biotech firms, for whom the loss is nearly three times higher (and the mean growth rate for biotech firms is lower). A biopharmaceutical company located 100 km from the largest cluster will experience a growth penalty of nearly 20% of the mean growth rate for biopharmaceutical firms. Thus, for biotech firms, foregone growth due to distance from the largest clusters is particularly significant.

Table 4 Impact on firm growth of distance from the largest cluster*

The distance penalty can also be summarized in terms of its impact on revenue. First, we consider revenue impacts for biopharmaceutical firms. Relative to the mean revenues of biopharmaceutical firms ($13.4 million in 1999), revenues would be reduced by 13% if the firm is located at a distance of 100 km from a biopharmaceutical cluster, 15% if located at a distance of 500 km, and 19% if located at a distance of 2,000 km. As the 1999 revenues for this sample of biopharmaceutical firms ranged from $1.4 million to $476 million, the revenue impact would vary from as little as $180,000 (for the smallest firm, if located 100 km from the cluster) to as much as $89 million (for the largest firm, if located 2,000 km from the cluster).

ICT firms also benefit from location within a relevant technology cluster, but location distance from the cluster does not penalize them as heavily. The mean ICT firm in our sample earned $92 million in revenue in 1999. Our results predict that the mean firm’s revenue would be reduced by 10% if it was located at a distance of 100 km from an ICT cluster, 10% if located at a distance of 500 km and 11% if located at a distance of 2,000 km. However, the revenue range of ICT firms in this sample is much greater than the revenue range of biopharmaceutical firms. For 1999, the range for this sample of ICT firms was from $1.0 million to $37 billion. Notionally, therefore, the total impact on revenues could be from as little as $100,000 to as much as $4.1 billion. Additionally, Model 3 suggests that distance from the nearest (weighted) cluster is also important for biotech firms, and the effects are qualitatively similar to those found for distance from the largest cluster. In total, these results suggest not only that cluster benefits are relatively localized, but also that these benefits depend on the size of the cluster.

Models (6)–(8) test for the effects of cluster diversity. In the reported specifications, we do not include a cluster term, but this does not affect the results. Model (6) uses the Hachman Index as the measure of cluster metropolitan economic diversity, with the cluster being defined as the nearest top-10 cluster. Model (7) uses cluster frequency as the diversity measure, but in this case it is measured relative to the cluster frequency of the SMA in which the firm is located. We use the relative measure, since the city in which the firm is located may itself provide diversity benefits. For this model, the relevant cluster is the largest cluster. Model (8) measures diversity by cluster frequency in a cluster in which the firm is located. We hypothesize that cluster metropolitan economic diversity enhances growth, and the effect is stronger for ICT firms. The results provide only mixed support for the hypotheses. Direct cluster diversity effects terms are always negative and statistically significant, suggesting that in general diversity is associated with lower growth rates, contrary to H2. However, the interaction terms suggest that software and communications equipment firms’ growth is enhanced when they are located within a diverse economic area, and the coefficients indicate that for these firms the net effect is positive, as hypothesized.Footnote 12

It is worth noting that there is little correlation (r = 0.085) between the Hachman Index measure and the cluster frequency measure. This is not as surprising as it appears at first blush: the Index measures diversity across broad economic levels of economic activity, whereas cluster frequency measures the degree to which a given location exhibits cross-cluster strength. Consequently, the Index records the degree to which any metropolitan area is as diversified as the U.S. as a whole, taking all industries into account. The cluster frequency measures only whether a location hosts a top-20 cluster in a specific sector. In order to further explore the effects of diversity, we estimated variations of Models (6)–(8) in which diversity of the SMA in which the firm is located is used in place of the diversity of the relevant cluster. These (unreported) results are similar to those reported above. We find no evidence that economic diversity and growth are in general related, but consistent evidence that ICT firms benefit from location in areas with high economic diversity.

7.2 Exogeneity

A problem associated with studies of cluster and agglomeration effects arises from the possibility that location choices are not exogenous. In our case, there may be unobserved firm-specific factors (such as quality of management) that result in both better performance and better location choice. As a consequence, growth and location choice are endogenous, resulting in biased estimates of location effects. In addition, causality is called into question to the extent that location choices reflect firm characteristics, including performance.

From a statistical perspective, unobserved variables such as management quality simultaneously affect location choice and performance so that the error term is correlated with an independent variable (location). It is difficult to fully address the endogeneity issue without well-defined instruments, time-series data, or a carefully designed experiment drawn on unanticipated shocks in the data. We do not have time-series data, nor is there an obvious instrument or controlled experiment available for the relevant time period. As these solutions are not available to us, we cannot completely address the endogeneity problem. However, we do offer some suggestive evidence in this regard.

Estimation problems associated with endogenous independent variables can be addressed if an appropriate instrumental variable can be found. A suitable instrument must be correlated with the suspected endogenous variable and uncorrelated with the error term, but we could find no naturally occurring variable that would satisfy both conditions. Therefore, we adopted an instrumental variables estimation procedure, using the method initially proposed by Evans and Kessides (1993), and recently used by Edwards and Waverman (2006) and Cubbin and Stern (2006). We construct a rank index-based instrument for all non-dummy location variables. For example, we sorted the log (distance to largest cluster) variable into three ranks (1, 2, 3) and so created a location rank index based on distance. By construction, this rank index is correlated with the original distance term and will also be orthogonal to the error term if exogenous disturbances do not affect a firm’s rank distance. This condition is unlikely to be violated except for observations near the rank thresholds and will be more likely to hold when the number of ranks is relatively small (Edwards and Waverman 2006). In other words, better managers may choose more advantageous locations without changing distance rank.

A regression of each rank index on each relevant distance variable confirms that the instrument is correlated with the original variable. For example, a simple regression of log (distance to largest cluster) on the rank index produced a coefficient on the latter term of 1.36 with a t-statistic of 18.64. Following Cubbin and Stern (2006), we use the residual from that equation to test for endogeneity of the log (distance) term. The test is undertaken by including that residual in the basic equation (column 1 in Table 3), and testing for the significance of its coefficient. If there is evidence of endogeneity, the predicted value of log (distance) is then used as an instrument. The results suggest that the log (distance) terms are endogenous, but the cluster diversity terms are not. The latter result is not surprising, since the cluster diversity terms refer to the characteristics of the cluster, and cannot be affected by an individual firm. We therefore re-estimated all equations involving distance from a cluster (nearest or largest) using the IV method described above.Footnote 13 The instrumental variables estimates (Table 5) are for the most part similar to OLS estimates reported in Table 3. The major, and important, difference is that the direct distance effect is reduced in magnitude and statistical significance. But the interaction terms remain statistically significant, and of about the same magnitude. In addition, none of the control variables are impacted in any significant way through estimation by instrumental variables

Table 5 Growth model regression results, instrumental variable approach. Dependent variable: Logarithmic growth rate 1995–1999

Although the IV estimates provide some confidence in the parameter estimates, it is difficult to address the causality issue in the absence of a well-specified structural model or time-series data. We do have data on the relocation activity of our sample firms between 1999 and 2003 and found that some 50 firms moved, but the vast majority of these (84%) moved within the same metropolitan area. This indicates that the sample firms are geographically immobile, and perhaps that young firms are unlikely to move for strategic (performance) reasons. However, this is at best suggestive, and we cannot claim to infer causality from our results.

8 Conclusions

This study examines the growth performance of a sample of successful new technology-based firms (NTBFs), focusing on the performance effects of co-location. We approach the problem from the perspective of firms that are attempting to obtain or augment their resources and capabilities. In general, our results suggest that cluster benefits exist, but are heterogeneous, even for our sample of relatively similar firms. Specifically, not all sample firms benefit from co-location, and any benefits are found to be firm-, sector-, or even cluster-specific. It would not be surprising (although our sample cannot directly address this) if cluster benefits are also variable for more heterogeneous high-tech firms.

For most sample firms, we find limited statistical support for the hypothesis that location in, or near, specialized clusters is positively related to growth performance. However, the magnitude of some of the estimated coefficients suggests that the economic impact of distance on firm growth might be substantial. We do find substantial and robust evidence suggesting that specialized cluster effects are associated with higher growth rates for young biotech firms, as hypothesized. Specifically, we find that for these firms, distance from the center of a relevant cluster matters, but that location within or outside of that cluster per se does not. Contrary to our hypothesis, we also find that location in diversified clusters is not associated with enhanced growth for all sample firms. However, cluster diversity does provide benefits for sample ICT firms, as hypothesized. Our results also hint at the importance of cluster size in determining firm growth rates, but we offered no a priori hypotheses in this regard; we suggest that future research should focus more on the determinants and effects of both cluster size and cluster diversity.

The results highlight several methodological issues relating to cluster definition and boundary specification. The results suggest that although it is possible to specify the center of a cluster on an a priori basis, it is far more difficult to do so for its boundary. Performance effects, when they exist, extend beyond political jurisdictions, such as cities or even states. When cluster effects are measured by distance from either the nearest top-10 cluster (weighted by rank), or by distance from the largest relevant cluster, we find some evidence that growth rates decline non-linearly with distance from the cluster, particularly for biotech firms. We find no evidence that location within a cluster (as we define it) per se matters. Similarly, Zaheer and George (2004) also find no evidence that location within a cluster matters for biotech firms and Keller (2002) finds that there are well-defined distance gradients for technology spillovers. Our results are also consistent with those of Rosenthal and Strange (2005) who arrive at similar conclusions based on their analysis of firm creation within the New York metropolitan area, and with Globerman et al. (2005) who find that the performance of Canadian high-tech firms declines with distance from the center of Canada’s largest city, Toronto. Even though these studies establish the “center” in different ways, it is interesting to note that all find that “distance” from the center is what matters. Given these various findings on the importance of distance, an important research question becomes: distance from what? Should distance be measured from a city or government center, or does some other reference point more accurately capture the spillover benefits?

There are a number of limitations to this study. First, endogeneity is always an issue for studies of this type. In this study, we measured directly the impact of co-location on firm performance, rather than inferring location benefits from locational choices. This approach leaves open the possibility of estimation biases and difficulties in causal inference, notably that firm performance and resource characteristics drive location choices. Although we have constructed an instrumental variable and used it for estimation purposes, we would have been more comfortable if we had been able to find a naturally occurring instrument. Despite these efforts, lack of a well-specified structural model and time-series data makes it difficult to establish causality. Thus, we view our approach to endogeneity as suggestive but not definitive and further research in this area is warranted.

A second limitation arises from the nature of the sample. Because the sample is limited to high-growth, successful NTBFs, we are unable to conclude that all high-tech firms would benefit from clustering. A sample restricted to successful firms does have an advantage in that it limits unobserved sources of firm heterogeneity, which is important when panel data are not available. However, such a sample also limits the generalizability of the findings. On a priori grounds it is difficult to speculate as to the direction and extent of any bias arising from our focus on successful small firms. One plausible inference of the RBV is that lower growth firms are the most resource-constrained, and benefit more from the resources that are available in clusters, a conjecture supported by Shaver and Flyer (2000). If this is the case, our results underestimate the relationship between clustering and growth. On the other hand, an alternative hypothesis is that lower growth firms may lack the absorptive capacity to benefit from cluster resources. In this case, our results would be biased upwards. This question clearly warrants further empirical research.

A third, and related, limitation arises from the restricted set of independent variables we are able to examine. The model presented here focuses on firm size, age, ownership status, and location measures as determinants of growth. While our results indicate that these variables are important, they are not the only factors that influence firm growth rates. For example, Eisenhardt and Schoonhoven (1990) have shown the importance of firm-specific managerial variables. Similarly, the model does not account for life-cycle effects that are likely to be important. Many of these problems can only be addressed by the use of panel data.

Fourth, we employ measures of “specialization” or “cluster diversity” that are quite broad in terms of sectors. Recent literature is beginning to focus on a narrower definition of “relevant cluster” (Aharonson et al. 2004) and this may be important factor for capturing specialization effects. In addition, our diversity measures may not fully capture Jacobs’ emphasis on the flow of ideas among agents because they are not explicitly based on the knowledge base in each sector.

In sum, this article suggests several opportunities for future research on the relationship between clusters and firm performance. It also suggests that high-tech entrepreneurs and venture capitalists should consider carefully the consequences of location choices. For those involved in the creation of new high-tech firms, our results suggest that it is distance, and not location per se that matters for firm performance, and the effects will differ by sector.