1 Introduction

Regional industrial structure is, in theory, of great importance to firm productivity and economic growth potential. In a competitive industrial structure that is an alternative to oligopoly and exhibits a high degree of vertical disintegration and specialization, agglomeration economies are constructive for small firms to compensate for their disadvantages in scale economies (Carree and Thurik 1999). A diverse industrial structure would also enhance productivity and promote innovation through cross-fertilization, have a positive impact on value-added growth, and be very important for the development of high-tech industry and the attraction of new firms (Jacobs 1969; Henderson et al. 1995; Duranton and Puga 2001; Batisse 2002; Puga 2010). In contrast, regional industrial concentration may negatively affect firm productivity and local economic growth (Chinitz 1961; Rosenthal and Strange 2003). A region dominated by a few large firms in an industry would generate fewer positive externalities associated with agglomeration economies, ultimately diminishing productivity and obstructing entrepreneurship and innovation (Chinitz 1961; Saxenian 1994). This implies that the regional industrial structure may help to explain differences in economic performance and dynamics between regions with a similar level of industrial agglomeration (Chinitz 1961; Saxenian 1994; Rantisi 2002).

Theoretically, a highly concentrated industrial structure may weaken agglomeration economies by limiting the degree of intermediate input sharing, labor market pooling, and knowledge spillover. The reasons are following. First, large firms have economic advantages in seeking nonlocal suppliers for their intermediate inputs, which then reduces the size of the local market of independent suppliers (Enright 1995; Porter 1998). On the other hand, large manufacturers may offer long-term and large-volume contract on supply and therefore are more attractive to local suppliers than small manufacturers (Booth 1986). Local suppliers are more responsive to the input demand of large manufacturers than small ones, and the presence of large manufacturers may raise input costs for small ones in the same region (Lee et al. 2010).

Second, dominant firms in a region may negatively affect labor-pooling economies for small firms in the same industry. This is because workers, especially those with skills and experience, are more attracted to large firms that offer better employment compensation packages and more stable employment opportunities than small firms (Booth 1986; Audretsch 2001). Third, a regional economy dominated by large firms tends to reduce knowledge spillovers and nurse business cultures that discourage innovation and adjustment to changing markets (Saxenian 1994; Chinitz 1961; Carree and Thurik 1999). Large dominant firms tend to be vertically integrated, which decreases face-to-face contact across firms (Enright 1995). If an industry is dominated by a few large inward-looking firms in a region, all other firms may suffer from lack of flexibility and be insensitive to innovation (Porter 1998). A concentrated industrial structure would also limit entrepreneurship (Chinitz 1961), which further retards the generation of new products and technologies as small firms are major sources of innovation and industrial evolution (Acs 1992; Audretsch 2001).Footnote 1

Empirically, the literature produces a few papers with limited and mixed conclusions on the impact of industrial structure on firm productivity. Feser (2002) identifies a significantly negative relationship between industrial dominance and plant-level productivity for the innovative–intensive measuring and controlling devices industry but no significant influence of industrial concentration for the low-tech farm and garden machinery industry. Drucker and Feser (2012) and Drucker (2013) conduct cross-section regressions that examine both the direct effect of industrial dominance on plant-level productivity and its intervening effect through limiting agglomeration economies. They conclude that a concentrated regional industrial structure is directly associated with lower productivity, but does not limit the types of agglomeration economies measured, such as input sharing, labor pooling, and knowledge spillover. Carree and Thurik (1999) reveal that industrial concentration matters for national industrial output, but the direction of the impact differs between developed and developing countries. Gopinath et al. (2004) conclude a nonlinear relationship between industrial concentration and productivity growth in US manufacturing industries.

This paper investigates the impact of regional industrial structure on firm productivity. More specifically, the paper attempts to answer the following research questions: (1) Does regional industrial structure affect firm productivity? (2) Does regional industrial structure affect localization and urbanization economies? and (3) Does the impact of industrial structure on firm productivity vary across types of firms or periods?

Those questions will be examined by using the firm-level data of China. China offers a unique case to examine these questions. The reasons are threefold. First, the local industrial dominance underwent dramatic changes during the market-oriented reform, but its effects have not been thoroughly investigated. Important causes of the changes include, but not limited to, the privatization of state-owned enterprises, integration to global markets (e.g., entering WTO), and the gradual removal of government protection. In general, the local own-industry dominance became weaker, as the indicator, a Herfindahl index calculated in the research, shows a median value of 0.13 in 1998 and 0.07 in 2007. Second, China has adopted strong industrial policies since 1978 to promote its economic growth. This is manifested by the designation of numerous special development zones throughout the country such as industrial parks, high-tech districts, export-oriented zones, and so forth. A regional dominant firm thus may get preferential treatment through taxes and loans, and legal protection from local governments. Third, local industrial policies in less-developed areas in China mainly focus on attracting large firms. Although these strategies may promote economic growth in a short run, the large and often dominant firms in these areas are always found to be resource-oriented and vertically integrated and hardly contribute to local industrial innovation or entrepreneurial activities (Gai et al. 2015; Tang 2011).Footnote 2 The negative influence of dominant firms, however, remains underexplored in the literature on agglomeration economies in China. On the other hand, our findings will have general implications given the increasingly dominant private sector and a well-established market economy. In the industrial sector, for instance, state-owned and state-controlled enterprises contributed 34% of total output in 1995, but only 12% in 2010 and 8% in 2017. The private economy provided more than 80% of urban jobs and more than 60% of GDP in 2017.

This paper is organized as follows. The next section discusses the literature; Sect. 3 presents data, measures, and models; Sect. 4 discusses results, and finally policy implications and conclusions are presented in Sect. 5.

2 The literature

Chinitz (1961) is among the first studies that pointed out the importance of regional industrial dominance to the growth potential of firms. He argues that New York is much more entrepreneurial than Pittsburgh as the former is distinguished by a large number of small firms and the latter is dominated by a few large integrated steel firms. The concentrated industrial structure in Pittsburgh limits the availability of capital for startups, reduces input sharing and labor pooling, and impedes entrepreneurial activities. In contrast, Rantisi (2002) reports that more than 85% of firms in New York’s garment district are small- to medium-sized enterprises, which compete with each other, benefit from specialized labor and services, and monitor rival firms’ performance and practices.

Saxenian (1994) compares Boston’s Route 128 and northern California’s Silicon Valley and concludes that the fundamental difference in the regional industrial structure is the key factor to explain their diverging growth trajectories in the 1980s. Both areas were centers of electronics and high-tech industries. When facing the changing market, Silicon Valley successfully made the transition to software and other computer-related industries, generating a large number of successful startups. In contrast, Route 128 failed to make the transition to smaller workstations and personal computers, which led to continuous stagnation and decline. Silicon Valley has a regional network-based industrial structure that promotes collective learning and horizontal communication among different firms. The Route 128 region is dominated by a few large corporations that perform their own work, and the industrial structure is relatively rigid and hierarchical. Thus, it is difficult for firms to adjust to the changing market and for small firms to survive in the Route 128 area.

Quantitative studies on the impact of regional industrial structure on agglomeration economies yielded mixed and inconclusive results. Henderson (2003) shows that agglomeration economies are mainly created through the large number of establishments rather than their large size. Compared to medium-sized or large establishments, Rosenthal and Strange (2003) argue that small firms generate greater external effects by finding that the concentration of own-industry employment in small establishments has a larger positive impact on both the birth and employment of new firms. Carree and Thurik (1999) reveal that the influence of industrial concentration (employment share of large firms) on national industry output can be either positive or negative in twelve European countries, depending on the industries. They conclude that industrial concentration may enhance production in less-developed countries and impede production in more developed countries. Gopinath et al. (2004) demonstrate an invert U-shaped relationship between industrial concentration, which is measured by a four-firm concentration ratio, and productivity growth in US manufacturing industries. Acs and Audretsch (1988) report a negative relationship between the concentration and innovation rate at four-digit industry level. Levin et al. (1985) and Lee (2005) exhibit that the effects of industrial concentration on innovation vary with industry conditions, cost of imitation, patent protection, and the importance of a firm’s technological competence in an industry. By measuring local competition as the number of firms per worker, studies suggest that a more competitive industrial structure would promote both the employment growth (Glaeser et al. 1992) and firm birth (Rosenthal and Strange 2003; Glaeser and Kerr 2009).Footnote 3

A few empirical studies have directly examined the relationship between regional industrial dominance and the performance of firms in the same industry. Feser (2002), Drucker and Feser (2012), and Drucker (2013) are the studies directly incorporating both regional industrial dominance and agglomeration economies into the plant-level production function. Feser (2002) measures industrial concentration as the share of the total sales made by the four largest firms (four-firm concentration ratio) in a commuting zone and uses plant-level data in 1992 to investigate two industries in the USA—the farm and garden machinery industry and the measuring and controlling devices industry. Feser finds a significantly negative relationship between regional industrial dominance and plant-level productivity for the innovation-intensive measuring and controlling devices industry but no significant influence of industrial concentration for the low-tech farm and garden machinery industry. Drucker and Feser (2012) and Drucker (2013) use plant-level data in 1992, 1997, and 2002 for three sectors in the U.S.—rubber and plastics, metalworking machinery, and measuring and controlling devices. According to the firms’ shipment values, the researchers measure the concentration by three indicators; the five largest firm concentration ratio, the Herfindahl index, and the Rosenbluth index. Based on cross-section regressions by year, they conclude that a concentrated regional industrial structure is directly associated with lower productivity, but that it does not limit the three sources of agglomeration economies that they are able to measure, including input sharing, labor pooling and knowledge spillover.

There are plenty of studies on industrial agglomeration in China. Industries in China exhibit agglomeration economies, but the magnitude of their effects appears to be weaker than in developed countries like France, the UK, and the USA. The reason may be that Chinese industries are subject to high degree of market fragmentation and regional protectionism (Lu and Tao 2009; Long and Zhang 2012). Ke (2010) reveals that industrial agglomeration and the city-level labor productivity are positively, mutually, and causally related, which implies that industries tend to concentrate toward more productive areas to achieve higher productivity. Studies examining agglomeration effects on firm performance in China have found that agglomeration could promote or discourage productivity, depending on the size or types of agglomeration (Batisse 2002; Fu and Hong 2011; Lin et al. 2011; Yang et al. 2013; Hu et al. 2015). Chen and Wu (2014) suggest agglomeration may help to achieve social goals, because they find evidence that agglomeration in China increases a firm’s pension contribution. We do not find any studies on the relationship between regional industrial structure, agglomeration economies, and firm productivity in China. Our research fills in the gap.

3 Data, variables, and model specification

3.1 Data

The data used in this study come from the China Annual Survey of Industrial Firms collected by the National Bureau of Statistics (NBS). The data contain all the state-owned enterprises (SOEs) and those non-state-owned enterprises (non-SOEs) with annual sales greater than 5 million RMB, the so-called above-scale firms.Footnote 4 It reports detailed information for individual firms, including location information, industrial code, input and output of production, etc. This firm-level data serves as the cornerstone of many aggregate industry data used and published by NBS, and comparing our data with NBS’s aggregate data for many key variables indicates a high level of consistency. Holz (2008) considers the data as quite reliable, particularly after 1998, because it includes large enterprises with a well-established accounting system, and these enterprises report regularly to the statistical authority so that inconsistencies can easily be detected. Since 2007, these data have been used extensively in many publications and earns trust among Chinese and international scholars (Brandt et al. 2014; Ding and Niu 2019).

Manufacturing firms in the period between 2000 and 2007 are used. We do not use data after 2007, because many key variables, such as value-added and material inputs, are no longer available. We exclude the data in 1998 and 1999, because the shares of SOEs are quite large in those 2 years, exceeding 30%. The 2 years, however, are used in constructing variables of lagged agglomeration. We drop industries in the mining, electricity, gas and water sectors, because the location and productivity of mining firms largely depend on natural conditions, and firms in the electricity, gas and water sector are mostly government owned.

While different researches do not use exactly the same methods in cleaning and processing the data, we primarily follow the steps of Brandt et al. (2014). It mainly involves four types of work. First, we link firms over time by using not only the unique ID assigned by NBS, but also the combination of other information such as firm’s name, phone number, address, etc. Second, nominal variables of output and input (raw materials and intermediate inputs) are deflated by very refined industries. Third, much external data, including the 1993 annual enterprise survey data, the nominal capital stock at the two-digit industry level by province are combined with the firms’ nominal capital stock to estimate the real capital stock of firms.Footnote 5 Finally, we fix data problems or errors including typos, missing values, changes of industrial classification in 2002, and changes of geographic code that is used to identify the locations of firms.

3.2 Variables

3.2.1 Measuring regional industrial dominance

We used the Herfindahl index of sales in a city to measure regional industrial concentration. The index in city j is given by

$$ {\text{Herfindahl}}\; {\text{index}}_{j} = \mathop \sum \limits_{i \in j} \left( {\frac{{{\text{Sale}}_{\text{i}} }}{{\mathop \sum \nolimits_{i \in j} {\text{Sale}}_{i} }}} \right)^{2} , $$

where \( {\text{Sale}}_{i} \) is the sales of firm i located in city j, and \( \sum \nolimits_{i \in j} {\text{Sale}}_{i} \) denotes the total sales of city j.

We chose the Herfindahl index mainly because it considers the full distribution of firm size and is superior to the concentration ratio in reflecting market structure, as suggested by Scherer and Ross (1990) and Amato (1995). As the firms in our data represent most of the sales (90.9% in 2004), the influence of missing small firms should be trivial, if any.Footnote 6 In the comparison of several indicators for industrial structure, Amato (1995) suggests that the leading firm share is sometimes also a good measurement. To check the robustness of our results to alternative indicators, we use the leading firm’s share, defined as the share of the largest firm in a city.

3.2.2 Measuring agglomeration economies

The spatial scope of measuring agglomeration in this paper is city. We define the city by prefecture’s city proper and county-level cities. The number of cities varies between 650 and 670 in the study period. These cities contain approximately three-fourths of manufacturing firms in our data. In 2000, they occupied 18% of land area, and produced 84% of GDP in industrial and service sectors. We exclude counties that are mainly rural areas. Doing so, as shown later, we capture most of the effects of agglomeration economies in our estimation. By the China’s industrial classification, there are 30 2-digit manufacturing sectors and 162 3-digit sectors. We focus on 20 digit industrial sectors and use 3-digit industries for robustness check.

We use total employment in own-industry activities as a proxy to localization agglomeration and total employment in other-industry activities as a proxy to urbanization agglomeration. Alternatively, agglomeration can be proxied by the number of firms and population. We use employment because our data includes the majority of manufacturing workers but only a fraction of manufacturing firms.Footnote 7 Besides agglomeration economies, what agglomeration generates may also involve negative consequences such as congestion. So what we estimate will be the net effect from agglomeration.

Employment size is probably a better measure of urban agglomeration in China than employment density. The boundaries of Chinese cities (city proper and county-level cities) are administratively delineated. A city typically consists of an urban core or built-up areas, surrounded by rural areas (Ding 2013). The density of the city, therefore, is heavily affected by the size of rural areas. This problem has been severe as many adjacent rural areas (counties or towns) are administratively merged into city proper in the past decades. Merging and/or annexing a county could increase the area of a city proper by more than one-third and dramatically reduce its density without any significant changes to its population or employment. Density is not a good measure for agglomeration in China also because the government-led land development often results in excessive land supply (Ding 2007).

Table 1 provides descriptive statistics of key variables. It illustrates that firm size varies substantially, as indicated by very large standard deviations relative to the mean values of value added, employment, and capital. At the mean level, a firm hires 283 workers, uses capital that worth approximately 40 thousand RMB, and produces value added slightly above 22 thousand RMB. In the same city, the firm has, on average, nearly nine thousand own-industry workers, and almost 179 thousand other-industry workers. The average Herfindahl index is about 0.18 in own industry and about 0.06 in other industries. The variables of Herfindahl index and agglomeration contain large variation, as shown by the standard deviation.

Table 1 Descriptive statistics of variables

3.3 Empirical specification

A direct way to test the effects of external environment such as agglomeration economies on firm productivity is to estimate the production function (Rosenthal and Strange 2004). To make our empirical results comparable to the literature, similar to Henderson (2003), Drucker and Feser (2012) and Drucker (2013), we estimate the following model where the dependent variable is a firm’s output, and independent variables include inputs as well as measures of agglomeration economies and industrial concentration:

$$ { \ln }\left( {VA_{it} } \right) = \beta_{1} H_{jt} + \beta_{2} A_{jt} + \beta_{3} \left( {H_{jt} *A_{jt} } \right) + \gamma ln\left( {X_{it} } \right) + \delta_{t} + \theta_{ij} + \varepsilon_{ijt} $$
(1)

for firm i in location j at time t. In function (1), the dependent variable \( VA_{it} \) represents the firm’s value added. The variable \( H_{jt} \) refers to the Herfindahl index, and \( A_{jt} \) refers to measures of agglomeration. We use the interactive term \( H_{jt} *A_{jt} \) to examine how regional industry dominance affects the productivity effect of agglomeration economies. As we have both localization and urbanization agglomeration (\( L_{jt} \) and \( H_{jt} \)), we have two interaction terms of \( H_{jt} *L_{jt} \) and \( H_{jt} *U_{jt} \). The term \( X_{it} \) includes a firm’s input data such as number of workers and capital. The terms of \( \delta_{t} \), \( \theta_{ij} \) and \( \varepsilon_{ijt} \) capture those unobservables that influence the firm’s output. Variable \( \delta_{t} \) varies only across periods, \( \theta_{ij} \) varies only across firms, and \( \varepsilon_{ijt} \) varies across both firms and periods.

The specification of (1) implicitly assumes Hicks’ neutrality, i.e., the changes in regional industrial environment do not affect the firm’s input choice. To test the validity of this assumption, we let localization agglomeration and urbanization agglomeration interact with a firm’s inputs \( X_{it} \) in estimating function (1). Experimental results suggest that these interaction terms all have very small coefficients (between − 0.003 and 0.005) and are statistically insignificant. We hence believe that our assumption of Hick’s neutrality is reasonable.

Endogeneity may be an issue in the estimation of function (1). First, several unobserved time-invariant factors may influence regional industrial environment and firm productivity simultaneously. For example, high-ability entrepreneurs may not locate randomly but prefer locations with certain features such as a mild climate and business-friendly environment. On the other hand, local amenities, local resources, or business culture could help to foster large dominant firms and affect firm productivity. To control the firm fixed effects (\( \theta_{ij} \)), we use the within estimator for panel data.Footnote 8

China’s vast urbanization and rapid economic growth generate considerable time variation for key variables, which is needed to validate the fixed-effects estimator. During our study period of 2000–2007, the urbanization rate of China increased by 9.7% (from 36.2 to 45.9%), and urban population rose by 32.1% (from 459 to 606 million). The national GDP and manufacturing GDP grew by 1.69 times and 1.77 times, respectively, and GDP per capita also increased by 1.58 times, during the period. This general pattern is well echoed by our microdata of firms. From 2000 to 2007, the median value of own-industry employment in the same city expanded by 83.3%, and the median value of other-industry employment expanded by 56.2%. The industrial concentration also dramatically changed. Consider the own-industry Herfindahl index. From 2000 to 2007, its 25th, 50th and 75th percentiles declined from 0.039 to 0.023, from 0.117 to 0.066, and from 0.282 to 0.161, respectively. Similarly, significant changes also appear in the other-industry Herfindahl index, whose 25th, 50th and 75th percentiles declined from 0.0140 to 0.0096, from 0.0315 to 0.0190, and from 0.0718 to 0.0444.

Second, unobserved time-variant factors may lead to the endogeneity. The market fragmentation at the province level in China is likely to generate differences in prices of output and intermediate inputs (Young 2000). Such price differences not only act as shocks to value added \( {\text{VA}}_{it} \) measured in monetary terms, but also affects firm’s input choice \( X_{it} \) as well as the regional industrial environment \( H_{jt} \) and \( A_{jt} \). The development of regional transportation affects firm productivity and regional industrial environment. To control for these factors, we add province-year fixed effects.Footnote 9 In addition, we also use industry-year fixed effects to control for national shocks to productivity and allow these shocks to differ across 2-digit industries. After controlling for firm/location fixed effects, province-year fixed effects, and industry-year fixed effects, what \( \varepsilon_{ijt} \) contains is contemporaneous idiosyncratic firm output shocks.

Even though the estimation with fixed effects will address the endogeneity issue, we still try to apply instrumental variables to test the robustness of our results. It is noted that finding valid instruments for regional industrial environment and a firm’s inputs is difficult (Henderson 2003; Puga 2010). We chose lagged variables as instruments and use the first difference estimation to control for firm/location fixed effects. So the variables in function (1) become the difference between periods \( t \) and \( t - 1 \), and instruments are the predetermined variable at \( t - 2 \). By doing so, we assume that predetermined values (at \( t - 2 \)) of regional industrial environment and input choices influence their future changes (from \( t - 1 \) to \( t \)), but these predetermined values are exogenous to the residual.Footnote 10 Results of GMM estimator, however, suggest an issue of weak instrument that may further increase bias in estimated coefficients and enlarge asymptotic standard errors (Staiger and Stock 1997). Nevertheless, the signs of key variables in GMM estimates are the same with those in fixed-effects estimates. We thus focus on our discussions on the results with fixed effects.

Whether or not the residual is correlated across firms or periods is crucial for the consistent estimation of standard errors of coefficients. Given the fixed effects that we have controlled, the intertemporal correlation of the residual for a given firm should be relatively weak. Since the key variables \( H_{jt} \) and \( A_{jt} \) are computed at the city level, overlooking the correlation of the residual between firms in the same city may significantly underestimate the standard errors even if the correlation is not strong (Moulton 1986). Thus we report standard errors clustered at the city-year level in our main results. We also let the standard errors cluster at the city level, province-year level, or province level to check the robustness of our results.

4 Results

4.1 Basic results

We first report estimated results for function (1) in Table 2. The estimated results reveal the following findings.

Table 2 Effects of industrial dominance on firm productivity

First, industrial concentration in the own industry has negative effects on firm productivity. In columns (1)–(2) that only include the own-industry concentration, we find it negatively correlated with value added. This negative correlation results from an indirect mechanism that the concentration retards localization economies, as indicated by the negative and statistically significant coefficients for the interactive term between the concentration and localization in columns (5)–(8). The magnitude of the concentration’s effect on localization economies is large. The results in column (6) suggest that a one standard deviation increase in the own-industry concentration will lead to a reduction of approximately 40% on the elasticity of localization economies. The coefficients for the concentration across columns (5)–(8) all have very low significance, implying that the dominance does not directly affect firm productivity.

Second, industrial concentration in other manufacturing industries has little effect on firm productivity. Column (7) adds other-industry concentration, urbanization agglomeration, and their interaction term into independent variables, and controls for all three types of fixed effects. Our results reveal that neither other-industry concentration nor the interaction term has a statistically significant coefficient. Their significance remains low in column (9) where we drop the variables of own-industry concentration, localization agglomeration, and their interaction.Footnote 11

Third, we find evidence for both localization economies and urbanization economies. According to the results in column (6), we find that at the mean level, the elasticity of localization agglomeration to value added is approximately 2%, and the elasticity of urbanization agglomeration is approximately 3%. We also find that missing industry-year and province-year fixed effects bring unbelievably high coefficient for urbanization economies, and in results not reported here, missing firm fixed effects or the interaction term between the concentration and localization economies may severely underestimate agglomeration economies. Perhaps due to the collinearity when other-industry concentration and the interaction term are present, the urbanization agglomeration variable loses some significance in column (7), but is still significant at the 10% level.

The inclusion of fixed effects greatly contributes to the identification of our empirical model. We find that, in results not reported here, only 45% of the dependent variable’s variation is explained by own-industry concentration and firm’s inputs, and then adding industry-year fixed effects and province-year fixed effects increases the R square by about 10%. After adding firm/location fixed effects, R square rises by another 30%. It implies that the unobserved heterogeneity of firms and/or locations probably have important influences on firm output, the concentration, and agglomeration. In total, implementing all three types of fixed effects nearly doubles the explanatory power of our empirical model (R square from 45 to 85%). Those results imply that overlooking these unobserved factors, particularly the firm/location fixed effects, is likely to generate biased estimates of coefficients. On the other hand, since we use the within variation to identify the parameters, the estimated effects of industrial concentration are perhaps the short-term effects. In that case, we might underestimate the effects of industrial concentration, if the long-term effects are present but not properly identified by the within estimator. The high statistical significance of key variables given so many fixed effects also makes our empirical findings more convincing.

The coefficients of employment and capital inputs reveal the following findings. First, all of the coefficients are positive and significant, consistent with our expectations. Second, the sum of the two coefficients is less than one, implying decreasing returns to scale. Third, the coefficients change significantly after controlling fixed effects. The coefficient of capital declines by approximately 43% from column (1) to column (7). An interpretation is that unobserved variables, such as entrepreneurial ability and local accessibility, may simultaneously increase the firm’s output and use of capital and hence causes the overestimation of the coefficient of capital in OLS estimates. Alternatively, if capital stock is poorly measured, the attenuation bias could become stronger after using fixed effects and thereby further reduce the coefficient of capital. The coefficient of employment rises under fixed effects, but the extent of changes is relatively small.Footnote 12

4.2 Robustness

We test the robustness of our findings to alternative choices in the research design. We address: (1) the temporal scope of the industrial environment; (2) the spatial dimension of agglomeration economies; (3) the industrial scope of localization and urbanization economies; (4) the measures of agglomeration economies and the dominance; (5) the dominance of SOEs and Non-SOEs; and (6) consistent estimates of standard errors. The results of those tests are displayed in Table 3.

Table 3 Robustness of estimation results

We find that adding lagged regional industrial environment does not change our conclusions. In column (10) we investigate what the empirical results would be if we assume it is lagged, not contemporary, industrial environment that affects firm productivity. So column (10) uses those 1-year lagged variables such as \( H_{j,t - 1} \), \( L_{j,t - 1} \), \( H_{j,t - 1} *L_{j,t - 1} \) and \( U_{j,t - 1} \). We hence use firms that stay in the data for at least 3 years and it reduces the sample size from 1.17 to 0.94 million. Similar to column (6) in Table 2, column (10) suggests that industrial concentration limits localization economies. Column (11) includes both contemporary and 1-year lagged variables of regional industrial environment, and column (12) adds the 2-year lagged variables in regressions. The results suggest that these lagged variables are all statistically insignificant, and the coefficients of original variables remain largely unchanged. This does not necessarily mean that industrial concentration and agglomeration in the past have no influences at all on firm productivity at present. Our interpretation is that, after controlling the variables at present, adding lagged variables introduces little new information. Indeed, R square rises very slightly, from 0.8502 in column (6) to 0.8677 in column (11) and 0.8759 in column (12). We also find a very high correlation between the current and lagged variables. The correlation is 0.9106 between current value and 1-year lagged value for own-industry concentration and 0.8492 between the current value and the 2-year lagged value. This intertemporal correlation is 0.9553 and 0.9238 for localization agglomeration, and 0.9952 and 0.9889 for urbanization agglomeration. In results not reported here, we find that adding lagged other-industry concentration does not affect our results either.Footnote 13

We reveal that capturing agglomeration economies outside the city does not alter our conclusions. We measure the dominance and agglomeration only within city proper or county-level city because the literature suggests agglomeration externalities are highly localized (Rosenthal and Strange 2003; Puga 2010). Nevertheless, we still examine agglomeration economies outside the city proper. A rough way to do so is to look at the entire prefecture. In column (12), we add own-industry employment and other-industry employment in the rest of prefecture (outside the city proper but inside the prefecture). Neither of them contributes to firm productivity. So we believe that city proper or county-level city is an appropriate spatial unit in examining agglomeration economies as well as the effect of industrial dominance.

Our results are robust to a finer industrial classification. We use 3-digit industrial code to construct variables of own-industry concentration, localization agglomeration and urbanization agglomeration. Column (13) displays the estimated results. We find that the coefficients of key variables in Column (13) are qualitatively similar to those in column (9), but the magnitudes have various changes. The effect of own-industry concentration declines by 35% as indicated by a smaller coefficient of the interaction term, but it remains statistically significant. The effect of other-industry concentration, though not reported, is small given the very low statistical significance. The elasticity of localization economies reduces to 1% at the mean level, while the elasticity of urbanization economies increases slightly to 3.66%.

Our conclusions remain the same to an alternative measure of agglomeration. While both number of firms and employment are used in the literature to proxy agglomeration, they might represent different sources of agglomeration. Henderson (2003) suggests that the number of firms may better capture knowledge spillovers and finds that what matters is the number of firms but not employment. In this research, we find that using employment or number of firms generates very similar results. Column (14) suggests a negative effect of own-industry concentration on localization economies, as well as strong evidence for localization and urbanization economies. A difference is that the coefficient of own-industry concentration becomes negative and statistically significant. It implies that the concentration, besides retarding localization economies, could hurt firm productivity directly.

We examine if our estimates change with alternative indices for regional industrial dominance other than the Herfindahl index. In column (15), we measure the dominance by the largest firm’s share of sales, the so-called leading firm share in the literature (Amato 1995). Estimated results suggest similar conclusions. The interaction term obtains a negative and significant coefficient, showing that a larger share of the own-industry leading firm leads to weaker localization economies. The coefficient of the own-industry leading firm share is small and statistically insignificant, implying that it does not directly affect firm productivity. The evidence for localization and urbanization economies are both present. In experiments not reported here, we find that the leading firm share in other manufacturing industries does not have any effects on firm productivity.

One may worry whether or not our results are driven by the presence of state-owned enterprises. Without a doubt, SOEs are larger, with an average employment of 770 compared to 223 of Non-SOEs and may affect agglomeration externalities via different mechanisms, saying local protectionism. So we include both the dominance of SOEs and the dominance of Non-SOEs in regressions by using the leading firm share. Estimated results in column (16) show that the two types of dominance exhibit quite similar effects. Neither of them directly affects firm productivity, as indicated by the insignificance of their coefficients. Both types of dominance reduce localization economies, as indicated by the negative and significant coefficients of the interaction terms. The two coefficients of interaction terms, we note, are very close to each other. Therefore, we conclude that it is insignificant whether it is SOEs or Non-SOEs that dominate the local market.

Last but not least, we investigate whether different spatial levels at which standard errors are allowed to cluster change our conclusions. In our main results, all standard errors are clustered at the city-year level. Although we do not think that the errors are necessarily clustered at larger levels, we let standard errors cluster at city level, province-year level, and province level, respectively, and in this order we report their estimates of standard errors in parentheses below every coefficient in column (17). Compared to column (9), a notable difference is that the coefficient for urbanization agglomeration loses statistical significance. Localization agglomeration and the interaction term both have expected sign and remain highly significant. Our conclusion that local concentration reduces localization economies holds.

4.3 Results of subsamples

We test whether the effects vary across different firm sizes, different types of industries, different ownerships of firms, and different periods. Table 4 displays estimated results.

Table 4 Estimated results for subsamples

We examine whether relatively small firms are severely harmed by industrial dominance. We use firms smaller than 900 employees, approximately the bottom 95% of the total, to estimate coefficients in function (1).Footnote 14 Estimated results in column (19)–(20) imply greater negative effects of the concentration for relatively small firms, since the coefficient of the interaction term between localization and own-industry dominance in column (20) is larger than that in column (6). Using some different cutoffs on employment generated similar findings.

We divide manufacturing firms into three types of industries, traditional light industries, traditional heavy industries and high-tech industries and find notable differences across sectors. We find that the concentration does not affect firm productivity in high-tech industries [columns (22) and (23)] but reduces localization economies in light and heavy industries [columns (18)–(21)]. The concentration has a stronger effect for light industry than for heavy industry, as implied by a larger coefficient (around − 0.05) of the interaction term in the light industry.

We find the evidence for localization economies are not in high-tech industries but in light and heavy industries. This is indicated by a statistically insignificant coefficient of localization economies for high-tech industry (columns 22–23), and a positive and significant coefficient of localization economies for both light and heavy industries (columns 18–21). At the mean level, the elasticity of localization economies to value added in the light industry is 2.58%. This is higher than that of heavy industries, which equals 2.15%.

We find the evidence for urbanization economies are in high-tech industries, but not in light industries or heavy industries. Columns (22) and (23) show positive and significant coefficients of urbanization economies in high-tech industries, where the elasticity of urbanization economies lies between 7 and 8%, significantly higher than the ones we get when using all manufacturing firms (columns 8–9). Columns (18)–(21) report positive but insignificant coefficients of urbanization economies in light and heavy industries. It implies that the evidence for urbanization economies we found when using all manufacturing firms is mainly driven by high-tech industries.

The effects of employment and capital inputs change moderately across the three types of industries. The light and heavy industries have quite similar elasticities of employment and capital inputs, which vary between 49–51 and 19–20%, respectively. These two sectors constitute 91.6% of firm samples. The production in high-tech industries relies more on employment and less on capital, as compared to two other sectors. Its elasticity of capital input, 18.4%, is very close to those of light and heavy industries, while the elasticity of employment is approximately 14% higher than heavy industries and 20% higher than light industries. The assumption of same input elasticities across sectors that we impose when pooling all manufacturing sectors together to estimate function (1) does not seem very unrealistic. For comparison, Henderson (2003) reports similar coefficients for employment and capital across high-tech and machinery sectors.

We also re-estimate the empirical model by using SOEs and Non-SOEs separately. SOEs in China have different objectives from Non-SOEs (e.g., social security and macroeconomic management), and are often criticized for low efficiency. According to results in columns (24) and (25), SOEs are almost independent to the regional industrial environment. Industrial concentration does not affect the productivity of SOEs, neither does localization nor urbanization agglomeration. Results for Non-SOEs are quite similar to our results in columns (8) and (9), except for a lower level of significance of the coefficient of urbanization agglomeration.

Finally, we investigate whether the effect of industrial concentration changes over time during China’s fast development. We divide the study period 2000–2007 into two periods, 2000–2003 and 2004–2007. Columns (28)–(31) report estimated results for the two periods separately. We find that own-industry concentration reduces localization economies in both periods and its effect slightly increases in absolute value over time (the coefficient of the interaction term changes from − 0.0348 to approximately − 0.047). Other-industry concentration exhibits no effects in either periods, similar to what we have previously identified. In addition, columns (28)–(31) provide evidence that agglomeration economies increase over time. The elasticity of localization economies, at the mean level, increases from approximately 1% in 2000–2003 to approximately 2.7–2.9% in 2004–2007. Urbanization economies obtain no evidence in 2000–2003 but become quite strong in 2004–2007, with an elasticity of 7.4%. These findings imply that agglomeration externalities act as a growing source of improved firm productivity and economic development in China.

5 Conclusions and final remarks

This paper investigates whether a concentrated regional industrial structure/concentration affects firm productivity. Using China’s firm-level data from 2000 to 2007, we find no negative impacts of regional industrial structure directly on a firm’s output, but we find that regional industrial structure matters for localization economies. More specifically, cities in which industrial sectors are more dominated by a few large firms in own sector tend to have weaker localization economies. The magnitude of the effect of regional industrial structure on localization agglomeration is not trivial, as revealed by the estimated coefficient of the interactive term. For instance, a one standard deviation increase in the own-industry concentration will lead to a reduction of approximately 40% on the elasticity of localization economies. We conclude that regional industrial structure has little impact on urbanization economies.

Our conclusions are robust with different measures of regional industrial structure (the Herfindahl index and leading firm share), with different proxies on agglomeration, with different industrial, intertemporal and spatial scopes of agglomeration externalities, and with subsamples by ownership, by period, and by industrial sector. A break-down of industrial sectors reveals that the effect of regional industrial structure on localization agglomeration is subject to types of industries. Specifically, we found that (1) regional industrial concentration negatively affects light and heavy industries, but does not affect high-tech firms; and (2) high-tech firms exhibit urbanization economies but not localization economies.

Our findings suggest two important policy implications. First, as firms are less productive in cities where their own industry is dominated by a few large firms, China should promote a more competitive industrial structure featuring less dominance of large firms. This is particularly important because both light and heavy industries tend to be negatively affected by industrial dominance while simultaneously constituting the backbone of the national economy.

Second, SOEs are much larger than non-SOEs, in terms of both average size and average output. This implies that the marketization or privatization of SOEs could help to increase the competitiveness of the regional industrial structure (e.g., reducing the staff, spin-off, etc.), generating more positive externalities in addition to its own merits such as motivating management and reducing agency cost. Finally, small cities are more vulnerable to a concentrated industrial structure. This down-side effect of large firms on local/regional economy should be considered when seeking attraction of large investment in an industry.