Introduction

By incorporating the conceptual framework of regional absorptive capacity into innovation studies, this paper aims to conduct a systematic assessment on the effects of absorptive capacity in cities. Although new growth theory has long recognized and extensively documented the role of innovation in the sustainable economic growth of a region (Grossman and Helpman 1991; Aghion and Howitt 1992), research in this vein has received critical reevaluation in recent times (Acs et al. 2009; Braunerhjelm et al. 2010). This is because new growth theory asserts that technological factors can automatically convert into commercially useful knowledge, while it fails to take into account “knowledge filters,” the barriers in new knowledge production (Braunerhjelm et al. 2010; Qian and Acs 2013). Questions emerge in this respect: does a higher level of technological input lead to a higher level of innovation; more specifically, to what extent do regions penetrate the knowledge filters of innovation? At the core of these inquries lies the concept of absorptive capacity (Cohen and Levinthal 1990; Zahra and George 2002). A absorptive capacity theory argues that innovation hinges on not only the amount of technological spending but also the capacity to recognize, assimilate, and exploit new knowledge (Cohen and Levinthal 1990). In this sense, absorptive capacity serves as a moderator and a critical intermediary of innovation (Veugelers 1997) (Fig. 1). Neglecting the capacity when examining the determinants of regional innovation may induce an overestimation of the effects of technological spending.

Fig. 1
figure 1

The traditional theoretical framework (a) and the absorptive capacity framework (b) in regional innovation studies

Through the theoretical lens of absorptive capacity this paper re-evaluates the regional factors that determines the geography of innovation in China, especially highlighting the diverse moderating effects of regional absorptive capacity and their urban heterogeneity. The investigation starts with a thorough inspection of the spatiotemporal dynamics of innovation investigation starts with by using a multi-scalar Markov chain technique. At the next stage, we investigate into the effects of regional absorptive capacity by carrying out a set of panel quantile regressions. The most notable and original contributions of this paper are as follows: firstly, by systematically examining the effects of regional absorptive capacity, this study unveils the extent to which absorptive capacity moderates the relationships between regional innovation and its major determinants, including industrial R&D activities, institutional environment, trade openness and agglomeration effects; secondly, unlike previous studies that are conducted at the provincial level or above, this paper probes into factors underlying regional innovation at a more detailed geographical level (i.e. the city level); lastly, unlike previous studies that are based on the assumption that the effect of absorptive capacity is constant among all regions, this study captures the heterogenous effect of absorptive capacity across cities. In this respect, we further discover that regional absorptive capacity in China surprisingly serves as a self-reinforcing mechanism for highly innovative cities solely, deepening the inequality in regional innovation performance.

Although a dramatic economic success has been witnessed since China’s ongoing marketization (Wei et al. 2013), the country’s GDP growth rate in 2016 has dropped to the lowest level ever since 1991, highlighting the reality that China can no longer rely on factor accumulation and low-end manufacturing. To address the predicament, the Chinese government has devoted tremendous efforts to boosting technological innovation. These attempts have made China the second largest country in terms of R&D expenditure in 2017 (UNESCO 2018). Nonetheless, the low efficiency in utilizing these technological resources still trapped the nation’s catching-up process in the intensified global technological competition (Yang and Lin 2012). Therefore, China may serve as an ideal place for scrutinizations to unveil the moderating effects of regional absorptive capacity, and the extent to which these effects may diverge across cities. The case study of China may also provide new insights and policy implications for other developing economies seeking to address similar obstacles.

The remainder of this article is organized as follows. The second section presents a brief review of literature on absorptive capacity. The third section presents the study area, data, and methods for the current study. The fourth section presents empirical results from a multi-scalar Markov chain analysis and panel quantile regressions. The final section sets out major conclusions and some corresponding policy implications drawn from our empirical analysis.

Literature Review

Absorptive capacity theory, as proposed by Cohen and Levinthal (1990), believes that innovation performance rests critically on the capability to recognize, assimilate, and exploit new knowledge by utilizing prior related knowledge embodied in human capital (Qian and Acs 2013; Mowery and Oxley 1995). Therefore, to quantify absorptive capacity, recent studies tend to adopt the stock of human capital as a measure, such as the number of scientists and engineers (Keller 1996), investments in R&D personnel (Liu and White 1997), as well as the number of R&D departments that are fully staffed (Veugelers 1997). Until recently, theoretical attempts to understand the effect of absorptive capacity have placed much emphasis on corporate innovation. For instance, Cohen and Levinthal (1990) discovered that absorptive capacity improves corporate innovation performance through effective assimilation, transformation, and exploitation of new knowledge. Meanwhile, Qian and Jung (2017) discovered that entrepreneurial absorptive capacity can penetrate the knowledge filter and consequently facilitate the commercialization of new knowledge among entrepreneurs. Additionally, Spithoven et al. (2010) further detected that absorptive capacity may vary with corporate R&D levels.

Although corporate absorptive capacity has been a hot subject of documentation in recent years, what has not been well elucidated is the extent to which regional absorptive capacity influences regional innovation. The concept of regional absorptive capacity in clusters is first explored by Giuliani (2005), who defined it as the capacity to absorb, diffuse, and creatively exploit extra-cluster knowledge. Caragliu and Nijkamp (2008) further introduced the theory to regional innovation studies, and defined regional absorptive capacity as a region’s capability to understand, decode, and efficiently exploit new knowledge (Caragliu and Nijkamp 2012). In the light of the above interpretations, we define regional absorptive capacity as a region’s capability to identify, assimilate, disseminate and creatively exploit new knowledge by utilizing the region’s prior knowledge embodied in its human capital. Apart from the definitions, it has also been indicated by a strand of empirical studies that the deficiency in human capital hampers regional absorptive capacity, and consequently constrains regional innovation. For instance, Keller (1996) discovered that technology is only implementable when the labor force has built up the corresponding skills. Moreover, Borensztein et al. (1998) discovered that spillovers from foreign direct investment (FDI) can be activated only when a sufficient regional absorptive capability is available in the host economy. Similar conclusions are also reached by Fu (2008) and Yang and Lin (2012). Both studies proved that the effect of FDI on regional innovation rests critically on regional absorptive capacity, particularly in form of human capital (Yang and Lin 2012; Fu 2008). A brief summary of existing empirical evidences is shown in Table 1.

Table 1 Summary of literature on detecting regional absorptive capacity

Thus far, although some scholarly efforts have already been devoted towards revealing the effect of regional absorptive capacity from several dimensions, there is still a lack of systematic assessment on its effects on regional innovation. Such an omission appears especially serious in studies on China where the current understanding of regional absorptive capacity still amounts to little more than its moderating effect on the association between FDI and innovation (Yang and Lin 2012; Fu 2008), which is only the tip of the iceberg. Besides, existing literature in China is widely based on provincial level for inquiry, which may not be sufficiently precise to capture the features of knowledge dynamics (Audretsch and Feldman 1996). Moreover, previous literature still rests heavily on the assumption that the effect of absorptive capacity is constant among all regions. However, the severe spatial polarization of regional innovation in China (Sun 2000) indicates that the magnitude of the capacity may also diverge across regions at disparate innovation levels. In sum, previous research on Chinese regional absorptive capacity still lacks comprehensiveness and requires deeper investigation.

To remedy the above inadequacies, this study aims at systematically assessing the effect of regional absorptive capacity in China, particularly focusing on its heterogeneity across cities. In this light, the Markov chain is adopted to investigate the regional dynamics of innovation at the first step. Next, with the aid of a panel quantile regression model, we further investigate the extent to which absorptive capacity moderates the relationships between regional innovation and its major determinants, and the extent to which these moderating effects vary across cities.

Data and Methods

Research Materials

We use panel data aggregated both at the prefecture level and the county level for the current study. In this paper, we first trace the spatial dynamics of innovation based on data from 2873 Chinese counties (or urban districts) over the period 2000–2015. Next, a set of panel quantile regressions is conducted based on data from 289 prefecture-level cities (including municipalities) over the period 2005–2010 (the period is shortened owing to the lack of statistics for explanatory variables at the county level and some missing variables during the periods of 2000–2005 and 2010–2015). The study area is depicted in Fig. 2.

Fig. 2
figure 2

289 prefectures in the study area (administrative boundaries are drawn according to the Chinese administrative boundaries in 2015)

The key variable, regional innovation, is measured by the number of patent applications across the region. As discussed above, existing studies have indicated that regional absorptive capacity is conditioned by human capital (Veugelers 1997; Liu and White 1997; Muscio 2007; Gao et al. 2008; Knockaert et al. 2014; Keller 1996; Borensztein et al. 1998; Roper and Love 2006; Fu 2008; Saito and Gopinath 2011; Yang and Lin 2012; Jung and Lopez-Bazo 2017; Qian and Jung 2017). In this light, to systematically assess the moderating effects of absorptive capacity, we take into account the interactions between human capital and the following determinants of regional innovation: industrial R&D activities (Acs et al. 1994; Audretsch and Feldman 1996; Czarnitzki and Hussinger 2004), government support (Guellec and Van Pottelsberghe De La Potterie 2003; Lichtenberg 1984; Branstetter and Sakakibara 2002), FDI (Blomstrom and Persson 1983; Driffield and Munday 2001) and agglomeration effects (Jacobs 1969; Glaeser et al. 1992). To operationalize these concepts, we adopt the total number of researchers and scientists in a region to measure the stock of human capital. We set the threshold at the H index of 5 to rule out most students and some early-career researchers and scientists who are likely to have little research experience, because it is the experienced researchers and scientists that have adequate prior knowledge to generate innovation. Industrial R&D activities are measured by industrial R&D expenditure. Government support and institutional environment are indicated by government R&D expenditure. Agglomeration effects are divided into spillovers from localization (i.e., Marshallian externalities), and spillovers from diversification (i.e., Jacobs’ externalities). Additionally, spillovers from localization are indicated by the employment density of the city’s highest-location-quotient sector, whereas spillovers from diversification are indicated by the city’s total employment density except for the highest-location-quotient sector. Table 2 shows the definition and data sources of variables.

Table 2 Definition and data source of variables

By introducing a set of interaction terms between human capital and the five major determinants of regional innovation into the econometric specification, the final model is specified as follows:

$$ \boldsymbol{LN}\ {\boldsymbol{PAT}}_{\boldsymbol{i}\left(\boldsymbol{t}+\mathbf{2}\right)}={\boldsymbol{\beta}}_{\mathbf{1}}\boldsymbol{LN}\ {\boldsymbol{COR}}_{\boldsymbol{i}\boldsymbol{t}}+{\boldsymbol{\beta}}_{\mathbf{2}}\boldsymbol{LN}\ {\boldsymbol{GOV}}_{\boldsymbol{i}\boldsymbol{t}}+{\boldsymbol{\beta}}_{\mathbf{3}}\boldsymbol{LN}\ {\boldsymbol{FDI}}_{\boldsymbol{i}\boldsymbol{t}}+{\boldsymbol{\beta}}_{\mathbf{4}}\boldsymbol{LN}\ {\boldsymbol{LKS}}_{\boldsymbol{i}\boldsymbol{t}}+{\boldsymbol{\beta}}_{\mathbf{5}}\boldsymbol{LN}\ {\boldsymbol{DKS}}_{\boldsymbol{i}\boldsymbol{t}}+{\boldsymbol{\beta}}_{\mathbf{6}}\boldsymbol{LN}\ {\boldsymbol{AC}}_{\boldsymbol{i}\boldsymbol{t}}+{\boldsymbol{\beta}}_{\mathbf{7}}\boldsymbol{LN}\ {\boldsymbol{AC}}_{\boldsymbol{i}\boldsymbol{t}}\times \boldsymbol{LN}\ {\boldsymbol{COR}}_{\boldsymbol{i}\boldsymbol{t}}+{\boldsymbol{\beta}}_{\mathbf{8}}\boldsymbol{LN}\ {\boldsymbol{AC}}_{\boldsymbol{i}\boldsymbol{t}}\times \boldsymbol{LN}\ {\boldsymbol{GOV}}_{\boldsymbol{i}\boldsymbol{t}}+{\boldsymbol{\beta}}_{\mathbf{9}}\boldsymbol{LN}\ {\boldsymbol{AC}}_{\boldsymbol{i}\boldsymbol{t}}\times \boldsymbol{LN}\ {\boldsymbol{FDI}}_{\boldsymbol{i}\boldsymbol{t}}+{\boldsymbol{\beta}}_{\mathbf{1}\mathbf{0}}\boldsymbol{LN}\ {\boldsymbol{AC}}_{\boldsymbol{i}\boldsymbol{t}}\times \boldsymbol{LN}\ {\boldsymbol{LKS}}_{\boldsymbol{i}\boldsymbol{t}}+{\boldsymbol{\beta}}_{\mathbf{1}\mathbf{1}}\boldsymbol{LN}\ {\boldsymbol{AC}}_{\boldsymbol{i}\boldsymbol{t}}\times \boldsymbol{LN}\ {\boldsymbol{DKS}}_{\boldsymbol{i}\boldsymbol{t}}+\boldsymbol{a}+{\boldsymbol{\mu}}_{\boldsymbol{i}}+{\boldsymbol{e}}_{\boldsymbol{i}\boldsymbol{t}} $$
(1)

where LN PAT stands for patent counts; LN AC denotes regional absorptive capacity; LN COR indicates industrial R&D expenditure; LN GOV represents government R&D expenditure; LN FDI stands for FDI; LN LKS indicates spillovers from localization; LN DKS represents spillovers from diversification; α is the constant term; μ is the region fixed effect; and e is the error term. We use the stock of human capital instead of other indicators that have been previously used for studies in the United States and Europe (e.g., education-based human capital (Roper and Love 2006), occupational abilities and skills (Qian and Jung 2017), or the number of employees with graduate degrees and continuous R&D activities (Fores and Camison 2011)) to capture the effect of regional absorptive capacity. The reason is that other measures of absorptive capacity are difficult to operationalize in an empirical setting due to the lack of relevant official statistics in China, especially when we want to investigate into the effect of absorptive capacity at a detailed geographical scale. In addition, the independent variables are lagged by 2 years, and the logarithmic form has been taken for all variables.

Methods

Markov Chain Analysis

Markov chain enables us to characterize the transitional pattern of each spatial unit’s innovation level in order to detect the regional dynamics of innovation in China (Liao and Wei 2012; He et al. 2017). The first step in the Markov chain analysis is to classify spatial units (counties) by their innovation levels. By referencing the existing literature (Liao and Wei 2012), Chinese counties are classified equally into four groups based on quartiles intervals (namely in increasing order, “Low,” “Middle-low,” “Middle-high,” and “High”). Subsequently, a 4 × 4 transitional probability matrix P is computed, in which each element (pi,j,t) indicates the probability that a county in class i at the tth year will converge into class j at the (t + 1)th year. By inspecting the transitional probability matrix, the evolutional pattern of regional innovation can be identified by judging whether the temporal pattern of regional innovation in China is converging, diverging, or polarizing.

Panel Quantile Regression

A substantial body of literature has applied standard (ordinary least squares - OLS) regression models to identify the determinants behind regional innovation, but this approach only provides the conditional expectation (mean value) of the dependent variable and fails to describe the whole picture of the conditional distribution. However, due to the tremendous heterogeneity in China, the extent to which absorptive capacity influences regional innovation under various quantiles (i.e., within cities at different innovation levels) are different. In contrast to the OLS method, quantile regression allows coefficients to vary with multiple quantiles of the dependent variable (Koenker and Bassett 1978). In this regard, quantile regression has distinctive advantages in studying the heterogenous effect of regional absorptive capacity across cities at diverse innovation levels. Besides, it is also capable of addressing other issues such as heteroscedasticity, outliers, and unobserved heterogeneity (Koenker and Hallock 2001).

Therefore, we adopt the model set out below to address the conditional quantile function of the panel data:

$$ {\boldsymbol{Q}}_{{\boldsymbol{Y}}_{\boldsymbol{i}\boldsymbol{t}}}\left(\boldsymbol{\tau} |{\boldsymbol{X}}_{\boldsymbol{i}\boldsymbol{t}}\right)=\boldsymbol{\beta} \left(\boldsymbol{\tau} \right){\boldsymbol{X}}_{\boldsymbol{i}\boldsymbol{t}}+{\boldsymbol{\mu}}_{\boldsymbol{i}}\left(\boldsymbol{\tau} \right)+{\boldsymbol{e}}_{\boldsymbol{i}\boldsymbol{t}} $$
(2)

where Xit denotes the independent variables; μi stands for the individual effect; τ denotes the τth quantile of the dependent variable Yit; and β(τ) denotes the τth quantile regression estimators. The minimization problem of quantile regression estimation can be written as:

$$ \boldsymbol{\beta} \left(\boldsymbol{\tau} \right)=\underset{\boldsymbol{\beta} \left(\boldsymbol{\tau} \right)}{\boldsymbol{argmin}}{\sum}_{\boldsymbol{k}=\mathbf{1}}^{\boldsymbol{q}}{\sum}_{\boldsymbol{t}=\mathbf{1}}^{\boldsymbol{T}}{\sum}_{\boldsymbol{i}=\mathbf{1}}^{\boldsymbol{N}}\left(\ |{\boldsymbol{Y}}_{\boldsymbol{i}\boldsymbol{t}}-{\boldsymbol{\mu}}_{\boldsymbol{i}}-\boldsymbol{\beta} \left(\boldsymbol{\tau} \right){\boldsymbol{X}}_{\boldsymbol{i}\boldsymbol{t}}|{\boldsymbol{w}}_{\boldsymbol{i}\boldsymbol{t}}\right) $$
(3)

where q denotes the number of quantiles; T denotes the number of years; N equals to the size of the observations; and wit is the weight of the ith city in the tth year, which can be expressed as:

$$ {\boldsymbol{w}}_{\boldsymbol{i}\boldsymbol{t}}=\left\{\begin{array}{c}\boldsymbol{\tau}, \kern2.35em {\boldsymbol{Y}}_{\boldsymbol{i}\boldsymbol{t}}-{\boldsymbol{\mu}}_{\boldsymbol{i}}-{\boldsymbol{X}}_{\boldsymbol{i}\boldsymbol{t}}\boldsymbol{\beta} \left(\boldsymbol{\tau} \right)<\mathbf{0}\\ {}\mathbf{1}-\boldsymbol{\tau}, \kern1em {\boldsymbol{Y}}_{\boldsymbol{i}\boldsymbol{t}}-{\boldsymbol{\mu}}_{\boldsymbol{i}}-{\boldsymbol{X}}_{\boldsymbol{i}\boldsymbol{t}}\boldsymbol{\beta} \left(\boldsymbol{\tau} \right)>\mathbf{0}\end{array}\right. $$
(4)

In this study, we follow Koenker (2004) and use Tukey’s trimean as a prototype, assigning 0.25, 0.5, and 0.75 respectively to the quantiles. Hence, all city samples are classified into three categories according to their innovation levels. Coefficients at the three quartiles of dependent variables are estimated correspondingly. In other words, the 25th percentile model (q25) investigates into cities at lower innovation levels (namely, less innovative cities), the 50th percentile model (q50) investigates into cities at medium innovation levels (namely, medium innovative cities), and the 75th percentile model (q75) investigates into cities at higher innovation levels (namely, highly innovative cities).

Result and Discussion

The Spatiotemporal Dynamics of Regional Innovation in China

Figure 3 shows the snapshot of the spatial distribution of innovation in China in the years 2000, 2003, 2006, 2009, 2012, and 2015. It is evident that regional innovation is highly concentrated in three urban agglomerations along the coast: Yangtze River Delta, Pearl River Delta, and Beijing-Tianjin Area. These three urban agglomerations have played an increasingly important role over time in driving the spatial dynamics of innovation. By contrast, the spatial structure of regional innovation in central and western China is characterized as scattered cluster patches and dispersed associations. In these regions, the clustering of innovation is primarily limited to municipalities, provincial capitals, or sub-provincial cities such as Wuhan, Chongqing and Chengdu, where human capital and scientific resources are highly accumulated. However, they hardly show any indication of diffusion to their surrounding areas from 2000 to 2015. Compared with the spatial structure in 2000, regional innovation has become even more convergent in these growth poles in central and western China in 2015. From the timeline of the spatial pattern, innovation distributed in eastern region has grown saliently over time, with the sum of patents increasing at an annual growth rate of 155.49%. By contrast, a relatively slower increase has been witnessed in central and western China, where the sum of patents is only found to progress at an annual growth rate of 132.10% and 128.90% respectively, lagging far behind the eastern region. In 2017, the sum of patents in western and central China only accounts for a mere 18.1% of the total nationwide, whereas Beijing, Shanghai, and Shenzhen - the three top innovative cities in eastern China - have already occupied 36.5% of the total patent counts to date.

Fig. 3
figure 3

Snapshot of the spatial distribution of innovation in China, 2000, 2003, 2006, 2009, 2012, and 2015

As stated above, one objective of this paper is to understand the underlying heterogenous effect of absorptive capacity behind the severe spatial polarization of regional innovation in China. To better illustrate this, a more explicit spatiotemporal pattern of regional innovation is revealed by the multi-scalar Markov chain analysis (Table 3). The result for county-level analysis indicates that the transitional pattern of regional innovation in China is relatively stable among various classes. Compared with other classes, counties in “Low” and in “High” are more likely to persist in terms of their innovation levels. More specifically, the frequencies of counties in “High” and in “Low” remaining at the same innovation levels in the following year is 88.09% and 83.66% respectively. By contrast, the frequencies are only 62.45% and 64.32% respectively for counties in “Middle-low” and in “Middle-high” to stay at the same innovation levels. Additionally, it is more likely for a region to transit from one level to a nearer level (e.g., from “Low” to “Middle-low”) than to a farther one (e.g., directly from “Low” to “High”). Similar findings also hold for prefecture-level and province-level analyses. On one hand, the results preliminarily reveal the persistence of a “Mathew effect” in regional innovation in China, where less innovative regions are relatively unlikely to achieve a breakthrough to highly innovative regions in a short time, whereas highly innovative regions tend to persist in terms of their innovation levels. On the other hand, we also discover that the transitional frequencies are rising when we are taking a more detailed geographical scale for analysis. In this light, in contrast with existing studies that are based on province level for inquiry, investigations at city level and county level are more sufficiently precise to capture the features of knowledge dynamics.

Table 3 Multi-scalar Markov-chain transitional matrix for regional innovation in China, 2000–2015

Factors Influencing Regional Innovation in China

We first inspect the correlations among variables with Variance Inflation Factors (VIF). The test shows no evidence of multicollinearity among explanatory variables (max VIF < 3). Subsequently, by resorting to the Hausman test, we select fixed-effect estimators for all econometric specifications. At the first stage of regression analysis, we estimate a set of baseline models to identify the determinants of regional innovation (Table 4). The baseline models yield several interesting findings: Firstly, corporate R&D, FDI and human capital pose positive and significant effects on urban innovation within all city classes. Moreover, for industrial R&D and FDI, we further discover that their contributions to regional innovation rise to the largest in highly innovative cities. Specifically, the coefficients of industrial R&D (LN COR) are identified as 0.068 in the 25th quantile model (q25), 0.060 in the 50th quantile model (q50), and 0.077 in the 75th quantile model (q75). This implies that every 1% increase in industrial R&D causes a 0.068% increase in innovation performance for less innovative cities, with the positive effect rising to 0.077% for highly innovative cities. This finding is also substantially supported by the existing literature (Acs et al. 1994; Audretsch and Feldman 1996). Similarly, the coefficients of LN FDI are 0.055, 0.064 and 0.077 at the 25th, 50th and 75th percentiles respectively, which implies that every 1% increase in FDI causes only a mere 0.055% growth in innovation for less innovative cities, with the positive effect gradually increasing in highly innovative cities. It is worth noting that contradictory conclusions have been found in existing literature on FDI. Some scholars believe that FDI can facilitate regional innovation through technology diffusion (Blomstrom and Persson 1983; Driffield and Munday 2001), while others reveal a negative (Harrison 1999) or insignificant relationship (Haddad and Harrison 1993) between FDI and innovation. In the light of our finding, we believe that this lack of consistency in the extant literature can be attributed to the failure to capture regional heterogeneity, particularly the disparity in urban innovation levels.

Table 4 Results from panel quantile regression: the baseline model

When it comes to government R&D expenditure (LN GOV), we also detect a significantly positive effect for all city classes. This finding coincides with the prevailing notion that there is a positive relationship between government support and innovation performance (Guellec and Van Pottelsberghe De La Potterie 2003; Czarnitzki and Hussinger 2004; Branstetter and Sakakibara 2002). Moreover, it is also implied that the positive contribution of government R&D investment still outweighs its negative “crowding-out effect” on industrial R&D in China (Lichtenberg 1984; Wallsten 2000). In sharp contrast with the effect of industrial R&D expenditure, government support tends to be more effective in less innovative cities. More specifically, the coefficients of LN GOV are 0.163 in the 25th quantile model, 0.153 in the 50th quantile model, and 0.117 in the 75th quantile model. The results indicate that every 1% increase in government support leads to a 0.163% increase in innovation for less innovative cities, with the strength of the effect gradually diminishing to a mere 0.117% for highly innovative cities. Furthermore, since that the coefficients of LN LKS and LN DKS all fail to meet the 1% confidence level for all the models, there is not enough evidence to support the positive influences of localization and diversification on regional innovation in China.

At the next stage, to systematically examine the moderating effect of regional absorptive capacity, we sequentially add the interaction terms between human capital and each of the other five regional innovation determinants (industrial R&D, government support, FDI, spillovers from localization, and spillovers from diversification) into the models (Table 5). Several findings on regional absorptive capacity are yielded from the results: Firstly, for highly innovative cities only, regional absorptive capacity exerts positive moderating effects on the association between industrial R&D and innovation, and the association between government support and innovation, which are two of the most significant determinants of Chinese urban innovation according to the baseline model. More specifically, in model 2 and model 3, the parameters for the interaction terms of industrial R&D (LN AC × LN COR) and government support (LN AC × LN GOV) are only positive and statistically prominent at the 75th percentile, indicating that the positive moderating effects of regional absorptive capacity for the above two critical factors do not even exist in less innovative cities. According to the existing literature (Cohen and Levinthal 1990; Zahra and George 2002; Camison and Fores 2010; Lund Vinding 2006; Lane and Lubatkin 1998), this can be ascribed to the disparity in urban knowledge bases among cities. In comparison with less innovative cities, the intensified knowledge creation in highly innovative cities are more likely to contribute to a higher amount of prior knowledge, which serves as the most critical determinant for absorptive capacity (Cohen and Levinthal 1990; Lane and Lubatkin 1998).

Table 5 Results from panel quantile regression: moderating effects of regional absorptive capacity

Similarly, regional absorptive capacity only exerts a positive moderating effect on the association between FDI and innovation in highly innovative cities solely. The interaction term of FDI (LN AC × LN FDI) is only significantly positive in the 75th percentile model, which is consistent with our expectation and common sense. Conforming to our results, it is also discovered in existing literature that regional absorptive capacity can facilitate foreign technology diffusion from FDI (Yang and Lin 2012; Fu 2008; Liu and White 1997). However, the heterogenous effect of absorptive capacity across different cities has not yet been revealed. Our study further illustrates that less innovative cities benefit less from FDI due to their deficiency in regional absorptive capacity. This finding is also supported by Borensztein et al. (1998), who made the claim that technology diffusion from trade openness can only be triggered by a sufficient amount of human capital. In this light, it is indicated from the above two findings that the regional absorptive capacity in China surprisingly serves as a self-reinforcing mechanism solely for highly innovative cities, inducing the rising regional inequality of innovation. To some extent, the above findings offer new insights for explaining the “Matthew effect” in the geographical distribution of innovation in China.

Additionally, although the main effects of localization and diversification are not significant at the 1% level according to the baseline model, regional absorptive capacity still improves their contributions to innovation in certain cities. For spillovers from localization (LN AC × LN LKS), regional absorptive capacity serves as a positive moderator within both highly innovative and less innovative cities; while when it comes to spillovers from diversification, the positive moderating effect is only evident in less innovative cities. This is because in less innovative and less developed cities, the “recombinant innovation,” which is the nature of diversification (Jacobs 1969; Castaldi et al. 2015), often takes a lower opportunity cost than it takes in more innovative and more developed cities.

Conclusion

By embedding the conceptual framework of regional absorptive capacity into innovation studies, this paper re-evaluates the determinants of regional innovation in China, particularly focusing on the role of regional absorptive capacity and its heterogeneous effect across cities. At the first step, we use patent data aggregated at the county level to trace the spatiotemporal dynamics of innovation in China from 2000 to 2015. Next, by carrying out a set of panel quantile regressions, we systematically reveal the extent to which regional absorptive capacity moderates the associations between regional innovation and its major determinants.

Our empirical study yields several interesting findings: Firstly, there is a severe “Matthew effect” in the spatial dynamics of innovation in China. In this sense, less innovative regions are relatively unlikely to achieve a breakthrough whereas highly innovative regions tend to persist in terms of their innovation levels. Secondly, turning to the main effects of the regional innovation determinants, we reveal that industrial R&D, FDI and agglomeration externalities are more effective in highly innovative cities, while government support is more effective in less innovative cities. Thirdly, the regional absorptive capacity in China unexpectedly serves as a self-reinforcing mechanism for highly innovative cities solely, exacerbating the spatial inequality of regional innovation. More pecifically, although we discover that positive moderating effects serves to facilitate agglomeration externalities in less innovative cities, the regional absorptive capacity for industrial R&D, government support and FDI - which have been proved as the most critical factors behind regional innovation in China - are only revealed in highly innovative cities.

Several policy implications are also derived from the empirical findings: in highly innovative cities, local governments may wish to efficiently exploit any regional absorptive capacity to trigger spillovers from advanced technologies. Moreover, since the contribution of industrial R&D increases to the highest level in highly innovative cities, we recommend that R&D policies give free rein to industrial R&D activities to shape their comparative advantage in regional innovation; while in less innovative cities, local governments are suggested to be more proactive in directing the R&D activities. Moreover, to truly bridge the gap with highly innovative cities, policies in lower innovation cities should give more priority to fostering human capital by means of favorable talent policies. In this respect, more efforts should be made to provide favorable urban amenities to attract and retain skilled laborers (Liu and Shen 2017) or highly educated youths (Liu et al. 2017), who both determine the potential regional competitive advantages in fostering human capital.

In synthesis, this paper contributes to the existing literature in three ways: Firstly, the study empirically carries out a systematic assessment on the effects of regional absorptive capacity; Secondly, our econometric specifications are based on cities, a more detailed geographical scale for advancing the current understanding of regional innovation; Lastly, this study captures the heterogenous effect of absorptive capacity across cities, which explains the rising regional inequality of innovation in China. Nonetheless, the spatial effect of regional absorptive capacity remains unclear, as we assume in this paper that cities are independent spatial units that do not interact with technological practices elsewhere. However, it has been suggested that there are strong cross-regional interactions between cities (Pred 1977), which is an issue that deserves our further attention. In this light, we intend to shed further light on the cross-regional moderating effects of absorptive capacity in the future.