1 Introduction

One of the most important development in the new growth and international trade theories has been the recognition of the significant role of knowledge flows between economic agents from different regions or economic areas. According to Grossman and Helpman (1991), for instance, growth rates are faster when technological change readily flows across international borders. For Romer (1990), the non-rival and partially non-excludable feature of the knowledge good does not allow inventors to fully prevent other firms from using their inventions. More generally, knowledge spillovers may be driven by a variety of channels such as the mobility of workers, the exchange of information at technical conferences, or knowledge available in the scientific and technological literature including patent documents. These knowledge externalities or R&D spillovers can benefit to competitors’ R&D by lowering the costs of their own R&D activities and in turn may contribute to their productivity performance. However, new products and processes can also render existing ones obsolete or less competitive and firms that encounter difficulties to stay in the R&D race may suffer from rivals’ R&D. In this case, R&D externalities are associated with competitive pressures which will translate into negative effects on firms’ performance.

The specific type of knowledge flows that economists have most been interested in concerns pure knowledge spillovers.Footnote 1 Economists have often investigated the patterns of these knowledge flows from a geographic or a technological perspective, i.e. in terms of geographic proximity or technological linkages between the unit generating these flows and the recipients. Over the last decade, several studies in the literature that examines the spatial dimension of innovative activities have found that knowledge spillovers tend to be locally concentrated (Jaffe 1989; Jaffe et al. 1993). At the same time, other studies have shown evidence of a positive relationship between the R&D of ‘technological neighbours’ and the firm’s R&D productivity (as measured by patenting). In terms of productivity performance, the effects of R&D spillovers also appear to be mainly technologically localised (Jaffe 1986, 1988).

While very important for economic growth, the two types of geography and technology based R&D externalities have rarely been investigated together (Orlando 2000). A first contribution of this paper is to analyse the impact of these spillover phenomena on firms’ productivity in a unified framework. To this end, we implement two methodologies to analyse knowledge flows among firms. We construct the R&D spillover stock by considering a technological as well as a geographic proximity measure. The approach for modelling the technology based R&D spillover variable builds on the methodology that was first empirically implemented by Jaffe (1986). This method rests on the construction of technological proximities between firms in a technological space. The firms’ positions in the technological space are characterized by the distribution of their patents over patent classes. Localization R&D spillovers are performed on the basis of geographic distances between firms which use the latitude and longitude coordinates of corporate headquarters (Orlando 2000).

In order for R&D spillovers to be effective, firms must be able to identify, assimilate and exploit the external knowledge stock. The degree of absorptive capacity will depend on the firms’ own R&D activities. A second contribution of the study is to analyse the role of absorptive capacity in enhancing the firms’ ability to benefit from geographic and technological based R&D spillovers. Following Cohen and Levinthal (1989), the firm’s own R&D is used to measure the level of knowledge accumulation internal to the firm and the importance of absorptive capacity.

We use an extended production function to estimate the impact of R&D spillover components and absorptive capacity besides traditional inputs and own R&D stock (Griliches 1979). The dataset consists of a representative sample composed of 808 worldwide R&D-intensive manufacturing firms over the period 1988–1997. This information is matched to the USPTO dataset of Hall et al. (2001). The results estimated by means of panel data econometric methods indicate a positive and significant impact of R&D spillovers on productivity performance. On the whole, the elasticity associated with the geographic (resp. technological) R&D spillover pool is two times (resp. four times) the one of the firm’s own R&D stock. Furthermore, US and Japanese firms are mainly sensitive to spillover effects generated by domestic firms while European firms appear to mainly benefit from the international R&D spillover stock.

The paper is organized as follows. In Sect. 2, we describe the dataset constructed for the purposes of the study. Then, we discuss the different methodologies used to measure the spillover stocks as well as the econometric framework. Section 3 presents the main empirical findings. A concluding section briefly summarises the empirical findings and points out some directions for future research.

2 Data and econometric framework

2.1 Data sources and matching procedure

The dataset has been constructed with the view of setting up a representative sample of the largest firms at the international level that reported R&D expenditures. The information on company profiles and financial statements comes from the Worldscope/Disclosure database.Footnote 2 The dataset consists of a balanced panel of 808 firms over the 1988–1997 period (Appendix 4). For each firm, information is available for net sales (S), the number of employees (L), the net property, plant and equipment (C), annual R&D expenditures (R) and main industry sectors according to the Standard Industrial Classification (SIC—four digits). The database of Hall et al. (2001) on US patents is the second source of information used in this study. This database, which is available on the NBER website, contains all patents registered at the US Patent & Trademark Office (USPTO) over the period from January 1, 1963 to December 30, 1999. The database contains a huge set of information among which the technological fields corresponding to the claimed invention. In Appendix 4, we show the number of patents used in the database, by country and economic area. The third source of information concerns the geographic coordinates of firms, i.e. the latitude and the longitude. This information has been retrieved on the basis of firms’ headquarters addresses and is used to compute geographic distances between firms.

A major task in assembling the dataset has been the matching of patents from the Hall et al. (2001) data with firms in the Worldscope database. Two difficulties have been encountered. First, patents are assigned to firms on the basis of their names which can vary from one data sources to the other, e.g. ‘Co’ instead of ‘Corporation’, ‘Incorporated’ or ‘Inc’ and other such changes or abbreviations. Second, many large firms have several R&D performing subsidiaries in several countries and it is not obvious to link the patents applied by these subsidiaries to the parent company. Ideally, one has to have a ‘mapping’ of the main firms company to their subsidiaries and affiliates. However, it is not easy to construct an accurate mapping, since it changes over time through the process of merger and acquisition.

Taking into account these issues, the matching procedure consisted of two steps. In a first step, patents were assigned to firms on the basis of their generic names. For instance, when searching for the word “Fiat” we retrieved 435 patent documents. Examining more in detail the firm’s full names reported in these documents, it appeared that 391 patents were assigned to “Fiat S.p.A.”, 26 patents to “Fiat Products INC.”, 14 patents to “Fiat Français” and four patents to “Fiat Products LTD”. These last companies are clearly foreign subsiadiaries of the European parent company. Hence, the patents granted to these firms have been consolidated with the ones of “Fiat”. In a second step, this procedure has been repeated for each firm of the sample. For about 80% of the sample there was only one firm name in the retrieved documents. For the rest, firm names which could be identified without any doubts as subsiadiaries have been matched with generic names.

2.2 Construction of variables

Given the presence of outliers, a cleaning procedure similar to the one in Capron and Cincera (1998) has been implemented in order to reject firms whose variables displayed very high and frequent variations.Footnote 3 The process of merger and acquisition of firms over time is the most likely reason for the presence of such outliers. All variables have been converted into constant 1995 dollars. Because of the non-availability of output deflators at the industry level for each country, net sales (S), net property, plant and equipment (C), R&D expenditures (R) have been deflated using the GDP deflators of respective countries. The stock of R&D capital has been built on the basis of the permanent inventory method with a depreciation rate equal to 15% and an initial stock of R&D capital calculated by assuming a growth rate of R&D expenditure equal to 5%.

A key issue in the empirical analysis on knowledge spillovers is the measurement of the pool of external knowledge. This stock is usually built as the amount of R&D conducted elsewhere weighted by some proximity measure which reflects the intensity of knowledge flows between the source and the recipient of spillovers.Footnote 4 In this paper, we follow the methodology developed by Jaffe (1986) to compute the technological proximity. This procedure rests in the construction of a technological vector for each firm based on the distribution of its patents across technology classes.Footnote 5 These vectors allow one to locate firms into a multi-dimensional technological space where technological proximities between firms are performed as the uncentered correlation coefficient between the corresponding technology vectors:

$$ P_{ij}=\frac{\sum_{k=1}^K{T_{ik} T_{jk}}}{\sqrt{\sum_{k=1}^K {T_{ik}^2 \sum_{k=1}^K {T_{jk}^2}}}} $$
(1)

where T i is the technological vector of the firm i and P ij is the technological proximity between firm i and j.

According to this procedure, the total weighted stock of R&D spillovers is performed as follows:

$$ Ts_i=\sum_{i\neq j}{P_{ij} K_j} $$
(2)

where K j is the R&D capital stock of firm j.

Table 1 illustrates some technological proximity measures for different firms in the dataset. As emphasised by Jaffe (1986), this technological distance index, which takes only positive values, relies on the strong assumption that the appropriability conditions of knowledge are the same for all firms. The more the outcomes of R&D activities are appropriable, the less there will be knowledge flows between R&D performers and the potential users of this knowledge. Since these variables are not observable at the firm level, their direct assessment is hard to pick up. However, in a panel data context, one may assume that these firm specific unobserved effects are constant over the period considered.

Table 1 Technological and geographic (in italic) proximities

As for technological proximities, different measures have also been proposed in the literature to measure the geographic proximity between firms.Footnote 6 Following Orlando (2000), we use the latitude and the longitude coordinates of firms derived from the address of their headquarters. Assuming a spherical earth of actual earth volume, the arc distance in miles between any two firms i and j can be performed according to the Haversine formula:

$$ \hbox{d}_{\rm ij}=2^\ast3.959^\ast\arcsin\sqrt{\sin^{2}\left({\frac{\hbox{lat}_{\rm j}-\hbox{lat}_{\rm i}}{2}}\right)+\cos\left({\hbox{lat}_{\rm j}}\right)+\cos \left({\hbox{lat}_{\rm i}}\right)\sin^{2}\left({\frac{\hbox{lon}_{\rm j}-\hbox{lon}_{\rm i}}{2}}\right)} $$
(3)

where 3.959 is the radius of the earth in miles and latitude and longitude values are in radians.

As stressed by Orlando (2000), the use of corporate headquarters to represent the firm location may be questionable for the purpose of spillover measurement. Quoting the author (2000, p. 14), “One may argue that our true interest is in the location of innovation, not necessarily in the location of corporate headquarters. However, if firms view R&D as their most strategically important investment they are likely to locate this activity close to corporate headquarters. Furthermore, while R&D may be a reasonable proxy of the scale of a firm’s innovative activity, spillovers from this implied knowledge base may emerge from any of the locations that compose the firm; R&D facilities, production facilities, or corporate headquarters. Thus, corporate headquarters may be as good a proxy of firm location as we can hope to find”.Footnote 7

Based on the geographic distances of firms and assuming that the spillovers’ stock is negatively correlated to the geographic distance d ij , we implement a weighted sum of R&D capital stock. We cannot use the function 1/d to compute the proximity G ij since for values of d ij equal to zero, the function 1/d ij is not definite. To solve this problem, we use the negative exponential function, \(1/\hbox{e}^{{d}_{ij}},\) so if the distance is zero, the geographic proximity is 1, i.e. the maximum possible value:

$$ G_{ij}=1/\hbox{e}^{{d}_{ij}} $$
(4)

Once we have computed the geographic proximity G ij among firms, we can construct the geographic based R&D spillover stock for firm i as:

$$ Tsg_i=\sum_{i\neq j} {G_{ij} K_j } $$
(5)

Table 1 also reports the geographic proximities of the same five firms as in the previous example. Here also, this proximity measure only takes positive values. Although the R&D spillovers based on the proximities in the technological or geographic space are likely to be less contaminated by pecuniary externalities and common industry effects, evidence of their impact on productivity may still be unrelated to knowledge spillovers, but rather the result of spatially correlated technological opportunities. Yet, as emphasised by Griliches (1992), if new opportunities exogenously arise in a technological area, firms active in that area will increase their R&D spending and improve their productivity.

Table 2 Technology and geographic based R&D spillovers (GMM-system estimates)

Following Bottazzi and Peri (2003), another approach to formalize geographic spillovers is to consider classes of distances. If R&D spillovers depend on the geographic distance between firms, then one can distinguish between spillovers occurring between closest firms and those occurring between far away firms.Footnote 8,Footnote 9 One advantage of estimating different geographic spillovers pools based on different ranges of distances rather than on (5) is that no a priori assumption on how this variable depends on distance needs to be made.

In order for R&D spillovers to be effective, firms must be able to ‘absorb’ the knowledge generated outside their walls. Yet, the empirical measurement of firms’ absorptive capacity, i.e. the ability of firms to identify, assimilate and exploit knowledge from the environment (Cohen and Levinthal 1989) has proven to be difficult. The usual way retained to measure absorptive capacity is through R&D.Footnote 10 Indeed, as discussed by Cohen and Levinthal (1989), R&D activities are not only aimed at generating new knowledge but these activities also play an important role in building a firm’s absorptive capacity. In practice, it is difficult to disentangle between the two faces of this dual role of R&D. Kinoshita (2000) and Grunfeld (2004) consider the interaction term between the firm’s own R&D intensity and the R&D spillover variable to evaluate the firms’ absorptive capacity. In this study we consider the R&D stock as an alternative of the R&D intensity in order to capture the cumulative nature of the learning process which helps to build the absorptive capacity.

2.3 Econometric framework and summary statistics

Following Griliches (1979), the impact of technological and geographic R&D spillovers on firms’ productivity growth besides traditional inputs and the firm’s own R&D stock, is estimated by means of a extended Cobb–Douglas production function:

$$ Y=AL^{\beta 1}C^{\beta 2}K^{\beta 3}X^{\gamma} $$
(6)

This function can also be estimated by adding an interaction term between the firm’s own R&D capital and the R&D spillover stock by setting:

$$ \gamma=\gamma_1+\gamma_2 K $$
(7)

This allow us to test for the presence and the extent of absorptive capacity. Replacing (7) in (6), taking the logarithms and introducing a set of time dummies leads to:

$$ \ln Y_{it}=\alpha_i+\lambda_t+\beta_1 \ln L_{it} +\beta_2 \ln C_{it} +\beta_3 \ln K_{it} +\gamma_1 \ln X_{it} +\gamma_2 K_{it} \ln X_{it} +\varepsilon_{it} $$
(8)

where ln is the natural logarithm; i indices the firm and t indices the time; Y it is the net sales; L it is the employment; C it is the stock of physical capital; K it is the stock of R&D capital; α i is the firm’s fixed effect; λ t is a set of time dummies; X it is a vector of spillover components; β and γ are vectors of parameters and ɛ it is the disturbance term.

Different R&D spillover components have been estimated:

  • Ts = total stock of technological spillovers;

  • Tsg = total stock of geographic spillovers.

Thus, we estimate the following models:

$$ \ln Y_{it} =\alpha_i +\lambda_t +\beta_1 \ln L_{it} +\beta_2 \ln C_{it} +\beta_3 \ln K_{it} +\gamma_1 \ln Ts_{it} +\varepsilon_{it} $$
(8.1)
$$ \ln Y_{it} =\alpha_i +\lambda_t +\beta_1 \ln L_{it} +\beta_2 \ln C_{it} +\beta_3 \ln K_{it} +\gamma_1 \ln Tsg_{it} +\varepsilon_{it} $$
(8.2)
$$ \ln Y_{it} =\alpha_i +\lambda_t +\beta_1 \ln L_{it} +\beta_2 \ln C_{it} +\beta_3 \ln K_{it} +\gamma_1 \ln Ts_{it} +\gamma_2 K_{it} \ln Ts_{it} +\varepsilon_{it} $$
(8.3)
$$ \ln Y_{it} =\alpha_i +\lambda_t +\beta_1 \ln L_{it} +\beta_2 \ln C_{it} +\beta_3 \ln K_{it} +\gamma_1 \ln Tsg_{it} +\gamma_2 K_{it} \ln Tsg_{it} +\varepsilon_{it} $$
(8.4)
$$ \ln Y_{it} =\alpha_i +\lambda_t +\beta_1 \ln L_{it} +\beta_2 \ln C_{it} +\beta_3 \ln K_{it} +\gamma_1 \ln Ts_{it} +\gamma_3 \ln Tsg_{it} +\varepsilon_{it} $$
(8.5)
$$ \ln Y_{it} =\alpha_i +\lambda_t +\beta_1 \ln L_{it} +\beta_2 \ln C_{it} +\beta_3 \ln K_{it} +\gamma_1 \ln Ts_{it} +\gamma_2 K_{it} \ln Ts_{it} +\gamma_3 \ln Tsg_{it} \\ +\gamma_4 K_{it} \ln Tsg_{it} +\varepsilon_{it} \\ $$
(8.6)

In order to test the approach of Bottazzi and Peri (2003) to formalize geographic spillovers, we split the stock of R&D spillovers into two components, one relative to firms localized within a distance of 300 km and another one relative to firms distant more than 300 km.Footnote 11 Thus, we estimate the following model:

$$ \ln Y_{it} =\alpha_i +\lambda_t +\beta_1 \ln L_{it} +\beta_2 \ln C_{it} +\beta_3 \ln K_{it} +\lambda_1 \ln Ns_{it} +\lambda_2 Fs_{it} +\varepsilon_{it} $$
(8.7)

where Ns = stock of spillovers between near firms (d ij  < 300 km); Fs = stock of spillovers between far away firms (d ij  > 300 km).

Equations 8.1–8.7 are estimated by means of two econometric models for panel data: first difference and system IV-GMM. These models allow controlling for firms’ permanent unobserved specific effects, and taking into account the possible endogeneity or simultaneity issue of the explanatory variables with the error term.Footnote 12 The more recent system GMM (GMM SYS) estimator combines the standard set of equations in first difference (GMM F.D.) with suitably lagged levels as instruments, with an additional set of equations in levels with suitably lagged first differences as instruments.Footnote 13 The validity of these additional instruments, which consist of first difference lagged values of the regressors, can be tested through difference Sargan over-identification tests. The GMM SYS estimator can lead to considerable improvements in terms of efficiency as compared to the GMM F.D. one.Footnote 14

In the following static generic model, \(y_{it}=\beta x_{it}+\eta_i+\upsilon_{it}\) where x it is correlated with η i and exogenous in the sense that \(E(x_{it}\upsilon_{is})=0\) for i = 1–N and \(\hbox{s}\leq t,\) taking first differences to eliminate the individual effects η i the moment conditions \(E(x_{it-s}\Updelta\upsilon_{it})=0\) for t = 3–T and s ≥ 2 are available. Lagged values of endogenous x it variables dated t − 2 and earlier can then be used as instruments for the equations in first-differences.

If x it are uncorrelated with the individual-specific effects, \(E(\eta_i \Updelta x_{it})=0\) for i = 1–N and t = 2–T and the following moment conditions are available: \(E(\Updelta x_{it-1} u_{it})=0\) for i = 1–N and t = 3 to T then suitably lagged first-differences of endogenous x it variables can be used as instruments for the level equations (so the system GMM is implemented).

Since the model is overidentified in the sense that there are more instruments than parameters to be estimated, the validity of the instruments can be tested by means of the Sargan test for overidentified restrictions. Considering the set of instruments used and the need to satisfy the orthogonality conditions, it helps to verify the null hypothesis of joint validity of the instruments. The Sargan test is χ2 distributed under the null with (p − k) degrees of freedom (where p is the number of instruments and k is the number of variables in the regression).Footnote 15 In particular, the Sargan test results allow us to assume the x it to be endogenous, since it is necessary to use \(\Updelta x_{it}-1\) as instrumental variables for the equation in levels since \(\Updelta x_{it}\) are correlated with the error term.

Appendices 2, 4 and 5 present some descriptive statistics and Appendix 3 gives the representativeness of the dataset in terms of R&D expenditures. R&D expenditures of the 808 firms of the dataset amount to about 30–50% of the corresponding R&D aggregates for the EU, Japan and the US.

3 Empirical findings

This section presents the main empirical findings of the paper. Table 2 reports the estimates regarding the impact of traditional inputs, the firm’s own R&D stock and the different R&D spillover stocks on productivity growth. Given the reasons discussed before, our favourite estimates are given by the GMM system model.Footnote 16 The results appear to be different from one model to the other while the Sargan tests of overidentifying restrictions do not reject the validity of the set of instruments at the 5% level of significance.

The estimated elasticities associated with the labour variable vary between 0.44 and 0.53 while for the physical capital variable the coefficients range from 0.12 to 0.16. It should be noted that these estimates are somewhat low compared to the ones generally reported in the literature.Footnote 17

The results indicate a positive and significant impact of firms’ own R&D capital on productivity performance. The magnitude of these coefficients, i.e. 0.18–0.25, is somewhat higher compared to the ones obtained in the related literature which can be explained by the high R&D intensity characterizing the firms of the dataset.Footnote 18

As regards R&D spillovers, there appears to be a rather strong link between technologically based R&D spillovers and firms’ productivity performance. The estimated elasticity associated with this variable, 0.61, is significant and higher as compared to the firm’s own R&D stock.

The results in Table 2 also confirm the positive relationship between the growth of productivity and the geography based R&D spillover stock. The estimated elasticity associated with this variable is about 0.41 and is again higher compared to the firm’s own R&D stock. This result is robust to the alternative weighting measure proposed by Bottazzi and Peri (2003) to construct the geographic spillover’s variable, hence confirming the role of localisation effects on knowledge flows and on productivity growth. Furthermore, the estimated elasticity of the ‘far away’ spillover stock is not significantly different from zero, which suggest the finding of Bottazzi and Peri of no spillover diffusion passed a certain distance between firms (300 km). These findings, which are in line with the results of previous studies, indicate that the geographic distance between firms matter for R&D spillovers.Footnote 19

The last column of Table 2 includes both proximity-based R&D spillover variables in the same specification. This allows one to investigate which proximity measure (geographic or technological) matters most to explain firms productivity growth. It is worth noting that adding these two spillover stocks together does not change the estimated elasticities obtained when each variable is estimated separately. Therefore, we can conclude that the impact of technologically based R&D spillovers in explaining productivity is higher compared to geographic knowledge externalities.

Finally, assuming that the stock of own R&D is a proxy of absorptive capacity, it is possible to analyse the extent to which this capacity interacts with both geographic and technological sources of R&D spillovers. The results reported in Table 3 are in line with the previous ones (Table 2). Furthermore, we observe a positive impact of the interaction terms between firms’ own R&D stock and both types of R&D spillover stocks. These findings suggest the presence of complementarities between these knowledge stocks. Furthermore, the results also indicate no any particular differences between the levels of absorptive capacities and their interaction with the geographic and technological R&D spillover variables. These results are confirmed in the last column of Table 3 where both R&D spillover stocks and interaction terms with the firms’ own R&D stocks are introduced simultaneously.Footnote 20

Table 3 Absorptive capacity (GMM-system estimates)

4 Conclusions

The purpose of this paper has been to assess, besides traditional inputs and the firm’s own R&D stock, the impact of two types of R&D externalities on large international R&D companies’ productivity growth over the last decade. The first R&D spillover variable considered is formalized by weighting the R&D stocks of other firms according to two alternative geographic proximity measures. As in previous studies in the literature on the geography of innovation, the idea has been to examine the extent to which localisation effects matter in the diffusion of R&D spillovers and their impact on productivity performance. The second type of R&D externality uses the distances of firms into a technological space constructed on the basis of the distribution of firms’ patents across technological fields. Besides the physical proximity between the sender and the recipient of knowledge flows, technological ‘closeness’ is another dimension which can affect the scope and direction of R&D spillovers.

The main results of the paper can be summarised as follows. Both the geographic and technological based R&D spillovers stocks have an important and positive impact on the productivity growth of firms. The effects of the pure technological externalities on firms’ economic performance appear to be higher as compared to the geographic spillovers. This finding suggests that the technological proximity is more important than the geographic one for the impact of R&D spillovers on firms’ productivity growth. Finally, these results are confirmed when controlling for absorptive capacity. Including the firms own R&D stock, the spillover variables and the interaction between the two simultaneously, we find a complementarity effect between own R&D and both sources of R&D spillovers.

In order to further explore these questions, further analyses are needed. Among the few suggestions for future work, we plan, as it has already been done in previous studies, to use information on patent citations to construct a more direct measure for R&D spillovers. Backward citations for instance, i.e. references in patent documents to former patents, can be interpreted as evidence of spillover effects from the knowledge described in the cited patent to the knowledge of the citing patent. In order to further analyse the interplay between geographic and technological proximities for the diffusion of knowledge, both types of R&D spillover stocks could be split into a national and an international component. This would allow testing for the presence of country borders effects such as for instance institutional settings, national policies, language and history (Maurseth and Verspagen 2002). Finally, the analysis could be enriched by considering alternative measures of absorptive capacities and their impact on firm economic performance such as the level of education of the workforce.