1 Introduction

When considering the creation and evolution of knowledge, two key properties must be acknowledged. First, knowledge is geographically sticky, that is, it does not easily disseminate across distance (Gertler, 2001; Feldman, 1994; Jaffe et al., 1993). Second, the process of knowledge creation is path-dependent and interactive meaning that generally builds on the pre-existing set of knowledge (Atkinson & Stiglitz, 1969; Dosi, 1982; Grabner, 1993). Together, these imply that the production of knowledge is best understood as an evolutionary process playing out on the regional level, a pattern at the forefront of evolutionary economic geography (Boschma & Martin, 2010; Kogler, 2016). Understanding and measuring this process is necessary when pursuing a knowledge-based growth strategy (so-called smart specialization strategies at least in the European policy context). This paper contributes by expanding on the existing measurements of entry relatedness to consider the relative importance of existing specializations in a region as well as the potential for future linkages between knowledge domains.

As noted above, new knowledge tends to build on that which is already present. This can occur by extending understanding in a given area and/or combining multiple areas in an innovative fashion. Thus, the path-dependent and recombinant nature of knowledge (Arthur, 1989) imply that the available and accessible knowledge pool is an important precondition for subsequent innovations. An implication of this is that if the necessary knowledge domains that would lead to significant technological advances are neither obvious, feasible, nor available, this can lock a region out from developing new advances (Cohen & Levinthal, 1990; Heiner, 1983). Given the limited geographic spread of innovations, one can then describe a region in both physical as well as cognitive dimensions (Nooteboom, 2000). These features are captured in the concept of the “regional knowledge space methodology” (Kogler et al., 2013, 2017; Rigby, 2015). Knowledge spaces, which can be generated at any spatial scale, not only capture the state of knowledge accumulation at a given place, but more importantly capture the networked structure of the local knowledge pool (i.e. the relationship between different areas of understanding) and its evolution over time. Thus, knowledge spaces provide the possibility to observe and analyse the knowledge structure of a particular locality. Further, they can be utilized as a measurement tool to investigate the contribution of place-based knowledge and its structural properties to the growth of inventive output (Boschma et al., 2015; Kogler et al., 2013) and productivity (Rocchetta, Ortega-Argilés, et al., 2021). Going one step further, the knowledge space can be used to anticipate how the introduction of new knowledge domain to a region will reverberate throughout the existing network. This provides a prediction for whether the new knowledge will flourish in an area, generating economic benefits, or not. These predictions are at the heart of smart specialization strategies (Heimeriks & Balland, 2016; McCann & Ortega-Argilés, 2015) designed to promote innovation-based economic development.

Although knowledge can be either developed or shared between distanced actors, specialised knowledge frequently remains localized because of the high costs associated with transferring it (Döring & Schnellenbach, 2006). As a result, knowledge distribution among places is uneven, which in turn creates the necessity of adopting a regional-specific approach for analysing knowledge and capabilities in space. The regional knowledge space approach put forward by Kogler et al. (2013) has become a standard framework for analysing, among other things, to regional technological change (Rigby, 2015), emerging technologies (Buarque et al., 2020), the resilience of regional economies (Rocchetta, Mina, et al., 2021; Tóth et al., 2022). and regional growth (Boschma et al., 2015; Rocchetta, Ortega-Argilés, et al., 2021). In particular, regional knowledge spaces have been used to explain changes in regional knowledge in an evolutionary context by incorporating entry and exit (Rigby, 2015) and the selection of technological knowledge domains (Kogler et al., 2017). In this context, entry is when a novel knowledge (that perhaps already exists elsewhere) is adopted and leads to the development of inventions. Exit, on the other hand refers to technological capabilities that previously existed in a region but that have since been abandoned. Selection, also known as differential growth, implies the changes in the activity, or position within the larger network, of a specific knowledge domain. Knowledge entry has in particular been shown to be important for explaining regional technological specialization (Kogler et al., 2017) with its effects linked both local and non-local technological relatedness (Rigby, 2015), as well as regional technological diversification (Boschma, 2017). By understanding how the effects of entry hinge on the existing stock of knowledge and potential links with further entry, it is possible to project a region’s ability to adopt and implement new technological capabilities (Kogler, 2017). This is why regional knowledge spaces are particularly useful when implementing smart specialization policies.

With this in mind, it is necessary to have metrics that describe the potential fit of a new technology into a region's knowledge space. In this study, we build upon the existing measures in two key ways that specifically focus on knowledge entry. First, a new measurement of “entry-relatedness” based on a knowledge gravity model is suggested. The knowledge gravity model, which has its origin in Newton’s law of gravitation, has been widely implemented in the field of international economics (Kabir et al., 2017) and points to two key features when describing the relationship between two countries: the size of each and geographic proximity. Together, the model suggests the greatest trade will occur between two large, proximate nations. Similarly, it has been used to describe knowledge transfer mechanisms based on the size and distance between two knowledge domains (Montobbio & Sterzi, 2013; Picci, 2010; Seliger, 2016). Turning to evolutionary economic geography, the bulk of studies use average relatedness of a new technology, which captures the “proximity” of a new technology to those already present in a region, when describing the new technology's fit to the region. This, however, misses the importance of different current technologies to the region's knowledge space, i.e. it leaves out the “size” dimension. Our new measure of entry-relatedness incorporates both the size and proximity dimensions when describing the potential fit for a new technology.

Second, a new indicator labelled “entry-potential” that indicates the potential ability to create new knowledge in a region is proposed. Whereas traditional entry-relatedness indicators that have been used in the relevant literature, e.g. relatedness density (Boschma et al., 2015), only consider the match of new technological knowledge to the existing knowledge base of a region, the novel entry-potential measure proposed in turn aims to capture the potential of future linkages between technologies. Intuitively, while two technologies may be rarely combined in a region, if they are frequently combined in other locations this indicates the potential for future matches. As such, the new entry may well fit into the region's future regional knowledge space that accounts for the evolution of recombinations even if it only makes a tenuous match to the current knowledge space. In summary, it is certainly important to consider how the new entry fits into the potential knowledge of a region. In order to capture this, indices based on the co-occurrence network measures of technological knowledge domains contained in patent documents are calculated. Thus, this new indicator will allow us to capture the potential value of new technological knowledge that enters a specific region in a forward-looking way that is different in what has been done previously in the relevant literature (see Appendix 1 for an overview and contrast of some established measures employed in evolutionary diversification studies and how our proposed measures differ).

Using an integrated dataset of the European Patent Office (EPO) PATSTAT database and the European Regional Database running from 1981 to 2015, we then construct our two novel entry-relatedness and entry-potential measures across EU-15 Metro and non-Metro regions and compare these across locations to determine which are the most fertile for technology entry.Footnote 1 Further, we consider the evolution of the measures over time to see how the relative competitiveness of regions has changed. In doing so, we compare our measures with the others common to the literature to illustrate the benefits of ours, especially when describing changes over time. Finally, we estimate the relationship between entry-relatedness and entry-potential and regional inventive output.

In the next section, we review the literature on regional knowledge spaces and related topics to put our proposed measures in context. Section 3 provides an overview of our data and a detailed discussion of how we construct the entry-relatedness and entry-potential indicators. Section 4 then utilizes their constructed values for a detailed description of how they compare across regions and over time. Section 5 illustrates the detailed estimates of regional inventive entry-relatedness and entry-potential measures. Section 6 concludes.

2 Theoretical framework of regional knowledge entry

2.1 Entry-relatedness and knowledge gravity model

Given the predominant role of technology in productivity and growth, governments have long sought ways to encourage the production and adoption of new technologies as a way of achieving growth. As noted above, this is typically done at a national or even sub-national level because of the importance of local knowledge for the introduction of new innovations. A key implication of this is that, given the uneven distribution of knowledge and technological capabilities across regions, it will be difficult for lagging locations to catch up to those with a more established knowledge base (Feldman & Kogler, 2010). With that in mind, policy makers have concentrated on closing the technological gap by focussing on specific technologies which are likely to flourish in a region, a strategy known as smart specialization (Heimeriks & Balland, 2016). In particular, smart specialization tailors itself to the specifics of a region's existing knowledge space within an evolutionary framework (McCann & Ortega-Argilés, 2015). Regarding this, European region’s smart specialization strategies have been discussed within the context of the with related technology diversification of a region (Santoalha, 2019a, 2019b). As a result, it recognizes that the technological gap between regions is not just a simple function of the relative sizes of the local knowledge stock but also differences in their specialisations. Thus, the identification of core competencies and the knowledge base of a region is important for establishing a competitive innovation strategy for regional growth precisely because it avoids a “one-size fits all” approach (Tödtling & Trippl, 2005).

Taking this into account, it is necessary to consider how entry in a particular knowledge domain is likely to interact with that already in the region to predict its impact on the creation of new knowledge. One way of measuring this is relatedness (Rigby, 2015). Similar to how product development or export patterns build on those already in locality (Hidalgo et al., 2007), the ability of a new knowledge to drive meaningful change in a regional economy depends on the possibility (or lack thereof) to be recombined with that already present. With that in mind, relatedness measures the cognitive distance between technological knowledge domains as indicated by co-occurrence that are utilized in the development of novel products and processes created in a local economy. This is then typically averaged across technologies to form average relatedness with the expectation that being cognitively near existing technologies means that the new technology has greater potential for generating additional inventions (Kogler et al., 2013). Indeed, technological relatedness at the regional level is regarded as a driving force for technological change (Boschma et al., 2015).

One limitation of this measure, however, is that it only considers cognitive distance and not the importance of the relevant technologies. For instance, if the size of both existing and entering knowledge components are small, the likelihood of a fusion which results in a new innovation is low regardless of how close the two technologies are. Conversely, even if two technologies are rarely combined, if there is a significant amount of activity in each, that increases the chance for a fruitful interaction. Thus our measure labelled “regional Knowledge Entry Relatedness”, which considers both proximity and size, may do a better job at reflecting the potential impact from entry of a new technology to a region.

We are not the first to apply a gravity model approach to economic phenomena, with the model particularly common to international economic topics. Indeed, several examples of gravity in innovation can be found. For example, Picci (2010) and Montobbio and Sterzi (2013) used a gravity framework to analyse the international inventor collaborations finding that, as predicted, collaborations are more common between larger countries that are closer to one another. Similarly, Seliger (2016) studied knowledge flows measured as forward citations in a knowledge gravity model using “technological distance” finding yet again that technological distance inhibits knowledge flows. Using a somewhat different approach, Keller (2002) finds that international R&D expenditure spillovers decline rapidly with the distance between regions. With these in mind, it seems natural to anticipate that the likelihood of successful recombinations of a new and an existing technology is increasing in the size of each while falling in their cognitive distance, an insight which motivates our “regional Knowledge Entry Relatedness” (rKER) measure.

While the potential for knowledge entry to lead to new innovation certainly depends on what is currently available in the region, it is important to recall that a region's knowledge space is not static. Indeed, the evolution and change of knowledge is one of its defining characteristics. Like products, technological knowledge also has a technology life-cycle and just as a product passes through development, introduction, maturity and decline, so too does a given technology. Even in decline, technologies matter because new ideas arise from existing ideas in a cumulative interactive process (Weitzman, 1996). As such, today's entry can be the foundation for future entry and recombination. Thus, when describing the potential fit of an entry, it is important to also account for future changes in the knowledge space.

With this in mind, network theory and analysis techniques are useful tools to measure the degree of connectivity of a knowledge component to the overall knowledge network (Kim et al., ; Lee et al., 2018). In particular, by measuring the connectivity of an entry to a reference network that serves as a benchmark for the future evolution of a region's specific knowledge space, these techniques provide a forward-looking understanding of the fit between the entry and the evolving local knowledge space. Network theory has been implemented in various studies to measure the level of connectivity of the target of interest and to explain the relation between network position and economic performance. For example, and concerning technological knowledge, network analysis based on patent technology class co-occurrence has been used to measure a firm’s technological competitiveness and technology convergence capability (Kim et al., 2018). A high network score indicates that a node, i.e. a technological knowledge domain, is highly centralized and plays an important role in connecting other nodes. From a network perspective, highly centralized nodes have a greater possibility of accessing important resources within the system and thus create a competitive advantage compared to other less connected nodes (Kim et al., 2018, 2019; Tseng et al., 2016). Set in a regional perspective, an entry that is well connected to both the current knowledge space and the reference knowledge space is expected to generate more recombination possibilities and thus more regional growth opportunities than its counterpart. To the best of our knowledge, the relevant literature to date has neither considered nor stressed the degree of knowledge components’ connectivity in this way. We therefore propose the new forward-looking “regional Knowledge Entry-Potential” (rKEP) measure which focuses on the potential value of regional knowledge in creating an invention from a recombinant perspective.

3 Data and measurement

In this section, we first describe the data we use and then turn to the specifics of how we construct our two measures of entry fit. Our data draws from two sources: the European Patent Office (EPO) PATSTAT database and the European Regional Database (ERD).Footnote 2 To construct the EU-15 knowledge space at the regional level, we collected all patent records from 1981 to 2015 that were invented by at least one inventor who resided in one of the EU-15 regions at the time of invention (full list of regions is from EurostatFootnote 3). To geo-locate patents produced by multiple inventors, as is standard we apply fractional inventor allocation (Kogler et al., 2017).Footnote 4 Furthermore, and also following common practice, we use 5 year windows as our time unit (Ahuja, 2000; Gilsing et al., 2008; Henderson & Cockburn, 1996; Podolny & Stuart, 1995; Stuart & Podolny, 1996). These longer windows are particularly suitable when considering evolutionary questions to permit sufficient time for systemic changes in knowledge spaces to manifest (Kogler, 2016). Thus, we have seven time periods: period 1 (1981–1985), period 2 (1986–1990), period 3 (1991–1995), period 4 (1996–2000), period 5 (2001–2005), period 6 (2006–2010), and period 7 (2011–2015). The PATSTAT data is also used to construct the measure of knowledge creation (the number of new patents). The ERD database provides all the other socio-economic information. The full list of variables will be covered in the following section.

3.1 Regional knowledge entry

The first step of measuring regional knowledge entry-relatedness (rKER) and entry-potential (rKEP) is to capture each region’s knowledge entry. Instrumental in this regard are the Cooperative Patent Classification (CPC) classes listed on individual patent documents.Footnote 5 As shown in Fig. 1, three steps are needed to capture knowledge entry by regions; (1) a regional CPC table, (2) a regional revealed comparative advantage (RCA) table, and (3) a regional entry matrix. First, a regional CPC table for each period is constructed. Here, the share of each sub-class CPC for is computed to control the weight of each technological knowledge domain that features in a patent application. For instance, in Fig. 1 if patent A contains CPC x and CPC y, and patent B contains CPC z, 0.5 is assigned for CPC x and y, and 1 is assigned to CPC z. In this case, the contribution of CPC z to a patent application is bigger than CPC x and y because it is the sole domain utilized for the invention.

Fig. 1
figure 1

Process of measuring regional knowledge entry

The second stage is to capture regional RCA values (Balassa, 1965). The RCA, also commonly known as the location quotient, is computed by dividing the share of a given CPC in a region by the share of that CPC in all regions. For instance, in Fig. 1, the RCA of CPC x in region A is measured by dividing the share of CPC x in region A (100/155) by the share of CPC x in regions A and B (110/185). The measured RCA is converted into dichotomized values (1 for above 1 and 0 for below 1) to simplify the comparison used in the following stage.Footnote 6

In the third stage, regional knowledge entry is measured by comparing regional RCAs for a given CPC-region between two consecutive periods. Thus, the knowledge entry indicator in period t + 1 is 1 when the RCA of CPC x in region A switches from 0 in t to 1 in t + 1.

3.2 Regional knowledge entry-relatedness (rKER)

The idea of regional knowledge entry-relatedness (rKER) is analysing the degree of relatedness of technologies that are entering into the region. The overall process of measuring regional knowledge entry-relatedness is presented in Fig. 2. First, the technological distance between CPCs is computed for each period. The technical distance is measured based on the relatedness between two different CPCs by computing the relatedness between CPCs from the co-occurrence matrix (Balland, 2017). Then, knowledge entry-relatedness is computed using the knowledge gravity model on technologies that are “entered” in each period. Based on the knowledge gravity model, each region’s knowledge relatedness (KR) is measured as follows:

$${\text{KR}}_{ijr}^{t} = \frac{{F_{ir}^{t} *F_{jr}^{t} }}{{\left( {Dist_{ij}^{t} } \right)^{2} }}$$
(1)

where F is the amount of CPC used for patent application, Dist is technology distance for CPC i and j in region r at time t based on the entire reference region, i.e. the EU15 knowledge space. Finally, regional knowledge entry-relatedness (rKER) is as follows:

$$rKER_{r}^{t} = \frac{{\mathop \sum \nolimits_{i \in ENTRY} \mathop \sum \nolimits_{j \in INCUM} {\text{KR}}_{ijr}^{t} }}{n}$$
(2)

where n is the total number of CPCs that are included in knowledge entry (ENTRY) and tech-incumbent (INCUM) sets. For region r at time t, ENTRY includes the list of CPCs that are newly introduced and INCUM contains the list of CPCs that the status has not been changed (neither the technology is introduced nor diminished). Since not all technology components are entering the technology pool of the region, regional knowledge entry-relatedness is measured for the CPCs that are included in both sets satisfying the entry condition. As shown in Fig. 2, in region B, CPC y—CPC x, and CPC x—CPC z are the only cases where regional knowledge entry-relatedness are non-zero.

Fig. 2
figure 2

Process of measuring regional knowledge entry-relatedness

3.3 Regional knowledge entry-potential (rKEP)

In this section, the regional knowledge entry-potential is calculated by using a patent network analysis based on the recombinant approach (Fig. 3). First, for each period a CPC co-occurrence network is constructed where CPCs and patents are assigned as nodes and edges. This CPC co-occurrence network is by design a non-directed form to reflect the absence of direction between CPCs and the weighted form to give weight by the usage frequency of an individual CPC. From this CPC co-occurrence network, the value of each knowledge domain (CPC) is measured along three network centrality indices referring to the three main aspects of knowledge potential: connectivity, linking power, and influence. As with the rKER measure, this is done using the entire EU15 knowledge space as the baseline network.

Fig. 3
figure 3

Process of measuring regional knowledge entry-potential

Connectivity, which is measured by the usage frequency of a CPC in different patents, is an important indicator for the recombination of knowledge domains. From a network perspective, a node with greater connectivity is more likely to be involved in more connections. If a certain technology has been frequently used with other technologies, it is plausible to think of it as a key technology that can be easily used for a new invention. In a similar sense, the technology linked to many other technologies has a bigger potential for creating innovation using its rich connectivity. Regarding this aspect, the connectivity of knowledge is measured with degree centrality, which calculates the importance of node by counting the weight of edges (Kim et al., 2018, 2019; Lee & Kim, 2018).

Another important technical feature is the linking power. Linking power refers to a technology’s degree of brokering heterogeneous technologies. In a network, a node that connects the different clusters plays an important role in information transfer. Similarly, technology with a higher level of bridging heterogeneous technologies has its own value of creating a new innovation by linking the different technologies. For this account, betweenness centrality is used to measure the linking power of an individual technology in our CPC co-occurrence network (Kim et al., 2018, 2019; Lee & Kim, 2018).

Lastly, influence is measured with eigenvector centrality. While the two indices mentioned earlier evaluate how well such technologies are linked or contribute to the connection with other technologies, eigenvector centrality takes into account the value of the connected technologies. By doing this, the weight of the edges is differentiated by the value of the connected nodes, which allows us to capture the quality of the connected technology. For instance, if a technology is more relevant to an important technology, there is an advantage that it can be used with that technology, which can be seen as a greater influence within the network.

The three aspects of knowledge potential measured by network centrality indices show the value of each node, but with different aspects. Assuming all of them as three main dimensions, knowledge potential (eKP) of individual CPC i is measured by the Euclidean distance of normalized degree centrality, betweenness centrality, and eigenvector centrality:

$$eKP_{i,EU}^{t} = \sqrt {\left( {N.deg_{i}^{t} } \right)^{2} + \left( {N.btw_{i}^{t} } \right)^{2} + \left( {N.eig_{i}^{t} } \right)^{2} }$$
(3)

As mentioned earlier, we include the EU subscript to acknowledge that knowledge-entry potential is measured using the EU region as a baseline. The rest of the variable's construction is comparable to that above, i.e. the knowledge entry-potential (KEP) is measured by the product of CPC’s knowledge-entry potential and RCA value for each region (Eq. 4). Then, the regional knowledge entry-potential (rKEP), the average of KP in each region, is measured by dividing the number of entered technologies as described in Eq. (5).

$$KP_{i,r}^{t} = eKP_{i,EU}^{t} {*}RCA_{i,r}^{t}$$
(4)
$$rKEP_{r}^{t} = \frac{{\mathop \sum \nolimits_{i \in ENTRY} KP_{ir}^{t} }}{{\mathop \sum \nolimits_{i \in ENTRY} I\left( {RCA_{ir}^{t} } \right)}}$$
(5)

4 Regional knowledge entry-relatedness and entry-potential in EU15 regions

In this section, we compare and contrast the rKER and rKEP measures across EU-15 regions. To observe the changes of both indicators we plot those first for the period 1986–1990 (Fig. 4) and then for 2011–2015 (Fig. 5). For the sake of legibility, only metropolitan regions are included in both figures. Further, to improve the presentation, we use the log of each variable and normalize it to the unit interval. In both figures, a region’s level of rKER and rKEP places it in one of four quadrants depending on whether respective scores are higher or lower than 0.5 (recall that both measures are scaled to the unit interval). For instance, a region located in the first quadrant is one that rKER and rKEP scores above 0.5, meaning that it has good conditions for creating both the related technologies and potential technologies for knowledge recombination.

Fig. 4
figure 4

Position of EU15 Metropolitan regions in rKER and rKEP (1986–1990). Notes All values are logged and normalized to range between 0 and 1. Labels of some regions are not included to avoid overlap

Fig. 5
figure 5

Position of EU15 Metropolitan regions in in rKER and rKEP (2011–2015). Notes All values are logged and normalized to range between 0 and 1. Labels of some regions are not included to avoid the overlap

In 1986–1990, larger regions appear to perform strongly in rKER. Large metro regions like Paris and Frankfurt are often regarded as hubs for innovation, potentially because of their concentration of the necessary resources and knowledge. As such, these regions have excellent conditions for creating new inventions and can generate a greater number of technologies when compared to others. Further, and central to our discussion of the path-dependency of knowledge evolution, the new technologies entering the region are likely to be related to the existing technologies. This then increases the relative probability of related technologies entering large metro regions.

On the other hand, relatively smaller regions tend to have higher values concerning the rKEP measure. Compared to their larger counterparts, these smaller metro regions arguable house fewer resources and technologies, while on the same account they of course should have greater opportunities to adopt new specialisations, i.e. add knowledge domains. For instance, in those regions, new technologies could have been developed based on comparably weak local technological foundations, suggesting an independence from local conditions. This independence from local resources and know-how may point to a greater ability to develop an innovation unrelated to the existing technologies. Furthermore, potential technologies for knowledge recombination were especially predominant during the 1980s and 1990s. Since our focus is on technologies ripe for knowledge recombination, this helps explaining why larger metro regions might have followed paths of existing specializations rather than adopting new and thus potential, albeit risky, technologies. In this sense, what we observe here could be a reflection of both, the condition of smaller regions in general, and the trend of technological development during the observed period.

Lastly, it is notable that not a single region displays competitiveness in both regional knowledge entry-relatedness and regional knowledge entry-potential. In other words, not a single region had both rKER and rKEP values above 0.5 in 1986–1990. However, we observe more regions with a high regional knowledge entry-relatedness score compared to the other dimension of interest, but nevertheless the bulk of regions had scores below 0.5 in both dimensions. This points to the importance of regions with very high values along one or the other dimension.

Comparing these findings to the 2011–2015 period (Fig. 5), some interesting changes become evident. First, more metro regions show greater rKER and rKEP values. As observed in Fig. 5, more regions are located in the first and second quadrant where rKEP is above 0.5. In particular, the importance of potential technologies seems to have increased due to the development of information and communication technologies (ICT). Throughout the years, ICT has been widely used in various technological fields and thus has fostered a converging process among different technological domains. This indicates that potential technologies are already embedded and actively developed in most regions as opposed of being the focus of smaller regions, or even outliers, as observed in the initial time period.

Second, we also detect a greater dispersion across regions in terms of their values along the two dimensions, including three regions (Helsinki, Milano and Lille) that show rKER and rKEP measures that are above the 0.5 threshold. Nevertheless, the tendency of larger regions towards higher values of rKER and similar that of smaller regions towards higher values of rKEP as observed in the initial time periods seem to persist. Thus, although larger regions may continue to have an advantage engendered by their large stocks of current technologies (i.e. high rKER scores), our introduction of the rKEP measure points to smaller regions’ growing ability to attract new technologies to their fertile landscapes. Thus, by using both measures, we find reasons to look towards the “democratisation of innovation” in which innovation is widespread across locations rather than concentrated in a few large, dominant regions. These measures also aid in evaluating the differing and changing aspects of knowledge creation for each region.

5 Research model and econometric analysis

5.1 Research model

In this section, we econometrically examine the relationship between of the two dimensions of knowledge entry, i.e. rKER and rKEP, and regional innovative performance. For this purpose, a multivariate regression analysis conducted by employing the following research model (Eq. 6):

$$\begin{aligned} Y_{it + 1} = & {\upbeta }_{0} + {\upbeta }_{1} \left( {{\text{ENT}}.{\text{Rel}}_{it} } \right) + {\upbeta }_{2} \left( {{\text{ENT}}.{\text{Pot}}_{it} } \right) + {\upbeta }_{3} \left( {{\text{ENT}}.{\text{Rel}}_{it} \times {\text{ENT}}.{\text{Pot}}_{it} } \right) \\ & + {{\varvec{\upbeta}}}_{4} {\varvec{Z}}_{{{\varvec{ij}}}} + {{\varvec{\upbeta}}}_{5} {\varvec{P}}.{\varvec{FE}}_{{\varvec{t}}} + {{\varvec{\upbeta}}}_{6} {\varvec{R}}.{\varvec{FE}}_{{\varvec{i}}} + u_{it} \\ \end{aligned}$$
(6)

where i represents each region in the EU15, and t is period (here we continue to work with the five-year windows defined earlier).

Our dependent variable is exploratory innovation, a measure that captures the introduction of new technologies (Gilsing et al., 2008; Guan & Liu, 2016). This is measured by comparing the technological profiles of a region between two consecutive periods, and by counting the total number of patents containing new technology classes (subclass CPCs) that did not exist in the previous period.Footnote 7 This is used in levels so that we can capture impacts of the knowledge entry measures on the total amount of entry into new areas of technological knowledge domains.

Turning to our controls, and in order to mitigate endogeneity, we use lagged values of the explanatory variables. Our primary variables of interest are regional knowledge entry-relatedness (rKER) and regional knowledge entry-potential (rKEP), both of which are constructed as described above.Footnote 8 We consider both of these measures as well as their squared values to control for possible non-linearities; further, it seems reasonable that both current and future fit can reinforce one another.

We also include additional controls, denoted by a vector Z, that are likely to affect regional growth, including patents per inventor (PAT.Inv) to control the regional patenting productivity, GDP to cover the economic size of the region, population (POP) and employment ratio in the manufacturing sector (EMP.m) to control the regional industry portfolio. To further reduce the potential for omitted variable bias, we also include time fixed-effects (P.FE) and region fixed-effects (R.FE). For reference, a fixed-effects model is selected based on Hausman Test results. Detailed information on the data utilized in turn is presented in Table 1. Table 2 presents the correlation values of all variables with simple statistics and variation inflation factor (VIF) results. To understand the regional heterogeneity of our key variables, Table 3 presents the descriptive statistics of exploratory innovation, regional knowledge entry-relatedness, and regional knowledge entry-potential by region and by country. For all variables, dispersion (standard deviation, SD) was greater at a regional level and this shows that the regional heterogeneity is greater than country-level heterogeneity.

Table 1 Variable descriptions (N = 4816)
Table 2 Correlation and descriptive statistics
Table 3 Ddescriptive statistics of key variables by region and country

5.2 Regression results

The regression results are shown in Table 4. As the Breush-Pagan test of homoscedasticity results showed the possibility of heteroskedasticity, all the results are reported with robust-standard errors. In specification (1) and (2), rKER and rKEP and their quadratic terms are presented, respectively. In specification (3) and (4), the result of interaction terms is reported. In all models our control variables including patents per inventor (PAT.Inv), GDP per capita (GDP), and population (POP) show positive and significant effects on exploratory innovation.

Table 4 Panel regression result

For exploratory innovation, both linear and quadratic terms of rKER show a negative and significant coefficient, implying that rKER hampers the creation of new inventions. This interpretation should, however, be made in light of our dependent variable which measures a region's entry into new, hitherto absent, technological specializations. As such, the estimates can be explained by the path-dependency of knowledge, or via the preferential attachment effect (Sun & Liu, 2016). Preferential attachment effects describe a certain tendency that new nodes are preferentially attached to the existing nodes (Barabási & Albert, 1999). New technologies are more likely to be related to the existing technologies in a region, and this can give a “cumulative advantage”, but at the same time, it may work as a barrier for a branching out into new areas which were not previously present in the region.

rKEP, on the other hand, is significantly and positively related to exploratory innovation, albeit only in a linear fashion. Highly potential technologies are those with greater advantage in knowledge recombination. In other words, these technologies have a competitive advantage of connecting different technologies including even those that already from the existing technology pool. Thus, entry of one new technology which is highly compatible with other technologies that are also currently absent may act as an avenue for attracting those additional missing competencies (Kim et al., 2018; Tseng et al., 2016). As such, a region with greater entry-potential values is also one more likely to succeed in exploratory innovation.

As shown in column (3), these patterns hold when including both entry-relatedness and entry-potential in the same model. Finally, column (4) introduces the interaction term. Doing so does not affect the individual coefficients and the interaction itself shows a positive and significant effect. This is most likely because when a new technology relates to what is already present (high entry-relatedness) as well as what is not (high entry-potential) it can act as a bridge between those two sets of technology. This then fosters even more innovation.

6 Conclusion

In order to form effective policies that promote knowledge production and subsequent innovative outcomes that generate economic value, it is important to have evidence-based projections on which kind of knowledge domains are most likely to fit and succeed in a regional context. Since knowledge stays primarily local and new technologies tend to build on those that came before, this begs the need for measures of the fit between a potential new (knowledge) entry and the existing capabilities of a region. In this study, we present two new measures of that fit. Relative to the existing measures (Appendix 1), ours account for the relative importance of a region’s existing competencies (entry-relatedness) as well as the potential for the future evolution of the region's knowledge space (entry-potential).

We then proceed by using data from EU-15 regions from 1981 to 2015 to examine changes in our proposed measures. Doing so suggests a trend towards the “democratization of innovation”, i.e. rather than have technological growth be solely driven by a few large centers, a greater share of regions exhibit significant innovation potential as measured by regional knowledge entry-relatedness and/or entry-potential. We then conclude by examining the impact of our measures on regional diversification. In terms of entry-potential, we find that a higher value points towards more diversification in the technological knowledge domains in a region’s patent portfolio. Although higher entry-relatedness has the opposite effect (potentially due to crowding out of unexplored and/or underperforming areas of the knowledge space), higher entry-relatedness seems to enhance the positive effect of entry-potential. This then suggests that a new entry in one period can serve as a bridge between current competencies as well as the future entry of new technologies. This points to the dynamic process of innovation in which inventions build on inventions, i.e. the path-dependency and cumulative nature of knowledge production at the regional scale, an insight that is at the heart of Evolutionary Economic Geography inquiry (Kogler, 2016). We therefore believe that these new measures prove a useful tool in describing knowledge trajectories, mechanisms of regional diversification, and thus aid to the development of more effective smart specialization strategies.