1 Introduction

Artificial intelligence (AI) is increasingly shaping complex dynamics in the technological and regional landscape. One specific feature of AI is its general-purpose nature, making this new technology applicable across multiple industries, activities, and geographical contexts (Audretsch et al., 2022; Brem et al., 2021; Klinger et al., 2018). As a General-Purpose Technology (GPT), AI is affected by the external environment, including cumulated local knowledge, collective intelligence, and distinctive regional assets, and is transforming the spirit and scope of entrepreneurship (Nambisan, 2017; von Briel et al., 2018).

The entrepreneurship literature has reserved much attention to the local environment where entrepreneurial activities occur (Audretsch, 1995; Audretsch et al., 2012; Fritsch, 1997; Schumpeter, 1939). Entrepreneurial activities and their local context are involved in a double-sided relationship. On the one hand, previous studies have linked entrepreneurial activities to economic development at the regional level, showing that new firms are key drivers of regional growth and regional employment dynamics (Audretsch, 2003; Kogler et al., 2023; Van Praag & Versloot, 2007). Thanks to firms' ability to generate and combine different pieces of knowledge that lead to the development of innovative products and services (Ghio et al., 2016). On the other hand, scholars have highlighted that the local context plays a crucial role in innovation and entrepreneurship dynamics (Audretsch, 2003; Shane, 2003). In this context, several studies have identified the local socio-economic features that may affect new venture creation and explain the local distribution of startups (Acosta et al., 2011; Cavallo et al., 2018).

Within the latter stream of the literature, the university has been identified as a key driver of entrepreneurship. More precisely, recent works point out that the knowledge generated by universities positively impacts the creation of new firms at the local level (Acosta et al., 2011; Bonaccorsi et al., 2013). They demonstrate that universities' and firms' proximity is an essential driver for innovation and entrepreneurship dynamics (Peña-Vinces & Audretsch, 2021). Indeed, the geographical closeness between universities and firms facilitates knowledge exchange due to its cumulative, localized, and tacit nature (Antonelli, 1995). As articulated by the Knowledge Spillover Theory of Entrepreneurship (henceforth KSTE) (Acs et al., 2009; Audretsch, 1995), universities produce knowledge that can remain uncommercialized.

Consequently, an opportunity to start a new firm is generated to exploit and commercialize that knowledge. The different scientific specialization of universities has a diverse effect on new firm creation at the local level (Bonaccorsi et al., 2013) that depends on the knowledge inputs required by those firms and on the solutions to complex problems that different knowledge provides. Indeed, knowledge is not considered a homogeneous good but the outcome of a combinatorial search activity (Fleming, 2001; Fleming & Sorenson, 2001; Weitzman, 1998). In this framework, it is reasonable to expect that universities are likely to affect the shaping of entrepreneurial ecosystems, depending on the specializations of their scientific research. In addition, the presence of research activities in specific disciplines can influence the local entrepreneurial environment in terms of the availability of resources (e.g., talent, capital, etc.) for startups. Overall, universities with a diverse scientific research base can provide a unique set of opportunities for startups in their local environment. By leveraging their research capabilities, universities can provide a greater variety of resources and expertise to help foster the creation of startups. This suggests that, in turn, universities may play a leading role in the technological specialization of the regions where they are located and thus contribute to the development of new knowledge and competencies in the local ecosystem (Caviggioli et al., 2022, 2023; Colombelli et al., 2021; Fagerberg & Srholec, 2023). This is particularly true in the case of artificial intelligence-based technologies, given the general-purpose nature of AI. Indeed, artificial intelligence-based technologies require knowledge and expertise beyond the realm of computer science. Some important areas of knowledge necessary to successfully develop artificial intelligence-based technologies include mathematics, engineering, statistics, psychology, philosophy, cognitive science, and neuroscience (Bostrom, 2014; Russell & Norvig, 2010).

Studies related to AI and the benefits it has on entrepreneurship have confirmed how this technology offers entrepreneurs new opportunities (Obschonka & Audretsch, 2020); it also emphasizes that this technology can influence entrepreneurs' strategic choices (Chalmers et al., 2020). At the same time, the strand of literature that has sought to combine AI education with entrepreneurship is well identified in the work of Giuggioli and Pellegrini (2022). It confirmed that the application of AI is essential for simulating contexts related to the real environment, enabling a more community-oriented approach to the study and practice of entrepreneurship (Ratten, 2020). In addition, learning is increasingly technology-based, increasing integration between modern AI-based solutions to update teaching and learning (Tarabasz et al., 2018). Related to education in AI, Khalid's (2020) study showed that students are more willing to learn entrepreneurial activities in universities where AI learning is offered, thus, confirming that AI plays a key role in fostering entrepreneurial activities at the university level.

However, despite the increasing attention on the intersection of AI and entrepreneurship at a macro level, the literature has mainly focused on the positive effects of this technology on business performance (Lasi et al., 2014) and venture creation (Chalmers et al., 2020), stating that AI and big data can function as external enablers of new entrepreneurial activities (Giuggioli & Pellegrini, 2022). Scholars have neglected to study the role played by local knowledge and skills in fostering AI-based entrepreneurship. Our work aims to shift the focus to a step prior to the use and effects of AI by understanding if different knowledge specializations from universities foster the process of creating Artificial Intelligence startups. Nevertheless, although the growing attention behind the universities' competencies in AI development (Bouslama, 2020; Hannigan et al., 2021), to the best of our knowledge, no previous studies have investigated the effect of university knowledge specialization on AI-based firms’ formation.

This article aims to fill this gap. More specifically, this article aims to study how university knowledge influences the development of AI technology through the formation of AI-based startups. The study investigates the local distribution of innovative AI startups and the effect that the knowledge of the local universities has on favoring the creation of this kind of new venture. More specifically, it analyses the effect of different combinations of university knowledge specializations on AI startup creation at the local level. In light of the KSTE (Acs et al., 2009), the study allows the following research questions to be answered: (1) do universities' knowledge specializations affect AI-based startups' creation? (2) Does the complementarity and interaction of different university knowledge specializations influence AI startups creation?

To answer our research questions, the empirical analysis is focused on Italian NUTS 3 regions between 2017 and 2020. Data on innovative Italian startups have been collected by taking advantage of the policy reform developed in Italy (Italian Startup Act, Law 221/2012). AI-based startups use Artificial Intelligence technologies as the core of their business model. These differ from other startups in the complexity of the technology and the range of applications and markets they target. AI startups are focused on developing AI applications or services such as natural language processing, computer vision, and machine learning. This contrasts with other types of startups, which may focus on creating products or services that do not necessarily leverage AI technology. Unlike other technology-based firms, AI startups do not have user-based platforms or other business lines that enable them to collect large amounts of data. AI startups are typically more capital-intensive than other types of startups, as they require significant investments in technology, talent, and infrastructure (Abis & Veldkamp, 2020; Rock, 2019; Wu et al., 2020). In fact, AI startups require computing power, human capital, and data availability (Bessen et al., 2022). For computing power, AI startups rely on IT assets (Jin & McElheran, 2017). AI startups rely on the market for human capital expertise and specific datasets since startups may lack needed proprietary and knowledge resources, like specific training data, to apply their general-purpose technology to a specific problem or sector.

To identify AI-based startups within the population of innovative Italian startups, we have adopted an original methodology based on a set of AI keywords and consisting of a double approach based on web scraping and content analysis techniques. Keyword research is used to identify and classify AI-related areas, e.g. its application in recent work to identify AI-related searches (Chowdhury et al., 2022) and AI-related trademarks (Nakazato & Squicciarini, 2021).

The dataset has been, finally, complemented with information on universities' knowledge specializations at the local level.

The contribution of the paper to the literature is threefold. We first complement the existing KSTE by providing empirical evidence about the strategic role of universities as sources of knowledge exploitable for local development. The existence of diverse universities' knowledge specialization is the inputs factors required to favor new firm creation and the consequent technological improvement at the local level. Second, we investigated the antecedents of AI-technologies creations, filling the gap related to the generation of AI from an entrepreneurial perspective. Third, we implement and adopt a classification methodology for identifying AI-related startups.

The results confirm our expectations about the need for knowledge that is not solely IT-based. The analysis confirms that combining local computer science knowledge with competencies in specific application domains fosters the emergence of AI startups.

The remaining part of the paper is structured as follows. The theoretical framework underpinning the analysis is provided in Sect. 2. The original classification method is presented in Sect. 3 The data and the methodology used are in Sect. 4. Section 5 presents the results. In Sect. 6, the discussion and conclusion summarize the analysis results and explore the contribution and implications.

2 Theoretical framework and hypothesis

The academic literature on entrepreneurship and regional economics has highlighted the strong links between creating new innovative firms and the regional context (Colombelli, 2016; Vivarelli, 2013). On the one hand, a vast body of literature has emphasized startups' key role in bringing innovations and introducing new technologies onto the market. Their role is even more relevant when new radical technologies are involved (Aghion & Howitt, 1992; Audretsch et al., 2006; Carree & Thurik, 2006). In this context, efforts to systematically link entrepreneurship to economic development at the regional level have shown that new firm formation is a determinant of regional growth, interregional differences, and regional employment dynamics (Dejardin & Fritsch, 2011; Feldman et al., 2005; Fritsch & Schindele, 2011). Entrepreneurial activities are to be considered among the primary agents of change and innovation; the creation of startups is one of the most critical forms through which new technologies, such as AI, are generated.

On the other hand, starting from the consideration that entrepreneurial activity is geographically clustered, both theoretical and empirical studies have been conducted in an attempt to identify the characteristics and attributes of the local socio-economic systems that may have an impact on the new firm formation (Bartik, 1985; Carlton, 1983; Feldman, 2001; Fritsch, 1997; Reynolds et al., 1994). Within this strand, the literature has highlighted the importance of local knowledge spillovers in the entrepreneurial process. An essential reference in this domain is the KSTE. This theory was conceptualized to articulate the link between knowledge spillovers and new firm formation (Acs et al., 2009, 2013; Audretsch, 1995; Audretsch & Lehmann, 2005).

According to the KSTE, new knowledge and ideas are key sources of new entrepreneurial opportunities (Acs et al., 2009; Audretsch & Lehmann, 2005; Szerb et al., 2013). In an environment filled with new ideas and knowledge that, for a variety of reasons related to cognitive inertia, lack of capacity, or risk aversion, cannot be commercially exploited by existing firms or universities, new venture creation is a way to exploit the opportunities generated by that new knowledge and ideas. In other words, the KSTE suggests that the startup of a new firm is an endogenous response to opportunities that have been generated but not fully exploited by incumbent organizations.

The knowledge involved in the innovation process may be generated from various sources: organized research carried out in universities, activities in the R&D divisions of corporations by individual researchers, and observation of processes or experience. Previous works grounded in the KSTE have highlighted the importance of creating new firms located in areas characterized by the presence of organizations that hold or generate knowledge (Bonaccorsi & Daraio, 2007; Cavallo et al., 2021; Colombelli, 2016). Within this stream of the literature, the geographical proximity of universities to industrial areas is considered a key facilitator of the exchange of knowledge between local firms and academia (Bonaccorsi & Daraio, 2007; Cohen et al., 2002; Del Bosco et al., 2021). These contributions have revealed that university specialization can shape regional branching processes and affect the generation of knowledge in new domains (Caviggioli et al., 2022; Colombelli et al., 2021). Universities can play a fundamental role in regional specialization processes because they are key sources of new knowledge, which can be transferred to the local ecosystem through a variety of channels (D’Este & Patel, 2007). First, universities feed the local ecosystem with highly educated and skilled individuals, contribute to skill upgrading through lifelong learning programs, and attract talent to the local ecosystem (Bonaccorsi et al., 2023; Bramwell & Wolfe, 2008; D’Este & Patel, 2007; Lehmann et al., 2022). Moreover, universities promote the diffusion of an entrepreneurial culture among students and academics and stimulate the creation of new firms within the ecosystem (Carree & Thurik, 2006; Giones et al., 2022; Shane, 2003; Zucker et al., 1998).

In light of these arguments, the concept of "university knowledge specialization" describes how the specialization of universities in specific scientific domains may affect the technological specialization of that region and the development of new businesses based on cutting-edge technologies, such as AI. In other words, universities specializing in particular scientific domains and disciplines transfer knowledge and competencies to the regions they belong to, and, as a result, the regions specialize in related knowledge domains (Bonaccorsi et al., 2013). In line with these arguments, we formulate the following hypothesis:

H1

University knowledge specialization is positively associated with creating AI startups in the focal region.

Although AI is highly topical and of interest to a wide and varied range of stakeholders, the literature on AI and entrepreneurship is still scarce. The academic debate related to AI is mainly at the conceptual level, focusing on the several applications of AI and how this new technology may affect and change medicine and neuroscience (Hassabis et al., 2017; Secinaro et al., 2021), cognitive sciences and human resources (Collins & Bobrow, 2017; Yafooz et al., 2020), or engineering and technology in general (Kakkar et al., 2021; Uraikul et al., 2007; Zang et al., 2015). This wide range of applications may suggest the complexity behind this new technology and, consequently, the need to involve different types of knowledge and skills. Scholars sustained that implementing AI applications requires a high technical domain (Chalmers et al., 2020). Moreover, they noticed a significant skill gap in the key job necessary to implement AI systems (Marr, 2018). On the one hand, required knowledge and skills for Artificial Intelligence applications and development concern programming skills, computational thinking (Lin et al., 2021), and all those compatible with various educational strategies in engineering education, maker learning, project-based learning, and problem-oriented learning (Navghane et al., 2016), software and hardware developers (Hao et al., 2021).

On the other hand, computational ability needs to be supported by other transversal competencies. Considering the numerous applications of AI technologies, some devices can execute a role that typically involves human interpretation and decision-making. These techniques have an interdisciplinary approach and can be applied to different fields, such as medicine and health (Secinaro et al., 2021). In light of this, it can be said that the development of artificial intelligence applications is not only linked to knowledge in computer science, but it requires a combination of multiple and numerous skills from different fields.

This argument is in line with the recombinant knowledge approach (Fleming & Sorenson, 2001; Weitzman, 1998). Previous works within the KSTE-based literature have pointed out that the size of the knowledge stock and its nature is of some significance. These studies, focusing on the effects that the heterogeneous nature of knowledge has on the formation of new firms, have revealed that the generation of new knowledge is the result of a recombinant process (Bishop & Gripaios, 2010; Colombelli, 2016; Colombelli & Quatraro, 2018; Colombelli et al., 2020; Fisher et al., 2017).

In line with these arguments, we propose that:

H2

University knowledge specialization in computer science is positively associated with creating AI startups in the focal region.

As already mentioned, the development of artificial intelligence applications requires combining many varied skills from different areas of knowledge. Computer science skills must be supported and complemented by non-computer science skills, such as mathematics, natural science, etc. (Bostrom, 2014). Based on this, we formulated our third hypothesis:

H3

The interaction between university specialization in computer science and non-computer science university knowledge specializations are positively associated with creating AI startups in the focal region.

2.1 The Italian context

Italy is an interesting case study for understanding how knowledge transfer works from universities to industrial and regional contexts (Bigliardi et al., 2015; Colombelli et al., 2021; Grimaldi et al., 2021). This nation has several small and medium-sized towns, a sparse number of large cities, connected industrial regions, and clusters centered on middle- and high-tech industries (Lazzeroni & Piccaluga, 2015). Each of these industrial clusters has developed specific competencies and know-how that have given it a competitive advantage and strengths in terms of knowledge. These logics have been the basis for the wave of European policies related to the Smart Specialization Strategy (3S) policy. The objective of S3 is to prioritize sectors and economic activities where regions or countries have a competitive advantage or have the potential to promote knowledge-driven growth to support and cope with the changes that the economy and society will face. The aim of this place-based approach is to promote assets and resources available in a well-defined district or region and support the identified priorities for knowledge-based investments, playing an important role in economic development and technological innovation.

In Italy, clusters are often characterized by the presence of universities and public research organizations interacting with local industries (Lazzeroni & Piccaluga, 2015), especially considering that a significant part of the Italian economy is based on locally-born and internationally-grown industries (Camuffo & Grandinetti, 2011; Grimaldi et al., 2021). Italy thus provides a particularly interesting context for understanding how universities transfer knowledge to regional clusters (Agasisti et al., 2019). Changes in recent decades have made academic institutions and the Italian Government more inclined and proactive toward the mechanism of technology transfer, encouraging visible processes that trigger knowledge sharing and exchange between universities and businesses, startups, and industry.

In this context of territorial promotion and revitalization through technology diffusion, there is room for the recent global trend of public policy development in digital technologies, within which Artificial Intelligence plays a central role. The need for public policy formulation has also emerged in the context of the European Union's multilevel governance, which since 2018 has required each member state to draw up its own national AI strategy (European Commission, 2018). Italy, too, is drafting its own Strategic Plan for Artificial Intelligence (MISE, 2020) with the intention of promoting the development of AI in the national business fabric by implementing numerous industrial policies on AI, also thanks to the involvement of experts capable of supporting this transition (Italian Artificial Intelligence Institute). According to the 'Strategic Programme on Artificial Intelligence' (2021), the Italian AI ecosystem consists of four categories of actors, such as the research community, knowledge transfer centers, technology and solution providers, and private and public user organizations. This ecosystem is in line with international peers (Germany, France, and Great Britain) in terms of research quality and output.

3 Methodology

To explore how university specializations can support the birth of artificial intelligence startups has been carried out empirical research based on regression analysis. The data collection regards the classification of AI startups and information about the universities. To analyze the venture creation phenomenon, we collect data related to AI startups. We analyzed the "Innovative startups" registered at “Registro Imprese” of the Italian Chamber of Commerce between 2013 and 2020 in 110 Italian NUTS3 regions. The Innovative Italian startups need to be part of the Italian "Registro Imprese" in the innovative section firms. With the aim of identifying the Italian startups related to AI, we propose a classification methodology based on a double approach: top-down and bottom-up approaches. The reason behind the double step of the process is due to the necessity to classify among the Italian startups clearly related to AI since this information is not available in the "Registro Imprese." Our initial sample includes 12,106 innovative startups founded in Italy between 2013 and 2020 collected from the business register (“Registro Imprese”) of the Italian Chambers of Commerce.

3.1 The top-down process

From the above-mentioned 12,106 samples of startups, we selected the ones targeting Artificial Intelligence. To achieve this goal, we built a web scraper based on Python programming language that is able to link to websites and retrieve their HTML code. We then ran the code and saved all the startups' websites. Two thousand two hundred twenty-five websites were not reachable due to various errors from the server side (i.e., page not existing/not reachable); thus, the corresponding startups were discarded. From the 9,881 remaining startups, we only selected the ones related to AI. We achieved this goal by setting a list of keywords, and we kept only the websites containing at least one of them. For the top-down process, we considered a list of 72 selected keywords that referred to AI technology and its domains of applications (Samoili et al., 2020); we enriched that list with the translation into Italian of all keywords and added other keywords taken from the literature (see Appendix). Nine thousand one hundred twenty-eight websites were discarded because they did not contain at least one keyword from the list. The resulting 753 startups were further filtered in accordance with the following criteria: the website must include a clear and explicit reference to AI technology. The result is a sample of 521 AI startups.

3.2 The bottom-up process

To strengthen the classification and check the results obtained from the top-down process, we used the startups resulting from the top-down process to identify new possible keywords through the bottom-up approach based on the startups' websites. Using the Nvivo software, we ran a text analysis on the 521 startup websites. From this process, we obtained new keywords, and after a manual check of them, we obtained a new list of 272 AI keywords in English and Italian. Using this new list of keywords, we processed the 9,881 startups' websites again to identify those startups related to AI. Then we identified 995 startups related to AI; those were further manually filtered to check whether the websites contained a clear and explicit reference to AI technology. Then, to verify if the startups obtained from the bottom-up process were new AI startups or not, we compared them with the 521 AI startups that had already emerged from the top-down approach, checking for matches. The aim of the bottom-up process is to check and validate the top-down process, also enriching the list of startups and keywords referred to the AI application domains.

The final sample of 532 startups was obtained (521 from the top-down plus 11 more new startups from the bottom-up process), so the bottom-up confirmed the results obtained from the first process and enriched the sample. This process has been verified by the authors to ensure its reliability (Figs. 1, 2).

Fig. 1
figure 1

Top-down process

Fig. 2
figure 2

Bottom-up process

4 Data

4.1 The dependent variable

The study sample includes an original database of 384 AI startups. Starting from the 532 startups obtained from the classifications, we geo-localized them and matched this information with NUTS3 level codes; then, we noticed that about 80% of them were founded between 2017 and 2020, the years in which AI technologies started to take hold in Italy. We decided to take into consideration only these startups to implement our analysis. This appeared as an appropriate context for the study for different reasons. First, Italy has a reasonably well-developed local university system, so the role of university knowledge is expected to be of particular importance in the creation of innovative startups. Second, the Italian economy seems to be advanced in mature sectors and in line with peers in terms of research quality and output of AI. However, this ecosystem fares less well when it comes to business spending on R&D, patents, and AI applications compared to other countries (Italian Government, 2021). This study, therefore, allowed us to test to what extent the relationship between the creation of innovative startups and technological knowledge, and beyond, is shaped by the regional technical context. Bonaccorsi et al. (2013) and Colombelli and Quatraro (2017) have recently assumed that new firms in local contexts could be interpreted as count data. A similar approach has been followed here, and the yearly count of the new AI startups in each province has been used as the dependent variable. Figure 3 shows the distribution of both the number of AI startups and the ratio of AI startups to the total number of startups in the Italian NUTS 3 regions for the entire observed period. The figure also shows that the total number of innovative startups in the Italian regions during the period is trivially concentrated in certain regions. To better understand the distribution, we must also consider how AI startups are distributed in relation to the total number of startups present in the region. Comparing the figure on the left-hand side with the right-hand side, we can see that the number of provinces belonging to the third tertile is slightly higher on the right-hand side (21) than on the left (19), also involving different provinces. This suggests that, in part, the AI startup phenomenon occurs in regions where the number of startups is already high, i.e., those provinces that move from the third quartile (darker color in the figure on the left) to the second quartile (lighter color in the figure on the right), such as the provinces of Milan, Rome, and Turin. In other cases, however, these new AI startups are concentrated in areas with a lower 'startup intensity' absolute value, i.e., those provinces that move from the first or second tertile (in the figure on the left) to the second or third tertile (darker colors in the figure on the right). These are the cases, for instance, of the provinces of Aosta, Novara, Pavia, Arezzo, Isernia, etc., suggesting. Therefore, the presence of these AI startups in different provinces is beyond those already known for the massive presence of generic startups. This suggests the need to investigate and understand the presence of local factors influencing the emergence of AI startups.

Fig. 3
figure 3

Maps

4.2 Regional specialisation index

Data collection relies on The European Tertiary Education Register (ETER database), which collects information on all Higher Education Institutions around Italy. Our analysis involved all the higher education institutions like universities (Ph.D. awarding), as well as universities of applied sciences (Polytechnics), Colleges of Arts and Music, and “Scuole superiori”.Footnote 1 Our criteria of selection were the following: the institution needs to be active at the current year of analysis (2016); the existence of data available about the number of students enrolled, following the “Bologna” levels of education (in this case, we consider from level 5 to 7). Our selection includes institutes whose activities can be classified as the following nine FOE (fields of education): Education (i.e., educational sciences); Arts and Humanities (i.e., arts, history, linguistics, philosophy, and psychology); Social Science (i.e., social and behavioral sciences-economics, political sciences and civics, psychology, sociology, and cultural studies- and journalism and information); Business (i.e., Business and administration-accounting and taxation, finance, banking and insurance, management and administration, marketing and advertising- and law) Natural Sciences (i.e., Natural Sciences, Mathematics, and Statistics-biological and related sciences, environment, physical sciences, mathematics, and statistics-); Computer and Information Sciences; Engineering; Agricultural Sciences and Medical Sciences. The most profound description of the fields will be provided in Table 1.

Table 1 Fields of education

The article aims to measure the impact of multiple university specializations on the creation of artificial intelligence startups in Europe. Accordingly, we used the Regional Specialization (RS) index as a measure of knowledge specialization using the number of students enrolled in each education field. The variable students reflect the number of students enrolled at the beginning of the academic year and are based on the count of students enrolled at ISCED (International Standard Classification of Educational Degrees) level from 5 to 7 by academic disciplines (fields of education). To measure the regional specialization, we considered the students enrolled in the universities located in region i for specialization j (j = {Education, Arts, Information, Business, Sciences, ICT, Engineering, Agricultural, Medicine} using the Balassa index (Balassa, 1965; Bonaccorsi et al., 2013).

$$RS_{i,j} = \frac{{stud_{i,j} }}{{\mathop \sum \nolimits_{j} stud_{i,j} }} \times \left( {\frac{{\mathop \sum \nolimits_{i} stud_{i,j} }}{{\mathop \sum \nolimits_{i,j} stud_{i,j} }}} \right)^{ - 1}$$

Studi,j is the number of students enrolled in all the universities present in region i and specialized in the academic discipline j. To avoid the asymmetry of the index, we computed the normalized version of the index, NRS, which is symmetric around zero. NRS values range between − 1 and + 1, with a neutral value at zero. Values higher than zero mean that the university u is more specialized in the academic discipline j than the average Italian university and vice versa when values are lower than zero.

$${\text{NRS}}_{{{\text{i}},{\text{j}}}} = \left( {{\text{RS}}_{{{\text{i}},{\text{j}}}} - {1}} \right)/\left( {{\text{RS}}_{{{\text{i}},{\text{j}}}} + {1}} \right)$$

As a proxy measure of available competencies required to develop AI technologies, we considered the regional specialization in Computing, information, and communication technologies (Lin et al., 2021), i.e., NRSIC, which from this point, we indicate as ICT. We define γ as all other normalized indices (γ = {Education, Arts, Information, Business, Sciences, ICT, Engineering, Agriculture, Medicine}).

4.3 Control variables

Apart from the effect of the dependent variable, our model includes several control variables to account for other factors affecting new firm creation at the local level. Table 2 reports the variable used in the empirical analysis. Variables refer to regional and university characteristics. Universities fill the local environment with educated human research, so we introduced the variable Students as a proxy of this measure. Universities are characterized by their specializations, represented by indicators γ. To measure the presence of universities, we introduced the variable UNIVERSITIES, which is a count of the number of universities geographically located in region i.

Table 2 Variables

The regional characteristics are determined using the firm's density and the gross domestic product per capita (GDP), a proxy of the level of industrialization of the region. The presence of potentially high demand at a regional level can influence the choice of running a new business and a new firm.

The creation of new firms can be considered as an outcome of the necessity of escape from unemployment; the unemployment rate (UNEMP) has been calculated as the ratio between the number of unemployed people and the number of individuals in the labor force at the time t in the region.

We measure the entrepreneurial intention of a region, including the population between 20 and 39 years old in the province i (YPOP20_39), as we expect that individuals in this age class have a higher propensity to entrepreneurship (Bonaccorsi et al., 2013; GEM, 2016; Grilli, 2022; Kerr & Glaeser, 2009), in fact, start-uppers typically fall into the 18–39 age group. This age group is often characterized as being highly entrepreneurial, tech-savvy, and willing to take risks. Moreover, agglomeration economies can also stem from the presence of other firms in the same place, which, to some extent, ensures the availability of local markets for intermediate goods. In this context, firm density (FIRMD), calculated as the ratio between the number of registered firms at time t in region i and the land use area, has also been added as a control variable.

To provide a comparative assessment of the performance of innovation systems, we derived from the Regional Innovation Scoreboard (RIS) the RIS innovation index (SCRBOARD) at the Nuts2 level and matched it to the examined Nuts3. Through the use of RIS, we are able to capture four main local conditions: context conditions, which measure the main drivers of innovation performance external to the firm; investment conditions, to capture the investments made in both the public and the business sector; innovation conditions, which capture the different aspects of innovation in the business sector; and impact conditions, which capture the effects of firms' innovation activities.

All the dependent variables lagged in 2016, except for the regional innovation scoreboard, which lagged in 2015.

Table 3 reports the descriptive statistics concerning the variables used in the analysis.

Table 3 Descriptive statistics

5 Results

To test the hypotheses, we considered the 110 Italian provinces as the unit of analysis through the estimation of a multiple regression analysis taking as the dependent variable the presence of AI innovative startup for each province i.

In our model, the number of AI startups by the NUTS3 region in Italy is the dependent variable, while regional specializations are the key explanatory variables.

Because of the discrete and nonnegative nature of the dependent variable, the model can be estimated using count models that have proved more appropriate to deal with nonnegative integers. The model can be estimated using a Poisson or a negative binomial model. As the table shows over-dispersion among the dependent variable (its standard deviation is larger than the mean), we decided to adopt the negative binomial estimator (Greene, 2003).

To test our hypothesis, we performed three models using the negative binomial model of regression, Model 1:

$$\begin{aligned} AI_{i} & = \beta_{1} \times UNIVERSITIES_{i} + \beta_{2} \times UNEMP_{i} + \beta_{3} \times GDP_{i} + \beta_{4} \times FIRMD_{i} + \beta_{5} \times YPOP20\_39_{i} + \beta_{6} \\ & \quad \times SCRBOARD_{i} + \alpha_{i} + \varepsilon_{i,t} \\ \end{aligned}$$

To test our hypothesis, we performed Model 2 using the negative binomial model of regression:

$$\begin{aligned} AI_{i} & = \beta_{i} \times \gamma + \beta_{2} \times UNIVERSITIES_{i} + \beta_{2} \times UNEMP_{i} + \beta_{3} \times GDP_{i} + \beta_{4} \times FIRMD_{i} + \beta_{5} \times YPOP20\_39_{i} \\ & \quad + \beta_{6} \times SCRBOARD_{i} + \alpha_{i} + \varepsilon_{i,t} \\ \end{aligned}$$

To test the second hypothesis, we performed Model 3, introducing the interaction among the variable ICT and the other index γ (with γ = {Education, Arts, Information, Business, Sciences, ICT, Engineering, Agriculture, Medicine}).

$$\begin{aligned} AI_{i} & = \beta_{1} \times ICT_{i} *\gamma + UNIVERSITIES_{i} + \beta_{2} \times UNEMP_{i} + \beta_{3} \times GDP_{i} + \beta_{4} \times FIRMD_{i} + \beta_{5} \\ & \quad \times YPOP20\_39_{i} + \beta_{6} \times SCRBOARD_{i} + \alpha_{i} + \varepsilon_{i,t} \\ \end{aligned}$$

Column (1) reports the results of the baseline model. The only presence of UNIVERSITIES in the region shows a non-significant coefficient, so our first hypothesis is not confirmed. The proxy for agglomeration economies, FIRMD, is not significant. Moreover, the variable YPOP20_39, i.e., the population aged between 20 and 39 years, refers to people with a higher propensity to entrepreneurship and shows the expected positive and significant coefficient. The proxy of regional innovativeness (SCRBOARD) is not significant. Finally, the rate of unemployment (UNEMP) and GDP are not significantly correlated with the creation of AI startups, confirming that unemployment does not affect the formation of innovative startups (such companies are not subject to the ‘escape from unemployment’ hypothesis). This result indicates that the founders of innovative startups are more likely to be ‘Schumpeterian entrepreneurs’ and not ‘necessity entrepreneurs’(Vivarelli, 2004).

Column (2) reports the second model. According to the second hypothesis, seen out in Sect. 2, the development of artificial intelligence applications is not influenced by the presence of single specializations. In fact, all the variables measuring the regional specializations are not significant. The presence of UNIVERSITIES in the region becomes a positive and significant coefficient, confirming the first hypothesis. The variable referred to as entrepreneurship propensity (YPOP20_39) is still positive and significant. The firm density (FIRMD) and SCRBOARD are not significant. Finally, the rate of unemployment (UNEMP) and GDP are not significantly correlated with the creation of AI startups.

Table 4 also reports the extension of the baseline model, which tests the interaction between IC and the other specializations. Column (3) shows that the specialization in Information and Computing positively influences AI startups creation (the coefficient of the variable referred to universities is positive and significant). The control variables UNEMP and GDP remain positive but not significant. There is no variation for the variable YPOP20_39 that is positive and significant. Considering the interaction between the specializations, we notice that the coefficient of the interaction between ICT and specialization in “Engineering” is positive and significant as the coefficient of the interaction between ICT and “Medicine”. The results indicate that the higher the variety in the combination of specializations in the region, the higher the number of innovative AI startups. In other words, an increase in the scope of skills available is likely to favor the creation of new enterprises. This might be because entrepreneurs can try out and experiment with new combinations of technologies available in the local context and distributed across a wide range of skills, such as mixing engineering with computer science or medical knowledge with ICT skills.

Table 4 Results

As a robustness check, we further refined the measurement with the zero-inflated negative binomial (ZINB) model since it allows us to model the empirical frameworks in which there is an excess of zeros in the dependent variables. A situation in which an excess of zeros can be observed can be due to the overall absence of startups or to a specific lack of AI startups in time regions that somehow feature a certain degree of innovative startup dynamics. To strengthen the robustness, we ran two analyses, considering different inflation equations. One of our inflation equations has been based on a variable (TotStartups) that captures the overall number of innovative startups (irrespective of whether these were AI or not) in each region for models 1a, 2a, and 3a. The other one is based on the variable Universities, models 1b, 2b, and 3b. Table 5 shows the results of our robustness check. The reported coefficients are marginal effects. Columns 1a and 1b report the result of the basic model, where the coefficient of the number of universities in the region is not statistically significant, so our H1 is not confirmed. The coefficient of the variable UNIVERSITIES becomes significant and positive in models 2a and 2b, confirming our first hypothesis. The coefficient of the variable IC is positive and statistically significant only in model 2a (0.639, p < 0.1), confirming the second hypothesis but not in model 2b, in which it is not statistically significant. In model 3a, the variable ICT is not significant, but it is in model 3b, confirming our second hypothesis. Considering the interactions between the variables of specializations, model 3b shows that the coefficient of the interaction between ICT and specialization in Engineering is as positive and significant as the coefficient of the interaction between specialization in computer science (ICT) and healthcare (Medicine). The interaction between ICT and Medicine is positive and significant also in model 3a (2.423, p < 0.1), confirming our H3 again. We note, as in Table 4, that in all the models, the variable describing the presence of a young population between 20 and 39 years is positively correlated with the creation of AI startups. Our results show that a region with a knowledge specialization in ICT and also in Engineering and Medicine favors the creation of AI-based startups. The result that the effect of knowledge specialization in ICT is per se relevant is not surprising. At the same time, the interaction between this and other specializations (Engineering and Medicine) is a significant result. This suggests that the skills needed for AI development are predominantly computer science but that this is not enough for the creation of AI startups. These skills must be complemented by other knowledge, such as engineering and medical knowledge, in this case. One of the main explanations lies in the GPT nature of AI. Being a general-purpose technology, AI requires the presence of knowledge and data about the application domains as a key condition for the creation of new startups.

Table 5 Robustness check

6 Discussion and conclusion

In the last few years, the entrepreneurship and management literature has primarily stressed the importance of the local environment for startup creation. At the same time, artificial intelligence is receiving growing attention from entrepreneurship researchers, which are investigating several aspects of this technology and its possible implication in different fields of application. In this study, we proposed empirical evidence on how the artificial intelligence firms’ formation at a local level depends on the scientific specializations of the universities located in the same area. Moreover, we provide detailed evidence on the co-existence of multiple specializations that impact this phenomenon of startup creation. So far, the theoretical and empirical literature has devoted little attention to the knowledge and competencies required for AI development. In fact, few recent articles have set out to investigate the skills required for AI professionals (Verma et al., 2021) or to understand the competencies needed to exploit AI in existing organizations (Anton et al., 2020; Davenport, 2018). Therefore, a systemic understanding of the knowledge bases in a region could indicate how best to align existing AI technological opportunities with entrepreneurial capabilities in local ecosystems, as the KSTE suggests (Cetindamar et al., 2020; Ferreira et al., 2017). As argued in KSTE, our study identifies the strategic role of knowledge in fostering the creation of new firms within the same region. Furthermore, our study advances the thinking on regional specialization within a specific technological domain and its economic implementations and complements it with the consideration of existing multiple specializations. Studying AI startups is essential to gain a better understanding of the creation process of this new and powerful GPT. By studying the development of AI startups, we can gain insight into how AI is being used to create new market opportunities and industries, as well as to improve productivity and reduce costs. Additionally, we can learn how AI is being used to drive innovation and explore new products and services. Studying AI startups can help both businesses and researchers to gain a better understanding of the potential of this technology and its implications for the future. As GPT, AI is pervasive, able to be improved upon over time, and able to spawn complementary innovations (Brynjolfsson et al., 2018; Cockburn et al., 2018). Due to its general-purpose nature, AI is an enabling technology that opens up new opportunities, impacting productivity and labor (Acemoglu & Restrepo, 2018; Agrawal et al., 2018). Furthermore, our assessment could offer several inputs for any strategy decision-maker, be it entrepreneurs, policymakers, or universities.

The results suggest that university specializations do not play the same role in promoting business creation for both AI and non-AI startups. As we have noted, regional specialization in computer science and engineering has a non-negligible positive effect on local entrepreneurship. This may indicate that the development of innovative startups needs to be supported by the local environment with specific and technical knowledge, underlining the importance of universities as generators of expertise at the local level. Among the most surprising results of our study is the presence of a significant interaction between the specialization in ICT and the specialization in Health. The interpretation of this result can be found in the literature and in published academic results. In fact, numerous pieces of evidence suggest that the link between AI and healthcare is increasing in terms of a number of different applications, shifting the attention of stakeholders around the world to this new combination. Just as the foundation of startups using AI and working in healthcare is growing, the interest of investors, which translates into numerous investments, is also captured by this phenomenon (Halminen et al., 2019). In this light, the contribution of this work to the literature is threefold. First, our results complement the existing theories on KSTE about the strategic role of universities as sources of knowledge exploitable for local development. In fact, the existence of diverse scientific specialization from universities is the inputs factors required to favor new firm creation and the consequent technological improvement at the local level. This result confirms the increasing engagement of universities and research centers in the process of creation and application of new technologies, such as AI (Tödtling, 1994). In addition, universities, by filling the local environment with educated students, enrich the local network from which other players also benefit, taking advantage of geographical proximity and agglomeration. The second contribution concerns the high relevance of investigating AI technologies and the skills and competencies—digital and not—required for their creation and development. As previously stated, the literature related to AI and entrepreneurship is still poor in contributions, and our work contributes to filling the knowledge gap related to the birth and growth of AI. Offering an exploration of the location of AI startups in Italy and how different knowledge specializations trigger business creation mechanisms, we have identified and traced in detail the path of AI, discovering what seems to be a new trend for Italy and beyond: the application of AI in healthcare.

Third, the purpose of this document is to establish a methodology for the operational identification of AI consisting of a concise taxonomy and a set of keywords characterizing the core and transversal domains of AI. The methodology consisting of two mutually enriching phases is based on a taxonomy and keywords that will help the mapping of the AI ecosystem of agents operating in it, such as startups in our case. This goal stems from the need to clarify some aspects of AI, as there is no standard definition of what AI actually involves (Samoili et al., 2020).

The results of this analysis also have important policy implications. This study expands our understanding of specific local knowledge specializations by testing their effects on the technological and entrepreneurial generation processes within a region. The effectiveness of knowledge-generating processes within an area can be enhanced by the local promotion of diverse and complementary types of knowledge (such as engineering with ICT or medical with ICT in the generation of AI technology). In this context, regional policymakers should encourage innovation processes based on the fusion of a wide variety of unique but complementary knowledge and skills in order to support the development of new innovative firms. Our findings are in line with the goal of Smart Specialization Strategies. This study identifies regions with a competitive advantage and the potential to promote the emergence of knowledge-driven AI technology. This place-based approach aims to promote the assets and resources available in a well-defined region and to support identified priorities for knowledge-based investments, which play an essential role in economic development and technological innovation.

In the spirit of the call of the present special issue, the article advances our understanding of the role of new technologies as a conduit for entrepreneurial development, entrepreneurship, and value creation. Our work expands the comprehension of the tacit mechanism of knowledge spillovers, focusing on trajectories through which entrepreneurs adsorb knowledge from the regions, and then they foster economic and societal value by developing and adopting new technology (Nambisan et al., 2017). Indeed, technological innovation has a profound impact on entrepreneurship and venture creation (Elia et al., 2020; Wright et al., 2004); with this contribution, we clarify some key aspects of the dynamics of startup creation and the development of AI technology.

This contribution also has some limitations. An obvious limitation is a failure to include all 110 provinces in the study, but this is due to the absence of universities or data on them in some NUTS3 regions. However, this has made it possible to involve almost all regions and startups in AI in Italy. Furthermore, our work is focused on the analysis of local inputs that favor the development of AI startups.

Further research avenues could focus, at a micro-level, on investigating the impact of different knowledge specializations on startups' growth and post-entry performance, the survival patterns of these startups, or acquisition from well-established firms in the territory. Further efforts should be devoted to extending the analysis to other countries and mapping and measuring the existing collaborations between universities and startups.