Introduction

In recent years, the energy crisis and environmental pollution have become prominent, which leads to the difficulty of limited resource reserves to support unlimited demand. Air pollution and other environmental problems have become more and more serious, and the search for new energy has become an important issue for governments. Under the dual pressure of environment and energy, not only the traditional energy industry needs to transform but also other energy-related industries. Against the backdrop of a shortage of international energy supply and increasing calls for environmental protection, the demand for zero-emission, non-polluting, and sustainable environment-friendly energy is exploding, and new energy vehicles (NEVs), as the representative of green energy, have become the focus of the market and the community for their unique advantages in environmental protection and energy conservation. By virtue of their advantages such as zero emissions, low noise, high energy efficiency, and diversification, NEVs have become an industry that major countries and regions are competing to develop (Chen et al. 2019). Countries have introduced preferential policies for the research and development of NEV technology and application promotion, and vigorously promoted the development of the pure electric vehicle industry.

The development of the NEV industry has strategic significance in alleviating oil dependence, reducing environmental pollution, and realizing sustainable development. As a strategic new industry that alleviates the energy and environmental crisis and leads the industry to upgrade, the development of NEVs has attracted much attention from governments around the world. Many countries have taken the development of NEVs as an important strategic initiative to compete for the high economic and technological ground and enhance their overall national strength. Research on NEV technology started early, and the technical fields studied are relatively detailed. Christensen (2011) examined component sharing across drivetrain solutions from a technology strategy perspective, arguing that modular design strategies can be extended to other conventional production systems. Zapata and Nieuwenhuis (2010) studied radical and incremental innovations in the fields of hybrid, biofuel, and hydrogen power technologies. Lee et al. (2010) used the development of automotive emission control technology in the USA as a case study of corporate R&D activity management initiatives under uncertainty in the institutional environment. Budde and Konrad (2019) investigated the dynamic network relationship between the desired future of fuel cell technology and technology policy support that influences each other. Xu and Su (2016) studied the evolution of China’s NEV policy portfolio. Ma et al. (2017) studied the policies related to NEVs in China and evaluated the government incentive mechanism. Sang and Bekhet (2015) conducted a survey on the use of NEVs by Malaysian consumers and an empirical study on the main influencing factors. Yan et al. (2018) used the main road warp method to study the development trajectory of NEV technology. Carlucci et al. (2018) conducted a study on the economic behavior factors of consumers purchasing hybrid vehicles. Breitzman and Mogee (2002) use patent information for technology tracking and forecasting to determine the development path of important technologies. Other scholars used NEV patent data for technology foresight (Aaldering et al. 2019, Li et al. 2019a, Shi et al. 2019) and competitive situation analysis (Cao et al. 2020; Ma et al. 2019), as well as descriptive statistical analysis of technology patent distribution (Aaldering and Song 2019; Zhao et al. 2019) from a scientifically measured perspective. For instance, Neves et al. (2019) showed significant differences in the impact of technological advances on NEV demand. Yang et al. (2013) analyzed the international competitiveness of Chinese EVs from the perspective of multinational patents and found that China’s EV technology is relatively strong in domestic R&D, but lacks competitiveness in the international market.

With the advent of a new round of technological revolution, the NEV industry is gradually entering a new period of accelerated development, and the rapid development of this industry not only injects fresh blood into the economic growth of countries but also contributes to improving the ecological environment (Zhao et al. 2018). The new energy automobile industry will become a new round of economic growth point; the industry has become the hotspot of the world to chase. In the background that the global NEV technology route has not yet been determined, the global competition in the field of NEVs has begun to shift to the level of patents and technical standards. In particular, the NEV powerhouses, represented by the USA, Japan, and Europe, have actively carried out patent strategic layout worldwide. Patent information can provide data support for government departments and industry alliances to make strategic decisions, scientific research institutions, and technological enterprises to seek partners, and then provide a scientific basis for formulating and deploying science and technology development strategies at macro, meso, and micro levels (Kim et al. 2019).

China’s NEVs have made the leap from process initiation to full acceleration thanks to national and local efforts (Zhou et al. 2020). However, at this stage, most of the research on China’s NEV technology innovation is conducted at the national or industry level, without the consideration of regional and inter-regional differences. China is a vast country with obvious regional unevenness in the development of science, technology, and economy (Xu et al. 2019), so how much regional differences exist in the NEV technology innovation? In addition, during the period of the rapid development of new technologies, no one can own all the resources and have capabilities necessary for technological innovation due to the many scattered technological breakthrough points (Powell et al. 1996). So what kind of regional layout does China’s NEV technology present?

In this context, the realization of NEV technology prediction and layout research plays a crucial role in the sustainable development of the industry. Therefore, this paper firstly adopts the frequent pattern growth (FP-growth) machine learning algorithm to identify the hot technology fields from massive data. Then, we build a knowledge flow matrix based on the knowledge flow characteristics represented by the main and sub-category numbers of patents and predict the technological development of the NEV industry by using input-output analysis. Finally, the technological development and layout of China’s NEV industry are studied by using multiple co-occurrence analysis based on the analysis results. Compared with the previous study, this paper proposes a new technology prediction method, which not only overcomes the shortcomings of incomplete data and time lag in patent citation analysis but also broadens the scope of application of input-output methods and provides new research ideas for technology prediction. It also conducts an in-depth study of China’s NEV technology layout, which helps to clarify the current status quo and evolution of China’s NEV technology development, enriches the system of co-existing analysis methods, and provides a reference for the technology strategy layout of each country. The rest of the paper is organized as follows. The next section provides a comprehensive review of the technology forecasting and technology placement methods based on patent mining. The third section describes the research methodology of this paper in detail. The fourth section presents the results of the patent analysis. Finally, discussions and conclusions are presented.

Literature review

Technology prediction study based on patent mining

As an important area of technology future analysis, technology prediction plays an important role in the decision-making process of governments and businesses. Nowadays, technology prediction has received attention from science and technology workers all over the world, and many countries have established relevant research institutes to explore technology prediction continuously. Initially, scholars mostly used the Delphi method for technology prediction (Esmaelian et al. 2017; Lintonen et al. 2014). Zhang et al. (2016) used a thematic analysis approach to identify the main research in big data technology and ultimately predicted the main research areas in big data technology by combining expert opinion with actual data. Apreda et al. (2019) used expert forecasting methods to predict future developments in medical device technology. The Delphi method is based on expert experience and its process relies heavily on expert experience and domain knowledge for technology identification and prediction (Martin 1996; Shin 2001). There is a technical expert dependency in the method (Liu and Wang 2019), and its objectivity is constantly challenged by realistic technological developments (Forster and von der Gracht 2014; Liu et al. 2007; Tichy 2004).

Along with the continuous application of quantitative analysis methods in the field of technology prediction (Barnes and Mattsson 2016; Cho et al. 2016), gradually scholars have been using bibliometrics to study technology prediction. Attempts have been made to improve the reliability and validity of technology predictions using paper and patent data (Yoon and Kim 2011). Wang et al. (2018) used bibliometric methods to map the technology roadmap for nano electricity generation and forecast the future development of this technology through patent analysis. Adamuthe and Thampi (2019) used regression analysis to predict the future development of technologies such as cloud computing, microcomputers, and mainframes. Li et al. ( 2019b) used scientific and technical papers and patent information as data sources, and used text mining combined with expert opinion to achieve technology evolution path identification and trend prediction. Wang et al. (2020a) used a combination of bibliometric and S-curve approaches to predict the development of carbon capture and storage technologies. Zhang et al. (2021) used the latent Dirichlet allocation topic model to mine blockchain patent text information, thus achieving trend prediction of future development in this field. Han et al. (2021) proposed a new framework for analyzing R&D trends based on the latent Dirichlet allocation topic model, which was validated using data on 3D-printed patents and invalidated patents. Manesh et al. (2021) used a bibliometric approach to mine emerging technologies related to Industry 4.0 and provide a detailed analysis of future developments. These methods can help to some extent to address the problem of subjective expert bias (Yeo et al. 2015).

As an important result of technological innovation activities, patents can be used to measure an enterprise’s technological innovation capacity and competitive advantage. Through an in-depth analysis of patent information, it can reveal the technical information contained therein to understand the advanced and key technologies in a particular industry (Miao et al. 2020). Patent analysis is the theory and method of mining patent intelligence from patents, which is the key technology and main tool in information science (Feng et al. 2020). Early scholars used social network analysis to achieve core technology identification through indicators such as network centrality and intermediacy. Yoon and Kim (2012) used patent mining methods to analyze the technology R&D opportunities of firms. Ozcan and Islam (2014) constructed a technology cooperation network based on patent information and achieved key firm and key technology identification by mining the information of nodes in the network. Park and Yoon (2014) used IPC co-category data to construct a technology network and applied social network analysis to determine the centrality and intermediacy of technologies, thus helping the country to plan technology development in a rational way. Kumari et al. (2019) used the latent Dirichlet allocation topic model to mine patent data on bionic robots to identify potential research topics in the field and use co-occurrence analysis to identify hot technologies for bionic robots. Yalcin and Daim (2021) used a bibliometric approach to analyze the future of blockchain technology. Cao et al. (2021) identified key technologies in China’s energy saving and environmental protection industry based on collaborative networks and technology networks.

However, the mainstream literature above ignores the directional nature of knowledge flows and fails to explore the patterns of knowledge flow activity when examining the characteristics of correlations between technical knowledge. The problem is gradually being recognized and scholars are beginning to study the characteristics associated with the inter-technology relationship from a knowledge flow perspective. Initially, scholars used citation networks to achieve research on the directionality of knowledge flows. Narin et al. (1987) used patent citation information to analyze a firm’s relative technological advantage. Leu et al. (2012) used patent citation information to identify key technology areas in the field of biofuel and bio-hydrogen technologies. Shin and Kim (2013) enabled the identification of key technologies in the field of image display through a network of patent citations. Lee et al. (2018) proposed a machine learning-based key emerging technology identification model based on patent citation information combined with patent base information, through which a key emerging technology identification in the pharmaceutical field was achieved. Noh and Lee (2020) achieved the prediction of hot technology areas by patent citation rate. Daim et al. (2020) used the technology knowledge redundancy method to analyze patent citation information to achieve forecasting technological positioning.

Technical layout

In recent years, there has been a great deal of academic research on technology layout. Some scholars studied the evolution of technology layout, such as Yeh et al. (2018) who visualized the trajectory of technological evolution in this field using the skeletal field as an example. Some scholars studied the empirical evidence of technology layout (He et al. 2020; Tan and Lin 2018) and some scholars studied the technology layout of international organizations in specific industries (Zhang et al. 2017). There are also scholars who study the technology layout and patent competition in typical industries (Yu and Zhang 2019). In the analysis of technical layout, the former scholars often use a double co-current analysis. If only from a one-dimensional perspective, the breadth and depth of knowledge can be explored and what can be revealed is limited. In order to reflect the interrelationships between different feature terms, the concept and method of multiple co-occurrence has been proposed. Abad-Segura et al. (2020) applied multiple co-occurrence analysis to analyze digital education trends. Sarica et al. (2020) used multiple coexistence analysis to patent-mine a firm’s technology development in order to analyze competing firms’ superior technologies. Multiple co-presentation analysis can more comprehensively, systematically, and deeply excavate the interrelationships between data, and its application to the field of research technology layout is more conducive to discovering multiple, intersecting, and potential relationships (Leydesdorff 2010).

It can be seen that the existing literature on technology prediction and technology layout research based on patent mining has achieved fruitful results, but there are still some deficiencies in the following aspects. First of all, most existing literature used patent citation, input-output, patent, and other methods for technology prediction, but these methods have different degrees of shortcomings and deficiencies, such as patent citation with a time lag problem, and patent analysis tends to ignore the direction of knowledge flow. Huang and Huang (2015) argued that the most important technical field covered by the patent is the main classification number of the patent, the rest of the related technology is the sub-classification number, and the flow of technical knowledge flows from the important technical field to the rest of the technical field. Therefore, by defining the patent knowledge flow relationship as the flow from an important technical field (patent main classification number) to other technical fields (patent sub-classification number), a knowledge flow network can be established, which can both take into account the directional nature of knowledge flow and overcome the time lag of citation analysis. Secondly, the study of NEVs has gradually become a hotspot of academic research. Many scholars in China mostly conduct research from the perspective of government policy, but there is little literature on the layout and evolution of China’s NEV technology to conduct in-depth analysis and research. Finally, the previous patent layout analysis is mostly based on the statistical analysis of patent quantity, the research dimension is relatively single, and the information it can reveal is also limited. Using multiple co-presentation method to analyze the technical layout can reflect the technical layout and evolution more clearly, but there are few studies in the literature (Fig. 1).

Figure 1
figure 1

Research framework.

Methodology

Research framework

In this paper, we first download and preprocess the relevant data from the Derwent Innovations Index (DII) patent database and the FP-growth algorithm is then used to identify the hot technologies from massive data. The knowledge flow matrix is then constructed based on the knowledge flow features contained in the patent data’s main and sub-category numbers, which is combined with input-output analysis to identify the core and frontier technologies. Core technology identification is conducive to control the current status of technology development, and the foresight of frontier technology can more accurately predict the direction of future technology development. Finally, we use multiple co-linear analysis to study the technology layout of China’s NEVs.

Overall process

Phase 1: Data collection and preprocessing

DII is a web-based information patent database, it is updated weekly with global patent information. The database has been translated into English and indexed by Derwent patent experts from more than 40 countries and regions around the world, and the use of the database provides a comprehensive view of global inventions (Mahlia et al. 2020). In this paper, TS= (New energy vehicle OR New energy automobile OR Battery Electric vehicle OR Battery Electric automobile OR pure electric vehicle OR pure electric automobile OR hybrid electric vehicle OR hybrid electric automobile) is applied to search in the Derwent database. We set the patent period is 1968–2018, and the search date is April 28, 2020. Then, a total of 63,505 patents (including patent families) are retrieved. The patent information provided by the Derwent database contains patent families, in which the basic patent is the earliest patent applied for by a certain organization, and the patents with the same technical information applied for by the organization afterwards will be included in this patent family. The analysis of data containing patent families requires the extraction of patent data in order to obtain the full patent data (Guan and Liu 2016). In order to ensure the scientific and rigorousness of the research results, this paper uses Python to write a program to extract and process the Derwent patent family data, so as to obtain more accurate and comprehensive patent information (153,398 patents), and use the processing results in the subsequent technical analysis.

Each patent information in the Derwent data contains Patent Number (PN), Int. Patent Classification (IP), Assignee Name (AE), Priority Application Information, and Data (PI). This paper first treats the patent families according to PN. Then, it extracts the filing date of each patent according to PI. Since the number of patents filed in the field of NEVs before 1989 was only 1252, the patent data from 1989 to 1993, 1994 to 1998, 1999 to 2003, 2004 to 2008, 2009 to 2013, and 2014 to 2018 were selected for analysis. Next, according to the priority patent number, priority patent language, and patentee name in the PI, the patent country of each patent is determined after multiple comparisons. Finally, the patent country is selected as China’s patent data, and according to the AE, the patentee information of each patent is obtained.

Phase 2: Hot tech discovery

The FP-growth algorithm, as an improved algorithm of Apriori, belongs to an association rule. Correlation analysis, as an effective data mining tool, was previously commonly used to analyze customer spending habits, employee organizational characteristics, etc. FP-growth algorithm, as one of the fastest and most efficient algorithms, does not take a long time and produces a large number of redundant candidate sets as is the case with the Apriori algorithm (Sarath and Ravi 2013). Current research on the FP-growth algorithm mainly focuses on algorithm improvement (Ismail et al. 2018; Segatori et al. 2018), and little literature has used the algorithm for value knowledge discovery and technology identification. The FP-growth algorithm is gradually being applied to mine knowledge data of different magnitudes and obtain the corresponding association rules (Ali 2012). Research shows the possibility of the FP-growth algorithm to extract key links from knowledge co-occurrence relationships. As the number of data increases, traditional statistical methods are difficult to perform large-scale in-depth analysis. Therefore, FP-growth can be used to mine the quantity and quickly identify the main components from the massive amount of data. This method can filter out a large amount of noisy information, thus improving the accuracy of the prediction results (Sohrabi and Roshani 2017).

The core step of the FP-growth algorithm is to build the FP-tree and then use it to mine the frequent item set. To build the FP-tree, the original dataset is scanned twice. The first counts the number of occurrences of all element items, while the second scans the elements that satisfy the minimum support level, and the frequent item set is available after the construction of the FP-tree. Figure 2 shows the FP-tree constructed with the data from Table 1 as an example (with minimum support of 3).

Figure 2
figure 2

FP-tree.

Table 1 Transactional database.

Phase 3: Design of core technology identification methodologies

The success of a technological field stems from the investment of money, equipment, personnel, and time in original innovation; it relies on technological networks; and learning, imitation, assimilation, and reinvention from other technological fields are the result of technology perception, which can be called the principle of technology sense. The principle of technology induction is usually quantitatively elaborated using input-output in the field of economics. If the number of patents in the i-th technology area of the technology network is Xi and the number of patents flowing from the j-th technology area of knowledge to the i-th technology area is Xij, due to the investment in R&D funds, the number of patents obtained by pure original innovation in the i-th technology field is Yi, then:

$$ \sum \limits_{j=1}^n{X}_{ij}+{Y}_i={X}_i $$
(1)

Referring to the input-output analysis method in economics, the direct induction coefficient is the ratio of the total number of patents in the j-th technology area knowledge transferred to the i-th technology area to the total number of patents in the i-th technology area. The formula is as follows:

$$ {a}_{ij}=\raisebox{1ex}{${X}_{ij}$}\!\left/ \!\raisebox{-1ex}{${X}_i$}\right.\left(i,j=1,2,\dots .,n\right) $$
(2)

If the data is viewed in the form of a matrix, equation (1) can be written as:

$$ AX+Y=X $$
(3)

Furthermore, we can get X = (I − A)−1Y.

Thus, the complete induction coefficient matrix is

$$ B={\left(I-A\right)}^{-1}-I $$
(4)

Finally, the induction coefficient is:

$$ {U}_i=\frac{\sum_{i=1}^n{b}_{ij}}{\frac{1}{n}{\sum}_{j=1}^n{\sum}_{i=1}^n{b}_{ij}} $$
(5)

where \( {\sum}_{i=1}^n{b}_{ij} \) represents the sum of the data in column j of the complete induction coefficient matrix. \( \frac{1}{n}{\sum}_{j=1}^n{\sum}_{i=1}^n{b}_{ij} \) represents the sum of complete induction coefficients. The induction coefficient indicates the ability of a technology to perceive, learn, assimilate, and reinvent itself with respect to the technologies in the technology network. If the technology induction coefficient is greater than 1, it means that the technology has been subjected to a higher degree of technology induction than the average in the study area. The larger the coefficient, the greater the demand for technology during the development of the technology. If the technology susceptibility coefficient is less than 1, it means that the technology has been subjected to a lower level of technology susceptibility than the average in the study area. Thus, the level of technology induction coefficient can be used to identify core technologies (Liu et al. 2017).

Phase 4: Design of methodologies for forecasting leading edge technologies

On the one hand, the impact of a technology achievement on a technology network is its influence on itself and other technical fields, which is usually manifested in the diffusion of the technology to other members of the network. On the other hand, the diffusion of technologies is outside the network. The principle of technological impact can be quantitatively elaborated using input-output analysis methods. It is assumed that the number of patents for the i-th technology domain in the technology network is Xi and the number of patents for the j-th technology domain knowledge transfer to the i-th technology domain is Xij. Then, the influence on the other technical areas of the technology network is expressed as \( {\sum}_{j=1}^n{X}_{ij} \). Influence on technology areas outside the technology network is expressed as μiEi. Among which, μi denotes the impact effect coefficient on the out-of-network technology area and Ei denotes the number of patents on knowledge transfer from the i-th technology area to the out-of-network technology area. We have:

$$ \sum \limits_{j=1}^n{X}_{ij}+{\mu}_i{E}_i={\sigma}_i{X}_i $$
(6)

In equation (6), σi denotes the equilibrium coefficient and σiXi denotes the effect of the i-th technology domain on the total impact inside and outside the technology network.

Referring to the method of defining the direct distribution coefficient in the input-output analysis method in economics, the direct impact coefficient is the ratio of the total number of patents in the j-th technology area knowledge transferred to the i-th technology area to the total number of patents in the j-th technology area. It is expressed as follows:

$$ {k}_{ij}=\raisebox{1ex}{${X}_{ij}$}\!\left/ \!\raisebox{-1ex}{${X}_j$}\right.\left(i,j=1,2,\dots .,n\right) $$
(7)

The complete influence coefficient matrix is

$$ D={\left(I-K\right)}^{-1}-I $$
(8)

The formula for calculating the technical impact coefficient is expressed as follows:

$$ {V}_i=\frac{\sum_{j=1}^n{d}_{ij}}{\frac{1}{n}{\sum}_{j=1}^n{\sum}_{i=1}^n{d}_{ij}} $$
(9)

where \( {\sum}_{j=1}^n{d}_{ij} \) represents the sum of the elements in row i of the full impact coefficient matrix. \( \frac{1}{n}{\sum}_{j=1}^n{\sum}_{i=1}^n{d}_{ij} \) represents the sum of all the full impact coefficients. Technology impact represents the role of one technology as a catalyst and guide for other areas of the technology network. If the technology impact coefficient is greater than 1, it means that the technology has had an important enabling effect on other technologies beyond the average of the research area. If the technology impact coefficient is less than 1, it means that the technology has had less impact on other technologies than the average level of impact in the research area. Thus, the magnitude of technological impact can be used to identify cutting-edge technologies (Liu et al. 2017).

Phase 5: Layout of multiple co-line analysis techniques

In order to analyze the technology layout of China’s NEVs more clearly, this paper uses multiple co-occurrence networks to analyze the situation. Firstly, we construct the province-patent classification number co-present relationship matrix, as shown in Figure 3. The matrix contains two feature items: patent province (PR) and IPC. There are three types of co-occurring relationships: ① Patent classification number coexistence, reflecting the interconnectedness of technologies. ② Patented provinces — IPC co-presentation, reflecting characteristics of technology distribution by province. ③ Provincial co-presentation, reflecting the cooperative relationship between different provinces in China.

Figure 3
figure 3

Schematic diagram of the composition of the matrix of the patent province — classification number multiple co-presentation network.

Due to the different types of three types of co-present relationships, the data in the matrix of co-present relationships need to be standardized in order to ensure the scientific validity of the results. The formula is as follows:

$$ {G}_{ij}=\frac{C_{ij}-{C}_{\mathrm{min}}}{C_{\mathrm{max}}-{C}_{\mathrm{min}}}\left(0\le {C}_{ij}\le 1\right) $$
(10)

where Cij is the initial value, Cmax and Cmin represent the maximum and minimum values of the three initial matrices, respectively, and Gij standardized results.

Analysis results

Hot technology identification

For the identification of hot research areas from massive data, this paper uses the FP-growth algorithm written in Python to implement frequent item sets and hot technologies mining in the field of NEVs in different periods. Table 2 shows the number of frequent item sets mined by the FP-growth algorithm at different thresholds and the corresponding number of hot technologies included. Based on the comprehensive consideration of simplifying data and reducing information loss, this paper selects a threshold of 0.05 to mine the frequent item sets for each period in the NEV field (Wu and Zhang 2020), so as to obtain the hot technologies of each period.

Table 2 The number of FP-growth identification techniques at different thresholds.

The hot technologies in the NEVs are identified by the FP-growth algorithm as the analysis data, and the knowledge transfer matrix of hot technology is constructed according to the rule that the analysis data flow from the main patent classification number to the secondary classification number. Table 3 shows the hot technology knowledge transfer matrix constructed as an example for 1999–2003, which is an asymmetric matrix where column labels indicate primary classification numbers and row labels are sub-category numbers. For example, data 53 in the third column of the first row indicates the number of patents when H02J-007/14 is the main classification number and B60R-016/04 is the sub-classification number. In addition, the data on the diagonal of the matrix shows the number of patents when the patent has only one classification number.

Table 3 Knowledge transfer matrix by technology area for the period 1999–2003.

Core technology identification

On the whole, many technologies continue to be the research core in the field of NEVs in the past 30 years such as the battery temperature control technology (H01M-010/50) which has been developed continuously since 1989 and its ranking has gradually risen. It has become the third core technology in 2013. There is also the vehicle mechanical drive (B60L-011/14), which has been ranked as the top three core technologies for two decades from 1994 to 2013. From the development of the last decade, new core technologies have emerged continuously, and NEV technology is changing rapidly and research in various fields is becoming more and more active (Table 4).

Table 4 Top 10 core technologies by period 1989–2018.

In summary, the evolution of the core technology areas of NEVs from 1989 to 2018 is approximately as follows: vehicle accumulator (B60K-001/04) — engine signal control (F02D-029/02) — vehicle mechanical drive unit (B60L-011/14) — engine signal control (F02D-029/02) — vehicle electrical energy controller (B60W-010/26) — battery mount manufacturing (H01M-002/10).

Frontier technology forecast

Table 5 shows the results of the calculation of the impact coefficient (frontier technology) for the main technology areas of NEVs. The result of this calculation is related to the core technology in the lagging period. From 1989 to 1993, the main research frontiers were battery mount manufacturing (H01M-002/10), battery testing instruments (G01R-031/36), and vehicle battery placement (B60R-016/04). Battery testing instruments became the core technology for the next period of the top three rankings. The battery-powered technology (B60L-011/18), which ranked first in terms of impact factor from 1994 to 1998, ranked third in terms of core technology in 1999–2003. Engine signal control technology (F02D-029/02), which ranked third in leading-edge technology from 1999 to 2003, was ranked first in core technology from 2004 to 2008. The frontier technologies ranked third and fourth from 2004 to 2008 (H01M-010/42, H01M-010/50) were developed into the core technologies ranked second and third from 2009 to 2013. The core technology of battery charging and discharging (H01M-010/44), which first appeared in 2009-2013, ranked seventh in the frontier technologies from 2004 to 2008. The battery pack charging device (H02J-007/00) did not appear in the table of frontier technologies from 2004 to 2013. It ranked fourth among the frontier technologies from 2009 to 2013 and then developed into the third core technology from 2014 to 2018.

Table 5 Top 10 frontier technologies by period 1989–2018.

The analysis reveals that there are still many frontier technologies with high technological impact in the early stages that have evolved into core technologies in the later stages. The results confirm that the proposed method can be well applied in the field of technology prediction, which remedies the shortcomings and deficiencies of patent citation data and provides a new analytical approach for technology prediction based on quantitative analysis.

Overall, the trajectory of the main frontier technologies for NEVs in the past three decades has been battery mount manufacturing (H01M-002/10) — battery power supply technology (B60L-011/18) — hybrid systems (B60K-006/02) — testing battery conditions (H01M-010/48) — vehicle safety electrical devices (B60L-003/00) — battery conductive coupling (H01M-002/20). It can be found that the frontiers of NEVs are different in different periods, which also reflects the rapid development of technology in the field of NEVs, and different technologies have achieved encouraging research results in a short period of time. In the future, relevant enterprises in the field of NEVs should focus on frontier technology in 2014–2018. These technologies will become the core technology in the field of NEVs in the future, which are the hot spots that countries are chasing. Relevant enterprises and research institutes should combine their own advantages to carry out technical layout ahead of time to achieve technological breakthroughs and technological leadership, so as to occupy a favorable position in the future international competition.

NEV technology development and layout study in China

Statistics show that the number of patent applications in China in the field of NEVs has only been increasing year by year since 1999. Therefore, based on Chinese patent data from 1999 to 2018, this paper focuses on the technological evolution and layout of China’s NEV sector over the past 20 years.

Analysis of the number of patents by province in China

In order to obtain the number of NEV patents in each province of China, this paper extracts the patentee information from the Derwent database of Chinese patent data for identification. The Derwent database assigns a separate 4-character code to each of the approximately 20,000 companies worldwide in order to standardize company names. For other companies and individual patentees, the Derwent database will develop a non-standard 4-character code for each patentee. Among them, AAAA-C indicates a standard company, AAAA-N is a non-standard company, and AAAA-I is an individual. Since it is not possible to determine the region to which the individual patent rights belong, this paper eliminates the 3916 individual patents filed in China in the field of NEVs and analyzes the remaining 19789 patents. In order to identify the provinces to which each patent right organization belongs, this paper builds a database of company codes and the regions to which they belong through websites such as TianYanCha (https://www.tianyancha.com/) and Enterprise Search (https://www.qcc.com/), and then uses Python to write a program to identify the regions to which each patent right organization belongs, and then draws a heat map of the number of patents in each province by time period, and the result is shown in Figure 4.

Figure 4
figure 4

Heat map of the number of patents by province in China. A Number of patents by province in China from 1999 to 2003. B Number of patents by province in China from 2004 to 2008. C Number of patents by province in China from 2009 to 2013. D Number of patents by province in China from 2014 to 2018

The number of patents owned by each province shows a decreasing ladder from east to west, and the number of patents in the eastern coastal region is far more than those in the central and western regions. In particular, the provinces with a high number of NEV patent applications in China are still located in the Yangtze River Delta, the Pearl River Delta, and the Beijing-Tianjin-Hebei region. This reflects that the Yangtze River Delta and the Pearl River Delta are now the core contributors to China’s NEV development, and these regions have encouraged the development of NEVs, with many enterprises investing in NEVs and achieving encouraging results. The northern region, excluding Beijing, has the highest number of patents in Shandong province, which shows that Shandong province continues to maintain a high R&D investment, vigorously developing NEVs, and has achieved good research results. The provinces along the Yangtze River continue to invest in research and development, and each province along the Yangtze River has a higher number of patents than its neighbors, indicating that China’s plan to build a NEV industry corridor based on the Yangtze River Economic Zone has achieved some success. In the future, we should continue to give full play to the low-cost transportation conditions and green and pollution-free features of the Golden Waterway, and complement each other’s functions based on the resource endowments and technical characteristics of different areas up, middle and downstream of the Yangtze River, so as to promote the synergistic development of the NEV industry in the Yangtze River economic zone.

Analysis of the evolution of hot technologies by province in China

In order to vividly observe the R&D investment status of hot technologies in the NEV sector in each province of China, this paper extracts data to draw a Sankey diagram of the distribution of hot technologies in each province from 1999 to 2018 based on the hot technologies in each period, as shown in Figure 5.

Figure 5
figure 5

Sankey diagram of the distribution of hot technology research by province. A Distribution of hot technologies by province in China from 1999 to 2003. B Distribution of hot technologies by province in China from 2004 to 2008. C Distribution of hot technologies by province in China from 2009 to 2013. D Distribution of hot technologies by province in China from 2014 to 2018

By comparing Figures 4 and 5, it can be found that the relative situation of the number of hot technologies owned by each province is basically the same as the relative situation of patents owned by each province, and the geographical distribution shows a decreasing distribution from east to west, but there are certain differences in individual provinces. For example, the top two provinces in terms of the number of hot technologies owned from 2014 to 2018 are Zhejiang and Shanghai, followed by Guangdong, Beijing, and Jiangsu, while the corresponding period in Figure 4 shows that Zhejiang and Shanghai have a much lower number of patents owned than Jiangsu, Beijing, and Guangdong. It can be seen from this that although the number of patents owned by Zhejiang and Shanghai is lower than those owned by Beijing, Jiangsu, Guangdong, and other provinces, the research focus of Zhejiang and Shanghai has been laid out early in the hot NEV technology field, so the number of hot technology patents owned by Zhejiang and Shanghai far exceeds those owned by other provinces, reflecting the scientific nature of the technological layout of Zhejiang and Shanghai in the NEV field.

Technical layout analysis

In order to deeply analyze the technology layout of NEVs in China, this paper uses Pajek to draw a multiple co-occurrence network diagram. Due to the gradual increase in the number of network nodes from 2004 onwards, the complex network formed by all the data will not be directly observable, and the network will need to be reduced in order to clearly observe the changes over time. In this paper, the network was reduced using the Degree method in Pajek software, and the setup criteria for each period were finally determined through continuous experimentation: 2004–2008 (6), 2009–2013 (24), and 2014–2018 (40). To facilitate visualization, squares are used to identify provinces and key technologies (both core and frontier), triangles represent core technologies for each period, diamonds represent frontier technologies for each period, and circles represent the remaining technologies. The node size represents the frequency of occurrence of the patent province or technology, and the line between nodes indicates a co-occurrence relationship. In addition, this paper uses different colors to distinguish three kinds of co-existence relationships, where yellow is the co-existence relationship between patent provinces and patent numbers, gray is the technology co-existence relationship, and red is the cooperation relationship between provinces (Figs. 6, 7,  8 and 9).

Figure 6
figure 6

Multiple co-occurrence network diagram, 1999–2003.

Figure 7
figure 7

Multiple co-occurrence network diagram, 2004–2008.

Figure 8
figure 8

Multiple co-occurrence network diagram, 2009–2013.

Figure 9
figure 9

Multiple co-presentation network map, 2014–2018

The main provinces involved in the study from 1999 to 2003 were Beijing, Taiwan, Shanghai, Guangdong, Hubei, and Shaanxi. A survey of provincial cooperation networks reveals that collaborative R&D among the provinces has not yet begun in this period. The provinces and technology co-existence networks (yellow link) reveal that each province has its own unique area of research.

From 2004 to 2008, the number of provinces in China participating in the study has been increasing. The provincial cooperation network shows that cross-province collaborative R&D has sprouted, and provinces are gradually starting to cooperate in R&D. The province-technology coexistence network presents that the research areas of some provinces have gradually overlapped during this period. In the technology co-existence network (gray line in the figure), core and frontier technologies are basically in the center of the network. At the same time, it can be found that many non-core technologies in different fields are closely related to each other, but there are few links between fields. This indicates that China’s non-core technologies in different fields have not yet been deeply integrated during this period, and there are many technological gaps and market opportunities that require increased investment in R&D.

Five new provinces were added in China from 2009 to 2013, namely Henan, Zhejiang, Fujian, Sichuan, and Heilongjiang. Through the provincial cooperation network, it can be found that cooperation among provinces is gradually increasing and cross-province R&D cooperation is developing. This shows that cross-provincial cooperation in R&D has become more frequent in this period. The province-technology co-presentation network shows that the provinces of Jiangsu, Anhui, Beijing, Shanghai, and Zhejiang have more similar research areas. Through the technology co-present network, it can be found that the network as a whole mainly presents two sub-networks, with very close links between technologies within the two networks and fewer inter-network links, mainly through Guangdong Province, Heilongjiang Province, and Li-ion battery technology (H01M-010/0525). In addition, vehicle battery power supply (B60L-011/18), battery pack power supply device (H02J-007/00), vehicle power controller (B60W-010/26), and vehicle safety electricity device (B60L-003/00) and other frontier technologies are in the central area of the network and have large nodes, indicating that these frontier technologies are the focus of China’s research, and the provinces continue to invest to achieve certain results, reflecting China’s scientific layout of frontier technologies, so as to occupy a favorable situation for future technological competition.

The number of provinces researching NEVs in China is gradually increasing from 2014 to 2018, and all provinces are beginning to collaborate on R&D and are working more closely with each other. The provinces of Shanghai, Zhejiang, Guangdong, Beijing, and Jiangsu have become the core nodes of the cooperation network, with Guangdong still at the center of the network. During this period, the core provinces basically covered all the research fields of China’s NEVs, and the technical differences between the provinces gradually decreased, reflecting the close cooperation between the provinces to promote knowledge flow and enhance the innovation capacity of each province. The core and frontier technologies of this period can be found at the center of the network, reflecting China’s emphasis on core and frontier technologies. The circular technology region on the right side of Figure 9 has fewer technologies directly connected to the provinces, but has developed into a technology region centered on core and frontier technologies, with very close linkages between non-core technologies. The technology region on the left is closely linked to core technologies and provinces.

In general, in the past 20 years, China has gradually increased technology investment in the field of NEVs and expanded the scope of research, and cross-province cooperation has become more frequent; the number of research areas in each province has gradually increased. Specifically, Guangdong has been at the center of the network since 2004, indicating that it has always been a core province for NEVs in China. The network provides a clear and subjective understanding of each province’s main research areas, the status of technical cooperation, the distribution of core and cutting-edge technologies, and technology co-existence. This will help the provinces to select partners and explore technical relationships, and promote the high-quality development of China’s NEV industry.

Discussion

As an important pillar industry of China’s national economy, the automotive industry plays a key role in the sustainable development of China’s economy. However, the automotive industry is closely related to the energy supply industry, and this has become a bottleneck that hinders its sustainable development (Ren et al. 2020). Meanwhile, technologies such as electrification, networking, and intelligence have opened a new door for the automotive industry. NEV, as a product of the combination of the automobile industry and the new energy industry, not only indicates the future development direction of the automobile industry, but also is a breakthrough point for China to achieve ecological civilization and industrial transformation (Rao 2020). It will inevitably become the point of emphasis for the automobile industry to get out of the development dilemma and achieve sustainable development.

The comparative analysis reveals that many of the identified frontier technologies with high technological impact in the early stages are gradually developing into core technologies of the future. The results of this study confirm that the technology prediction method proposed in this paper can well identify frontier technologies and thus achieve the prediction of future core technologies. By analyzing the characteristics of patent knowledge flow attributes, the method effectively compensates for the shortcomings of incomplete data and time lag of previous technology prediction using patent citation data (Huang and Huang 2015).

The analysis of 2014–2018 NEV technologies identifies battery testing instruments (Lin and Dai 2020), battery pack power supply devices (Asensio et al. 2020), battery box manufacturing (Hackl et al. 2020), and hybrid vehicle control systems (Dong et al. 2020) as the core technologies at this stage, which are the research hotspots that countries are competing for. Countries can focus on supporting these areas; increase support for industry academia, and research; and enhance the rate of scientific and technological transformation so as to occupy the technological high ground and gain a competitive advantage in international competition.

The analysis of 2014–2018 NEV technologies identifies battery conductive coupling (Dhara and Das 2020), battery pack power supply device (Asensio et al. 2020), secondary battery application (Wang et al. 2020b), battery cooling technology (Al-Zareer 2020), and electrokinetic control system (Gong et al. 2020), which are identified as the current frontier technologies through frontier technology recognition. These research areas have already attracted a large number of researchers worldwide in the short term and are expected to develop into core technologies in the field in the future. Governments should encourage enterprises to cooperate closely with universities, carry out the technological layout as soon as possible, actively participate in and innovate in cooperation, promote the continuous development of frontier technologies, and help enterprises to move forward to more advanced research fields.

The number of NEV patents held by each province in China is in decreasing order from east to west, and the number of patents in the eastern coastal region far exceeding those in the central and western regions. China’s plan to establish a NEV industry corridor based on the Yangtze River economic belt has achieved some success, with each province along the river having a higher number of patents than its neighbors. In the future, we should continue to give full play to the low-cost transportation conditions and green pollution-free features of the Golden Waterway, and complement each other’s functions based on the resource endowments and technical characteristics of different areas in the up, middle, and downstream of the Yangtze River, so as to promote the synergistic development of the NEV industry in the Yangtze River economic zone.

Conclusions

In this paper, a new technology prediction method is proposed. Firstly, the method uses the FP-growth algorithm to identify popular technology areas in massive data. Secondly, it builds a knowledge flow matrix based on the knowledge flow characteristics of main sub-category patent numbers. Finally, it draws on input-output analysis to identify the core and frontier technologies. Taking NEV technology as an example, this paper validates the technology prediction method proposed in this paper, identifies the core and frontier technologies in the NEV field, and then uses multiple co-occurrence analysis to study the technology development and layout of China’s NEV field based on the identification results.

Many of the early “frontier technologies” identified through the technology forecasting approach presented here have evolved into later “core technologies.” It validates that the technology forecasting method proposed in this paper can be effectively used in the field of technology forecasting. This method helps state and enterprise managers to identify the future direction of technology development, so as to help enterprises adjust the technology development strategy, focus on breakthroughs, and occupy a favorable position in the future technology competition.

This study uses multiple co-occurrence analysis to analyze the technology layout of NEVs in China. Compared with single co-occurrence analysis, multiple co-occurrence analysis can more comprehensively, systematically, and deeply explore the interrelationships between data, which is more conducive to discovering multiple, intersecting, and potential relationships when applied to the field of research technology layout. This method not only allows for the rapid discovery of potential partners in each province, but also allows for a comparative analysis of the technology distribution and research capabilities of each province. It is conducive to the provinces to explore technical relationships and select partners, so as to promote the high-quality development of China’s NEV industry.

The technology prediction method proposed in this paper can help enterprise managers to quickly mine the information and technology that is beneficial to enterprise development from the massive patent big data, and provide data support for enterprise technology innovation decision-making. The method makes up for the shortcomings of incomplete data and time lag in previous patent citation analysis, broadens the scope of application of input-output methods, and provides new research ideas for technology prediction, so as to provide useful insights for relevant researchers.