1 Introduction

The concept of deep learning (DL) was firstly introduced by Hinton et al. [1, 2] in 2006 and it has been researched and applied by global scholars. DL is a set of representation learning methods with multiple levels of representations, which can be obtained by composing simple but non-linear modules that transform the representation at one level into a representation at a higher, slightly more abstract level [3]. DL also has the ability of representation learning that allows a machine to be fed with raw data and to automatically discover the representations needed for detection or classification [3]. Different from standard neural networks, DL is a technique with multilayers and takes the previous layers’ output as the subsequent layer’s input [4, 5]. Due to the complexity of the structure and the high computational cost, DL has not been payed much attention. With the development of computer technology and the remarkable performance of DL over traditional machine learning (ML), DL has attracted much more attention and developed rapidly and widely in recent years [6]. The successful applications of DL have been created by many international companies, such as Google’s AlphaGo, Deep Dream, Facebook’s Deep Text, Baidu’s unmanned ground vehicle and IFLYTEK’s speech recognition system. In addition, the DL has been widely applied in image recognition [7,8,9,10], handwritten character recognition [11, 12], semantic segmentation [13, 14], human-level control [15], face recognition [16], face detection [17, 18], face spoofing [19], human action recognition [20] and medical image analysis [21], etc.

Due to the prompt development and extensive applications, several overviews related to DL have been explored. Guo et al. [22] developed a survey on the face recognition using deep learning. Overviews have been studied on the application of DL in radio resource allocation [23], physical layer communication [24], object detection [25], speech emotion recognition [26], sentiment classification [27] and wireless signal recognition [28], etc. Shu et al. [29] overviewed the development of DL in medical image analysis. Furthermore, a survey of the DL-based computer aided diagnosis (CAD) systems which was developed for mammography and breast histopathology images was presented [30], and the DL methods used to automatically detect and classify pulmonary nodules in medical images and their performance were also analyzed [31]. Most of the overviews about DL were based on part of its applications and were explored from the traditional perspective. However, few overviews about DL has been studied by using bibliometrics. Faust et al. [32] only used science mapping and bibliometrics to detect the keywords provided by authors.

Bibliometrics is a discipline based on quantitative analysis with the intersection and combination of much disciplines such as philology, information science, mathematics and statistics [33]. It is a relatively important branch of intelligence science [34] and can be used to analyze the characteristic of the publications in a certain research direction or in a specific journal. The internal structures and relationship of the publications can be revealed by bibliometric. The bibliometrics has been applied to analyze the performance of international journals, such as European Journal of Operational Research [35], IEEE Transaction on Fuzzy Systems [36], Information Sciences [37] and Knowledge-Based Systems [38]. In terms of the bibliometrics of a research direction, the big data research in healthcare informatics [39], learning analytics [40], and genetic algorithm [41] were explored by using bibliometric methods and science mapping.

The performance analysis and science mapping are the two procedures of bibliometric evaluation. The performance analysis is to evaluate the performance of different scientific actors based on publications and citations [38]. And the structure, evaluation and dynamic features of scientific research are illustrated by science mapping [42]. Cobo et al. [43] made a comparision of several visualization tools for the science mapping. VOS viewer and Cite Space are the most frequently used tools and they are also used in this paper. VOS viewer can figure the networks of co-ciation, co-authorships, co-occurrence, citation and bibliographic coupling in an easy-to-interpret way [44]. Cite Space specializes in identifying emergent terms without the influence of citations of the publication by burst detection [45], and detecting the hotspots and research trend by time-line review.

The aim of this paper, is to provide a comprehensive bibliometric analysis of the publications of DL from 2007 to 2019 (the first publication with keywords “deep learning” and “machine learning” was published in 2007), with the analysis of publications, citations and cooperation structures and research trend. The main contributions of this paper lie in the following aspects:

  1. 1

    Except for the annual publication, the types and the research direction of the publications, the publication structure of all the publications is analyzed from the perspective of countries/regions, institutions and authors.

  2. 2

    The co-citation structures of countries/regions, institutions, authors and papers are illustrated by the science mapping tool VOS viewer. The most cited of them are also analyzed.

  3. 3

    The cooperation networks of countries/regions, institutions and authors are displayed by VOS viewer and the corresponding strongest collaborative relationships are listed.

  4. 4

    The timeline review and citation bursts detection of keywords are exported by Cite Space, for deep analysis of the hot spots and research trend of DL.

The rest of this paper begins from data describing and ends as the conclusion. And the detailed arrangements are as follows: Sect. 2 describes the data source and preprocessing. The publication structure analysis is deployed in Sect. 3. Section 4 analyzes the citation structures of countries/regions, institutions, authors and papers. Based on the Cite Space, cooperation networks of countries/regions, institutions and authors are illustrated in Sect. 5. Further analysis, such as the timeline review and the citation bursts detection of keywords are given in Sect. 6. Section 7 ends the paper with some conclusions.

2 Data source and preprocessing

The Web of Science (WoS) contains a remarkable treasure of data on scientific content, impact and collections from 1990 to the present day on a global scale, and it is used world-widely. As one of the databases contained in the WoS, the WoS Core Collection provides researchers a large amount of source of authoritative journals and publications and their sufficient relative information, which can be exported and imported to the bibliometric analysis platform. The WoS Core Collection includes 6 Citation Indexes (Science Citation Index Expanded (SCI-EXPANDED), Social Sciences Citation Index (SSCI), Arts and Humanities Citation Index (A&HCI), Conference Proceedings Citation Index-Science (CPCI-S), Conference Proceedings Citation Index-Social Science & Humanities (CPCI-SSH) and Emerging Sources Citation Index (ESCI)), and 2 Chemical Indexes (Current Chemical Reactions (CCR-EXPANDED) and Index Chemicus (IC)).

To ensure the veracity and comprehensiveness of the data, we have checked all the above databases and find that SCI-EXPANDED, SSCI, CPCI-S, and CPCI-SSH to be the most precise group of databases. Due to that DL is the part of ML, the search topic was set as “deep learning” and “machine learning”, and then 6,145 publications were searched and the earliest one was published in 2007. Thus, the search terms were set as follows: topic set as “deep learning” and “machine learning”, timespan set as from 2007 to 2019, and the database set as SCI-EXPANDED, SSCI, CPCI-S, and CPCI-SSH. After artificial filtering, there are 5722 publications and their relative information (the record content set as “full record and cited references”) is exported from WoS as plain text in January 16, 2020.

3 Publication structure analysis

In this section, the analysis of the publication structure is conducted from three aspects: annual publication, productive countries/regions, productive institutions and authors.

3.1 Annual publication

According to the data in the WoS, the annual publication circumstance of DL from 2007 to 2019 is illustrated in Fig. 1.

Fig. 1
figure 1

Annual publication from 2007 to 2019

As Fig. 1 shows, the first publication was published in 2007, and there was no publication in the next two years until the second publication appeared in 2010. During the 5 years from 2010 to 2014, the number of publications was not more than 100. Since 2015, the number of publications excessed 100 for the first time and has been increasing. Especially, in the last 2 years, augmented number of publications indicates the rapid development of this area. According to WoS, the main types of these publications are analyzed and the top 10 types are shown in Fig. 2.

Fig. 2
figure 2

The top 10 types of the publications

It can be seen that the type of the most publications is article and the number is 2843, which accounts for 49.69% of all the 5722 publications. Among all the publications, proceedings paper is the second type of the top 10 types with the number of 2546 and the percentage of 44.50%. Besides, there are other types, such as review (366), early access (83), editorial material (39), meeting abstract (9), book chapter (5), correction (3), letter (3), and data paper (2). Based on the analysis of WoS, the research directions of the publications are shown in Fig. 3.

Fig. 3
figure 3

The top 10 research directions of the publications

In Fig. 3, computer science and engineering are the most popular two research directions. The number of publications in computer science is 2893 and the proportion of the total number is 50.56%, followed by the number and proportion of publications in engineering are 2150 and 37.57% respectively. Except for the two most popular research directions, researches are also widespread in telecommunications (631, 11.02%), radiology nuclear medicine medical imaging (343, 6.00%), optics (236, 4.12%), mathematical computational biology (220, 3.85%), imaging science photographic technology (218, 3.81%), chemistry (214, 3.74%), biochemistry molecular biology (193, 3.73%) and physics (189, 3.30%), which can be inferred that DL not only well develops in the theory and methodology, but also is comprehensive in the applications.

3.2 Productive countries/regions

In order to reflect the publications of countries/regions, the top 10 productive countries/regions and their annual publications are exported and illustrated in Fig. 4.

Fig. 4
figure 4

The top 10 productive countries/regions from 2007 to 2019

According to the statistics, the USA (1851 publications), China (1242 publications), England (369 publications), India (343 publications), Germany (310 publications), South Korea (302 publications), Canada (250 publications), Japan (241 publications), Australia (192 publications), and Italy (170 publications) are the top 10 most productive countries/regions, and their total number of publications is 5,100 which accounts for 89.13% of all the 5722 publications. The first publication was written by authors from the USA, and American authors published one paper in 2010 and 2011 respectively. Most of these 10 countries/regions published paper since 2013, while Australia published its first publication in 2012 and the USA was in 2007. The annual publication of the 10 countries/regions keeps increasing gradually all the time and the publications of the top 6 countries/regions are more than 100. As the first country whose publications over 100, the number of publications of the USA excessed 100 since 2016 (110 publications) and its annual publication keeps more than other countries/regions in the next 4 years. The publications of the USA are more than that of China in most years except for in 2014 and 2015. As the two most productive countries, the number of publications from the USA and China is 3,093, and the percentage is 54.05% of all 5,722 publications. It is apparent that the USA and China have contributed most to the research of DL.

3.3 Productive institutions and authors

In this subsection, the publication is analyzed from the perspective of institutions and authors. The relative data was download from WoS and the top 10 productive institutions and authors are shown in Fig. 5 and Table 1 respectively.

Table 1 The top 14 most productive authors
Fig. 5
figure 5

The top 10 productive institutions

As is shown in Fig. 5, the top 10 productive institutions are CHINESE ACAD SCI (China), STANFORD UNIV (USA), MIT (USA), TSINGHUA UNIV (China), UNIV CHINESE ACAD SCI (China), HARVARD MED SCH (USA), GEORGIA INST TECHNOL (USA), IMPERIAL COLL LONDON (England), UNIV ILLINOIS (USA) and UNIV OXFORD (England). And the number of publications of CHINESE ACAD SCI (122) is almost twice of STANFORD UNIV (66). 5 of the institutions are from the USA, while the other 5 are from China (3) and England (2) respectively.

In Table 1, the most productive authors and their countries/regions are listed with at least 8 publications. P represents the number of the publications. Specifically, only 5 authors published more than 10 publications in the area, and the largest number of publications is 15. The most productive 14 authors are from China (3), India (3), Japan (2), USA (2), England (1), Germany (1), Italy (1), and Switzerland (1), respectively.

4 Citation structure analysis

To demonstrate the influence of the publications, the citation condition is illustrated in this section from four different aspects. The citations of countries/regions, institutions and authors are illustrated respectively and the top 10 most cited publications are listed. It is noted that P, C and AC represent the number of publications, citations and average citations of the object. The indicator link represents the number of the object which is co-cited with, and the TLS is the abbreviation of total link strength, which indicates the frequency the object that is cited together with others.

4.1 The most influential countries/regions

At present, 106 countries/regions have published at least one paper in the area, and 94 of them consisted of 1036 couples, have been cited together for 11,282 times. Some of the couples are cited by more than one time. And the citation network of the 94 countries/regions are illustrated in Fig. 6.

Figure 6 shows the co-citation network of the countries/regions which have published papers related to DL. The linked nodes represent the countries/regions which have been cited together, and the size of the node indicates the number of citations of corresponding countries/regions. The bigger the node is, the more the citations of the countries/regions are. The thicker the link is, the more frequent the two countries/regions are cited together with each other. For further study of the citation structure, the top 13 most cited countries/regions with more than 1000 citations and their relative information are demonstrated in Table 2.

Fig. 6
figure 6

The citation network of countries/regions

Table 2 The top 13 most cited countries/regions

In Table 2, the countries/regions are ordered by the number of the citations. The number of links is the number of countries/regions that are cited together. As the most productive country, the USA is also the country with the most citations, and the average citation of it is 11.57 which means each publication from the USA is cited by 11–12 papers on average. Canada has published just 250 papers in the area, but obtained the second highest citation (10,994 citations), and the highest average citation (43.98). The 2 of 250 papers with the most citations are “Dropout: A simple way to prevent neural networks from overfitting” [46] (with 6118 citations and published in 2014) and “Representation learning: A review and new perspectives” [47] (with 2501 citations and published in 2013). The citations of these two papers account for 78.41% of 10,994 citations. Switzerland, as the country with the second highest average citation 38.50, has been cited for 4466 times while the number of its publication is only 116. From Table 2, it is reasonable to infer that the number of the citation do not always keep correspondence with the number of the publication. The high citation does not indicate high publication, on the contrary, high publication may also have low citation.

4.2 The most influential institutions

In this subsection, we analyze the citation network on the level of organization. According to the record, 4656 institutions have published in the area and 205 of them have published at least 10 papers. Moreover, 218 of institutions have been cited for more than 100 times, and Fig. 7 displays the co-citation relationship.

Fig. 7
figure 7

The citation network of institutions

218 institutions are classified as 11 clusters and differentiated by colors. The line between two nodes represents the linked institutions have publications cited together. The thickness of the link indicates the strength of the co-cited relationship. The larger the node is, the more the publications of the institution are cited. To make it clearer, the top 12 most cited institutions that are cited for more than 1000 times and related information are listed in Table 3.

Table 3 The top 12 most influential institutions

As shown in Table 3, half of the 12 institutions are from the USA, 2 of them are from Switzerland, and 2 of them are from Canada. Another two institutions are from China and England. UC BERKELEY EECS, SUPSI, and UNIV LUGANO have published just one paper in the area, but have gained the highest three citations and average citations. Canadian institutions UNIV TORONTO and UNIV MONTREAL have published 34 and 12 papers respectively in the area and have been cited 6488 and 2711 times respectively. And the two institutions ranked the fourth and fifth of the average citations. The number of Link is the number of the institutions that are co-cited with, and the TLS represents the strength of the co-cited relationship. For example, UNIV TORONTO (with 808 Link and 1429 TLS) has collaborated 1429 times with 808 institutions

4.3 The most influential authors

In the following, the citation of authors and their co-cited relationship are described. In total, 20,711 authors have published papers related to the area, and 383 authors are cited for more than 100 times. 339 authors consist of the largest co-cited network among 383 authors, shown in Fig. 8.

Fig. 8
figure 8

The citation network of authors

339 authors are clustered and differentiated by colors. The size of the node represents the citations of the author and the line linked two authors whose publications are co-cited. The thicker the link is, the more the authors are co-cited. Table 4 shows the relative information of the top 17 authors with citations not less than 1000.

Table 4 The top 17 most influential authors

As we can see, except for the author Bengio, Yoshua, all the other 16 authors published only one paper in the area and gained the most citations. The P, C, AC, and TLS of the first 5 authors, the next 8 authors, and the last 2 authors are the same. Through checking the WoS, it finds out that the three couples have cooperated for one paper respectively and these three papers have been cited for many times. For further study of the publications, the most influential publications are analyzed in the next subsection.

4.4 The most influential papers

To further analyze the feature and content of the publications. In the following, the condition of citation and co-citation are illustrated and demonstrated. The co-citation relationship is described as corresponding network shown in Fig. 9. The top 10 most influential publications with the highest citations, are listed in Table 5.

Table 5 The top 10 most influential publications
Fig. 9
figure 9

The co-citation network of articles

In Fig. 9, 114 publications are cited for more than 50 times among 161 publications, and consist of the largest connected network. The linked publications have been co-cited together and the size of the node is corresponding with the citations of the publication. They are grouped into 7 clusters and differentiated by 7 colors. The attention has been payed to the most influential publications, and the top 10 most cited publications and the corresponding information are listed in Table 5.

In Table 5, 10 publications are listed. 9 of them are collaborative publications, except for the review: Deep learning in neural networks: An overview [49], which is written by one author Schmidhuber, Juergen. For the type of the publications, 4 are reviews, and another 4 are articles and the rest 2 are proceeding papers. Most research direction of the 10 publications are computer science and engineering, and there are also research concerning Automation and Control Systems, Neurosciences and Neurology, General and Internal Medicine, Geochemistry and Geophysics, Remote Sensing, Imaging Science and Photographic Technology, Chemistry and Instruments and Instrumentation. All of the 10 publications are cited for more than 300 times and the top 4 of them are cited for more than 2,500 times, while the other 6 publications are cited less than 1000 times. And the top 10 publications were published during the period of 2013–2017, mainly in the year of 2016.

5 Cooperation analysis

In order to analyze the cooperation relationship, the cooperation networks of countries/regions, institutions and authors are constructed by the use of VOS viewer. In this section, P represents the number of publications of the object. The number of indicator Link represents how much the object cooperate with, and the total link strength (abbreviation as TLS) indicates the total time of the object collaborated with others.

5.1 Cooperation network of countries/regions

Relative data was downloaded from WoS and imported into VOS viewer. By setting the minimum number of documents of a country/region as 20, 47 of 106 countries/regions meet the threshold and compose the collaboration networks as shown in Fig. 10.

Fig. 10
figure 10

The cooperation network of countries/regions

All the 47 countries/regions are clustered into 6 clusters which are differentiated by 6 colors. The link between two nodes means that there is cooperation between them, and the width of the link represents the link strength, i.e., the frequency of cooperation. The size of the node is the TLS of the node, which is the summation of all the link strength of the node. In Fig. 10, the USA, China and England are the three countries with the highest TLS, and the link strength between the USA and China is the strongest. For further study of the collaboration relationship of these countries/regions, the top 6 countries/regions with the strongest cooperation relationship are exhibited in Table 6.

Table 6 The top 6 countries/regions with the strongest cooperation relationship

Table 6 shows that 6 couples of collaborative countries/regions with the strongest cooperation relationship. Total cooperation strength derived by TLS/P, means the degree of cooperation. Cooperation strength is link strength divided by TLS. As the country with the highest TLS and the most collaborative countries/regions, the USA have collaborated with 73 countries/regions and cooperated with China, England, Canada, Germany, and South Korea most frequently. Within all the cooperation of the USA and other countries/regions, 23.45% of them are collaborated with China. As for China, 239 of 582 cooperation times are with the USA, which accounts for 41.07%. About half of the publications from the USA and China are cooperated with other countries/regions (with total cooperation strength 55.05% and 46.86% respectively), while almost all the publications of England, Germany, and Canada are finished with other countries/regions (with total cooperation strength 129.27%, 97,10% and 84.80% respectively). Especially, the total cooperation strength of England is 129.27%, which excesses 100.00% and means that the publications of England are collaborated with at least one country/region on average.

5.2 Cooperation network of institutions

According to the analysis, 338 of 4656 institutions have published at least 7 publications in the area, and 328 of 338 institutions consists of the largest connected network as shown in Fig. 11.

Fig. 11
figure 11

The cooperation network of institutions

The linked items are cooperators, and the thicker the link is, the stronger the cooperation relationship is. The size of the node denotes the total link strength of the node. According to VOS viewer, the number of institutions with total link strength over 100 is 10, and relative information of these 10 institutions is exhibited in Table 7.

Table 7 The top 10 institutions with the highest total link strength (TLS)

The number of links is the number of cooperative institutions. CHINESE ACAD SCI, as the institution with the most publication, is also the institution which has the most cooperators. CHINESE ACAD SCI has cooperated with 155 institutions and the TLS of it is 235, which means that it cooperates with some institutions for more than once. For each institution, the Link is greater than the P, and the TLS is greater than the Link, which means that some publications are collaborative works finished by more than two cooperators. For further study of the collaboration, the 6 couples of institutions with the strongest cooperation relationship and their relative information are listed in Table 8.

Table 8 The top 6 couples of institutions with the strongest cooperation relationship

As we can see, CHINESE ACAD SCI and UNIV CHINESE ACAD SCI keep the strongest cooperation relationship with each other and have co-worked for 43 times. For CHINESE ACAD SCI whose TLS is 235, 18.30% of its cooperation is with UNIV CHINESE ACAD SCI. And 35.83% of UNIV CHINESE ACAD SCI’s cooperative works were finished with CHINESE ACAD SCI. As the couple of institutions with the second strongest collaborative relationship, HARVARD MED SCH and MESSACHUSETTS GEN HOSP co-worked with each other and finished 17 publications together. HARVARD MED SCH, with 50 publications, 103 cooperators and 141 LTS, 12.06% of the 141 times cooperation were with MESSACHUSETTS GEN HOSP. As for MESSACHUSETTS GEN HOSP, 16.04% of its 106 cooperative works were accomplished with HARVARD MED SCH. MESSACHUSETTS GEN HOSP published only 28 papers, but have accomplished 106 times cooperation with 69 institutions, which means that it collaborates with most of the institutions for more than once. The Link of TECH UNIV DENMARK (11) is greater than the P (12), which indicates that at least one of the TECH UNIV DENMARK’s publication is not the cooperative work but finished by itself.

As shown in Table 9; Fig. 13, the top 6 authors with the highest TLS cooperate with each other and construct the strongest collaboration relationship. Based on the analysis of VOS viewer, these 6 authors co-worked with others, and keep each other as the strongest cooperators. Further analysis of the co-authorship can be obtained based the research of Chuan et al. [55, 56].

Table 9 The top 6 authors with the largest total link strength (TLS)

5.3 Cooperation relationship of authors

To reflect the cooperation relationship between authors, the VOS viewer is applied to analyze and depict the cooperation network of authors. According to VOS viewer, 20,711 authors have published relative papers about this topic, and 2856 of them consists of the largest author cooperation network shown in Fig. 12.

Fig. 12
figure 12

The cooperation network of authors

The cooperation network of 2856 authors are depicted in Fig. 12, and the size of the node denotes the TLS of the authors, i.e. the author’s cooperation frequency with others. The link between two nodes represents that two authors cooperate with each other. According to VOS viewer, 10,962 Link and 11,842 TLS exist in the network showed in Fig. 12. To make it clearer, the top 6 authors with the highest TLS and their relative information are listed in Table 9. The cooperation network, which is the strongest one, is also displayed in Fig. 13.

Fig. 13
figure 13

The strongest cooperation network of authors

6 Timeline analysis and burst detection

For further study of the DL, the timeline review analysis of keywords is explored and illustrated. The top 24 keywords with the strongest citation bursts are also detected. To analyze the research trend of this area, the Cite Space is used to show the result of timeline review analysis. It reflects the research emphasis during different periods, and the research trend has changed over time. The timeline review of keywords is illustrated in Fig. 14. According to Cite Space, 24 keywords with the strongest citation bursts are searched and listed in Table 10.

Table 10 The top 24 keywords with the strongest citation bursts

According to Cite Space, the keywords are classified into 25 clusters which are showed in Fig. 14 and marked with numbers from 0 to 24. The cluster is labelled with keywords and by log-likelihood ratio (LLR) algorithm. Part of the researches are the applications of DL and most of these applications are related to medical science, such as: drug discovery, epigenetics, neuroimaging, computer-aided diagnosis, atherosclerosis, diabetic retinopathy, automated treatment planning, protein-protein interaction, and melanoma. The technology of DL is also used in the area such as object detection, semiconductor manufacturing, neuromorphic computing, human activity recognition, opinion mining, eco acoustics, acoustics emission, land cover classification and system identification. The other researches are more about the critical and main technology of DL, such as: adversarial machine learning, data science, feature learning, brain-computer interface, and image memorability. Feature learning related research began from the very beginning of the DL and the research related to drug discovery, neuroimaging, computer-aided diagnosis, acoustics emission and adversarial machine learning, sprang up in 2014. In 2015 and 2016, it was applied in other areas, especially in medical science. In a word, with the development of DL, relative applications have developed.

Fig. 14
figure 14

The timeline review of keywords

The citation bursts of the keyword, reflects the citation situation of the keyword and in which period the keyword is cited much more. The row with title “2007–2019” is the timeline and red timeline represents the citation bursts period of the keyword. Since 2013, the keywords like deep learning, deep belief network, algorithm, object recognition and network, have been payed much attention. As the keyword whose strength of citation bursts is 39.6584, deep learning was cited frequently from 2013 to 2016. The citation bursts period of keywords algorithm and object recognition was from 2013 to 2016. The citations of the keyword deep belief network and network burst from 2013 to 2017 and from 2013 to 2015 respectively. During the period of 2014–2016, the keywords like neural network, feature learning, restricted Boltzmann machine, and recognition, were with much citations than other periods. 7 of the 24 keywords got citation bursts during 2015–2016, and 5 of the 24 keywords were with much more citations than other periods from 2015 to 2017. Only three keywords: computer vision, stacked auto-encoder and QSAR (the abbreviation of Quantitative Structure-Activity Relationships), have been cited more frequently since 2016. Among them, the citation bursts of QSAR lasted from 2016 to 2019, while that of the other two keywords ended in 2017.

With the development of DL, various DL models have been proposed and applied. The most typical DL models are stacked auto-encoder (SAE), deep belief network (DBN), convolutional neural network (CNN), recurrent neural network (RNN) and generative adversarial network (GAN). Most of other DL models are the variants of these models. The features, the main variants and the main applications of these DL models are simply presented in the Table 11.

Table 11 The most typical deep learning models

The DL is of various advantages and can be combined with several researches. In the future, it may be more used under decision-making environment, like decision supporting. With the massive historical data, the process of decision-making will be simple and performance well. The DL has performed well in some areas and in some applications, but training DL model needs a mass of data and high-performance computer, which constrains the application extent and scene. Maybe the future research can be focused on solving this problem by improving or combining with other techniques.

7 Conclusions

In this paper, a comprehensive overview of DL from 2007 to 2019 is presented. 5722 publications are selected after pre-processing and relative information is exported from WoS. The publication structure is detected according to the record of WoS. The citation structure and the cooperation networks of countries/regions, institutions and authors are analyzed by using VOS viewer. In addition, further analysis is deployed based on the timeline review analysis and citation bursts detection of keywords, which are depicted by the Cite Space.

  1. 1

    Since 2017, the publication augments and the number of the recent 2 years’ publications is around 2000, which is much more than other years. In terms of the type of the publication, 94.19% of the 5722 publications are articles and proceeding papers. 106 countries/regions have publications in the area, and 3,093 publications are published by the USA and China, which accounts for 54.05%. The distribution of institutions and authors spread, and 4656 institutions and 20,711 authors have publications in the area.

  2. 2

    From the analysis of the citation structure, it can be found that high publications do not mean high citations. The most productive countries/regions, institutions, and authors do not always keep the same as the most influential productive countries/regions, institutions, and authors. The USA and Canada are the most cited countries with more than 10,000 citations. As the institutions with the highest citations and average citations, UC BERKELEY EECS, SUPSI, and UNIV LUGANO have published just one paper in the area. Except for the author Bengio, Yoshua, all the other 16 authors published only one paper but gained the most citations. And the 16 authors consist of 4 couples and finished 4 papers which are also the top 4 most cited publications. The quality of the publication is much more important than the quantity of it.

  3. 3

    The cooperation relationship between the USA and China is the strongest, while CHINESE ACAD SCI and UNIV CHINESE ACAD SCI keep the strongest cooperation relationship with each other and have co-worked for 43 times. However, the collaboration relationship is not very strong, because 10,962 links and 11,842 TLS (total link strength) exist among 20,711 authors.

  4. 4

    The application of DL has been widespread, especially in the area of medical science. The hotspot of DL is different in different periods and the research trend varies over time.

This paper provides a comprehensive bibliometric analysis of DL from 2007 to 2019, with the analysis of publication, citation and cooperation structures and the research trend. In conclusion, although the findings in this paper do not cover all the information, this paper is of important and enlightening values to the researchers who are interested in DL. In the future, attention will be continued to pay to the development of DL, especially the development, variant and application of different DL models, such as SAE, DBN, CNN, RNN, GAN.