Introduction

Nano scale technology has been regarded as one of the most important drivers in recent decades in creating new materials and improving industrial techniques. It enables control of matter at the molecular scale. Along with its fast development, nanotechnology has been widely applied and is still seen to have huge potential in various fields of research, ranging from medicine, food packaging, protective textiles, to clean energy exploration, etc.

The co-evolution of nanotechnology in multiple research areas has shown that diverse nano innovations are becoming more and more connected. Science research at the nano scale is believed to be converging and connecting different areas of science and technology (Porter and Youtie 2009; Roco 2005, 2008; Loveridge et al. 2008). As Porter and Youtie (2009) pointed out, if this convergence trend is true, it has (and will continue to have) important implications not only for nano scale science but also for governance and regulations of emerging technologies. Roco (2005, p.129) states that converging technologies will bring “tremendous improvements in transforming tools, new products and services, enable human personal abilities and social achievements and reshape societal relationships”.

Given the difficulty in quantifying connections and boundary changes among different fields, existing studies have shown a plethora of mixed findings in the integration level of nano research. There has been a debate on the trend and degree of interdisciplinarity in various areas of nanoscience. One group positively observes increasing interdisciplinarity in nano scale research and argues that nano research is getting more and more integrative (Porter and Youtie 2009; Loveridge et al. 2008). Nicolau (2004, p.451) states that “Nanotechnology is the most interdisciplinary field so far. This interdisciplinarity is naturally enhanced by the fact that at the nano level the differences between very different disciplines, such as mechanics and chemistry, begin to blur to a large extent and leads to an acceleration of the knowledge production and transfer.”

Another group, on the contrary, claims that the degree of interdisciplinarity in nano scale research does not differ from other science and engineering research. Schummer (2004) argues that nano scale research shows no special interdisciplinarity but rather multidisciplinarity consisting of different (unrelated) fields sharing only a “nano” prefix. Hullmann and Meyer (2003), by examining the Science Citation Index journal discipline classifications, find that the disciplinary distribution of nano-scientific papers (between 1996 and 2001) is still developing and has not reached a stable shape yet.

There are also different opinions on how to measure the interdisciplinary nature of basic research. Citation flows (Bassecoulard et al 2007; Igami and Saka 2007; Leeuwen and Tijssen 2000; Tomov and Mutafov 1996), subject categories (SCs) and journal disciplines (Porter and Rafols 2009; Porter et al. 2007; Hullmann and Meyer 2003) and co-author analysis (Igami and Saka 2007) have been widely applied. Rafols and Meyer (2007) and Porter and Rafols (2009) argue that cognitive dimensions of research (e.g. citation and references) show a high and consistent degree of interdisciplinarity while social aspects (e.g. affiliation analysis) present a lesser and more erratic degree of interdisciplinarity. Rafols and Meyer (2007) suggest that bibliometric indicators based on citations and references can more accurately capture the generation of cross-disciplinary knowledge than tracking disciplinary affiliations. In particular, Porter and Chubin (1985) support the use of citations outside category as an indicator of interdisciplinary research activity. However, Schummer (2004) states that a co-author analysis can cover different aspects of interdisciplinarity than other methods.

This paper’s research proposition is to examine the interdisciplinarity of nano research fields by another cognitive means: a vocabulary mining approach. Complementary to the existing interdisciplinarity literature based on subject categories (SCs), citation and co-author analysis (Igami and Saka 2007), this paper adopts a vocabulary mining approach in exploring integration and overlapping trends of various research fields where nano-technology is applied. The cognitive dimension which explores the essence of nano application in different fields is more important than the institutional (team) collaboration aspect. However, the cognitive dimension has more than one side. This paper examines the interdisciplinarity in nano research and development not only from the co-keywords aspect, but also from the citation aspect. Co-word analysis reveals the overlapping degree of nano research fields, while citation analysis discloses the core and mature field of nano research. Institutional collaboration as an auxiliary means is also presented in explaining the interdisciplinarity feature of nanotechnology research areas. As to the nano field classification, different from using the journal classification (Meyer and Persson 1998) and nano title papers (Schummer 2004; Braun et al. 1997), we classify nano research areas through vocabulary mining, which we believe provides a more accurate dimension of the analysis.

Methodology

Based on Web of Science data, we have harvested nano-publications totalling 723,356 records from 8,700 of the most prestigious academic journals for the past 12 years (1998–2009). The database is constructed based on a lexical query searching and defining strategy developed by the Georgia Institute of Technology (see Porter et al. 2008). Through 16 different algorithms connected together, the search for nano scale scientific research provides a broad but not too excessively expansive collection of hits within the Web of Science database (see more details in Newman et al. 2009, Wang and Notten 2010 and Huang et al. 2010). The nano publication data has been cleaned and noisy records have been excluded. For instance, records containing irrelevant keywords (e.g. nanoliter, nanometer and nano3) have been removed from our database.

Due to the fact that publication databases are built using academic journal output, it is our aim to explore the possibility for linking basic nano research keywords with specific vocabularies which also have an applied focus.

The first step in our methodology is to select suitable vocabularies to mine. Looking at the larger communities or knowledge networks involved in research in these fields we decide on the selection presented in Table 1.

Table 1 Selected thesauri for the studied nano research areas

These thesauri are processed, and the underlying controlled vocabularies used to build them are extracted. This allows for the building of controlled vocabulary keyword sets of the research fields chosen. Such an approach opens the possibility of employing set theory mathematics on the keyword sets (Srinivasan et al. 2001). We are interested here especially in the logical relationships between sets which can be explored using their intersections. Figure 1 shows a Venn diagram for five sets which we will use as a visual conceptualization of the interactions of our studied fields.

Fig. 1
figure 1

A Venn diagram of five sets. Source: http://en.wikipedia.org/wiki/Venn_diagram. Note: Presented as a radially symmetrical composition of congruent ellipses, with simplified labels: “A” denotes A ∩ ~B ∩ ~C ∩ ~D ∩ ~E, “AD” denotes A ∩ ~B ∩ ~C ∩ D ∩ ~E, etc

Next, we consider our vocabulary data from a rough set perspective, where a rough set is a tuple comprising a lower and an upper approximation of a set (Pawlak 1982). The lower approximation of a set X is the complete set of keywords that can be unambiguously classified as belonging to X, while the upper approximation of a set X is the complete set of keywords that are possible members of X. Referring to the Venn diagram presented earlier, we can visualize the lower approximation of the set A as being the figure region denoted by the label A (and expressed in formal set theory as the set A ∩ ~B ∩ ~C ∩ ~D ∩ ~E). Conversely, the upper approximation of the set A can be visualized as the union of all the sixteen figure regions which contain A in their labels, i.e. A, AB, AC, etc. (and expressed in formal set theory as simply the set A). We note here that a rough set can be defined for each figure region in Diagram 1, e.g. the lower approximation of the region ABE can be expressed in set theory notation as A ∩ B ∩ ~C ∩ ~D ∩ E, while the upper approximation of the region ABE can be expressed in set theory notation as A ∩ B ∩ E. A set of interest here could be the set ABCDE, for which the lower approximation and the upper approximation are always equal, as this set is the intersection of all the five sets considered.

To apply this to our data, we build rough sets of controlled vocabulary for each of our nano research field. As a way of example, we define the lower approximation for the controlled vocabulary of the NAL Agricultural Thesaurus as being the set of all keywords that belong to this vocabulary and do not belong to another vocabulary considered. We see this lower approximation as a set of keywords signifying the specialization of each field. The upper approximation of the NAL controlled vocabulary will thus be the set of all keywords that belong to this vocabulary, without concern for non-empty intersection with other sets. We see this upper set as comprising both field specialization keywords, and general and integrative keywords. These lower and upper approximations of these research fields can then be used to investigate the evolution of the fields, as well as the interactions between them.

Besides a keyword mining approach, the other main method adopted in this paper is bibliometrics. By tracing the cited literature in each studied field, citation analysis identifies the learning and referring relationships between nano research areas. In this paper we provide two types of citation analysis: one is the frequency and ratio of being cited, the other is cross-field citation. Cross-field citation can demonstrate that one field learns the common knowledge and shared technologies from others.

Integration between nano research areas: co-keyword analysis

Upper and lower approximation of five fields

Looking at research areas from a rough set theory angle, as mentioned before, it is possible to distinguish between the two above mentioned sub-sets, one being an upper, and one being a lower approximation of the field specific keyword set. In other words, the upper approximation is the set of keywords with a general and integrative meaning, while the lower approximation is the set of keywords signifying the specialization of each field.

The set of Fig. 2a–e shows a division of the publication output in two approximations in the analyzed five fields where nanotechnology applies. Upper approximation publication records show the trend in publication output per field of basic research which would be of possible interest to more than one area. Lower approximation publication records show the trend in publication output per field of basic research specifically of interest to that specific area. One can see that the gap between area-specific and cross-area research is widening. Among all the five fields, the two lines plotted for the Medicine and health field (based on MeSH vocabulary) are relatively closer compared with those in other fields. This presents that, in this field, the growth of field-specific technology is not diverging too far from the more general research. For the remaining four fields, the figures indicate that the general and integrative research grows faster than field-specific research, at least from the perspective of their nano-related publication records.

Fig. 2
figure 2

Comparison between upper and lower approximation in nano-research fields. a Defense technical information. b ICT and computer science. c Physics, electrical and electronic engineering. d Medicine and health. e Agriculture. Source: Authors’ own calculation

Integration of five selected fields

This section explores the interdisciplinary nature of nanotechnology research for the five fields through co-keyword analysis in the cognitive sense. As showed in Fig. 1, the core ABCDEFootnote 1 presents the most popular keywords which belong to all the five studied fields. The change of the core area over time implies the divergence or convergence trend of the five units. The publication numbers of the total nanotechnology related research (lower approximation) and the core area (upper approximation), as well as comparisons between their growth rates are provided in Fig. 3.

Fig. 3
figure 3

Comparison of publication growth rates between nanotechnology total and intersection of 5 sets. Source: Authors’ own calculation

The co-keyword overlap among the five studied fields is increasing rapidly over time, not only with publication numbers but also as a percentage of the total nano-science and -technology publications. The intersection of ABCDE covers 7,884 publications in 1998, which is 24 per cent of whole nano-technology publication pool of that year. Till 2009, the publication number has changed to 34,776, and the percentage has increased to 35 % (see Figs. 3, 4).

Fig. 4
figure 4

Intersection of 5 sets as a percentage of total nanotechnology research publications. Source: Authors’ own calculation

The share of the core area (A ∩ B ∩ C ∩ D ∩ E) is rapidly increasing over time (see Fig. 4). The percentage of core-keyword related publications in the total nano research publication pool jumped from 24 % in 1998 to 35 % in 2009.

The above figures show that the interdisciplinarity among these five fields where nanotechnology is applied has been getting stronger over time. The use of the co-word method is superior, in our opinion, to the analysis of “nano-titled” papers as criticized by Schummer (2004), because the former examines publications from the cognitive point of view, i.e. looking at the content of the paper, while the latter deals only with papers selected on the basis of nano-prefixed titles.

The sharp increase of the intersection part does not mean that the publications linked to each of the keywords grow at a same speed. Some of the top core keywords (intersection of 5 sets) are listed below. For instance, there were 144 articles with the keyword “Nanoparticles” in 1998, but it increased to 6,630 in 2009. Similarly, “Nanowires” articles boomed from 15 in 1998 to 1,800 in 2009. On contrary, the keyword “Spectroscopy” related articles grow with a mild speed, from 1,054 in 1998 to 2,784 in 2009. The various growth rates of publications related to different keywords indicate the evolution and developing trend within Nanotechnology research.

Interdisciplinary nature: citations in nano-fields

In bibliometrics, content analysis and citation analysis are both of importance. If we regard content analysis as the methodology examining cognitive communication from a quantitative aspect, then citation analysis is the type of method exploring the frequency and pattern of links between academic works or researchers. Following the co-keyword analysis in previous section, this part of our paper examines the interdisciplinary nature of nanoscience based on the citation analysis.

General citation analysis based on Journal Impact Factor has been often adopted as an approach measuring research quality, with the assumption that the more cited the better the quality of the publication is. However, we argue that the citations among research areas symbolize more links between different fields. Namely, the citation ratio across different fields indicates the degree to which similar (or common) technologies are shared Table 2.

Table 2 Top core-keywords and related publications

Publication quality: citations in five research fields

Table 3 gives us a wealth of information on the total amount of publications per research area, divided by the two earlier separated sets: cross-field articles (upper approximation) and field specific articles (lower approximation). It also presents us with stratified citation data where we can see the amount and ratio of articles cited more than 100, 200 and 500 times, as well as the average times cited per field and sub-set.

Table 3 Citation ratios in five research fields (1998–2007)

What is also apparent from the above Table 3 is that the nano-related publications in Medicine & health and ICT are the most highly cited ones. Although Medicine & health is quite consistent, but for ICT, the cited average does not differ much from that of other fields, but the cited ratios in highly cited groups (≥100, ≥200 and ≥500) are relatively high. We should nevertheless acknowledge the possibility of disciplinary differences in citation practices. It is also interesting to see that across the five fields analysed there is little divergence in average citation numbers and that on average more field specific, or area focussed, publications have a slightly higher ratio of citations. This latter characteristic is counter-intuitive. One would think that cross-disciplinary articles would get higher citation numbers due to their wide applicability. However what we see from the table is that the opposite is true.

Cross-field citations

Porter and Chubin (1985) state that citations outside category is an important indicator of measuring the interdisciplinary level of research activities. Cross-field citations present interaction between different fields and reveal the core field of research for others. Different from the approach based on Subject CategoriesFootnote 2 (Porter and Rafols 2009; Porter et al. 2007), this paper classifies scientific articles into different field groups by keyword. The citation pattern across different fields presents the interdisciplinary features in the studied groups.

Table 4 provides the cross-cited information among the five fields. It shows that, without doubt, each of the five fields cited publications from their own area most. The very close percentage numbers indicate that cross-citation from outside fields is very important for all the studied fields. In particular, if we have the citations and publications standardised first, the citations from different fields are more or less equal. Without standardising, due to the different sizes of the publication pools in different fields, the citation rates vary greatly in particular for research area specific groups (i.e. research fields at lower approximation).

Table 4 Cited publications cross five fields (1998–2007): standardized percentages

Given that citations have a time lag after publication, there is not much sense comparing early and later years. Therefore this table combine all the years together rather than presenting 1998 and 2007 separately.Footnote 3

Institutional cooperation

At the social level, inter-organizational networks have made an important contribution to the integrative innovation process of nano research fields. As Schummer (2004) argues, co-words and citation analysis reveals the interdisciplinarity in terms of information, while co-author analysis focuses on the social aspect of interdisciplinarity. Considering the ever-growing publication records and the fact that many different authors (particularly Chinese and Korean) share the same names—which makes the co-author analysis less pronounced—this paper carries out cooperation analysis from an institutional viewpoint instead of using co-authorships.

The above analysis shows stronger connections between different research fields over time from the content and citation point of view. One may wonder what the institutional features are behind this fact. As Porter et al. (2007) indicates, institutional parameters which nurture interdisciplinarity are worthwhile analyzing. In the nano publication pool, there are four types of affiliations, namely: academic, government, corporate, and hospital.Footnote 4 Table 5 presents the features of institutional collaboration in 1998 and 2007.

Table 5 Institutional cooperation in 1998 and 2007

The cooperation links between Academic and the rest of organization types have increased most over time. Following that, the cooperation between Government and others have increased mildly. However, the cooperation between Hospital and Corporate stays the same. The table shows that Academic and Government/NGO are the two most active organizations which have improved their cooperation with all the others (see column 1 and column 2).

From the above analysis, it is apparent that the institutions of Academic and Government seem to play very important roles in the innovation process of diverse nano technologies.

Conclusions

This paper introduces a new approach—the keyword mining approach—in exploring the interdisciplinary nature of five nanotechnology related research areas. We employ recognized bibliometric techniques as well as set theory mathematics to define these five nano-related research fields, Defense technical information, ICT and computer science, Physics electrical and electronic, Medicine and health and Agriculture, which are subsequently analyzed. The analysis covers two sets of scales: the first set has specific keywords which are of direct interest to the research field, the second has general keywords with varying degrees of overlap with other areas.

Our analysis involves both cognitive and institutional dimensions. Furthermore, the cognitive dimension covers not only the co-word aspect, but also the citation aspect. Co-keyword analysis reveals the overlapping degree of research and development between nano fields, while citation analysis provides the degree of learning common technologies from other fields.

The results of this paper show that the shared area (of the five studied fields) in the whole publication pool is increasing rapidly. Our analysis also shows that citations from outside categories share a fairly high proportion in the whole reference pool, which indicates a high rate of external learning.

Institutional cooperation analysis indicates that among the four studied organizational groups (Academic, Government, Corporate and Hospital), the cooperation links between Academic and the other organizational types have all increased most over time. Following that, the cooperation between Government and others has increased mildly. However, the cooperation between Hospital and Corporate has remained static.

Based on our analysis, we can draw the following conclusions. First, technologies involved in one research area become more diverse over time. The connections between nano-R&D fields become stronger and the general trend of interdisciplinarity in the studied fields is converging in the long run, although the degree of this convergence depends greatly on the indicators one chooses.

Secondly, the interaction pattern in different fields embodies the stage of knowledge development and transfer as well. If the knowledge is more tacit and ever-changing, there will be more informal means of knowledge transfer, e.g. oral communication or personnel mobility. However, “the more the knowledge is standardized, codified, simplified and independent, the more relevant are formal means of knowledge communication, such as publications, licenses, patents, and so on” (Breschi and Malerba 1997). From the above publication, citation and cooperation analysis, one can also see that nano technology (in all the five studied fields) becomes more mature and standardized, and as such more codified.

The keyword mining approach enables us to examine the five nano-research areas. These provide examples of integration between research areas where nanotechnology is applied. However, in order to examine sectors following an industry classification system, further research is needed.