Introduction

The generation and accumulation of big data in information and communication technology (ICT) has motivated various responsive communication studies focusing on social change. The cumulative nature of communication studies makes it necessary for a scholar to be aware of research trends (Kamhawi and Weaver 2003; Park and Leydesdorff 2008). Previously, scholars manually reviewed selected papers from a given time window using discrete meta-analyses (Cho and Khang 2006; Kamhawi and Weaver 2003; Khang et al. 2012; Trumbo 2004). Recently, scholars have applied computer-assisted bibliometrics (Borgman 1989) to a large number of communication publications to overcome the drawback of selective reviews that occur due to human cognitive limitations (Feeley 2008; Park and Leydesdorff 2009; Peng et al. 2012; Potter and Riddle 2007; Zhang and Lueng 2014).

Bibliometric analysis can cover all publications related to a given theme or field within a specified time window, whereas traditional meta-analysis restricts the number of publications using sampling (Cooper et al. 1994). Traditionally, bibliometric analysis has dealt with citation relationships among publications, the impact of journals based on such relationships, and the scholarly impact of an actor, e.g., an author or an organization. Rice et al. (1988) applied bibliometric analysis to capture citation flows and impact factors. They also conducted network analysis to identify the structural features of journal-to-journal citation patterns. Some used data of author for bibliometric network analysis to find central players (Griffin et al. 2016). Recently, bibliometric analysis has included titles, abstracts, keywords, and the entire text by incorporating semantic network analysis (Doerfel and Barnett 1999; Chung et al. 2013) and text mining (Yan 2015). Peng et al. (2012) built a word co-occurrence network of Internet studies and analyzed nodes and clusters in the network to recognize patterns and trends.

A primary goal of this article is to provide a comprehensive knowledge structure of communication studies. To achieve this, we abstract groups of semantically associated terms into concepts such as methods and subjects, and determine connections between these methods and subjects. Thus, we propose a subject–method topic network analysis method for analyzing subjects, methods, and their relations.

We derive a bipartite topic network of a subject–method topic network from abstracts over a given period. All research considers a subject ontologically and epistemologically to determine a research question and the corresponding solutions (Fink and Gantz 1996). Typically, a research focus (i.e., the subject) and method are features of scholarly papers (Khang et al. 2012; Kim and Weaver 2002). An abstract summarizes the paper such that it specifies the research problem(s), key subjects, and the main results described in the full text (Booth et al. 2003).

Our study contributes to the discovery of structural preferences in subjects and methods in communication studies over 25 years based on data science. Taipale and Fortunati (2014) described preferences in research focus and problem-solving methodologies for mobile communication studies from five journals over a period of 20 years. Oxley et al. (2010) pointed out that well-defined and focused empirical studies are preferable to theoretical developments in some research fields. Thus, in the current era of massive content production, automatically identifying preferences from various perspectives is important. Unlike investigations that are limited to samples of communication studies, this study reveals the importance of traditional subjects and methods, as well as new ones, such as web analysis and new media literacy, by incorporating all communication studies in the given period.

Related work

Traditional reviews of trends in communication studies

Communication studies is a discipline that examines human communication processes. It includes various subjects, such as history of media and media studies, media law and ethics, political communications, broadcasting, the Internet, social media, mobile communication, human–computer interaction/interface, advertising, cultural studies, film studies, personal communication, international communication, audience studies, journalism, journalism trends and education, and national motion and time studies (Calhoun 2011). In the social sciences, research into such subjects often employs both qualitative and quantitative methods, such as participatory experiments, questionnaire surveys, in-depth interviews, literature reviews, discourse analysis, narrative, content and historical analysis, meta-analysis, and network analysis (Donsbach 2006; Luff et al. 2015).

Many literature reviews of communication studies are performed manually and cover a small set of data from several journals. Tomasello (2001) selected 961 papers published in five leading communication journals from 1994 to 1999 and examined the Internet’s influence on communication processes using content analysis. For the selected papers, two coders labeled the topics (six categories), the Internet interface (eight categories), and the research methodology (eight categories). Kim and Weaver (2002) sampled 561 papers published in 86 journals and books from 1996 to 2000 and conducted thematic meta-analysis for topical, methodological, and theoretical trends of Internet studies with two coders. Kamhawi and Weaver (2003) collected 889 papers from the top 10 journals listed in The Iowa Guide from 1980 to 1999 and analyzed method, focal scope, theoretical approach, funding source, and time period using thematic meta-analysis. Trumbo (2004) gathered 2649 articles from eight communication journals published between 1990 and 2000 and analyzed methods, research focus, data collection procedures, and data sources using content analysis with three coders who used the categories proposed by Cooper et al. (1994). Khang et al. (2012) collected 456 articles from 17 leading journals and new technology-specific journals from 1997 to 2010. They assessed the importance of social media by focusing on social media type, research topics, theoretical framework, research method, and data collection using content analysis with three coders. Taipale and Fortunati (2014) employed content analysis to identify 66 articles from five journals on mobile communication studies from 1992 to 2012. Zhang and Leung (2014) reviewed 84 articles on social networking service research from six leading communication journals from 2006 to 2011. They applied thematic meta-analysis to the articles. They found thematic patterns, such as impression management and friendship performance (e.g., Boyd and Ellison 2007), network and network structure (e.g., Hampton et al. 2011), bridging online and offline networks (e.g., Donath 2007), privacy (e.g., Debatin et al. 2009), and extended scope in Social Network Services research (e.g., Elphinston and Noller 2011). Because communication studies are heterogeneous and various (Craig 1993), Chung et al. (2013) identified the theoretical structure of communication discipline by applying a positivistic keyword network analysis to four major communication journals. However, the number of articles in communication studies grows faster, and subjects and methods deployed in the articles are also various. The recent and prominent way for identifying and analyzing the communication discipline seems to be bibliometrics.

Bibliometric analysis

Bibliometrics (Garfield 2006), i.e., measuring the impact and knowledge structure of academic literature by citation and content analysis, has been applied to communication studies to support comprehension of this rapidly changing and growing research field. First, citation analysis allows us to discover knowledge flow via scholarly communication. Rice et al. (1988) analyzed the journal-to-journal citation networks of 20 communication journals published from 1977 to 1985. Feeley (2008) collected 19 communication journals from 2002 to 2005 and performed citation network analysis using degree centrality (Freeman 1979). Park and Leydesdorff (2009) analyzed the journal-to-journal citation network of 107 journals that cited the Journal of Communication in 2006. Barnett et al. (2011) analyzed citations among 45 Communication journals from 1998 to 2007. They revealed the patterns of the journals citing the journals in other disciplines. Lee and Sohn (2015) selected all scholarly communication papers containing the phrase ‘social capital’. They built a citation network of 171 articles and performed centrality analysis and main path analysis. Second, content analysis helps us understand the structure of knowledge in terms of words and groups of words. Doerfel and Barnett (1999) analyzed the titles of papers listed at the meeting of International Communication Association in 1991 by co-word network analysis. Peng et al. (2012) built a collection of 25,486 abstracts retrieved by a query relevant to Internet studies from 2000 to 2009. They applied a state-of-art bibliometric approach that utilizes text mining. They constructed a word co-occurrence network to identify popular keywords in Internet studies. They also conducted cluster analysis on the network to deduce themes from words. Chung et al. (2013) built a co-theory network that its nodes are theories and its links are weighed by the frequency of theories co-occurring in articles. However, recently, information science scholars have begun to apply topic modeling (Blei et al. 2003) to bibliometrics (Yan 2015). Topic modeling is a text mining technique for text summarization that extracts topics from a corpus.

Bipartite network analysis

In particular, our approach of subject–method topic network analysis is classified as a bipartite network (here, we use the term “bipartite network” and “two-mode network” interchangeably) analysis. In a bipartite network, nodes fall into two different groups. The nodes in one group only interact with members of the other group. A typical example is an author-journal network in communication studies (Griffin et al. 2016). According to Griffin et al. (2016), degree and eigenvector centrality are reasonable measures of importance in a bipartite network among four common measures of centrality: degree centrality, betweenness centrality, closeness centrality (Freeman 1979), and eigenvector centrality (Bonacich 1972). Additionally, we argue that weighted degree centrality is more reasonable than degree centrality if the bipartite network has a complete set of links between two different types of nodes (i.e., the nodes in one group and the nodes in the other group are connected totally). This is because every node in the same group will have the same degree in this case. In bipartite network analysis, node centrality for each group, links between nodes in different groups, and clusters of nodes in different groups are the unit of analysis.

In this study, we widen our collection to include all communication studies articles and combine topic modeling and network analysis to cover all articles in a simple and comprehensive manner. We propose subject–method topic network analysis, which is a research method and a theoretical framework. It is a sort of a bipartite network but its nodes are topics classified as either one of subjects and methods.

Methodology

Subject–method topic network construction

Figure 1 shows the process for constructing our subject–method topic network, a special two-mode network (Borgatti and Everett 1997) with only links between nodes of different modes. First, we build a corpus based on the collected data. Second, we perform preprocessing to convert the data to an analyzable format. Third, we cluster the preprocessed data using topic modeling. Topic modeling is a text mining technique that summarizes a collection of documents into clusters that can be abstracted as topics. Fourth, we categorize a topic as either a subject or a method by reading the terms in the topic and labeling the topic with a representative term. Fifth, we build a subject–method network by linking subjects and methods according to their co-occurrence in a document. Finally, we analyze the network from various perspectives to deduce the current state of communication studies. Here, assume that a subject–method topic network includes N S subjects and N M methods with N topics. The weight of an edge is the co-occurrence frequency between a subject and a method. In Fig. 1, \(w_{{\left( {{\text{s}}_{3} ,{\text{m}}_{1} } \right)}}\) is the weight of Subject 3 and Method 1.

Fig. 1
figure 1

Constructing a subject–method network

After we collect text data for a particular research field, we preprocess text data by annotating the part(s)-of-speech (POS), removing stop words, and normalizing terms by using Stanford Core NLP (Toutanova et al. 2013). Stop words are insignificant terms that should not be used as keywords, for example, prepositions, articles, pronouns, and conjunctions. Since important terms are often nouns, we extract core information from nouns. Normalization of terms involves lemmatization and stemming. Figure 2 illustrates the preprocessing steps for an abstract written by Lo and Wei (2010). All sentences are split into lexicons (Fig. 2b). Natural language processing identifies the POS of each lexicon. Figure 2c shows terms after removing the predetermined stop words and selecting a certain focal POS. In English, various terms can be derived from a single stem. Extracting the stem from a term is called normalization. Figure 2d depicts lemmatization. Lemmatization of a term normalizes the term by assigning the term into a standard term, lemma.

Fig. 2
figure 2

An example of preprocessing texts

To abstract topics from preprocessed terms, we apply topic modelling. Topic modelling extracts several topics from a collection of documents. Among various topic models, we use latent Dirichlet allocation (LDA) (Blei et al. 2003) implemented in MALLET (David 2013), a well-known statistical topic model. LDA assumes that a document is a mixture of topics and that the terms in the document represent the topics; in other words, a document consists of terms and each term is assigned to a topic. Therefore, a document is a composition of topics. LDA is statistical model because it represents the probability of a term being assigned to a topic as a discrete probability distribution over terms. LDA infers the distribution by inputting documents to the model. When the inference process is complete, each term is assigned to one of the topics and each topic has a probability distribution over the terms. The number of topics is predetermined by the user. Unlike a clustering algorithm that groups terms exclusively, in LDA, a single term can be assigned to more than one topic. The distribution is multinomial. For inference based on Bayes’ theorem, Dirichlet distribution is used as a prior distribution.

We label a topic by considering its top frequent terms and abstracting a concept. Generally, experts are used to label topics only. However, in this study, we rely on the content retrieved by querying the top frequent terms using the Google search engine as well as experts. We name a topic by choosing a subject or a method name that represents the content from a predetermined category using the experts’ labels with Google search engine. For example, in Fig. 3, the top frequent terms of topic 25 are news, media, newspaper, issue, and study. We match the retrieved content using the terms as a query for a category to determine a label. Similarly, topic 6 is labeled argumentation, and topic 14 is labeled survey research.

Fig. 3
figure 3

Linking subjects and methods

Then, we build a subject–method topic network based on the LDA results. An edge is established if a term in a subject topic and a term in a method topic occur in the same document. The weight of the edge is determined by the co-occurrence frequency. In this study, we focus on abstracts. An abstract contains a research subject and the corresponding method; thus, co-occurrence in an abstract can support the relationship between the subject and the method. For example, in Fig. 3, document 1 has the term ‘issue’, which is assigned to topic 25. Note that the same document contains the terms for topics 6, 14, 12, and 17. Therefore, there are ten edges among five topics. However, topics 25, 6, and 14 are labeled news media, argumentation, and survey, respectively. Here, topic 25 is a subject, and topics 6 and 14 are methods. Since an edge should have a subject and a method, two edges are established among the three topics.

Subject–method topic network analysis

For a subject–method topic network, we consider topic distribution, centrality, relation, and module analyses.

First, topic distribution over a given field illustrates the issues or scholarly interests in the given field. Topic modeling results in groups of terms representing topics and indicates the proportion of each topic. Topics can be classified as a subject or a method; therefore, we can compute the ratio between subjects and methods. In addition, the proportions of subjects and methods describe the degree of variety in the subjects and methods, as well as the preference among the subjects and methods. In addition, methods can be categorized as qualitative or quantitative approaches. Therefore, we may be able to infer which method (qualitative or quantitative) scholars prefer.

Second, the centralities of a topic network indicate the importance of each topic in relation to the other topics. The degree centrality (Freeman 1979) or potentially the weighted degree centrality of a subject topic is the popularity or importance of the subject topic regarding method topics. In other words, a subject topic with high degree implies that the subject is handled using various methods. Similarly, a method topic with high degree indicates that the method is used to handle many subjects. Similar to topic proportion, we can identify the variety of subjects handled using qualitative methods or quantitative methods.

Note that we can compare topic proportion with topic centrality proportion. A topic centrality proportion is the relative degree of a topic among topics in a centrality perspective. The ratio of the topic centrality proportion to the topic proportion represents the relative importance of the topic regarding the frequent occurrence of the topic.

Third, a topic relation involves investigation on an edge between a subject and a method. This reveals the preferential pairs of subjects and methods. This relation concentrates on a one-to-one relationship between a subject and method. The preferred pairs of subjects and methods indicate that the pairings are conventional and scholars tend to be comfortable with such subject–method pairs. Weak ties between subjects and methods imply that the subjects are not frequently handled using a given method due to certain constraints. Disconnected subjects and methods indicate that the research purpose relative to a given subject is incompatible with a given method. Thus, we can determine if quantitative or qualitative methods are preferable for a given subject, and a potential linkage between a disconnected subject–method pair can give us a breakthrough leading us to new findings in the subject.

Fourth, a topic module includes many-to-many relations between topics. Topic modules are detected by partitioning a subject–method topic network. A community detection algorithm is applied to the network to identify modules (Blondel et al. 2008). The algorithm groups topics into modules by calculating the internal densities of modules. Topics that occur concurrently and frequently are more likely to have greater density, and they are clustered as a single module. On the other hand, topics that are less likely to occur simultaneously have lower density, and they are separated into different modules.

Results

Dataset

We applied the proposed method to papers collected from the journals listed in the 2014 JCR Social Sciences Edition as communication studies. We assume that JCR Social Sciences Edition lists only highly qualified journals, and unlisted journals have small influence on the total stream of communication studies. For example, the central journals identified by Griffin et al. (2016) are mostly included in our collection. We chose 33,272 papers from 73 journals published between 1990 and 2014 (Table 3 in “Appendix 1”). We extracted title, author names, publication year, abstract, and references from the papers via Web of Science. According to the 2014 JCR Social Sciences Edition, the papers cover communication theory, practice, and policy; journalism; broadcasting; advertising; media studies; mass communication; public opinion; speech; business; and public relations. We categorized the subjects and methods based on the present categorizations and the information retrieved from a query that integrated the terms in a single topic.

Labeling topics and their classes

We label the topics of subjects and methods using the present categories, as well as content found by information retrieval, as shown in Table 4 in “Appendix 2”. We focus on the topics of communication studies, classified into subjects and methods. In literature review, developing categories to identify and encode a paper into a subject or a method is important. We refer to previously studied categories (Donsbach 2006; Luff et al. 2015). We also average experts’ labels considering the content obtained from information retrieval by applying the relevant terms of a topic. A topic consists of a distribution of words, and the frequently occurring words represent the topic. Moreover, a search engine, such as Google, recommends recent web pages that are highly relevant to the search terms. Therefore, we rely on the Google search engine and three coders to identify a concept by looking at the retrieved pages and matching the concept to one of the categories deduced in present studies.

Topic proportion analysis

LDA give the proportion and probability distribution over terms for each topic. Conceptual modeling, classified as a method, occupies the largest share (10 %). On the other hand, national dispute, categorized as a subject, has the lowest proportion (1 %). In communication studies, the subjects are mass media, national dispute, public health, public opinion, gameplay (possibly human–computer interaction), policy, discrimination, journalism, culture and media, media literacy, organizational communication, community, educational technology, broadcasting, relational analysis, disease, marketing, crisis and emergency risk, industry, family, and comparing media systems. The methods are argumentation, discourse, factor analysis, content analysis, survey research, conceptual modeling, web analysis, and participatory experiment. The ratio of subjects to methods is 1.5. Note that the subjects are more varied than the methods. For methods, the ratio of qualitative methods to quantitative methods is 0.75. In communication studies, qualitative methods are more preferred than quantitative methods.

Topic network analysis

Here, we analyze our subject–method topic network at node and edge levels. We also add a comparison between a conventional co-word network and our topic network. Gephi (https://www.gephi.org/), network visualization software, is applied to show the subject–method topic network. The total number of nodes is 30, and the number of edges is 196. The mean degree centrality is 13.07, which indicates that the average number of neighbors is approximately 13. The mean weighted degree centrality is 29,595.70. In our network, circle and triangle nodes represent subjects and methods, respectively.

Topic centralities

The topics that are classified as subjects are marketing, national dispute, gameplay, journalism, crisis and emergency risks, public health, culture and media, family, broadcasting, community, disease, discrimination, policy, and public opinion. These twenty one subjects are associated with nine methods. According to weighted degree centralities, media literacy is the most important subject, followed by comparing media systems, mass media, organizational communication, education technology, relationship analysis, industry, public opinion, policy, discrimination, disease, marketing, community, broadcasting, family, culture and media, public health, crisis and emergency risk, journalism, gameplay, and national dispute.

In Table 1(a), media literacy is central because its weighted degree is 39,338 and the weight occupies 4.59 % of the total weighted degree of nodes. The number of terms belonging to media literacy has a topic share of 6 % among all topics, regarding topic proportions of subjects and methods. This means that the importance of media literacy is relatively lower in relations than in usage. The weighted degree of comparing media systems is 30,928 (3.61 % relative to the total weighted degree), and that of mass media is 30,814 (3.59 %). Comparing media systems and mass media are also less important in terms of weighted degree than in terms of topic proportions. In particular, industry is 5 % in topic proportions but 2.81 % in topic centrality proportions. This means that 5 % of all the documents treated industry but the importance of industry in relation to other topics is 2.81 %. In addition, the proportions of all subject topics are reduced in the subject–method topic network. Relatively speaking, methods are central topics.

Table 1 Subject topics, method topics, and the relationships between subject topics and method topics

Method topics are participatory experiment, argumentation, interview, discourse, factor analysis, content analysis, survey research, and conceptual modeling. The order of the topics according to topic proportions is conceptual modeling, participatory experiment, discourse, factor analysis, survey research, interview, content analysis, web analysis, and argumentation. However, according to the weighted degree, the order becomes conceptual modeling, participatory experiment, discourse, survey research, factor analysis, web analysis, content analysis, interview, and argumentation. The weighted degree of conceptual modeling is the largest at 103,366 (12.05 %), and that of participatory experiment is the second largest at 58,076 (6.77 %). The topic proportions of conceptual modeling and participatory experiment are 10 and 6 %, respectively. The difference between the weighted degree values of the two methods is more than the difference between their topic proportions. We find that the topic network analysis gives the methods higher value in relation to subjects. Note that a change in order occurs in the subject–method topic network, e.g., factor analysis precedes survey research, and web analysis and content analysis come before interview.

Topic relations

Here, we list the highest weighted edges and the least weighted edges to observe prominent subject–method pairs. Table 1(b) shows that the 12 out of 14 pairs have a common method, conceptual modeling. The other parts of the top nine pairs are media literacy, comparing media systems, mass media, organizational communication, educational technology, relationship analysis, industry, public opinion, and policy. The tenth pair comprises media literacy and participatory experiment. The pairs of discrimination and conceptual modeling, media literacy and discourse, and marketing and conceptual modeling follow. It appears that theory construction based on previous literature is typical because the top edges contain conceptual modeling frequently. In addition, media usage, the change due to media, and the evolution of media itself are the main themes in communication studies. In particular, media literacy is often handled using participatory experiment and discourse in the top-ranked edges. This tells us that scholars had high interest in media literacy and various experiments were conducted. In Table 1(c), the least weighted edges include national dispute, human–computer interaction, public health, journalism, and crisis and emergency risk as subjects. In addition, they involve qualitative methods, such as argumentation, interview, and content analysis. Such subject and method pairs appear to be minor research topics in communication studies. In other words, novel breakthroughs are needed in these minor research topics and more attention should be given to these topics. Specifically, national dispute is frequently shown in the least weighted edges.

Comparing topic network with conventional term network

Figures 4 and 5 illustrates the difference between a subject–method topic network and the conventional co-word network of communication studies over 25 years. The colors in Fig. 4 indicate the modules partitioned in the term network. However, the modules are difficult to identify and label. On the other hand, compared with Fig. 4, Fig. 5 has a small number of nodes and edges. It is also a bipartite graph (i.e., a subject is represented as a triangle and a method is expressed as a circle). Thus, one can easily grasp the whole structure and the relationships among subjects and methods. Therefore, a subject–method topic network is more suitable for analyzing the current state and trends of a certain research field than a conventional term network.

Fig. 4
figure 4

Co-word network of communication studies over 25 years showing only central nodes. (Color figure online)

Fig. 5
figure 5

Modularized subject–method topic network of communication studies over 25 years. (Color figure online)

Topic modules

As mentioned in the “Methodology” section, the modules of the subject–method topic network enable us to identify many-to-many relations among subjects and methods. Figure 5 shows six different modules of topics built by a community detection algorithm are discovered. The largest module (turquoise) has conceptual modeling at its center. The second largest module (green) is centered on media literacy. The third largest module (yellow) has discourse theory at its center. The next largest module (red) has educational technology as a midpoint. The next module (blue) has argumentation as a core. The last module (purple) has web analysis as a hub. In the following, we describe the network module by module in relative to subjects and methods. In addition, we give an example for each module to facilitate better understanding and support.

Module 1 has public health, marketing, relational analysis, journalism, comparing media systems, family, crisis and emergency risk, policy, and organizational communication as subjects. Its methods are conceptual modeling, survey research, and content analysis. In the past 25 years, the subjects and methods in this module were major topics. We can estimate that scholars in communication studies were primarily interested in media and its effects on public health, marketing, journalism, media systems, family, crisis and emergency risk, policy, and organizational communication. Examples of conceptual modeling of comparing media systems are citation analysis (So 2010), critical evaluation (Lo and Wei 2010), critical literature review (Iwabuchi 2010; Kim 2010), and trend analysis (Chen 2009). They are essentially built on previous studies to gain new insights. In other words, over 25 years, scholars placed constant attention on media and its effects.

Module 2 comprises subjects such as culture and media, discrimination, and community. This shows that scholars who treated these subjects tend to use discourse analysis as a study method. The composition of the subjects and the method implies that scholars focused on the interaction between media and society, and analyzed it using a conventional qualitative approach. Examples include neoliberal discourse on gender minorities (Gray and Harris 2015) and a discourse on heteronormativity and animosity to gender minorities in TV (Dhaenens 2014).

Module 3 covers subjects such as disease and national dispute and an argumentation method. We confirm that studies of such subjects were traditionally treated with argumentation. Examples include Antaki and O’Reilly’s (2013) study of communication among children with mental illness. They controlled the order of questions and the situation to observe different results induced when conversing with children with mental illness. They identified advantages and disadvantages according to the results in terms of argumentation. In addition, McMullen and Sigurdson (2014) argued that depression and its inhibitors are analogous to diabetes and insulin.

Module 4 contains public opinion and educational technology as subjects, and incorporates factor analysis as a method. New media has influenced the way we confirm public opinion and the channels used to deliver educational content. We suspect that people were curious about the effects of media and the bridging factors of these effects. For example, Cabero-Almenara and Marín-Díaz (2014) examined the ICT education processes of universities to identify preference between in-class and online education. In particular, they checked the effect of social software on collaboration. In addition, Abad (2014) investigated information divide between ages under the condition of Internet and ICT focused education in Spain.

Module 5 involves broadcasting, media literacy, and gameplay (human–computer interaction) as subjects, and has participatory experiment and interview as methods. We expect that the subjects are based on human experience and concentrate on cognitive process, which are difficult to quantify; thus, qualitative methods were employed. McKeever (2015) examined the effects of empathy responses to articles about highly and lowly similar characters with severe depression on helping people with depression. Manosevitch et al. (2014) studied a way for people to actively participate in political issues on online forums. Restivo and Van De Rijt (2014) conducted an experiment for designing a reward system to promote people to contribute to Wikipedia.

Module 6 has industry and mass media as subjects. It includes a web analysis method. It seems to comprise mass media based on Internet and web-influences industry at its core. Thus, scholars’ efforts into analyzing the relationship between new mass media and industry formed the basis for this module. For example, Karlsson et al. (2015) analyzed the content of hyperlinks within Swedish online news to identify the importance of hyperlinks from a digital journalism perspective. Humphreys et al. (2014) verified that people share private information offline and online through Twitter by studying private information sharing among Twitter users. Wilken (2014) examined the industrial significance of the location-based Facebook platform. He confirmed the effect of local service recommendation and location-based mobile advertising.

By constructing several subject–method topic networks over time, we showed the trends of communication studies in terms of subjects and methods. In this study, we separated 25 years into four periods: 1990–2000, 2001–2005, 2006–2010, and 2011–2014. For each period, we built a subject–method topic network, and each network had three or four modules of topics. Over time, some subjects and methods moved from one module to another, as shown in Table 2. From this, we were able to describe the changes in subjects and methods over time. In the following, we focused on several issues.

Table 2 The changes in modules of subject–method topic network over periods

Family was examined by qualitative approaches, such as interview or content analysis, during the first period. However, from the second period onward, the subject was investigated by conceptual modeling, participatory experiment, web analysis, survey research, and factor analysis. In other words, quantitative approaches began to be applied the subject of family. We guess that the subject has become increasingly important; thus, various methods were utilized and quantitative data regarding the subject became available.

Educational technology, along with subjects such as family and discrimination, was explored by applying interview and content analysis methods during the first period. However, in the following periods, such subjects were mixed with other subjects, such as media, communication, culture, journalism, policy, and national dispute. These subjects were frequently studied using qualitative methods.

Factor analysis was a popular method utilized in the mass media subject during the first period. In the first period, the application of mass media, such as TV, radio, and newspapers, for various purposes required scholars to examine media factors. In addition, after 2001, at which time the Internet had become widespread and media was changing rapidly, various studies explored the relationship between media and other subjects, such as public health, marketing, family, industry, broadcasting, organizational communication, media literacy, public opinion, and human–computer interaction.

Argumentation is a qualitative method that finds reasons for supporting the organization and content of texts. During the first period, argumentation was a popular method, as discourse, applied to crisis and emergency risk, culture and media, policy, media literacy, national dispute, and journalism. During the second and third periods, it was particularly applied to national dispute, policy, and comparing media systems. We confirm this phenomenon in Module 2 in Table 2(b) and Module 4 in Table 2(c). However, as shown by Module 2 in Table 2(d), in the fourth period, it was used along with other qualitative methods, such as interview, discourse analysis, and content analysis, for the comparing media systems, education technology, community, culture and media, discrimination, journalism, national dispute, and policy subjects. It seems that new media technology development during the 21st century generated many social problems that cannot easily be dealt with using quantitative methods; thus, argumentation became popular in many communication studies.

Web analysis was used in various subjects from 1990 to 2005 when the Internet was rapidly penetrating society, as shown in Module 1 in Table 2(a, b). As the Internet population grew larger, people utilized the web in more diverse ways and information transfer via the web increased drastically. After 2006, web analysis became specialized in media literacy. Consider that social media overwhelmed the world in approximately 2005. Social media supports an interactive web. It gave everyone the power to produce digital content, including news, whereas traditional media is dominated by only a handful of companies. In addition, social media allowed people to instantly respond to others. Thereafter, we suppose that the need for new media literacy increased, and our results detect this phenomenon.

The pairing of media literacy and web analysis became an independent module as of 2006. The effect of the web forced people to comprehend different media, analyze various information, and communicate with others. When the period transits from (c) to (d) in Table 2, the number of modules decreases, but the pair still occupies its share. This shows that web-based media became necessary in communication. In particular, mass media was frequently coupled with factor analysis between 1990 and 2000. However, from 2001 to 2005, mass media constructed a module with web analysis. Thereafter, the web became a significant part of mass media and a target for analysis.

Conclusion

We have proposed a method to identify subjects, methods, their relations, and their communities over time, which we refer to as subject–method topic network analysis. This analysis method allows us to review all articles in a research field easily using text mining and bibliometrics. It summarizes articles into several topics classified as a subject or a method. The proposed method links topics in relation to their co-occurrence in a document and constructs a subject–method topic network. We apply the proposed analysis method to a collection of articles listed in the JCR Social Science Edition as communication studies between 1990 and 2014.

We identified subjects such as public health, marketing, relationship analysis, family, industry, crisis and emergency risk, disease, broadcasting, organizational communication, public opinion, mass media, gameplay (human–computer interaction), comparing media systems, educational technology, community, culture and media, discrimination, journalism, national dispute, policy, and media literacy. We also detected methods such as factor analysis, participatory experiment, conceptual modeling, survey research, interview, discourse, content analysis, argumentation, and web analysis. The topic ratio between subjects and methods was 1.5, and that between qualitative methods and quantitative methods was 3. According to the proportion of topics, the most and least reported subjects were media literacy (6 %) and national dispute (1 %), respectively. The most and least reported methods were conceptual modeling (10 %) and argumentation (2 %).

In a centrality analysis of our subject–method topic network, we found that method topics became more centralized than subject topics in general. Prominently, over the years, factor analysis increased in importance more than survey research, and web analysis became more central than content analysis and interview. Thus, we assume that quantitative methods have become more preferable. Specifically, developments in the web and statistical analysis may highlight web analysis and factor analysis methods.

In topic relations, conceptual modeling was the dominant method applied to media literacy, comparing media systems, mass media, organizational communication, education technology, relational analysis, industry, public opinion, and policy. Media literacy also dealt with participatory experiments. These results imply that the subjects relevant to media usage, social change according to media, and media change in communication studies were popular. Conceptual modeling was the major method because communication studies are often developed on top of present theories. The minor subjects in topic relations were national dispute, human–computer interaction, public health, journalism, and crisis and emergency risk, and the corresponding methods were qualitative methods, such as argumentation, interview, and content analysis.

In topic modules, we identified six modules in total. The modules included empirical studies on changes in family, company, nation, and public opinion regarding media, socio-cultural studies by discourse analysis, argumentation studies into disease and national dispute, factor analysis of public opinion and education technology, studies into human experience utilizing media or the effects of media on people, and web analysis for industry and mass media. These modules demonstrate a preference for pairing subjects and methods among scholars in communication studies. Over the years, three or four modules appeared. Interestingly, media literacy and web analysis became an independent module. This implies that the web and Internet have influenced human communication such that we have had to consider literacy from emerging perspectives.

The main contributions of our study are threefold. First, a more comprehensible knowledge structure of a research domain can be obtained. Namely, central subjects, central methods, central subject–method pairs, and central subjects–methods groups of a research field can be discovered easily. Furthermore, our study shows the evolution of the knowledge structure over time. Previously, a knowledge structure often used to be represented by lower-level elements such as words, so that the number of nodes is too many and it is complex to be examined. Second, in informetrics view, a fine-grained unit of analysis is not only found but also the relationships of the units are detected. Third, the subject–method topic network analysis extends the territory of a bipartite network analysis by including topics as nodes. In text mining, a topic is considered as a latent variable which is an abstraction of many terms and represents the terms. The latent variable is often a key to the semantic finding. And thus, this study leverages a bipartite network to become a meaningful bipartite topic network by considering topics as nodes.

This study has three limitations. When labeling topics, we rely on present categorizations and information retrieval from the Google search engine and experts rather than experts in communication studies only. However, we believe that the present categorizations and the precise and recent results from the search engine have helped us label topics sufficiently. We connected subject and method topics by their co-occurrence, which shows an indirect relationship between topics. Yet, co-occurrence is widely used in bibliometric content analysis. In addition, we have only used weighted degree centrality and our understanding may improve by using other centralities.

In future, we will use the named entity recognition technique in text mining to label topics, along with the help of experts in communication studies. We can also consider relational terms in topics that can connect topics semantically. In addition, we can merge a subject–method topic network from another research field with the communication studies network to speculate on convergence between the two fields.