Keywords

1 Introduction

The widespread adoption of social media is challenging the way traditional media have been used to distribute news, and to discuss top social and political issues [23, 36, 41]. A large body of Computational Social Science research focuses on the study of individuals and their behaviors on such platforms [7, 35, 44]. Various seminal papers investigate social and political conversations on social platforms like Twitter [3, 19, 45, 48] and Facebook [6, 8, 20]. Yet, little work has been devoted to understand how the main actors of political discussion, the politician themselves, adopt and leverage such platforms [10, 29, 31]. During the 2008 Presidential Election, Barack Obama used fifteen social media sites to support his campaign. His successful effort demonstrated the central role of Twitter and other social platforms as integral parts of modern political communication. Since then, online political discussion and the attention toward political candidates and political figures, and their social media presence, arose. Politicians are influential figures in the offline world, and surely can acquire a great deal of influence in the social media spheres as well. Their social media activity, in turn, can alter their success and affect their careers, especially during election time. The online campaigns preceding the 2016 Presidential Election carried out by both parties in support of various potential nominees, including Hillary Clinton, Bernie Sanders, and Donald Trump, further demonstrate the social media power to shape the political scene [51, 52]. A better understanding of politicians’ usage of social media channels for political conversation could therefore reveal something about the complex mechanisms of political success in the era of social politics.

Yet, social media are not limited to political “propaganda”. The effects of social media political communication on the offline world are tangible. Examples of political campaigns that preceded mass mobilizations and civilian protests include the Arab Springs [26, 32], Occupy Wall Street [13, 14], and the Gezi Park protest [50]. Although it is difficult to establish a causality link, we can safely say that the “Twittersphere” can be a strong indicator of political and public opinion [49]. The open nature of TwitterFootnote 1 probably contributed to determine its political communicative power. The ability to communicate interesting political issues yields the opportunity to users to acquire more visibility and influence [2, 9, 43], although Twitter political discussion is plagued by a number of issues related to manipulation and abuse [21, 22, 45].

In this paper we explore how the main actors of political discussion, the politicians, adopt Twitter to cover social and political issues. We focus on U.S. President Obama and all the 50 U.S. State Governors, and adopt the framework of agenda-setting theory to identify their main topics of discussion. The analysis of over one hundred thousand of their tweets reveals how Governors and the President use Twitter, what are the emerging patterns of political discussion, the top issues for each party, and finally who are the politicians who exhibit the most coherent political agenda.

2 Social Media and Politics

Twitter was born in 2006. In less than 10 years, it acquired half billion users, 310 million of which are active and produce over 500 million tweets per day as of July 2016.Footnote 2 Twitter suggests that “each tweet represents an opportunity to show one’s voice and strengthen relationships with one’s followers".Footnote 3 As a modern political toolbox, Twitter has been widely used by various Presidents, Congressmen, Governors, and other politicians all over the world. In particular in the United States, Twitter and other social media have been not only the subject of extensive research, but also the platforms used to run large-scale social experiments to study political mobilization [6]. Scholars from various disciplines have investigated the role of these platforms in modern political communication.

Generally, social media research related to politics can be categorized into two fields. The former focuses on the possibility of using social media signals to predict political elections. A large number of papers faced this challenging question, with at times promising results. For example, Gibson and McAllister’s study [27] demonstrated a significant relationship between online campaigning and candidate support. Macnamara found evidence of a “significant online political engagement” in the 2008 U.S. Presidential Election [37]. Other studies covered the U.S. Presidential debate and Twitter sentiment, finding an alignment between popular opinions and votes [17, 18, 48]. Despite some promising work, the issue of predicting elections using social data remains debated [25].

The second area of research investigates Twitter users’ behaviors, opinions and topics of political interest, at times proposing methods to identify their political alignments [11, 15]. Some of these studies highlighted interesting socio-political phenomena: for example, Conover et al. [12] found that the network of political retweets exhibits a highly segregated bipartisan structure, which seems to reflect the users’ political leanings, similarly to political blogs [1]. Shogan’s et al. research showed that, in recent years, Republican politicians tweeted more than five times as often as Democrats, suggesting that Twitter might be particularly appealing to American opposition politicians, who use it as an instrument for voicing their dissent directly to the public [28, 47]. A study conducted by Chi and Yang [10] found that Democratic congressmen tend to release information that citizens want to hear, while Republican congressmen share with the citizens their own agenda. Hemphill’s work suggested that Congressmen of opposing parties use very different strategies to choose the hashtags that better reflect their framing efforts [30].

It appears that most literature either focuses on Twitter and elections, especially before and during election time, or focuses on President or Congressmen, even though “most Americans have more daily contacts with their state and local governments than with the federal government”.Footnote 4

Studies on State Governors and their social media presence are absent, and this paper aims at filling this gap. Although some research focuses on how politicians use social media before and during their election, what happens after that? Voters are excited about their party’s success, and they are vocal about it. What comes after this initial excitement? We want to shed light on which Governors really follow their agenda after their election, and determine whether a framing of clear intents and goals emerges from their political channels online.

As of April 25, 2015, the 50 U.S. State Governors in charge collectively gathered over 3 million followers and sent out over 150,000 tweets. Though the majority of their Twitter accounts are merely political, some, such as Michigan Governor Rick Snyder’s “OneToughNerd” account, show some character’s personality traits, while others lend a certain intimacy, for example including family pictures like for Maryland Governor Larry Hogan, New Jersey Governor Chris Christie, Maine Governor Paul LePage and Louisiana Bobby Jingdal. Balancing personal lives and public service information makes State Governors’ Twitter accounts very interesting objects to study the Governors’ political stance in front of the public. This paper tries to dig into this unexplored field to analyze the State Governors’ Twitter accounts by using agenda-setting theory, to understand whether the State Governors’ activity on Twitter can be used to predict the popularity of parties or coalitions.

3 Agenda-Setting Theory

Twitter allows politicians to set their political agenda and reach their audience directly. Studying their behaviors brings the promising opportunity to further our understanding of agenda setting in digital media [46]. The agenda-setting theory is regarded as a key element to explain mass communication effects and mass media influence in long-term conditions. The primary assumptions of the theory were formulated by Maxwell McCombs and Donald Shaw in 1972 [39]. Agenda setting is one of the most widely used theories in communication studies since then [33, 34, 38, 53, 54].

Agenda setting is the filter mass media perform when selecting certain issues and portraying them frequently and prominently, which leads people to perceive those issues as more important than others. Two levels of agenda-setting theory will be used in this study. The first-level agenda setting focuses on the amount of coverage of an issue, suggesting which issues the public will be more likely to be exposed to. The second-level agenda setting, also called framing as suggested by McCombs, Shaw and Weaver [40], examines the influence of attribute salience, or the properties, qualities, characteristics, and relations. By making some political issues salient, agenda setting makes these specific issues more accessible than others.

The first level of agenda setting is the issue level. Though some scholars categorize top issues manually [46], we plan to use top issues listed on the White House’s homepage. As of April 2015, the top seven issues listed were: economy, education, foreign policy, health care, immigration, climate change, energy and environment, and civil rights. April 2015 is also the time of our Twitter data collection. We will try to identify whether politicians give attention to these issues by analyzing how often kewords and hashtags related to these issue are mentioned on their Twitter accounts. In the second level of agenda setting, we will analyze whether Democrats and Republicans highlight different attributes of the same issue by examining the hashtags and keywords they choose when they do discuss an issue. We will also examine those hashtags and keywords relations by constructing occurrence networks to see how those hashtags and keywords are framed in the Governors’ tweets.

Many researchers found different tweeting patterns among Democrats and Republicans Congressman, such as Shogan et al. [28, 47] and Chi and Yang [10]. Our research as well aims to find whether State Governors’ Twitter accounts exhibit different levels of engagement. Then, we would like to further our understanding of the general patterns of usage, applying the second level agenda-setting theory, or framing, to scrutinize the hashtags and keywords network structure. Hence, we formalize the following three research questions:

RQ 1: How frequently do Governors use Twitter to discuss their political agenda? Do party differences emerge?

RQ 2: How do Governors’ Twitter accounts reflect their political agenda, and how similar political agendas are across Governors?

RQ 3: What similarities and differences emerge in hashtag usage among Governors’ Twitter accounts?

4 Data Collection

We used the Governors’ timelines to reference the tweets from the 50 U.S. Governors and the U.S. President Barack Obama. We collected 114,316 tweets from the Governors’ timelines. We downloaded the stream of tweets for each account by querying the Twitter Public API for user timeline by using a manually-collected list of account names. This returns the entire stream of tweets for each account, avoiding sampling issues [42]. We performed the queries between January 23 and April 26, 2015, for all 51 accounts, in a systematic way and with a 100 s pause between each account. The pause was set to prevent our script from sending queries that exceed the rate limitation of the API. All data were finally stored into a JSON file and later analyzed.

We parsed each tweet to extract words and hashtags using the regular expression package re with Python 3. We first removed the URLs by excluding patterns starting with http, https, ftp, and mailto. Then, tweet texts were converted into lowercase for consistency. Finally, we obtained hashtags and words by another set of regular expressions. The hashtags were defined as sets of concatenated characters starting with a pound sign (#), while the words were defined as concatenated sets that start and end with alpha-numeric characters.

We identified the keywords by manually looking for the most frequent words that could be indicative of specific topics and sound meaningful to ordinary readers. To identify what could be the candidate words associated with each topic, we first manually parsed our collection of tweets and assigned the words that appeared together with the target topic as the candidate word selection for that topic.Footnote 5 For example, when we query for “health care” we will assign each of the 17 words (we, will, fight, to, protect, the, healthcare, of, Floridians, their, right, to, be, free, from, federal, overreach) appearing in the tweet “We will fight to protect the healthcare of Floridians & their right to be free from federal overreach.” as a candidate choice of keywords for health care. All the stop-words that were identified by the Python Natural Language Toolkit (NLTK) were removed. In the previous example, the set of candidate words after this further cleansing is reduced to (fight, protect, heathcare, Floridians, right, free, federal, overreach). The next step was to remove the words that are syntactically needed but not contextually meaningful. We identified the words that were a keywords of more than one topic and manually marked them to be further removed or not. Words that were shared by more than one topic were marked to be deleted if we were unable to find a potential topic for them; words that possibly related to any of the topics were marked to be kept. In the example, words to be deleted included: fight, protect, Floridians, right, federal, overreach. These words could not be attached unequivocally to any one topic. For example, the words fight and protect appeared more often attached to foreign and immigration issues, and the word right appeared more often related to civil right issues. Words to be kept included: healthcare (as well as health care with a space), and freedom, which could be assigned to health care, in particular related to the Affordable Care Act (or, ObamaCare). After we identified which words to delete or keep, we then updated the sets of each candidate keywords for each topic. We then ranked each candidate keywords by their overall frequency in our collection. The top seven candidate keywords for each category were used to identify the topic of each tweet. We assigned a tweet to a topic whenever any of the 7 keywords for a topic appeared in a tweet. The topics were not mutually exclusive: in other words, one tweet could be assigned to more than one topic when the top candidate keywords from different categories occurred in a tweet. We counted the numbers of tweets for each topic among the Governors. The agenda was finally recovered by ranking the topics by the numbers of tweets associated to them: the results are displayed in Table 1. The assessment of the quality of the agenda produced by our semi-automatic method is satisfactory: the seven topics are each clearly identified by a short list of intuitive keywords. By means of the same approach, we varied the number of keywords to include more words, finding that the results (discussed later) were substantially unaltered. Finally, the proposed method to generate the agenda was preferred over traditional topic modeling techniques that we tested, such as LDA, because of the inability of such probabilistic generative models [4] to discriminate between topics related to issues relevant to politics, and other irrelevant (for our purpose) topics that appeared in the Governors’ Twitter timelines.

Table 1. Top words per category

5 Experimental Results

5.1 Overall Tweeting Patterns

To try answer RQ 1, we analyze the 114,316 tweets collected from the Governor’s timelines. The amount of tweets produced by each Governor ranged from 30 to 3,242, with a median of 2,838. These figures demonstrate that the majority of Governors is quite active on the platform. There were 46,125 tweets posted by the 19 Democrat Governors, and 68,047 by the 30 Republican ones: this suggests that, on average, each Democrat produced 2,427 tweets, and each Republican posted 2268 tweets; this difference is not particularly significant. President Obama contributed 3,242 to the Democrats, and the independent Governor of Arkansas had 144 tweets. We were able to identify 75,202 hashtags and words from the tweet texts after removing the URLs. Democrat Governors used 50,960 words while Republican governors used 41,263. The Democrats also tweeted more distinct hashtags, 6,463, while Republicans had only 4,264. A previous study conducted by Shogan et al. [28, 47] on the House tweeting patterns suggested that Republicans tweet more, and Twitter might be particularly appealing to the American opposition politicians. Our analysis demonstrates that there are no significant differences in terms of average posting volumes between the two parties, and the larger sheer number of Republican tweets is to be attributed to the significantly greater number of Republican Governors (30 versus 19 Democrats). However some stylistic differences emerge, in that Democrat Governors seem to make a much more pervasive and diverse use of hashtags than Republicans.

5.2 Political Agenda and Keywords Usage

To answer RQ 2, we plan to describe each Governor’s posting behavior according to the agenda we defined in Table 1. For each Governor’s account, we calculated the number of times each keyword of Table 1 appeared in any of the Governor’s tweets. By sorting this dictionary of keywords and relative usage in descending order, we can obtain a rank of each Governor’s keyword usage. We can therefore use the ranked keyword dictionaries to perform pairwise comparisons of Governors and try capture similarities and differences in priorities regarding the categories of political discussion. Note that using rankings is preferable to using simple feature vectors of keyword counts: ranks are more amenable to direct comparisons (for example via Spearman’s rank correlation) without data normalization to account for different intensity of activities and other biases.

To measure the correlation of discussion keywords between all pairs of Governors, we use Spearman’s correlation applied on their ranked keyword dictionaries. Spearman’s rank correlation assigns each pair \(<X_i, X_j>\) a similarity score between \(-1\) and 1, with \(X_i\) and \(X_j\) being the keyword ranks of Governors i and j respectively. Score of 1 and \(-1\) indicate perfect positive and negative correlation, respectively, whereas a score of 0 suggests no correlation. To understand the distribution of pairwise correlation scores, we plotted Fig. 1. The range of scores spans roughly from \(-0.2\) (indicating a slight negative correlation) to very strong positive correlation scores greater than 0.8. The skewness towards positive scores can be attributed to the fact that we have considered only seven words per category, with seven total categories, for determining the rank distributions.

Fig. 1.
figure 1

Distribution of spearman rank correlation scores

Figure 2 shows the matrix of pairwise Spearman correlations among the 50 U.S. Governors plus the U.S. President Barack Obama. The visual inspection of Fig. 2 suggests the presence of a strong block structure, as groups of highly correlated accounts happen to be clearly identifiable. To further inspect this hypothesis, we generated a weighted graph of inter-Governor similarity using the matrix of Fig. 2 as adjacency matrix. The resulting graph is displayed in Fig. 3, where for visual clarity, self-loops have been removed and all edges with weights (i.e., Spearman correlation) less than 0.8 have been filtered out. Figure 3 captures the agenda similarity network among Governors. Its analysis suggests the emergence of a strong community structure, where some Republican and Democratic Governors appear to be strongly aligned on agenda priorities and form two tight clusters: the large red cluster revolves around Wisconsin Governor Scott Walker, North Carolina’s Pat McCrory, Mississippi’s Phil Bryant, Iowa’s Terry Branstad, (former) Indiana Governor (and current Vice President nominee) Mike Pence, Maine’s Paul LePage, and few others.

The similarity, in terms of agenda priorities (as measured by the rank correlation) seems to be slightly less pronounced for Democrats: President Barack Obama seems to be isolated and carrying out an agenda significantly different from any other Democratic Governor. A blue cluster emerges with Colorado Governor John Hickenlooper, Missouri’s Jay Nixon, Kentucky’s Steve Beshear, and Washington’s Jay Inslee, and Vermont’s Peter Shumlin and few others show some agenda similarity. All the other Governors somehow sit at the periphery of this network showing spurious alignments with some of their counterparts, and a less pronounces inter-party agenda priority sharing.

Fig. 2.
figure 2

Keyword-based correlation among Governors

Fig. 3.
figure 3

Governors network through the lens of agenda setting theory

5.3 The Governor-Hashtag Graph

To address RQ 3, we finally explored the similarity among the governors at a hashtag level. We extracted the hashtags from each Governor’s timeline and created a Governor-hashtag graph. The nodes in this bipartite graph represent the Governors and the hashtags they used. A Governor node and a hashtag node would be connected if the Governor had used the hashtag in any of his/her tweet. The weight is the number of tweets that contain that hashtag. We only extracted the hashtags that were used more than 10 times among all the Governors and by more than two Governors, to focus specifically on more common hashtags. We were able to identify 658 common hashtags that occurred more than 10 times and were used by more than two Governors from our collection. We also tried to recover the community structure by using the Louvain modularity maximization algorithm [5]. The result for the Governors’ hashtag usage are demonstrated in Fig. 4. The graph only represents the nodes that were connected with edges with weights larger than four, for visual clarity. The large circles denoted the nodes for Governors, and the small ones were nodes for hashtags.

We were able to identify four communities using the modularity algorithm with the resolution set to 2.0. Varying the resolution limit parameter [24] provided consistent results. The four communities contained 36, 9, 3 and 3 Governors, respectively, as shown in Fig. 4. We colored the largest community in red to indicate that it’s the community with the largest fraction of Republican Governors (24). The second community is colored in blue to indicate that it’s the community with the largest fraction of Democrats (8). The other two communities were colored in green and purple, respectively. We believe that the green cluster should belong to the Democrats (it contains Dems like Vermont’s Peter Shumlin and New Hampshire’s Maggie Hassan); the purple cluster contains several Republican Governors (e.g., Ohio’s John Kasich and Maine’s Paul LePage). Overall, the clustering algorithm assignment was correct for 32 of the 51 Governors (62.7 %). It generated 24 correct assignments out of the 30 Republicans (80.0 %), 8 correct among the 19 Democrats (42.1 %), and the Independent Governor of Arkansas was assigned to the reds.

In light of the most meaningful keywords for each of the seven categories summarized by Table 1, we parsed each Governor’s timeline to determine to what extent the tweets of each individual were representative of each category. The underlying assumption of this strategy is that the more a State Governor tweets about any particular category, the more he/she is concerned about that particular issue, or at least wants to convey that message to his/her followers. In general, for both parties, it is quite easy to scrutinize the most recurring topics of discussion of each Governor and identify those who concentrate more or less on politics and policy related topics, or other types of events.

Fig. 4.
figure 4

Governors and hashtag network (Color figure online)

Figure 4 illustrates the most commonly occurring hashtags and issues of discussion of the two groups. Its analysis yields a good amount of insights into U.S. political discussion. One can notice the commitment of certain Governors to specific topics: for example, Vermont Governor Peter Shumlin seems pushing an agenda focused on environment, energy, and local economy issues. Other Democrats, like Connecticut Governor Dan Malloy, Arkansas’ Asa Huthinson, the U.S. President Obama, focus on issues related to climate change, equality, health care, and education.

The Republican agenda is sufficiently diverse but focuses mostly on issues related to economy (small business, innovation, “made in USA”, agriculture), immigration and security (human trafficking, Texas), and civil rights (especially veterans’, military, and marriage rights). A number of external events are also discussed (note that we did not remove any hashtags from the Governor-hashtag graph as long as it matched the threshold criteria explained above): some examples include reference to sport events (Nascar, Basket’s March Madness, etc.), political events (2012 Elections, the GOP Convention, etc.) and tragedies (the Boston Marathon bombing, the Sandy Hook school shooting, etc.).

6 Conclusions

In this article we explored the landscape of U.S. Governors political communication on Twitter using the tool of agenda setting theory. We first collected a sizable amount of tweets (over one hundred thousand) generated by these politicians, and assessed that most of them are quite active Twitter users. Our results clarified some previous research about the usage of social media platforms by Democratic and Republican politicians, showing that Republican and Democrat Governors tend to be more or less equally active on Twitter on average, however they exhibit different styles of communication, with the Democrats significantly more inclined to use hashtags than their counterparts.

We furthered our understanding of Governors’ priorities using the agenda-setting theory to identify a set of seven categories of top socio-political issues, by means of a semi-automatic annotation strategy. After inferring the priorities of each Governor, and computing the pairwise similarity among Governors, we constructed a network that reflects Governor agendas similarity. Its analysis illustrates that President Obama has a distinctive agenda-setting strategy, which has no affinity with either Democrats or Republicans.

The graph also shows that Republican Governors, such as Wisconsin Governor Walker, North Carolina’s McCrory, Mississippi’s Bryant, Iowa’s Branstad, Indiana’s Pence, Maine’s Paul LePage, and few others, shared the most similar issue agenda settings. On the Democratic side, Colorado Governor Hickenlooper, Missouri’s Nixon, Kentucky’s Beshear, and Washington’s Inslee, and Vermont’s Shumlin and few others form a tight blue cluster of aligned agendas. Republican and Democratic Governors’ clusters tend to be quite polarized, which confirms the intuition that the two parties share significantly different agendas (at times conflicting) and different political priorities. Similar insights emerged from the analysis of the hashtag co-occurrence networks, which allows for an easy identification of the topics of discussion of both parties.

This study displayed the high-level dynamics of adoption of Twitter by U.S. Governors based on how they set their agenda on top political issues and how they frame their conversation around it. Further studies should explore the public agenda setting, which means the agenda setting of the public in each State, to see if these share similar trends with their Governors’ agendas. This would shed light on the effects of politicians’ social media conversation on the public.