Introduction

In recent years, due to the wide diffusion of mobile technologies and easy availability of online sources, interest in the content that users generate and share through the Internet has constantly grown. In particular, researchers have investigated microblogs and social networks not only to understand interests, sentiment, opinions and social behaviours but also to make sense of the huge amount of information coming from human sensors (Goodchild 2007) to detect anomalies, to note interesting events or to support decision makers.

With 500 million tweets a day and 271 million active users (Twitter Advertising Blog 2015), Twitter is a representative case of user-generated content and has therefore attracted several researchers from different fields including computer science, economics, and the geo-sciences. Nevertheless, the comprehension of such data is not an immediate process due to the particular nature of the tweets. In fact, each tweet is a piece of unstructured text that can have up to 140 characters, can contain abbreviations or slang forms and typing errors, can include links to videos or photos, can include one or more hashtags (preceded by a “#”) to indicate the subject(s) of a tweet, can be a direct answer to another user (identified by a “@” followed by the user name of the answered tweet), or can reprise and pass along another tweet (indicated by “RT”, or retweet). Yet, each user can create his/her own network of contacts by subscribing to, or becoming a follower of, other user(s) based on shared interests, familiar/friendship relationships, or the followed user being a public person or an institution. Because the only limit to the content of a tweet is the user’s imagination, these data cover a very wide spectrum of themes, which can be useful in several domains, but they are also extremely noisy.Footnote 1 This noise can prevent detecting interesting events or behaviours. Moreover, apart from the content of the tweet itself, several other aspects can be considered when working with these data, e.g., the location in which a comment has been tweeted, when this tweet occurred, and whether this information is reliable. In many application domains, the analysis of huge amounts of data can take advantage of the relatively new field called Visual Analytics (VA). Defined as the science of analytical reasoning facilitated by interactive visual interfaces (Thomas and Cook 2005), the strength of VA resides in the intertwined use of analytical and visual methods supported by human-computer interaction, e.g., to discover interesting patterns, unknown insights, events, and relationships among these.

Thus, the focus of this survey is on the visual analytics approaches designed to analyse the microblogging content of Twitter data, taking into account the spatial and temporal aspects We are aware that the strong influence of a single commercial product such as Twitter might limit to a certain extent the validity of the conclusions; however, several benefits come from our analysis. First, despite the high level of interest in the research community, to the best of our knowledge, no survey covers this topic; therefore, our analysis represents a first attempt to fill this gap. Second, the approach we propose can be extended to other commercial products such as (Sina Weibo 2009) and to other textual sources. Third, the business strategies recently adopted by Twitter, such as the acquisition of the live-video streaming start-up Periscope (Yoree and Evelin 2015) and the planned sale of trillions of tweets to data miners (Garside 2015), suggest that this company will continue generating social content in the near future. Therefore, a general overview on approaches to analysing this content will be of interest in the VA research field.

Thus, the major contributions of this work are the following:

  • An overview of the visual analytics approaches for the spatio-temporal exploration of microblogging content. With a particular focus on the research questions that inspired these approaches and on the application fields to which they were applied, we identified three main research directions and grouped the application fields into a limited set of categories.

  • A quantitative and qualitative comparison of analytical and visual methods proposed by the surveyed papers. These methods are organized according to their characteristics and purpose, and different tables summarize and illustrate common aspects and interesting and unusual insights.

  • Identification of present and future challenges that we feel are worth further investigation. On the one hand, we note the challenges that can be identified in the surveyed papers; on the other hand, we also identify the open issues resulting from our comparisons.

  • A simple but effective collection of interactive charts to explore the surveyed papers. Using a freely available tool, we created an interactive online counterpart. These resources might allow, for example, researchers and students to interact and further explore the surveyed papers from different points of view such as research questions they address, analytical and visual methods adopted, and publication year.

The remainder of the paper is structured as follows. After the description of adopted methodology and related work (Methodology to Analyse Microblogging Content Based on the Example of Twitter and Related Work sections), the core contributions of the paper are presented. From a general overview (A General Comparison of Research on Microblogging Content section) addressing research questions and application fields, we move to a more detailed comparison concentrating on analytical and visual methods, in which we emphasize common issues, limitations and interesting cases (Detailed Comparison of Research on Microblogging Content Using Twitter as an Example section and, for the online counterpart, An Interactive State of the Art section). Finally, in addition to these analyses, we delineate opportunities and challenges that future research should investigate (Challenges section).

Methodology to Analyse Microblogging Content Based on the Example of Twitter

In general, different approaches have been adopted in recent years to analyse and comprehend microblogging content. They can be grouped into (i) qualitative, (ii) quantitative, and (iii) mixed approaches, which are presented in the following.

Several solutions address the content of the data directly and are based on qualitative approaches. On the one hand, one can investigate the purpose of single tweets and user intentions and then query why people tweet; on the other hand, one can explore what users are talking about, the subjects of their tweets and possibly trending topics. In the first case (why), (Java et al. 2007) identify four major user intentions: a) communicating about daily routine (called daily chatter); b) conversation (using a “@”); c) sharing information or URLs; and d) reporting news. Similarly, (Alhadi et al. 2011) define eight different purposes for microblogging activities such as Twitter: a) social interaction with people; b) promotion or marketing; c) share resources; d) give or require feedback, e) broadcast alert/urgent information; f) require/raise funding; g) recruit a worker; and h) express emotions. In the second case (what), (Dann 2010) suggests the use of six top-level categories and 23 distinct subcategories for in-depth analysis of Twitter. The top-level categories are a) conversational; b) pass along; c) news; d) status; e) phatic; and f) spam. Similarly, (Crawford 2010) classifies the tweets into eight categories: a) user’s current status; b) private conversations; c) links to web content; d) politics, sports, and current events; e) product recommendations/complaints; f) advertising; g) spam; and h) other (i.e., messages that match none of the above categories). Finally, one can focus on the content of tweets with an analysis based on the hashtags, such as (Kamath et al. 2013), to identify the most relevant subjects. However, from our experience and also from a recent study (Potts et al. 2011), although the use of hashtags to characterize a tweet is well known, most users tend to abuse (Woollaston 2013) or misuse (Henry 2012) hashtags or even to neglect the adoption of hashtags, leading to an even more complicated and most likely less reliable analysis.

Quantitative approaches are also commonly adopted. These approaches consider, e.g., the hourly number of tweets related to a given topic, the daily distribution of specific terms or topics, and number of tweets per user. Such an approach can be applied in several fields, for instance, to detect influenza epidemics (Culotta 2010), emergencies such as fires (Dobson and Fisher 2003), or other types of events spatially and temporally located (Fujita 2013).

Mixed approaches are also applicable; these are based, e.g., on user interviews and query logs analysis. Thus, one can explore, for instance, the search behaviour of Twitter users (Teevan et al. 2011) and discover that users tend to seek temporally relevant information (e.g., breaking news and popular topics) or information related to people (e.g., comments or opinions about a film). Users monitor such results by repeating the same queries.

The focus of this survey is the literature addressing the content (what) of Twitter data, that is, approaches and methods more related to a qualitative investigation. The most important scientific digital libraries and journals were retrieved, and the most relevant venues were considered to collect the necessary material (even in the form of videos whenever possible). The temporal interval considered spans from 2008 to 2016, both for publications and venues. Considering that Twitter was launched in July 2006 and required approximately two years to reach a critical amount of users and popularity, this constraint appears reasonable. This intuition is supported by the fact that one of the oldest tools (likely the oldest one) considering all of the aspects we focus on was presented in 2009 at InfoVis09 (see Tweetsters (Kim et al. 2009)).

We considered 80 publications;Footnote 2 concerning the typology of publication (see Fig. 1), we distinguished between journal and conference publications. Moreover, we grouped the papers according to three main categories, that is, geo (or generically spatial) related, data/knowledge engineering-related publications, plus a last category (“other”) containing the works not belonging to the previous ones and those summarizing all venues in a single paper. For more details, please consult Table 1.

Fig. 1
figure 1

Number of papers based on typology (journal or conference) and category. The yellow bar summarizes venues with a single paper and that are not included in the other three main publication categories (visual, geo, or data/knowledge engineering-related).

Table 1 Typology and category of publication. The first column shows typology and category of the surveyed papers; the second column provides examples of journals or conferences belonging to the typology/category in the corresponding left cell

After collecting the papers, the next step was devoted to the definition of criteria based on which of these works would be organized and compared. Two main background issues led this phase: first, what the researchers were aiming at as they investigated the field, that is, the research questions guiding their work. Second, because the authors assign to VA a relevant role in any of the examined works, to understand to what extent this assignment occurs, we examined the papers to determine what analytical methods are commonly adopted, whether the user is involved, what visual methods are usually employed, the aim they for which they are intended and the type of interaction they provide. The results of our comparison were organized in different tables, providing both quantitative and qualitative information and several interesting insights. Moreover, an interactive online counterpart of these tables is made available; thus, the reader (e.g., a student or scientist) can easily obtain access to the content and insights of this survey and further explore the papers according to his/her interests.

Finally, the authors would like to emphasize that, although they are aware that publications might be missing, they are convinced that this omission does not hinder deriving general conclusions or identifying weaknesses and potentialities of this research field. Similarly, this work clearly represents a first attempt in this field. Consequently, the structure of the comparison here presented shall be extended or partially revised in the future based on possible new discoveries.

Related Work

As mentioned in the introduction, to the best of our knowledge, no previous work covers the topic we address, that is, how visual analytics approaches are designed to explore the microblogging content of Twitter data, including considering the spatial and temporal aspects. Surveys on Twitter do exist, but rather than on the content, they focus on the need for new data management and query language frameworks (e.g. (Goonetilleke et al. 2014)). However, we must note that, although they represent interesting examples of the visual analysis of tweets, notable works were intentionally not included in our survey because they completely neglect the spatial aspects of the data. We refer for instance to systems such as TIARA (Pan et al. 2013), or applications such as #FluxFlow (Zhao et al. 2014), that address the analysis of large collections of textual data and focus respectively on the temporal evolution of topics (TIARA) and on the anomalous spreading of information (#FluxFlow). Thus, we can extend our comparison a step further to works investigating textual data in general. In particular, a recent paper (Wanner et al. 2014) provides a comprehensive overview of visual analysis approaches for event detection in textual data streams, and Twitter data are considered one of the available textual sources. However, as our survey emphasizes, this paper can be further extended from different points of view. First, event detection is one of the main research categories that can be identified in the literature, and the other categories we identified also contain valuable works. From a methodological point of view, relevant methods such as tf-idf are excluded because they are judged elementary. Independent of whether the methods are elementary, we show that they are frequently adopted in the analysis of tweets; therefore, they cannot be excluded tout court. Similarly, the spatial aspect of data is almost completely neglected, whereas we illustrate how tightly connected the temporal and spatial components are, particularly in the tweet analysis. Finally, whereas (Wanner et al. 2014) provide a vague statement concerning the limited user interaction with the analytical methods, we emphasize the lack of user involvement with concrete examples and references.

A General Comparison of Research on Microblogging Content

In this section, the first part of the comparison of the surveyed papers is illustrated. To perform such a task, we start with a more general overview that considers two criteria that represent an intuitive approach to arranging the examined works – the research questions that the authors try to address and the application fields in which they employ their approach. Thus, one can comprehend (also from a quantitative point of view) the research foci in which the scientists concentrate their efforts and the application fields for which the analysis of Twitter data was considered of interest.

Concerning the former criterion (research question), independent of formulation, application context and methods adopted, these research questions share basic concepts such as, for example, the volume of data, noise, relevant events, and normal or abnormal situations. After collecting and arranging them, we believe they can be summarized into three main categories: 1) research questions addressing support (for) decision-making; 2) those aiming at event detection; and 3) those concentrating generically on the comprehension of what people discuss, e.g., what are the opinions or sentiments related to a certain topic such as a concert or a football game. The columns “Research question” and “Main research category” (see Table 2 Footnote 3) emphasize this fact with more detail. For example, the main research category support (for) decision-making includes questions such as “How can "noise" from tweeting during a crisis be reduced?” (Pritchard et al. 2012) and “How can crisis management and decision-making be supported using social media?” (MacEachren et al. 2011a; Yin et al. 2012; Robinson et al. 2013). Event detection encompasses questions such as “How can important events be distinguished from not-relevant groups of information in a large volume of data?” (Bosch et al. 2013), “Can human sensors be adopted to detect earthquakes quickly and with good accuracy?” (Sakaki et al. 2013), and “Is it possible to identify and support the exploration of relevant events in a large volume of data?” (Dou et al. 2012). Finally, the category comprehension comprises issues such as “How can microblog (Twitter) content and its changes over time be identified and represented?” (Lohmann et al. 2012) and “When and where is an idea dispersed? How can it be represented with opinion/sentiment about it?” (Cao et al. 2012). Another interesting insight is the temporal distribution of these categories. As the digital counterpart of this table shows (see An Interactive State of the Art section), the interest of researchers in event detection clearly increased beginning in 2012. This increase can be related to the increasing use of Twitter as a further source of information, e.g., in the case of an earthquake.

Table 2 Research questions and application fields. The number of times each application field is adopted is emphasized between brackets (e.g., “earthquake” was adopted as a case study 11 times). In the last column, SDM, ED, C stand for support to decision-making, event detection, and comprehension respectively

Concerning the latter criterion (application field), because the contexts in which the human tweeting activity has been considered until now, we not only list these application focusses but also try to further group them into five major groups (first six columns of Table 2): conversational, sport and social events, crime, health & epidemic outbreak, disaster & crisis management, and information diffusion. Note that the same work belonging in some cases to more than one group appears reasonable; in fact, the same solution can be applied to and cover different application contexts. However, this table emphasizes two additional facts. First, the scientists investigated how human social behaviour can be analysed, understood and forecasted by employing these data (column conversational). Second, from a more-quantitative perspective, they tried intensively to understand what type of support this source of information could provide during natural disasters or health-related problems (columns disaster & crisis management and health & epidemic outbreak).

As final remark, we note that different kind of users can be involved in any of the identified main research categories, spanning from less skilled users to experts. For instance, practitioners and analysts with knowledge in visualization and GIScience are involved in (MacEachren et al. 2011a) to support a decision-making process; generic users (e.g. a journalist with basic knowledge of visual tools) can take part in the event detection shown in (Dou et al. 2012; Marcus et al. 2011), and investigate (i.e. comprehension) interests and emotions of tweeters (Saravia et al. 2015), or the geographical distribution and occurrences of keywords (Klomklao et al. 2016).

Detailed Comparison of Research on Microblogging Content Using Twitter as an Example

In this section, the second part of the comparison of the surveyed papers will be presented because of the importance that the examined works give to GeoVisual Analytics. The strength of GVA resides in the intertwined use of analytical and geovisual methods as supported by human-computer interaction. To understand to what extent this use occurs, the papers have been examined to determine what analytical methods are commonly adopted and whether the user is involved (Analytical Methods section). Similarly, we investigated what visual methods are usually employed, the aim for which they are intended and the type of interaction they provide (Visual Methods section). Considerations about the type of analysis (static or dynamic) are also presented (Static vs. Dynamic section). Finally, a short presentation of the online counterpart of our state of the art study is illustrated (An Interactive State of the Art section).

Analytical Methods

Table 3 shows the results of our comparison and some interesting insights. First, a large variety of analytical methods (see the bottommost cells in each column) are adopted to analyse the microblogging content. However, they can be grouped into 7 major categories or general methods (the uppermost cells with bold text): those related to speech or entity analysis (e.g., natural language processing,Footnote 4 part of the speech) and to terms or keywords analysis (e.g., term frequency or document frequency); those based on probabilistic models (e.g., Latent Dirichlet Allocation, shortly LDA (Blei et al. 2003)), classification, or clustering; and those trying to extend and improve the information coming from the tweet using, e.g., a link to videos or pictures or the semantic meaning of terms (the category is named enrichment for this reason). The final category includes those methods not belonging to any of the previous categories because they are very specific or adopted in a single or very few cases.

Table 3 Identified analytical methods. The column “other” includes methods not belonging to any of the previous ones, and not often adopted such as seasonal trend decomposition, SOM, and support vector regression

Second, the first three categories (i.e., speech or entity analysis, terms or keywords analysis and probabilistic models) can be further grouped into a coarser one, which can be named “Topic identification” because the methods contained in these categories primarily focus on the identification of topics. However, they define the topics in different manners, i.e., starting from a single term, from a set of terms, or from a distribution of terms, respectively (Figs. 2 and 3).

Fig. 2
figure 2

An example of user involvement in analytical methods is represented by Scatterblog2 (Bosch et al. 2013) that allows the interactive creation of classifiers and modification of the number of iterations or topics of the LDA model

Fig. 3
figure 3

Another rare example of user involvement is given by (Senaratne et al. 2014) in which the user can modify the cluster parameters, e.g., minimum points for cluster and radius

Third, and most likely the most relevant outcome related to the analytical methods adopted, is that the user does not play a central role in the analysis. In contrast, apart from very few exceptions, the background knowledge s/he might possess is usually ignored. Moreover, even in these cases, the involvement is limited. Specifically, (Bosch et al. 2013; Chae et al. 2012) allow modifying the number of iterations or the number of topics of the LDA model, (Bosch et al. 2013) permits the interactive creation and training of the classifiers, (Pritchard et al. 2012) involve the user in the selection of POS elements, (Senaratne et al. 2014) allows modifying the cluster parameters, e.g., minimum points for cluster and radius, whereas (Kim et al. 2011) permit the user to define correlated functions to aggregate geo-tagged tweets. Even more limited, but notable, is the user involvement provided by (Fischer and Keim 2014). In this case, the analyst can remove or reorder some features using a drag and drop interaction and work on the results of the algorithm; however, s/he cannot change directly the parameters of the algorithm.

Visual Methods

Tables 4 and 5 illustrate the results of our comparison for the visual methods. The methods have been grouped into those that consider the spatial, the temporal aspects or both of these (Table 4) and those related to the content of the data (Table 5). The latter have been further subdivided into visual methods thought to visualize terms or topics, to graphically depict quantitative information or relationships, and finally to query and filter the data.

Table 4 Visual methods and spatio-temporal aspects
Table 5 Visual methods and content

Along with the identified visual methods grouped according to their focus on spatial, temporal or spatio-temporal aspects, Table 4 shows the aim of such methods (row “General aim”) and the number of times each visual method is adopted (row “Specific visualization”, between brackets). A first consideration relates to the quantitative aspects of the comparison. Not surprisingly, when addressing spatial information, a representation in the form of a somewhat sophisticated map is very common (see the column “Geographical Map” in Table 4). However, although the collaboration between these maps and methods for representing or filtering temporal information is also often used, the combination of spatial and temporal aspects in a single solution is rarely employed (e.g., Space-Time cube (Kim et al. 2010)).

Different interactive functionalities are provided, but zoom & pan represents a sort of basic feature for maps that almost all of the approaches adopt. A single selection (i.e., the selection of a single visual item) and brushing (e.g. (Chae et al. 2012; Bertini et al. 2011)) are also very frequently used, whereas multiple selection is rarely adopted (e.g. (MacEachren et al. 2011a, b)); however, all of them commonly support linking.Footnote 5 Details on demand (DoD) can be obtained via single selection to show the tweet content represented by the selected icon on the map; in some cases, DoD are also provided by mouse-over to show, e.g., date and topic (Lohmann et al. 2012), the number of tweets (Abel et al. 2012a, b), entities (Dou et al. 2012).

A particular solution is proposed by (Bosch et al. 2013; Chae et al. 2012; Thom et al. 2012; Andrienko et al. 2013); hovering the mouse-over a certain geographical area works as a content lens and shows the related terms in the form of a (tag) cloud. Very rarely, the layout can be reconfigured; the user can change the layout to be circular or longitudinal ((Cao et al. 2012) Fig. 4), can change geo-tag opacity and their radius (Hassan et al. 2014) or can enable/disable tweets from certain regions (Musleh 2014).

Fig. 4
figure 4

Sunflower metaphor proposed by Whisper (Cao et al. 2012) to investigate the information diffusion over Twitter; temporal animation and changeable map layout are some of the interactive functionalities provided

Although some approaches provide no interaction for explicitly addressing the temporal aspects, most of the works do so – and in very different ways. White and Roth (2010), Dou et al. (2012), Abel et al. (2012a, b) adopt the mouse-over to show, e.g., topic and date; (Dou et al. 2012) provide linking (i.e., map and word cloud show only the keywords related to the event hovered on the timeline). Marcus et al. (2011), Yin et al. (2012), and Ji et al. (2013) permit changing the temporal granularity of the timeline (based on different predefined temporal granularities), and (Marcus et al. 2011) also allow selecting peaks. Animations to represent the temporal aspects of the data are used by, e.g. (Bertini et al. 2011; Musleh 2014). Nevertheless, from an overall perspective, such approaches are rarely adopted. However, interesting solutions are provided by (Yin et al. 2012; Marcus et al. 2011), who integrate timeline and temporal filters to facilitate selecting a time range of interest within the same visualization.

Another option to address the temporal aspects of the data, but with an aim different from those described previously (i.e., filter rather than explore), consists of the use of a temporal filter. This option is realized in different ways. A single slider (e.g. (ap Cenydd et al. 2011; Ji et al. 2012)) or two sliders (e.g. (Jackoway et al. 2011; Croitoru et al. 2012)) allow adjusting the beginning and ending point of the time interval of interest; in contrast, exact start and end dates are required by (Hassan et al. 2014; Kumar et al. 2011; Meyer et al. 2011). In some cases, DoD are given in the form of a sparkline (e.g. (MacEachren et al. 2011a, b)), heatbar (e.g. (Bosch et al. 2011)) or histogram (Bosch et al. 2013) over the temporal filter to show for example the related frequency, volume, and sentiment of the elements contained in the selected interval. Finally, animations and the ability to choose a time window are only provided by (Cao et al. 2012).

As a final consideration concerning the first group of visual methods, we emphasize that among the few cases trying to mix the temporal and spatial aspects, the most interesting solutions are most likely StickViz (Kim et al. 2010) and AnimatedRibbon (Itoh et al. 2016) (see Fig. 5). They allow the user to select among five types of shapes for representing a phenomenon of interest and to visually identify events and relevant changes.

Fig. 5
figure 5

AnimatedRibbon (Itoh et al. 2016) is one of the few cases trying to mix temporal and spatial aspects to identify traffic anomalies

As mentioned previously, Table 5 illustrates the visual methods thought to represent the microblogging content and grouped based on their focus; the aim of such methods is represented by the row “General aim”, and the number of times each one was found is shown between brackets (row “Specific visualization”). Starting from a quantitative point of view, the most frequent terms, topics or tags are usually represented in the form of a cloud such as a word cloud (e.g. (Bertini et al. 2011)), tag cloud (e.g. (Morstatter et al. 2013)), or topic cloud (e.g. (Chae et al. 2014)). Although most of the cases provide no interaction, some interesting approaches are found. For instance, hovering the mouse-over a term can highlight the corresponding co-occurring terms or related terms (Lohmann et al. 2012; Kraft et al. 2013), can simply display a tooltip with the number of tweets, can be connected with other views and highlight the related tweet in the tweet list (MacEachren et al. 2011a, b),Footnote 6 or can be connected with more sophisticated content, e.g., reference articles, news and related terms from Wikipedia (Purohit et al. 2013). More interestingly, (Lohmann et al. 2012) allow having further details about the temporal variation of the term usage using a histogram as background to each term and reconfiguring the layout of the word cloud based on factors such as the linear/algorithmic scale of terms' font size (based on the number of terms to be displayed) or the threshold (i.e., the minimum number of times a term must occur to appear). Moreover, as mentioned previously, few approaches involve the user directly in the configuration of the analytical methods. These exceptions comprise the solutions proposed by (Bosch et al. 2013; Chae et al. 2012), which visually support the user in adjusting the number of iterations and the number of terms per topic for the LDA computation (see, e.g., the column “LDA topic view” in Table 5).

Another interesting insight we discovered is that, although visualizations inspired by Themeriver (Havre et al. 2000) appear as an intuitive means of representing the temporal variation of a term/topic usage, they are rarely adopted (see the column “Themeriver-like” in Table 5). Even rarer possibilities to interact with them can be cited; Hassan et al. (2014) (see Fig. 6) and Wang et al. (2012) allow the adjustment of parameters to provide a different level of detail (Hassan et al. 2014) or time unit based on which group of tweets related to a given topic is shown. (Dou et al. 2012; Ribarsky et al. 2014) provide a mouse-over effect that highlights the corresponding topic in the topic cloud; the ability to select an item shows more details about the related content.

Fig. 6
figure 6

SoDa (Hassan et al. 2014) incorporates content, location and temporal aspects in a composite interface. In detail, the weighted tag network (A) shows the semantic relationships between related hashtags; the GeoTag Map (B) illustrates the geographic distribution of the tweets; the adjustable ThemeRiver (C) depicts the temporal evolution of topics; the right panel (D) offers plenty of interaction and manipulation options; the left sub-window (E) supplies general statistics; the tweet monitor (F) displays a list of currently incoming tweets; finally, the third sub-window (G) allows the search for keywords or the application of language filters

Simple charts providing quantitative information (e.g., showing the hourly volume of tweets) can frequently be found in the examined literature. In most of the cases, no interaction is allowed; however, a mouse-over is provided by (Morstatter et al. 2013) (to reveal date and number of tweets) and by (Chae et al. 2014) (which highlights the bars belonging to the same period). Moreover, (Mearns et al. 2014) allows moving a slider on the x-axis and shows the volume of terms, bursting terms, and anomalies.

Relationships (among tags, hashtags, and users) are illustrated by a network graph. Usually, such a visual method is also employed to show how information propagates. Apart from interactions such as zoom & pan or a single selection, which are quite common, (White and Roth 2010) allow multiple selection and focusing on a part of the graph without losing the rest (focus + context), (Morstatter et al. 2013) offer details on demand with mouse-over and by selecting a node (a panel with further info is then shown), and (Hassan et al. 2014) permit adjusting the parameters of the network such as node/edge threshold and link distance.

A common characteristic we noted is that the whole message is often shown in a table (named as, e.g., message list (Aramaki et al. 2011), message view (Hassan et al. 2014), or tweet view (Marcus et al. 2011)) providing a limited set of interactions or no interaction at all. The ability to scroll up & down the table is a sort of basic feature for the tools providing this type of view. However, (Bosch et al. 2011, 2013; Chae et al. 2012; Andrienko et al. 2013) also allow selecting and possibly removing an unwanted message; (Kumar et al. 2011) let the user switch between tweets and entities to show their frequency in the tweets, and (MacEachren et al. 2011a, b) permit organizing the list of the 500 most related tweets based on relevance, time, and space.

Finally, the insertion of a set of keywords is generally the most adopted solution to filter relevant tweets. (Pritchard et al. 2012; Abel et al. 2012a, b) also allow selecting from a set of given items (e.g., POS elements (Pritchard et al. 2012)); special cases are represented by ((Bosch et al. 2013; Rui et al. 2012), and (Singh et al. 2010), which permit a geographical search (Bosch et al. 2013; Rui et al. 2012) and adopt a well-defined algebra (Singh et al. 2010).

Static vs. Dynamic

As a final remark on our comparison, we note that the surveyed papers offer the analysis of pre-collected datasets (static), continuous streams of data (dynamic), or both. Specifically, based on our study, a good number of works support both static and dynamic analysis. All dynamic approaches exploit the APIs provided by Twitter to collect and query data: Streaming APIs (Twitter 2016), Search APIs (Twitter 2016), and REST APIs (Twitter 2016) In some cases, a combination of them is also adopted (e.g. (Yin et al. 2012; Robinson et al. 2013; Cao et al. 2012; Abel et al. 2012a, b; Rui et al. 2012)). A particular case is represented by (Sabty et al. 2013); the authors adopt a self-developed infrastructure called RADAR, which mixes different crawlers together (not only to collect Twitter data). Moreover, although the Search APIs allow up to 400 keywords, the strategies used are different: for instance, (Sugumaran and Voss 2012; Lee et al. 2013) use only 2 keywords/hashtags, whereas (Morstatter et al. 2013) adopt 34 keywords.

Concerning the representation of the dynamic aspects, to obtain a view of the changes in terms/topics over time, a local database (such as (MongoDB 2009)) plus a query system represents a common solution (which can be found, e.g., in (Hassan et al. 2014; Kraft et al. 2013)). Thus, it is possible to navigate in the past using different interactions; a time slider is adopted, e.g., by ((Bosch et al. 2013; Chae et al. 2012; Musleh 2014; Morstatter et al. 2013; Liu et al. 2011), whereas animation can be found in (Cao et al. 2012; Abel et al. 2012a, b; Musleh 2014). Finally, in a few cases, a geographical search (independent of the bounding box defined in the APIs) is also occasionally possible; it can be pre-defined (e.g. Ireland (Pozdnoukhov and Kaiser 2011), US (Meyer et al. 2011)) or user defined (Bosch et al. 2013; Chae et al. 2012).

An Interactive State of the Art

This section presents the second part of the comparison of the surveyed papers. To make the content of this survey available not only in a static form (as a paper) but also in a more interactive fashion and for different purposes, we created a set of interactive charts. These charts represent the interactive and online counterpart of the comparisons we presented and allow further exploration and comparison of the surveyed papers.

To this end, we adopted (Tableau Public 2014), free software that allows the creation of interactive data visualizations for the web. Figures 7 and 8 illustrate how we organized the content.

Fig. 7
figure 7

A screenshot from the online counterpart of the surveyed papers. It shows the identified research categories and the application fields. Interactive filters are also available

Fig. 8
figure 8

This screenshot shows analytical methods and user involvement (only the few exceptions are selected). Visual methods grouped based on spatio-temporal aspects and content analysis, respectively, can be found at the given links

The first chart (Fig. 7) represents the identified research categories and application fields,Footnote 7 as described in “A General Comparison of Research on Microblogging Content” section. The papers are organized according to publication year and into the main research category to which they belong. Moreover, three different filters allow focussing on, e.g., a specific research category or one or more application fields.

The second chart illustrates the analytical methods and user involvement,Footnote 8 as discussed in “Analytical Methods” section. In particular, Fig. 8 shows the few exceptions among the surveyed papers that involve the user in the analytical part of the process. Similar to the previous case, a researcher or a student can further explore the papers using the provided filters.

Finally, the last two charts encompass the visual methods that focus on spatial and temporal aspectsFootnote 9 and the visual methods that focus on the microblogging content analysis,Footnote 10 respectively (Visual Methods section). As in the previous cases, filters related to visual methods and publication year support a more detailed exploration based on different user needs.

Challenges

In this section, we describe some major research directions related to spatio-temporal exploration of microblogging content. To this end, we considered the outlook section of the articles referenced in this paper, we investigated previous review papers (such as (Goonetilleke et al. 2014; Wanner et al. 2014; Luo and MacEachren 2014)), and included our own conclusions about possible research challenges based on the analyses performed in the previous sections. At the end of this process, we identified the following major research directions (Fig. 9):

  • Analysis, visualisation and interaction methods

  • Data streams and real-time processing

  • Scalability of methods and combined used of other (heterogeneous) data sources

  • Social relationships, external context and information diffusion

  • Trustworthiness and privacy

Fig. 9
figure 9

Major research directions related to spatio-temporal exploration of microblogging content

The list is a subjective selection and is not claimed to be complete. In fact, there are further opportunities and more-specific challenges (related for example to analysis of trajectories and mobility characteristics, utilisation for application or usability testing), which, regardless, remain beyond the scope of this paper.

Develop or Extend Analysis, Visualisation and Interaction Methods

Whereas most research papers indicate possibilities to improve their presented analysis and visualisation methods, only some of them mention the further development and refinement of user interfaces and interaction methods. Thus, this subsection lists some future research directions on analytical methods, (geo-) visualisation methods and interaction methods, respectively.

An example is the improvement and combination of filter approaches mentioned by (Bosch et al. 2013). Although filter methods based on classifiers, statistically based keyword lists or spatio-temporal restrictions work well for specific information requirements, when used on their own, the results of filter combinations are less predictable and not transferable to other situations.

Geoparsing is a research topic that is also quite relevant for the analysis of microblogging content, particularly given that only a minority of the data contains explicit geographic coordinates. Challenges include, first, toponym extraction with ambiguities present in natural languages and, second, differentiation between location of activity (i.e., the place to which the tweet was sent) and location of interest (i.e., the place of the described event), which do not necessarily correspond.

(Geo-)visualisation challenges refer to occlusion issues in the spatio-temporal presentation of tweets and to the design of glyphs with varied sizes adapted to the zoom level. Additional visualisation metaphors are required that are able to incorporate visual encoding of external contexts such as information spreads and diffusion. The sunflower metaphor proposed by Whisper (Cao et al. 2012) provides an example of geoinformation diffusion on Twitter concerning an earthquake. Potentialities to analyse and present less-structured geodata are also shown by the georeferenced word clouds, a cartographic presentation method that was not used on traditional paper maps.

Finally, interaction methods are crucial to enable a dialogue between the analyst and the data. As noted in our paper, various types of interaction support the visualisation methods, but such support is rarely provided to assist the analytical methods (see Table 3). Research is required to find solutions that support the analysts in developing strategies to explore large amounts of data in a semi-automated fashion. In this respect, the development of methods combining animation with meaningful user interaction appears particularly challenging for dynamic presentation methods.

Data Streams and Real-Time Processing

A major issue concerns the analysis of a continuous stream of data in real-time. Although web applications such as (Trendsmap 2009), (Onemilliontweetmap 2013), and (Tweereal 2013) demonstrate potentialities either to store and query huge amounts of streaming data or to show them, we remain quite far from tangible real-data analysis. In fact, several approaches available in the literature employ static datasets; although using, e.g., time slider or animation, those approaches analysing dynamic datasets provide a comparison of several static snapshots from the past rather than a continuous analysis of tweet content showing how the content evolves.

A successful example of a customized metaphor for the dynamic display of classic area charts and histograms is the concept of visual sedimentation inspired by the process of physical sedimentation (Huron et al. 2013). New information is presented as objects falling on top of older ones and aggregated into strata over time. Recent information is presented more prominently, whereas older information receives less attention, but still sufficient to enable a quantitative comparison on a temporal basis. With that information, a focus and context technique is suitably applied to a dynamic visualisation framework.

Real-time detection and tracking of spatial events can also benefit from incremental approaches. A first interesting incremental stream-clustering algorithm is proposed by (Andrienko et al. 2015).

The potential for real-time trend detection over Twitter streams is highlighted by (Crooks et al. 2013). There is also a desire to do targeted marketing based on real-time streaming tweet analysis (Ribarsky et al. 2014) and to detect geo-temporal sentiment patterns, trends and influences in customer feedback streams for live alerts (Hao et al. 2013).

Scalability of Methods and Combined Use of Other (Heterogeneous) Data Sources

The scalability requirement of analysis and visualisation methods refers in a first instance to the processing of varying amounts of input data, but the requirement also has a spatial, temporal and thematic dimension. From a spatial point of view, an increased amount of data can result not only from covering larger spatial areas (even on a global scale) but also from more-detailed information analysis and presentation (with a local focus). The resolution of the GPS positioning device could be considered a lower limit; nevertheless, how strongly the location of a tweet correlates with the described content is questionable (Hahmann et al. 2014). Although the description (tags) of a georeferenced picture (with the object of interest in visibility range) has a strong spatial correlation, the place in which the tweet was posted might be quite far away from the described event. Furthermore, scalability in a spatial sense imposes requirements on automated abstraction and generalisation methods. Thus, two slightly different issues must be solved: first, the selection of an appropriate level of detail, e.g., when presenting tweets within in a map through individual points, individual markers, aggregated clusters, density surfaces, word clouds or phrases; second, legibility issues, which could be tackled through displacement, usage of transparency values, filter or aggregations. As Yin et al. (2012) mention, more experiments on large-scale datasets must be conducted to evaluate system overall performance.

Changing the spatial focus can have implications for the analysed language. Although some of the analysis methods might be easily transferable between languages, such a transfer becomes difficult as more grammar is considered, for example, with the identification of negation or amplifications (Hauthal and Burghardt 2014). Thus, there is much potential for the usage of in-depth language processing (Wanner et al. 2014), e.g., disambiguation, polarity extraction and named entity recognition. In fact, scalability related to the thematic dimension requires the selection of multiple groups of keywords, processing of datasets with large numbers of entities, or a high degree of detail represented by attributes or relationships. The temporal dimension of scalability becomes important when datasets with very long histories or those with varying granularity from years to seconds are analysed. High update rates require methods of real-time processing, as mentioned in the section on data streams. Challenging research on the temporal information relates also to the evolution of events, topics or topic categories to discover that, e.g., some topics show repetitive behaviour on certain hours or days or reoccur during the weekend rather than on weekdays.

Some of the analysed papers mention the research challenge of a combined use of other (heterogeneous) data sources, e.g., improving the performance of burst detection and tweet classification by using additional, external resources to compensate for tweets’ terseness (Yin et al. 2012). More general data integration is needed to combine complementary datasets referring to the same spatial event or physical objects captured by different people or organisations, at different times, with different underlying data models and quality levels. Similarly, matching and integration of microblogging content and topographic data (which can be used for example as underlying reference data) pose significant research questions concerning topology and semantics (Sester et al. 2014).

Social Relationships, External Context and Information Diffusion

The analysis of microblogging content is biased by the characteristics of people participating in this type of information exchange and other social network activities. Thus, there is a need to know about the characteristics and demographics of the people generating messages. However, social science research focusses on the process of information spreading (e.g., in the form of retweets) and on determining who is influential in a social network. In particular, spatio-temporal research related to information diffusion is challenging when the analysis combines “when”, “where”, and “how” an idea is dispersed.

Visual encodings about the elements in the diffusion process might support the analyst in investigating the relationship between information consumption and a broader context. From a methodological viewpoint, the integration of spatial and social network analytics can provide further insights. (Luo and MacEachren 2014) argue that actors close to one another tend to have greater similarity than do those far apart. Therefore, geographical and social network distance, relationships and interactions should be considered. Developing theory, methods, and tools to consider spatial and network factors simultaneously has the potential to achieve new insights about human processes. Similar are the research questions mentioned by (Fujita 2013), who proposes a function for visualising the collected data by focussing on the relationships between locations and communication among users.

On a more general level, geosocial science will perform research to understand the effect of geo-social relationships on society, for example, the shift of traditional decision-making approaches with the emergence of social media (Luo and MacEachren 2014). An additional use of social media can be achieved with developing new geo-social visual analytic methods to incorporate data exploration, decision-making, and predictive analysis as a whole. A significant challenge for predictive analysis is that geo-social systems are highly sensitive to social adaptive behaviours in crises such as epidemics or natural disasters. Nevertheless, harvesting ambient geographic information from Twitter will enhance our situational awareness capabilities. Although past research and analysis of microblogging content and Tweets has largely addressed event detection, future work will also investigate the evolution and change of events and consider the effect on people and their reactions.

Trustworthiness and Privacy

When addressing user-generated content, the “3 Vs” can be used to characterize the data: volume (size of the data), velocity (speed of data management in real-time) and variety (heterogeneity of sources) (Laney 2001). More recently, a fourth term, veracity, has been introduced (IBM 2016) into the discussion: the quality, the provenance and the trustworthiness of the data. Related research directions comprise the development of new methods that are able to determine the influence of users, methods that are based not only on the number of followers or retweets but also on content (e.g., specific terms and trending topics), on sentiment (e.g., positive and negative influence) or are they a part of more-complex information diffusion models. Similarly, new envisaging methods are able to evaluate the reputation and the reliability of users to easily distinguish relevant events or information from noisy data. One can imagine methods that relate trustworthiness and geographical location of tweets (e.g., in a flood emergency, one could decide to assign a higher trust to the tweets coming from locations near the river). However, the location-trust correlation might be extended with more-sophisticated measures to describe the user degree of trust (and this extension would reflect on the information s/he is communicating).

Finally, we see much promise in the visual representation of such a network of influences and trust. Although some recent efforts were a step in this direction (such as Whisper (Cao et al. 2012) and TweetXplorer (Morstatter et al. 2013)), we remain quite far from what could be called an interactive map of influences that could describe how a topic propagates based on its users, its reliability, the sentiment expressed, and its location and timing.

Although the analysis of human sensor data has many useful applications as shown in Table 2, the collection and the analysis of such data can be accompanied by privacy issues. In fact, as with any other technology related to geospatial location, the risk of generating privacy issues is very high when merging the information coming from a tweet with other publicly available information. Future research directions should therefore not only further investigate solutions such as (Andrienko and Andrienko 2012) (based on the generalization and abstraction of movement data) or (Wakamiya et al. 2011) (based on urban characterization of crowd behavioural patterns) and possibly extend them but also focus on content aspects from a user of point of view to understand users' conception of privacy and how it is adopted when using microblogging technologies in general and specifically on Twitter.

Conclusion

This survey focussed on visual analytic approaches developed for the spatio-temporal exploration of microblogging content. Based on different criteria, papers and tools from journals and conferences were collected and compared from quantitative and qualitative points of view. The comparison emphasized common research issues that these works share, weaknesses that should be overcome (such as the very limited user involvement in the analytical methods) and their advantages and disadvantages. These results have been organized in different tables (including an interactive online counterpart of the tables), were presented. Future research directions and challenges were identified that we believe are worthy of further investigation.

Because the Twitter world is a very dynamic environment (as the evolution of the timeline (O'Brien 2014) and the introduction of polls (Shermann 2015) show), we are aware that this work represents only a first attempt to describe it. Therefore, our plans include work not only to update the study to reflect the current state of the Twitter world but also to further revise the study based on new discoveries. Moreover, we wish to explore the feasibility of a more interactive and visually intriguing state of the art to include the aspects investigated in this survey, representative figures (e.g. of geovisualization techniques) and possibly videos.