Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The analysis of dynamic network data has become an increasingly important research field with promising application areas in different real-world domains, including the analysis of organizational knowledge and collaboration networks [25]. As the temporal dimension is adding a new level of complexity, the demand on computational methods—and the cognitive efforts for their users—are even higher than they are in static network analysis anyway [7, 33].

While several visual and computational methods addressing the examination of temporally evolving networks have been proposed in recent years, their effectiveness and utility for end users needs to be further analyzed. Considering the increasing complexity and the novelty of all these methods, adopting participatory design strategies can be beneficial. These strategies can help to improve the methods and particularly their application to real world scenarios [4, 35] by bringing users’ needs and experiences into the development process. Moreover, by analyzing users’ preferences and performances when dealing with such methods in specific scenarios, it is possible to gain insights that might be applicable in a more general context.

Following this approach, we evaluated a visual analytics method for dynamic networks along its development process. First, we performed an intermediate evaluation by the means of mock-up studies and second, we conducted a qualitative evaluation of the final interactive prototype.

In the following sections, we want to discuss related work, summarize insights gained by a mock-up study, give an overview of the main results of the prototype evaluation, highlight examples for pathways of (participatory) development and design and bundle the outlined issues into conclusions and future research questions.

1.1 Related Work

While several methods for the visualization of static networks have been proposed in Graph Drawing [11], Information Visualization [20], and Data Mining [10] communities, the interactive visual analysis of networks evolving over time is an emerging research field. Besides the choice of a visual representation for the relational data (e.g. node-link diagrams or matrix-based representation), an important issue for time-varying networks is the appropriate visual encoding of the temporal dimension [2]. At least four different approaches exist: animation [16], superimposition [5], juxtaposition [3]; and two-and-a-half-dimensional view [6, 12]. But finding an adequate visual encoding for the time dimension is not sufficient to solve the issue of visualizing dynamic networks. Another important aspect is to obtain a sequence of diagrams that facilitates the perception of changes, by preserving the user’s mental map [13]: it must minimize unnecessary changes while emphasizing temporal trends or patterns. An early formulation of the problem is sketched by [33], while [7] discuss it systematically from a graph drawing perspective. Several computational methods, which descend from Social Network Analysis (SNA) [45], can be integrated into visualizations. A common approach is to compute some static SNA metrics associated to nodes and edges and then encode them to a chosen visual variable or exploit them to perform dynamic filtering [34].

To test prototypes in the field of visual analytics, various methods for empirical user studies were discussed in recent years. Especially in the visual analytics community, the usage of highly standardized quantitative methods (see [4]) was criticized of being too rigid resulting in artificial results [14]. Therefore more qualitative approaches were favored [22, 39, 40]. Methods which allow to gain insights into which problems occur and why they occur [27] should also engage users to “search to learn” and show real behavior instead of using simple “lookup tasks” [28]. Another necessary step to avoid artificial results when covering users’ exploration process [41] is to use real world data with context [47]. Therefore the selection of expert groups who have to deal with (often ill-defined) real-data is favored by some authors [21, 23]. A rather novel trend to analyze exploration behavior when using visual methods is to analyze exploration focusing on the multiple ways of problem solving processes of the users [30].

1.2 Visual Analytics Methods

The prototype at hand is aimed at the examination of dynamic social networks and has been designed and implemented on the basis of a visual analytics approach [15]. It features the integration of visual, analytical and interactive techniques, led by some basic perceptual principles, and it is tailored for small longitudinal network datasets (up to 50 nodes), manually collected by the means of questionnaires (discrete time domain). Even though it is definitely far from being of general applicability and covering all recent developments in the field, its integration of some different techniques provides us the opportunity to observe how users exploit, alternate between, or combine them for the means of visual network data exploration.

The visualization is based on node-link diagrams, and three ways to map the temporal dimension into it leads to three different views: juxtaposition (JX), superimposition (SI) and two-and-a-half-dimensional (2.5D) view.

The JX view (see Fig. 1) is obtained by mapping time to space (the horizontal temporal axis), i.e. by placing the diagrams of different time-slices side by side. It applies the principle of small multiples [43] and allows the reader to directly compare the time-slices. Coordinated zooming and panning and coordinated highlighting further facilitate comparison.

Fig. 1
figure 00251

Juxtaposition view (JX)

The data which is shown by the dynamic network visualizations in this report—and which was shown in all empirical studies we refer to—is a real world data set, which covered eight different relations of knowledge communication structures at a university department [48] with four time steps and 33–34 nodes per time point (38 in total). Relational questions included content related and technical advice, intensive collaboration, awareness of individual knowledge, knowledge substitution, discussion of new ideas and suggested communication that should be intensified.

The SI view is obtained by superimposing the node-link diagrams (see Fig. 2). It can be described as mapping time to a visual variable, namely the transparency, which is employed to differentiate between time-slices, so that more recent elements are more opaque. It requires less screen space than the previous view, but is affected by more visual clutter and occlusion. To reduce these problems, at first only nodes are shown to reduce occlusion and visual clutter, but edges can be displayed on demand.

Fig. 2
figure 00252

Superimposition view (SI)

In the 2.5D view (see Fig. 3), diagrams for each time-slice are drawn on separate transparent planes, stacked along the horizontal time axis, orthogonally. It can be seen as the mapping of time to an additional spatial dimension, along which more information can be displayed, as described in the following. 3D zooming, rotating and panning controls allow the user to set the best viewpoint.

Fig. 3
figure 00253

2.5D view

In order to preserve the user’s mental map and provide a common context for the interactive exploration of the three views, they are built upon a consistent spatial metaphor, which also drives smooth animated transitions between them (see Fig. 3): the sheets, on which the diagrams are drawn, are stacked upon each other in the SI view, then translated alongside the time axis in the JX view, and finally rotated by 90° around their vertical axes in the 2.5D view.

As for the layout of the node-link diagrams (i.e. the way nodes are arranged), the prototype adopts a continuously running force-directed layout that also ensures the preservation of the mental map over different time slices. The user can interactively control the amount of preservation: a simple slider in the Graphical User Interface (GUI) allows users to select stability (maximum mental map preservation) or consistency (independent layouts) and to pass from one to the another through stepwise transitions (see Fig. 1 at the bottom).

An integrated SNA computational component provides the calculation of SNA metrics on demand (e.g. different types of centralities). In this way the user can interactively select a certain SNA metric to be computed for a certain type of relation s/he is interested in; the entire temporal multi-relational network is partitioned into as many static single-relational networks as time-slices consist and the requested metric is computed for each of them. Then the resulting values are encoded to visual variables within the visualization (color and size of nodes) for each time-slice, or they are shown in a numeric form within a tooltip when the user hovers a node.

Besides the dynamic layout, with its user-controlled stability, the prototype at hand features other interaction techniques to facilitate the exploration of dynamic networks: a specific interaction technique to highlight a given node and its connections; and the on-demand visualization of node trajectories, by which users can focus on specific nodes and track their evolution. In the 2.5D view, for example, trajectories run along the spatial dimension dedicated to time. Shading different colors along the trajectory of a given node shows how its values for a certain metric vary over time. In this way, the results of analytical methods are integrated directly into the main visualization of the network, aiming to enable the user to examine its relational and temporal aspects simultaneously without any additional diagram.

1.3 Overview

To match the described features with the needs of the intended user group, the development process of the prototype at hand crossed six main stages, with three of them bringing user expectations, evaluations and participatory elements into play (marked with a star below):

  1. 1.

    Assessment of the State of the Art,

  2. 2.

    User and Task Analysis*,

  3. 3.

    Design,

  4. 4.

    Mock-up Study*,

  5. 5.

    Implementation, and

  6. 6.

    Prototype Evaluation.*

While a initially executed user and task analysis had the function to bundle state of the art options on the targeted group of users and align possible features with their real world tasks and needs (see [15, 48]), the next section focuses on the participatory part of a mock-up study on dynamic network layouts, while the following sections will turn towards the empirical results of the implemented feature evaluation.

2 Mock-Up Study

The aim of the mock-up study was to test three early sketches of non-interactive dynamic network visualizations (one JX view and two versions of SI views) on their comprehensibility, visual design and utility.

2.1 Study Design

Therefore we conducted an experiment with a sample size of 38 participants, including 10 experts (with at least 2 years SNA experience) and 28 non-experts in the field of social network analysis. Each participant was tested individually for about an hour and had to solve four open tasks as well as four pre-defined tasks. These tasks were similar to the tasks 2–7 used in the evaluation study (see Table 1) and their thinking aloud and viewing behavior were observed, recorded and analyzed.

Table 1 Mock-up study plan

Our real world data on two time points of knowledge communication at a university department was firstly visualized in a JX view. The network structure of the two layers differed, since it was computed for each network with a medium stability-consistency balance. This also applied to the first variant of a SI view, in which the two layers were displayed as stacked overlay and the nodes were additionally connected by trajectories over time (see Fig. 4). We will refer to this view as comet plot. In the second SI view (to be referred to as SPOCC plot—Stable POsitions, Color Coded), nodes kept a fixed position over time, but the relations and nodes were color coded on their temporal attribute: red for relations or nodes only existent in timepoint 1 (t1), green for relations or nodes appearing in timepoint 2 (t2) and blue for relations or nodes that were constant over t1 and t2 (see Fig. 4, on the right hand side).

Fig. 4
figure 00254

Closeups of two variants of a superimposition (SI) view (see also http://www.smuc.at/cometandspocc/): comet plot (left hand side) and SPOCC plot (right hand side). While the comet plot shows relations of time point t1 in orange and t2 in blue, it allows certain shifting of the nodes due to the consistency of the temporarily changing network structure and its force-directed visualization by the chosen spring embedder layout. In contrast, the SPOCC plot holds all node positions stable, but codes temporal changes with the colors green (nodes or ties which emerged at t2), red (nodes or ties which vanished after t1) and blue (nodes or ties present at both time points)

The analysis of visual information processing is necessary to examine how users gain insights into network visualizations. In this study, we used an eye tracking technology to analyze on which parts users focused to understand the network. To examine visual information processing, eye tracking technology provides a means to observe a viewer’s point-of-gaze (e.g., [36]). In the past, eye tracking focused mainly on scene perception and reading under laboratory conditions [18, 36]; only in the last years, applications in more everyday settings [30] became possible with the emergence of more usable technology.

Central eye-movement measures are fixations and saccades. Saccades are shifts from one point of gaze to another; fixations indicate visual attention to that information [36]. In scene perception, top-down and bottom-up influences control where one looks [18, 46]. Bottom-up influences are stimulus-driven, whereas top-down influences are viewer-driven. Bottom-up influences are mainly based on the visual salience of the stimulus, i.e., color, saturation, and [28]. Top-down influences on the other hand are a viewer’s knowledge about the stimulus, his or her domain knowledge, and his or her goals [18]. Another top-down influence is the viewer’s domain knowledge [9] showed that due to their higher knowledge on possible configurations experts in chess can easier create chunks of information.

Eye movements were recorded using an SMI iView X™ RED eye tracker at a temporal resolution of 60 Hz. It tracks the corneal reflection of the pupils and allows relatively free movement of the head when seated approximately 60 cm from the tracking device. As it allows eye tracking with glasses and contact lenses, a wide range of participants could be included. Each participant was tested individually. After an explanation on the purpose of the study, the functionality of the eye tracking device was explained to the participants. The device was calibrated using a nine-point-calibration. Participants viewed the scenes on the 17 computer screen, integrated in the eye tracking device. The experimenter was seated next to the participant with a control screen of the participant’s gazes to intervene, if the gaze was lost by the eye tracking system.

Think aloud notes were used to study the participants’ problem solving strategies and to gain deeper insights about their exploration behavior. Using this method, we logged participants’ interaction, tracked their eye movements, observed their behavior, and asked them to think aloud during the experiment. We integrated these data sources, segmented them according to the tasks, and documented the users’ success levels.

Eye tracking data were analyzed with BeGaze™ analysis software from SMI. We segmented the recordings based on single tasks and extracted the fixations (number and duration) and saccades (number and amplitude). To analyze the visual attention given to highly informative regions, the scenes were coded in accordance to predefined Areas of Interest (AOIs) similar to [19], dependent on the tasks [24].

The Mock-up study plan (see Table 1) consisted of four large parts: An introductory/calibration phase, four open tasks session with a static network and the three different mock-ups where participants were allowed to explore the networks freely while thinking aloud and get familiar with the visualizations and the GUI (task 1–4). Then the users had to solve and interpret a pre-defined set of tasks, from rather basic tasks up to more complex structural analysis (task 5–12). Finally, in the post questionnaire, user could provide additional feedback, discuss problems and suggest improvements (if they had not done it before). This test plan consisted of two slightly varying variants of network data the mock-ups were based upon to avoid rigor in the visualizations. The results for the two variants were merged for the following analysis and for the eyetracking results.

2.2 Study Results

Overall, the feedback from participants was promising that dynamic network visualizations can be made comprehensible with such graphs and allowed for further fine tuning and interactive enrichment of at least two out of three variants. All participants were able to comprehend the JX view fast and easily, even the non-experts. The comet plot was the most difficult to comprehend, only some experts caught the concept behind this visualization at a first glance, without an explanation how to read the graphs. Many of the non-experts asked for an explanation and some of them could not utilize the structural visualization in the intended way. The SPOCC plot was easier to understand, but suffered from visual clutter to a high extent.

We will come back to these user problems, eyetracking results and resulting design decisions in a more detailed manner further down (see section about visual clutter).

3 Prototype Evaluation

The qualitative prototype evaluation was conducted to evaluate the prototype’s usability and the comprehensibility of the different views and interaction techniques, and to cover users’ exploration process [41].

In contrast to the mock-up study described above, the sample consisted of nine experts who work in the field of social network research as pre- and postdocs, mainly as computer scientists or as graph theorists plus another computer scientist from the visual analytics field. None of the participants had prior knowledge about the prototype or has been tested in the mock-up study.

In the first phase of this study, the prototype was presented to the participants in an interactive session together with an instructor. Participants were encouraged to explore the functions of the prototype, ask questions, give feedback about the prototype’s usability and express their ideas and suggestions for improvement. In the second phase, they had to solve seven tasks, which were derived from [1] and were selected on the basis of our experiences with the mock-up study (see Table 1).

These tasks (see Table 2) included lower-level activities like the identification and comparison of the relations of a single node at two time points as well as higher-level activities [29] like the description of structural group changes over time. In the first task participants were allowed to use all prototype functions and views freely. In all other tasks they were compelled to work with a preselected initial view. In the third and last phase, participants were asked to summarize their impressions and give additional feedback.

Table 2 List of tasks for the prototype evaluation. The tasks were named according to the scheme proposed by Ahn et al. [1]

The material consisted of the same real world data set we used in the mock-up study (as proposed by [21, 23]), except that we used four instead of two time steps. The verbal comments and a screen cast were recorded during all phases—which lasted about 1.5 up to 2 h in total. Notes were taken by an observer during these sessions. These notes were jointly analyzed by a team of three usability experts who were also part of the testing-team. The notes were segmented in single observations, which were categorized and counted as presented in the following section. First we want to present an overview about users’ feedback and observed problems during the introduction phase, later we will describe our main insights that derived from task analysis.

3.1 Evaluation Results

The evaluation results are structured as a matrix, with the main visual, computational and interactive features of the prototype as rows and columns (see Fig. 5).

Fig. 5
figure 00255

Frequency of observations which feature problems (red, left hand side of each column), positive feedback (green, center of each column) and ideas for improvement (blue, right hand side of each column)—which could be identified for all views, for single views (i.e. SI-, JX-, and 2.5D view) and for the GUI itself

The feedback was segmented into 255 distinct observations, which were categorized as problems (118), positive feedback (45) and ideas for improvement (109). It has to be noted that similar observations were counted multiple times, so that we could identify 155 unique observations in total.

To give an overview we will focus mainly on areas in which many observations have been made, leaving bugs and too specific implementation issues aside. In all views, many participants stated that the transitions are too slow, although the idea to maintain the mental map by transitions yielded consistently positive feedback. In the case of the highlighting feature, participants recommended additional interactions to make comparisons easier by highlighting more than one node at a time.

SNA measures of nodes like centralities were always double coded by size and by color in the prototype. Many users expressed the wish to have more freedom in selecting how these measures are displayed, and they preferred to use their favorite color palette. This applies for the relations too, where different types of relations should be visualized by different visual features like color or line style.

Concerning the main views, the juxtaposition view (JX) was rated as the most comprehensible by users comments and we detected the least problems in this view.

In the superimposition view (SI), participants mainly struggled following transitions and dealing with visual information overload. We will describe these problems in detail in a later section.

For the 2.5D view, users reported navigational problems as being too slow, not responsive enough and they missed an immediate feedback of the prototype when they zoom, pan or rotate. Most users suffered from perspective distortion when comparing node sizes and they mentioned legibility issues since the node labels and tool tips were distorted to a high extent in the two middle layers. Many users also mentioned a visual information overload as soon as many of the (too boldly styled) trajectories were displayed in 2.5D view.

Regarding the GUI, most users reported serious problems in understanding some of the labels, especially those of the dynamic views. Seven of eleven users reported (all of them no native English speakers) comprehension problems for the naming of the views (mainly “Superimposition” and “Juxtaposition”), two proposed to use icons instead of names. Only one person made sense of all the chosen view names. When dealing with user feedback seriously, this could be also seen as a hint that the untested transfer of technical terms (here: from the InfoVis community) via a prototype to an audience without that specific domain knowledge could have a negative impact on usability.

Aside this summary of problem oriented feedback, all users focused on implementation and in general, nearly all participants expressed a remarkably positive assessment of the prototype in their overall summary.

3.2 Task Completion Analysis

We used two indicators to assess the effectiveness of our prototype’s features in supporting users to solve assigned tasks: correctness and confidence. The correctness is defined as the conformity of user’s answer to the answer we obtained by numerical methods and, for certain tasks, also by our previous knowledge of the real-world network at hand. The confidence differentiates between affirmative certain answers, and uncertain answers expressed in vague forms (e.g. “I would say”, “I guess”, “I am not sure”). We disregarded the task completion time, because we were more interested in the reasoning process, and asked users to think aloud and explain how they conceived the answer rather than to give the fastest answer.

The overall correctness of the answers was 89% (see Table 3). Half of the incorrect answers were given to task 1, but they might be ascribed to the task openness (without any default settings of the view and other parameters) and to the fact that it was intrinsically hard to solve, demanding the detection of a very slight variation of the network density. As for the confidence, 82% of the correct answers overall were also certain answers, with the highest value for task 3 and task 5, and the lowest also in this case for task 1.

Table 3 List of tasks for the prototype evaluation. The tasks were named according to the scheme proposed by Ahn et al. [1]

As a general conclusion, we observed that most of the users were able to provide correct, complete and confident answers for task 2 to task 7 (see Table 3), mostly by using the combination of visual, analytical and interactive options we had set, with noticeable exceptions and unexpected behavior that we discussed (see section about multiple problem solving strategies). For some users, their performances on given tasks also affected their initial preference about a given view, for example some users initially were skeptical about the 2.5D and the SI views, but changed their mind after they realized they had been able to solve task 6 and task 7 by using them.

4 Pathways of Development

To illustrate how the process of prototype development was related to participatory aspects and the results of the final evaluation outlined above, we want to use this section to trace some pathways of development in detail. The first one will be referred to as maintenance of the mental map, the second one as avoidance of visual clutter, and the third one is following the implications of multiple problem solving and unexpected user behaviors.

4.1 Maintenance of the Mental Map

Within the context of dynamic network visualization the general visualization principle of “preserving the mental map” [32] predominantly refers to the challenge that the layout randomness, which is introduced by random steps of spring embedder algorithms, has to be brought under control. Starting with network data of a given time point, force directed layout algorithms usually generate node-link arrangements, that are driven by the overall aim of stress minimization (or majorization). This procedure reliably reproduces global patterns like clusters or local configurations like node neighbourhoods, but still could be realized by infinite specific detail arrangements, all solving the overall equation of stress minimization. This means, that even two instances of a barely evolving network tend to look quite different—if no further methods of layout preservation take care for visual comparability.

To still allow for the visual analysis of network dynamics, the spring embedder layout of a second instance has to be coordinated with the first layout solution, so that the mental map, which a user generates when viewing the first instance, could be preserved and leveraged to also analyze (stabilities or changes) within the second or third instance. Hence the sequence of layouts of the different instances of an evolving network has to provide a minimum amount of graph stability, whereas structural changes and the shifting of single nodes (due to a consistent layout solution at a certain time point) should not be overly suppressed. This means that an appropriate trade-off between inter-time stability and layout consistency has to be found [38]. The solution which was implemented in the prototype at hand allows the user to control this balance by herself—depending on the data and tasks which are at hand [15].

Beyond that solution, the basic requirement of maintaining mental network maps was generalized and pursued as an overall aim for all cases of interactions, which re-arrange the structure of a node-link diagram. This led to the implementation of three kinds of methods which maintain the mental map within the linked view architecture of the (superimposed, juxtaposed or 2.5 dimensionally stacked) time panels of each dynamic view:

  • Maintain the mental map over time: aside the dynamic layout control mentioned above, a continuously running real-time layout provides smooth structural transformations after all kinds of user triggered structural changes.

  • Maintain the mental map throughout user interactions: implemented methods include the coordinated highlighting of single nodes or neighbouring nodes after hovering a node on any panel (i.e. the visual linking of the same nodes at different time points), as well as coordinated positional shifts after dragging & dropping nodes on a single panel.

  • Maintain the mental map amongst the three different views: a feature of smooth transitions was implemented, which allows for animated transits from one view to the other.

The basic idea of this general line of development was evaluated considerably positive. Suggested improvements were mainly addressed to detail or implementation issues like the speed of the continuous layout, the duration of its re-stabilization or of the transitions between views. Still the continuously provided visual integration and visual feedback, which arises from the combination of (A) and (B) was consistently rated positively. The highlighting function was appreciated for connecting different instances of evolving nodes or patterns across the time layers—hence helping to strongly reduce visual work. When the mock-up-study showed that finding the same node on other layers was quite time consuming (even in spite of the given layout stability), the prototype feature of linked highlighting (red for the focal node, hovered on any layer, yellow for all neighbouring nodes) solved this problem entirely.

Similarly, the method of smooth transitions between different views was consistently rated as supportive for the understanding of the operational principles of the different dynamic views. On the one hand, the way how the display architecture of a view is working, could be inferred just from observing the smooth transitions and how layers are visibly re-arranged. As one participant of the prototype evaluation put it, in the case of being new to the tool, the transitions could save hours to be spent with reading a manual otherwise. On the other hand, several subjects pointed out, that this feature should be made optional for the purpose of daily use, where the mental maps of all views already would have been successfully deployed. Aside these functional evaluations, the transitions of the tested prototype version were rated as being too slow for efficient use.

By analysing strategies adopted by users to solve the assigned tasks, we conclude that the techniques implemented to maintain the mental map were in general working effectively. For example, considering the mental map preservation amongst views (C), we looked at task 3, 4 and 5. These three tasks have a sub-task in common, namely finding a certain node by visually inspecting the network. Predefined settings provided JX view for task 3 and task 4, and 2.5D view for task 5. Even if we have not explicitly measured the task completion time, the ‘finding’ sub-task resulted to be much harder in 2.5D, because of perspective distortion of the node-link diagrams, according to users’ oral feedback. We observed that some users reminded the position of the user to be found in task 5 from previous explorations in a different view, and this supports the idea that not only the mental map, but also the learning curve is somehow preserved amongst views. Moreover, one user switched to the JX view to find the requested node, and then back to the 2.5D to track its temporal evolution; this observation suggests that the mental map is also preserved when switching views for solving subtasks of a more complex task.

4.2 Avoidance of Visual Clutter

For static node-link diagrams, there is no a priori criterion for determining topological or geometric properties, but several “good” layout approaches have been proposed based both on computation and comprehension aspects [36]. Much research is based on optimizing the graph layout to enhance perception and comprehension [17] like minimizing edge-crossing, preserving symmetry, minimizing edge bends, minimizing edge length. There is also a research trend focused to optimize consumability of huge networks [44].

Techniques to avoid clutter for static graphs are a pressing issue even for small dynamic networks with 30–50 nodes since dynamics could multiply the information to be displayed by time steps and relations over time. In the following section we want to describe our efforts and insights at some decisive points during the development process.

With the help of the mock-up study, we wanted to gain first insights into how the perception of visual clutter (for a definition see [37]) can be influenced by different layouts and where comprehension or interpretation problems arise if information was hidden or compressed to reduce clutter. As described earlier, we used only two time steps for the construction of the mock-ups and about 35 nodes—but even in this case participants frequently reported clutter problems (“There are so many lines, I can’t see anything”).

Our approach consisted of two analysis stages: At stage one, we collected some basic behavioral indicators that have a relation to visual clutter, answering questions how easy a single node can be found, how easy its number of relations can be compared and how cognitively demanding this comparison process was. This behavioral data was analysed by using data of users viewing behaviour with eye-tracking methods. On the second stage, we looked for more subjective measures and analysed users feedback and their reported problems. In contrast to the first stage, also more complex tasks like structural analysis could be taken into account and presumably more top-down processes in users cognition were involved.

Based on some selected results of the eye-tracking data, we want to provide a first overview about the viewing behaviour of our participants with emphasis on visual clutter. We selected a task where the relational dynamics of the actor Leonard (Le) should be analyzed (see Table 1, tasks 6–8) for the three mock-ups with three different network questions. We choose one of the simplest, least demanding tasks to analyse effects of clutter on a near to perception level and to be able to include non-experts who had no problem to solve these basic tasks. Another advantage to select simple tasks for this analysis was that all participants used the same strategy to solve this task (see next section).

The first sub-task was to find the actor Leonard within the node-link diagram (see Table 4). For this and further analysis, we used a subset of the test sample consisting of experts and non-experts (n = 18), leaving out the group of involved participants to prevent biases caused by the usage of previous knowledge.

Table 4 Amount of time to find Leonard (n = 18); medians for experts non-experts

Interestingly, the median duration to find Leonard with the help of the JX and SPOCC plot were similar and higher than in the comet plot. Possible explanations could be that the JX plot is small sized, since two separated networks require more space than two merged networks. The SPOCC plot has less lines, but the coloring might be responsible to have caused some distraction, since there is evidence from other graph based eyetrackinb studies that lines are mainly ignored during the search process [26]. The comet plot might be the one where the actors are most salient, since the comet tails emphasize the nodes visually. A problem with this analysis could have been learning effects, but it has to be noted that the participants had the chance to become acquainted with the layouts and have seen and analyzed similar structures in previous tasks.

After the users had found Leonard, the next sub-task was to compare Leonard’s connections for two time points. To gather insights into the effort needed for this sub-tasks, we summed all fixation durations on the node Leonard or Leonard’s direct neighbourhood.

The SPOCC and the comet plot showed only small differences in the descriptive statistics compared to the JX plot. At the JX plot, mean durations at one AOI were clearly shorter which is a positive result for the readability of the network. This finding is also in line with users feedback. To make a fair comparison, we have to sum the left AOI and the right AOI in JX plot to compare the sub-tasks with the other plots (see Table 5, row at the bottom). In summary, the amount of time needed to solve the sub-task is clearly higher for the JX plot, possibly due to the additional effort to close the lateral gap (see also Fig. 6, upper plot).

Table 5 Glance durations on the Area of Interest (AOI) around Leonard in milliseconds (Sum of fixation duration on node Leonard and Leonard’s direct neighbourhood), n = 18
Fig. 6
figure 00256

A comparison of JX plot (upper figures), SPOCC plot (middle figures) and comet plot (lower figures). On the left hand side, the original plots are displayed, on the right hand side, scanpaths are shown as overlays. Red lines denote scan paths, blobs the fixations and blob’s size the fixation duration. The paths of the first half of the group of participants is shown (n = 9), since the plots for the other half differed slightly to vary the experimental condition

In Fig. 6 a comparison of the scan paths for each plot is shown—on the left the original plots, on the right the same plots with paths overlaid. The thick red lines denote the saccades or jumps of the eye, red blobs denote fixations and the size of the blobs their duration. In the upper plot (JX plot), there are three main visual attractors: actor Leonard on the left side and on the right hand side and the legend on the lower right corner. Concerning the fact that the legend is frequently used—which is a typical viewing behaviour when viewing graphs [42]—interestingly, only the legend on the right side was used. Most of the smaller red blobs are short fixations during the initial scanning for the label “Leonard”. During this scanning process, both networks where scanned to find Leonard, but all of the 18 participants have first fixated the AOI of Leonard on the left network.

As noted earlier, the size of the red blobs indicates the fixation duration, which is also a marker of cognitive effort [24]. We analyzed also the fixation duration statistically, since “a longer fixation duration indicates difficulty in extracting information, or it means that the object is more engaging in some way.” [36]. We have found significant differences between the three mock-ups, for experts as well as non-experts: statistics showed shortest durations for comet plot and with a clear difference to the two other plots (experts: c2 (2, N = 720) = 12.54, p = 0.002; non-experts: c2 (2, N = 530) = 9.19, p = .01), longer durations for both SPOCC and JX plot (no significant difference for experts c2 (1, N = 488) = .23, p = .63, but significantly shorter durations when non-experts used the SPOCC plot: c2 (1, N = 388) = 4.12, p = .042). In Table 6, the median fixation durations are presented, with indicators for the demand of more cognitive effort for SPOCC and JX plot.

Table 6 Median fixation duration in milliseconds

To sum up the results of eye-tracking data, the comet plot seems to have some advantages over the two other plots when used for simple, very elementary tasks. Especially for the SPOCC plot we found hints for its high visual density (corresponding to high cognitive requirements) during the scanning process (sub-task 1) and sub-task 2.

In the second analysis stage, we also assessed think-aloud notes and feedback from the participants about their impressions when working with the three mock-ups: For the comet plot (see Fig. 4 for a zoomed view, left hand side) overlay problems were reported frequently: One problem arises when a node doesn’t change its position over time. In this case the node at t1 (orange circle, see Juliette) was hidden by the node at t2. This kind of projection problem was difficult to grasp since it was unclear where the orange relations belonged to and was mentioned by many users. Users also reported that in case the node changed its position over time, some relations at t1 got masked by the trajectories. It has to be noted that, as stated earlier, users also frequently reported to have general comprehensibility problems and needed more time to figure out how to interpret the dynamics than in the other layouts.

Regarding the mock-up of the SPOCC plot (see Fig. 4, right hand side), users frequently reported problems to follow the coloring scheme. The aim of color-coding was to avoid visual clutter by using colors to compress information. Exemplarily we presented only one blue relation instead of two relations if the relation was stable. But color-coding gets easily difficult when used in network dynamics. For example, if A relates to B in t1 (orange) and B relates to A in t2 (green), we have to display two relations (or define a new color for it, and this is not the only combination). For this mock-up we decided to overlay such relations with some transparency (alpha = .7), but many users mentioned that it is stressful to deal with these “brownish lines”. An example where an orange and a green line are mixed can be found in Fig. 4 (right hand side) with one of Juliette’s relations coming nearly vertically at 1 o’clock from top. Users had to find the arrows to decompose the direction. These findings, the knowledge that more time points will make the whole topic even more complex, and the observation that many non-expert users had difficulties to find structural changes by color based macro-reading led to the decision to close this branch of development.

From a design point of view, we gained the impression that users easily get into problems with clutter in SPOCC plot but also the comet plot, where many users reported to have troubles to understand the visualization at first glance and solve more than basic tasks. So we identified color coding of ties to visualize dynamics as a dead end road concerning the design decisions of the mock-up study, since it worked neither for simple nor for complex tasks. Regarding the indicators of movements with tails or trajectories, we decided to rework the visualization. For prototype implementation, where interaction components can help to reduce clutter, we therefore recommended the strict motto to hide as much information as possible and make it available on demand only. But how much and which information can and should be hided to be beneficial and where are the drawbacks?

In the final prototype evaluation, the superimposition view—initially displayed with temporal trajectories like the comet plot, but without relations—got reasonable positive feedback regarding clutter by many experts at first sight. However, some of them stated that this reduced view does not provide enough information to be interpreted safely since no relational information is available. Hence they are forced to rely on movement information alone, which was seen as a general drawback of this visualization. The interaction feature to highlight and show previous temporal relations on demand, was therefore highly welcome for most participants, but seen as too limited, since there was no opportunity for multiple selection.

To sum up the participative design of the SI view, both strategies—to show too much or too little relations—have been criticized due to specific advantages and disadvantages. As color coding of ties turned out to be very difficult to be used for differentiating temporal dynamics only additional interaction methods will be able to deliver the basis for a user-controlled solution. Such methods would help to control the current amount of complexity to be shown (from no relations up to all relations), as well as continuously adjustable node highlighting (from single nodes and their relations up to multiple node selections), as well as various graph lenses [5]. In our view, these methods will have to be combined and further fine tuned to meet users full acceptance.

By providing this description how to deal with visual clutter, we wanted to provide some insights about the difficulties, possibilities, and limitations to find appropriate pathways through the methods and design space in the realm of dynamic network analysis by the means of empirical user feedback during the development process.

4.3 Multiple Problem Solving Strategies

Concerning the collected data of the final prototype tests, we analyzed the think-aloud audio recordings and the prototype interaction screencasts, using a categorization scheme during observation to extract information how their interaction is related to their insights and categorizing the different solution strategies for every task. This is comparable to the work of [31], who used a similar procedure with insights analyzing open tasks. We found interesting empirical results besides correctness and confidence of users’ answers: we noticed multiple problem solving strategies spanning both tasks and users, pointing out relevant differences in either the alternative or the combined use of several prototype features.

The first empirical result refers to the integration of visual and analytical methods and their balance. When addressing task 1, that was the toughest to accomplish and registered the lowest correctness, most users looked for or asked for an analytical method directly providing the numeric answer (that was actually missing, since the given SNA component computes only node-level measures so far, and does not provide any network-level measure such as density). Also for task 6, one user said that a binary table would have helped her in tracking nodes’ presence more than any visualization. Conversely, we noticed an opposite and unexpected behavior for task 3 and task 4. Task 3 required to compare the degree of a given node over time, and these values are mapped to the color and size of nodes, by default settings; task 4 required to find the out-degree of a given node at a given time, and this value pops out as a numerical tooltip on mouse over, by default settings. By analyzing these tasks, we observed that some users disregarded the analytical hints and preferred to find the (out-)degree just by counting the adjacent nodes, with the help of the highlighting interaction. This observation would lead us to infer that users prefer to visually solve those tasks they think they can manage, and to have recourse to analytical methods for harder tasks. It is worth noting that for task 4 users who counted were as confident as users who looked at the numeric tooltip, but the former were less often correct than the latter. The analysis of task 5 showed us another interesting user behavior: after finding the sign of the variation of the eigenvector centrality by looking at the provided visual mapping of the analytical value (color and size of the trajectory), some users rotated the 2.5D view or switched to the JX view in order to verify whether the network topology was compliant and confirmed the answer they gave.

Fig. 7
figure 00257

Interaction graph for task 1 of the prototype test. While the first two rows show task duration (gray, row T) and occurred insights (light yellow, row R), rows 3–5 show the usage of the main views (SI = blue  = row V1, JX = green = row V2, 2.5D = red = row V3), whereas the remaining rows show the usage of different interaction and exploration techniques

Another interesting result, which we found from the analysis of the task completion procedures, concerns the recourse to different views to solve the same task. For task 6, for example, we provided a predefined setting with the 2.5D view and all trajectories activated. Most users solved the task looking at the interruptions of trajectories (an interruption on the left side indicates a node who has joined the network, an interruption on the right side indicates a node that has left the network, and an interruption in the middle of two trajectory segments indicates a short leave). In answering the second part of the task, about the relational consequences of these changes, some users switched forth and back to other views, while others kept on using the predefined view and 3D navigation. For task 7, some users switched from the predefined SI view to the 2.5D view, where they looked at the slopes of trajectories to investigate movements of nodes. In Fig. 7 we can compare the exploration strategies of two subjects dealing with task 1. This figure not only shows a temporal view of the interaction logs (i.e. which views and which interactions were used or performed and when), but also task lengths and occurrences of insights. In this context, an insight is meant as a guess, a partial answer (in the sense of knowledge-building insights; see [8]), or any additional remark. In particular, the first two rows show the task duration (gray) and the sequence of insights (light yellow). Then, the following three rows correspond to the three views: superimposition (blue), juxtaposition (green), and 2.5D (red). Finally, remaining rows show the sequence of different types of interactions: transition between views (purple), 2D navigation (light green), 3D navigation (light blue), change of layout stability (pink), highlighting nodes and/or their trajectories (light purple), computing an SNA metric (orange), and showing additional details in the info area or in the tooltip (light orange). Comparing the exploration behavior of these two subjects, namely Subject A (top) and Subject B (bottom), we see that A was faster, used as few interactions as possible and solved the task straightforwardly. Conversely, B seemed to exploit task 1 to play with the tool, exploring most of its views and interactions. It is worth noting that both were possible expected behaviors, given the openness of the first task.

In Fig. 8 we compare the exploration strategies of the same pair of subjects dealing with tasks 2–7. While the correctness of answers is the same (100%) and completion times are similar, we can identify very different patterns of interaction and few similarities. Overall, we note that subject A switched views quite often, while subject B never changed the pre-defined view that was offered by the experiment setup. Moreover, A never performed 3D panning or rotation, which were used quite often by B; conversely, A made an intensive use of the tooltip while B used it very seldom. Nevertheless, there is a similarity in the layout adjustment, which was performed by both users only for task 2 (clusters and their stability) and task 7 (shifts from core to periphery).

Fig. 8
figure 00258

Interaction graph for the tasks 2–7 of the prototype test

Looking in detail at specific tasks, we also find more differences than similarities. When dealing with task 2, for example, subject A started the analysis at a local level with a lot of 2D zooming and panning and then some layout adjustments, while subject B set the layout at first and then analyzed the network at a global level, during a long visual reasoning phase without any interaction, besides some highlighting in the end. For task 4 (node outdegree), both subjects had the same interaction pattern, but with a difference: A gave an answer only after reading the SNA value in the tooltip, while B counted highlighted nodes and gave the correct answer, then used the tooltip to confirm it. Tasks 5 and 6, whose predefined view was the 2.5D, also showed differences: subject A explored the view by highlighting nodes and trajectories, while subject B navigated it in the 3D space. For task 5, in particular, we can see as both subjects launched an SNA computation, but A looked at the numeric value in the tooltip, while B looked at its visual mapping (as explained above).

In general, considering all subjects of our study, we found that the different views complement each other, and the ‘best’ view does not depend only on the data and task, but also on users, who might have different strategies even if they belong to a homogeneous group with a common background. Furthermore, even a single user dealing with a single task, might find beneficial switching from one view to another to ensure the correctness and/or the completeness of her insights. Similarly, for many tasks there is no perfect choice between only visualization (node-link diagrams) and only computation (numbers and tables), but the best choice is to integrate both of them to support the visual analytics reasoning process. This approach enables the user to exploit his/her preferred problem solving strategy at best and to gain complex insights from a multifaceted methods approach to network analysis. A possible disadvantage is that beginners might choose a wrong way, and in more complex cases training might be needed to help them switching to the fastest and most accurate strategy, but in general flexibility and multiple options seem to be beneficial.

5 Conclusions and Future Research Questions

We have presented the main results of the evaluation of a visual analytics approach to dynamic network analysis. After the description of the features of the existing prototype, we provided insights into its participatory development process and focused on the results of a final prototype evaluation. The main contribution of this approach is to be seen in its consistent focus on users of visual analytics methods as one of the crucial factors for a method’s utility and success in real world tasks and data scenarios.

In case of the prototype we examined, this evaluation approach provided a fine grained matrix, showing specific strengths, weaknesses and further development suggestions. Aside from technical implementation issues, the continuation of the outlined issues is obvious following three main strands we have pointed out:

Maintenance of the mental map: as a general aim, the extended preservation of the analysts mental map is to be seen as one of the major challenges for future developments of complex visual analytics methods and technologies. The aim is to free the analysts cognitive capacities from visual efforts of searching, matching or navigating, but focusing it earlier on intended tasks of pattern analysis and exploration. Still, as our results show, the control of the amount of mental map maintenance should be made optional for the purpose of daily use, where it could have also been already successfully deployed as cognitive scripts and schemes.

Avoidance of visual clutter: from our empirical results, visual clutter turned out to be a decisive aspect when developing visualizations for dynamic network analysis, even for smaller networks and a small number of time points. For color-coding of temporal aspects we could not derive a satisfying and sustainable solution. Interactions could be of help to avoid cluttering, but they have to be carefully aligned with users interactions patterns and exploration behavior. Extended interaction techniques might include lens techniques or other smart interactions to open up more space for all that ink that temporal visualization requires.

Multiple Problem Solving Strategies: we observed many different problem solving strategies in a small group of subjects, sharing the same skills and coping with the same task on the same data. In particular, subjects made different recourse at either visual or computational methods. From this observation, we can preliminary infer that a seamless integration of several views and computations in a consistent framework seems to give better results than optimizing the design of a single technique. As a result, we want to advocate the implementation of multiple problem solving options and methods into complex information visualization tools and technologies, for only this strategy seems to be able to cope with the diversity of future users and their (evolving) tasks.