Keywords

1 Introduction

Storyline visualizations have been the focus of intense research in the last decade. Originally introduced to describe the narrative of a movie [12], this visualization paradigm has been successfully used to represent the temporal dynamics of the interactions between actors (individuals or organizations) in a social network or in a working environment [10, 14, 16,17,18,19,20]. In a storyline visualization, the narrative unfolds from left to right, each actor is represented as a line, and two lines may converge or diverge at a time instant based on whether the two corresponding actors interact or not at that instant; see Fig. 1(a). Since a group of lines bundled together usually reflects an in-person meeting, a common constraint in a storyline visualization is that an actor cannot belong to two different groups at the same point in time. However, this constraint represents a severe limitation for some application scenarios, for example when groups model associations that are not in-person meetings (e.g., paper co-authorships) or when each point in time of the storyline corresponds to a relatively long time interval (e.g., one year).

In this paper we generalize the classical storyline model by allowing an actor to simultaneously belong to distinct groups. We call this model Storyline with Ubiquitous Actors (SUA); see Fig. 1(b). Essential to our model is that an actor is represented as a tree rather than a single line. Our contribution is: (i) We propose a visualization paradigm for the SUA model and identify quality metrics for it. (ii) We define an algorithmic pipeline for storyline visualizations in the SUA model. (iii) We provide a proof-of-concept implementation and apply it to produce visualizations in real-life scenarios.

Related Work. Tanahashi and Ma [18] present a general framework for generating aesthetically pleasing storyline visualizations. Subsequent papers focus on specific optimization problems like crossing minimization [5, 7, 8] and wiggle minimization [6]. Padia et al. [15, 16] consider storyline visualizations with multiple timelines. Efficient approaches that compute storyline visualizations with hierarchical relationships or with streaming data are described by Liu et al. [10] and by Tanahashi et al. [19], respectively. Qiang and Bingjie [17] present a system that embeds storyline visualizations into a radial layout. For a broader dissertation on storytelling and visualization refer to the survey of Tong et al. [20]. We remark that our scenario is strongly related to the dynamic sets visualization; see, e.g., [11, 13], and [3] for a survey. In this regard, it is worth mentioning a recent work by Agarwal and Beck [2], who adopt storylines for visualizing dynamic sets.

2 Storyline Visualizations and Ubiquitous Actors

We first recall basic definitions and principles of classical storyline visualizations and then define our visualization for the SUA model.

Classical Storyline Visualizations. A storyline \(\mathcal {S}= (\mathcal {A},\mathcal {G})\) consists of a set \(\mathcal {A}=\{a_1, a_2, \dots , a_n\}\) of actors and a set \(\mathcal {G}= \{G_1, G_2, \dots , G_k\}\) of groups. Each group \(G_i \in \mathcal {G}\) is a triple \(\langle \mathcal {A}(G_i), b_i, e_i\rangle \), where \(\mathcal {A}(G_i) \subseteq \mathcal {A}\) is a subset of actors, \(b_i\) is the begin-time of \(G_i\) and \(e_i\) is the end-time of \(G_i\). We say that \(G_i\) is active at any time instant in the interval \([b_i,e_i]\), and that each actor \(a_j \in \mathcal {A}(G_i)\) participates to \(G_i\). A common assumption is that an actor cannot participate to two distinct groups at the same point in time, i.e., if \(\mathcal {G}_i\) and \(\mathcal {G}_j\) are two distinct groups such that \([b_i,e_i] \cap [b_j,e_j] \ne \emptyset \) then \(\mathcal {A}(G_i) \cap \mathcal {A}(G_j) = \emptyset \).

In a storyline visualization, each actor \(a_j\) is represented as a line \(\ell _j\) that flows from left to right; see Fig. 1(a). Some basic principles are considered: (i) For each group \(G_i\), the lines representing the actors in \(\mathcal {A}(G_i)\) are adjacent, i.e., they run close together from the begin-time \(b_i\) to the end-time \(e_i\) of \(G_i\); (ii) lines of actors that are not in the same group at the same time are depicted relatively far from one another; (iii) a line should not deviate unless it converges or diverges with another line. In addition, common quality metrics for the readability of storyline visualizations are: (a) Line or block crossings – a line crossing occurs when two lines intersect while a block crossing is caused by two blocks of parallel lines that pairwise intersect. (b) Line wiggles – line deviations that, when frequent, negatively affect the visual flow of the layout. (c) White space gaps – white areas used to separate lines of actors that do not participate to the same group.

Fig. 1.
figure 1

Storyline visualization: (a) Classical model. (b) SUA model.

Visualizations with Ubiquitous Actors. To support the visualization of ubiquitous actors, we represent an actor \(a_j\) as a tree \(\tau _j\) rather than as a line (see Sect. 3 for a formal definition of \(\tau _j\)). Informally speaking, when an actor simultaneously participates to different groups, the line of the actor branches out and forms a tree. For example, in Fig. 1(b) we see the trees of five actors \(a_1, \dots , a_5\). At time \(t_1\) actor \(a_2\) participates to group \(G_1\) while at time \(t_2\) it simultaneously participates to groups \(G_1\) and \(G_3\). As a consequence, the line of \(a_2\) at time \(t_1\) is split into two branches. The choice of a tree is motivated by the fact that we want to represent each actor by a connected geometric feature (avoiding discontinuities); at the same time, we want to keep such geometric feature as simple as possible, since the addition of edges may increase the number of crossings. Such tree representations add new quality metrics:

  • Actor planarity. It is natural to require that each tree representing an actor is not self-intersecting. While this is trivially guaranteed when an actor is a line, it requires an algorithmic effort in the SUA model.

  • Branch continuity. To avoid interruptions in the continuity of the story, the number of branches of an actor at time \(t_h\) that continue at time \(t_{h+1}\) should be maximized. If an actor participates to m groups at time \(t_h\) and to \(m' \ge m\) groups at time \(t_{h+1}\), all branches at time \(t_h\) should continue at time \(t_{h+1}\).

  • Branch degree. When an actor tree needs new branches at some time instant \(t_h\), it is desirable that the maximum number of branches that emanates from a common branch at time \(t_{h-1}\) is minimized.

We note that such new metrics may be in conflict with classical ones (see Fig. 2).

Fig. 2.
figure 2

(a) A layout with optimal branch continuity. (b) Violating branch continuity for \(a_3\) at \(t_3\) and for \(a_1\) at \(t_5\) removes 9 crossings and reduces line wiggles. (c) A layout with optimal branch degree. (d) Violating branch degree for \(a_3\) at \(t_3\) reduces line wiggles and removes 2 crossings.

3 The SUA Algorithmic Pipeline

We compute storyline visualizations in the SUA model by means of an algorithmic pipeline based on the concept of actor-tree \(\tau _j\) associated with an actor \(a_j\). The life-time of actor \(a_j\) is the interval between the first and the last time instant at which \(a_j\) belongs to some group. Tree \(\tau _j\) is defined as follows. Node set – Tree \(\tau _j\) has a root \(r_j\). For each time instant \(t_h\) in the life-time of \(a_j\): If \(a_j\) participates to at least one group \(G_i\) (\(i>0\)) active at \(t_h\), \(\tau _j\) has a node \(u_{h,i}\) for each such group; otherwise \(\tau _j\) has a single node \(u_{h,0}\) that is not associated with any group. Edge set – The parent of a node \(u_{h,i}\) (\(i \ge 0\)) is assigned as follows: If \(t_{h-1}\) is not in the life-time of \(a_j\), the parent of \(u_{h,i}\) is the root \(r_j\). Else, if \(i>0\) and \(G_i\) is active before time \(t_h\), the parent of \(u_{h,i}\) is \(u_{h-1,i}\). Else, the parent of \(u_{h,i}\) is one of the nodes \(u_{h-1,l}\). Our algorithmic pipeline consists of four steps:

1. Actor-tree Initialization. It defines an initial actor-tree \(\tau _j\) for each actor \(a_j\). Namely, given the nodes of \(\tau _j\), it assigns the parent to each node of \(\tau _j\).

2. Branch Permutation. For each time instant \(t_h\) this step computes a permutation (i.e., a vertical order) of all nodes at time \(t_h\) in the union of all actor-trees .

3. Actor-tree Untangling. For each actor-tree \(\tau _j\), it redefines the parent of some nodes, so to reduce self-intersections of \(\tau _j\) without changing its node degrees.

4. Branch-coordinate Assignment. It assigns the y-coordinates to actor-tree nodes.

In Step 1 we aim to optimize branch continuity and branch degree. Step 2 aims to minimize block or line crossings. Step 3 tries to enforce actor planarity. Step 4 aims to reduce line wiggles and space gaps. Different algorithmic strategies are applicable to each step. We briefly describe our solution; see also Fig. 4.

Actor-Tree Initialization. For any actor-tree \(\tau _j\), let \(V_{h-1}\) and \(V_h\) be the sets of nodes at time instant \(t_{h-1}\) and \(t_h\). The parents of the nodes in \(V_h\) are chosen among the nodes in \(V_{h-1}\) so that the distribution of the degrees in \(V_{h-1}\) is as uniform as possible. For each node \(u_{h,i}\) in \(V_h\) that belongs to the same group of a node \(u_{h-1,i}\) in \(V_{h-1}\), the parent of \(u_{h,i}\) is \(u_{h-1,i}\). For the remaining nodes of \(V_h\), we adopt a round robin policy to assign children to the nodes in \(V_{h-1}\) so that the difference of the degrees of any two nodes of \(V_{h-1}\) is at most one.

Branch Permutation. We exploit a state-of-the-art algorithm for classical storyline visualizations, namely the algorithm by van Dijk et al. [5] based on a SAT formulation, which optimally solves the problem of minimizing block crossings. To this aim, we transform the output of Step 1 into an instance for a classical storyline visualization: Each actor tree is partitioned into a set of edge-disjoint paths by duplicating each node with \(k \ge 2\) children into k nodes each having one child (see Fig. 3(a)). Each path is processed by the algorithm in [5] as a distinct actor. All copies of the same node are then recombined into a single node to restore the tree. However, if disjoint paths originating from two copies of the same node are treated independently, they can create many crossings when recombined back into the tree. To alleviate this drawback, we let the initial node of each path belong to the same group of its original duplicate, unless this operation makes the path belonging to multiple groups at some other point in time. Moreover, when copies of the same node are recombined into a single node, two edges incident to this node may create a crossing, which is easily removed as depicted in Fig. 3(b). Hence, a crossing in an actor tree only involves independent edges.

Fig. 3.
figure 3

(a) Transformation of an actor tree into a set of disjoint paths. (b) Crossing removal when merging two copies of the same node. (c) Actor-tree untangling. (d) Preservation of the edge order around a node when splitting it.

Actor-Tree Untangling. For each actor-tree \(\tau _j\), we redefine the parent of some nodes to reduce the number of crossings between the edges of \(\tau _j\). If two edges \((u_{h-1,p},u_{h,q})\) and \((u_{h-1,r},u_{h,s})\) (\(p \ne q\), \(r \ne s\)) of \(\tau _j\) cross, we replace them with two new edges \((u_{h-1,p},u_{h,s})\) and \((u_{h-1,r},u_{h,q})\) (see Fig. 3(c)). This operation removes at least one self-intersection and does not create any new one. Also, the degree of \(u_{h-1,p}\), \(u_{h,q}\), \(u_{h-1,r}\), and \(u_{h,s}\) does not change. We repeat this procedure until it is no longer possible to remove self-intersections from \(\tau _j\).

Branch-Coordinate Assignment. As in the Branch-permutation step, we consider the set of paths that decompose the tree and make them disjoint by duplicating each node with \(k \ge 2\) children into k nodes each having one child. The cyclic order of the edges around each node defines the permutation of the lines that correspond to these edges (see Fig. 3(d)). Any technique that assigns coordinates to the paths, while reducing line wiggles and white space gaps can be applied (see, e.g., [6, 18]). This assignment preserves the vertical permutations of the paths.

Fig. 4.
figure 4

Illustration of the algorithmic pipeline.

4 Implementation and Case Studies

We developed a prototype web application, StoryTreeViewer, which implements the algorithmic pipeline of Sect. 3, see https://bit.ly/2yS3Fvi. StoryTreeViewer offers a simple interactive interface, which we used to evaluate effectiveness and limits of our model through two case studies on publication data extracted from DBLP [9] and Scopus [1].

Fig. 5.
figure 5

Visualizations of our case studies. See the full version [4] for larger images. (Color figure online)

Case Study 1. The first case study, see Fig. 5(a), describes scientific collaborations among the authors of this work in the various editions of the Graph Drawing Symposium (GD) since 1999. Each actor is an author and a group \(G_i\) is a subset of actors who co-authored some papers. \(G_i\) is active in \([b_i,e_i]\) if all their members co-authored at least one paper in each year from \(b_i\) to \(e_i\). The layout reveals the following dynamic. In the first part of the story there is a strong collaboration between the three oldest actors (pink, green, and blue), in particular they form a group lasting from 2004 to 2009. In 2003, the pink actor was the chair of GD, which prevented him to publish together with the other two authors. In 2010 and 2011 the collaboration of the three actors is weaker, as they mainly collaborated with researchers outside their university. The dynamic becomes more involved in the last years, when two new members joined the group (cyan and orange), and new theoretical and application research topics were activated.

Case Study 2. The second case study, see Fig. 5(b), describes scientific collaborations among five of the research teams (universities) with the highest number of papers published at GD. The actors are the teams and the groups are defined as in case study 1. Namely, a group \(G_i\) is a subset of teams that appear together in some papers (in terms of author affiliations); \(G_i\) is active in \([b_i,e_i]\) if all its teams appear together in at least one paper in each year from \(b_i\) to \(e_i\). The layout shows some interesting facts. From 1999 to 2002 there is a strong collaboration between Roma Tre and Perugia, witnessing that the group in Perugia stems from researchers coming from Rome. The collaboration between the five research teams increases since 2007 and becomes stronger since 2011. This is partly explained by the series of workshops started around 2006 (BWGD, HOMONOLO, GNV, etc.) that increased international collaborations.

Limits. Working on the case studies, we observed some limits of our approach: (i) The implementation for the Branch Permutation step exploits the algorithm in [5], splitting each actor-tree into multiple disjoint paths. The size of this transformed instance raises some computational complexity issues. (ii) Our visualizations appear to be readable for relatively few actors and further work is needed to better evaluate the effectiveness of the SUA model on larger instances.

5 Conclusions and Future Work

We introduced the SUA model, which allows ubiquitous actors in storyline visualizations. This model extends the spectrum of applications for this type of representation and opens up to many intriguing research directions. Among them: (i) Are there more effective ways of modeling ubiquitous actors other than using trees? (ii) Design and experiment different algorithms for the SUA pipeline.