Keywords

1 Introduction

Hierarchies involve entities and connections between these entities. These connections are organized into levels that indicate status differences [6]. Hierarchies occur in many application domains, and can be visually represented in a variety of ways. The efficiency of their visual representation relies on domain knowledge, convention [2, 3, 5], and on the naturalness of the link between the representation and the information depicted [7]. There exists a natural link between hierarchies and tree diagrams [1, 6]. This paper explores the question how tight this correspondence is, and to which extent this correspondence is overruled by conventional associations of hierarchies with diagrams other than trees. More specifically, this paper focuses on visual representations of hierarchies in three generally well-known application domains: organizational structure, folder and file systems, and arithmetic expressions. For organizational hierarchy, it is common to use a tree diagram. Folder and file systems are displayed by many file browsers in an indentation representation. In arithmetic expressions nested parentheses are used to indicate the order in which operations are to be applied. Like graphs and charts, these three visual representations can be seen as schematic diagrams, as they provide a concise representation of abstract information [2]. Each of the three diagrams conveys meaning in a different way. First, the entities involved, the number and nature of their possible hierarchical connections are different. The entities are conceptually different, and in the organizational and folder hierarchy, they are less abstract than in the arithmetic one. In organizational and folder hierarchies, non terminal nodes may dominate one or more nodes at a lower level. Arithmetic expressions always involve binary branchings. Connections in organizational hierarchies are hierarchical in the sense that entities at a higher level manage and control entities at a lower level. In folder structures, the connection is one of inclusion. In arithmetic expressions, it is one of the subsequent operational orders. Second, the visualization approach differs. While the tree and indentation diagram exploit the horizontal and vertical dimension of the display space to visualize hierarchical connections, nested parentheses make use of only one dimension, and uses special characters (parentheses or similar characters) to mark levels and group entities. The tree diagram is arguably the only fully graphical diagram, as it uses a line to connect entities of different hierarchical levels. Each diagram has its variants, dependent on orientation and/or order in which objects of the same level are placed on the display. All domains involve hierarchies based upon the same logical information, but this information is represented differently in the three different domains. In order to gain insight in the relative merits of each type of diagram in producing and understanding hierarchies in the application domains selected, we have conducted an empirical experiment. In this paper, we are especially interested in effects of the interaction of natural correspondence and conventional association on use and interpretation of diagrams to communicate the hierarchical message in different application domains.

2 Method

The experiment consisted of two parts: (i) construction of a visual representation based upon a textual description of a hierarchy in the three domains selected for this study; (ii) comprehension of hierarchies in these domains via tree, indentation and nested parentheses. The diagrams were displayed with as few graphical devices as possible. Textual labels were used to denote the entities. Trees were shown top-down. Prefix notation was chosen for the indentation and parentheses representations of organizational and folder structure, but for the arithmetic expressions we decided to use infix notations. The design of the experiment was within-subject. The independent variable in both parts is the application domain with three conditions. In the comprehension part, the diagram type with three conditions is added as second independent variable.

2.1 Participants

Sixty-three participants took part in the experiment, in the period from September 7 to September 18, 2015. The participants recruited were all native Dutch speakers enrolled at the University of Groningen (33 male; 30 female; average age: 22.3 with a standard deviation of 2.2). Domain knowledge and conceptual knowledge necessary for correct interpretation of the representations were assumed to be available to all participants.

2.2 Design and Materials

In the production task, each participant was told to read a textual description (displayed on a computer screen) of an instance of a hierarchical ordering in each of the three application domains (presented in random order, and verbalized in terms suited to the domain at hand), and to draw by hand, with paper and pencil, a visual representation of the information provided. No specific instructions were given, as the goal was to elicit spontaneous drawings. Participants were allowed as much time as they needed to construct the diagrams. All drawings were categorized as belonging to a certain representation type. Categories were obtained inductively. Figure 1 illustrates the hierarchy description for folder structure (translated from Dutch).

Fig. 1.
figure 1

Illustration of textual description in the folder domain for the production task.

In the comprehension task, participants were asked to respond to yes-no questions about hierarchical relations in each domain (organization, folders, arithmetics), and for each representation (tree, indentation, nested parentheses). The questions, verbalized in domain specific terms, were shown on a computer screen (display size 2560\(\,\times \,\)1600) together with one of the three representations. Correct responses and response times were recorded for each question. To reduce learning effects and fatigue, only nine questions were selected from a total set of twenty-seven questions, which varied slightly in formulation for each domain. The hierarchies represented varied in complexity. A randomization algorithm for the selection of these nine questions from the whole set of twenty-seven questions was setup in such a way that all participants had to process all three visualization types in a random order for different domains. Table 1 shows examples of questions (translated from Dutch) participants were asked to respond to in the comprehension task.

Table 1. Illustration of questions for each domain in the comprehension task.

2.3 Procedure

The experiment was conducted individually with each participant. The participant was positioned at a desk in a quiet room with a computer, screen, keyboard and mouse. Production preceded comprehension. The participant read instructions and scenarios from the computer screen and did the drawing manually. Next he/she was directed again to the computer screen to respond to the questions. The whole experiment took about twenty minutes for the construction task and five for responding to the questions.

Fig. 2.
figure 2

Sketches drawn by two participants in response to the text in Fig. 1.

3 Results

3.1 Production

The drawings produced were categorized as belonging to one of the following ten categories, collected inductively: tree, indentation, nested parentheses, network, treemap, table, pie diagram, iconic diagram, text, other. Figure 2 shows hand sketches that two participants constructed in response to the text given in Fig. 1. The left picture was categorized as treemap, the right one as indentation.

Table 2 shows the ten diagram types used to categorize the drawings (N = 189) produced for the three different domains by the participants (N = 63), with their frequencies. Trees and networks were popular among the representations constructed for both organizational hierarchy and folder structure. The arithmetic scenario gave rise to the most diversity. Remarkably no tree or tree-like diagrams were drawn in response to this scenario. Table 2 also shows that, for each domain, the hierarchy representation conventionally linked to it, was produced by some of the participants. In the organization condition, nearly half of the participants draw a tree.

Table 2. Frequencies of representation types constructed in the production task.

3.2 Comprehension

Figure 3 visualizes as box plots the results for the response times in the comprehension task. Figure 4 shows the results for accuracy as percentages of incorrect answers. Both figures give an overview of the results obtained for each representation type independent of domain, and for those for each specific domain.

The results are reported for different numbers of participants. All results of two participants were filtered out, as they were clear outliers. Additionally, outliers with respect to response times at representation and domain level were removed from the results visualized in Fig. 3. Figure 3 shows that the mean and median of the response times in the parentheses condition for all domains are higher than those in the tree and indentation condition. An ANOVA with repeated measures with a Greenhouse-Geisser correction shows that the mean scores for response times are statistically significantly different (F(1.752, 105.121) = 11.053, p = 0.000). Post-hoc testing with Bonferroni correction indicates that performance on the question task with nested parentheses has a significant impact on the time spent to complete the task (mean = 27930.61, std. dev. = 13962.096). Parentheses were slower processed than the other interfaces. A Wilcoxon Signed-Ranks test was used to test significant differences in response accuracy (see Fig. 4). This test shows that trees perform significantly better on accuracy than indentation (Z = \(-3.332\), p = 0.001). This is especially caused by the combination of indentation and arithmetic hierarchy which turned out to yield very slow performance. Arithmetic expressions were most easily processed in a representation with nested parentheses. However, for the folder and especially the organizational domain, nested parentheses yielded worse performance than trees and indentation.

Fig. 3.
figure 3

Average response time for each representation type in general and for each application domain.

Fig. 4.
figure 4

Percentage of response error for each representation type, in general and for each application domain.

4 Discussion and Conclusion

The purpose of this paper was to get insight in the use of specific hierarchy representations and in the ease of processing them with respect to three different application domains. We were interested in the question to which extent the natural correspondence between a hierarchy and a tree would be overruled by conventional associations. Neither the results of the production task nor those of the comprehension task lead to the unequivocal conclusion that trees are the most prominent and best performing candidates for processing hierarchies in the three different domains considered in this paper, although trees turned out to yield the most accurate performance. Parentheses were clearly slowest. We discern an important difference between the tree and indentation representation on the one hand, and nested parentheses on the other. Nested parentheses are difficult to process by users in a folder or organizational hierarchy. This may be caused by the one-dimensionality of parentheses which do not exploit the levels property [4], and which look cluttered, especially in combination with (large) textual labels to denote the hierarchy entities. Interestingly, this inconvenience did not lead to participants producing fewer correct answers. In contrast, linear representations with parentheses seem to fit operational hierarchy in arithmetics far better than folder and organizational structure. The results of this study suggest that trees and especially representations with indentation perform poorly for arithmetics. This may be due to the fact that at a conceptual level arithmetic expressions are not associated with hierarchies. This observation is indeed supported by the findings of the production task, where no participant constructed a tree or tree-like diagram in response to the arithmetic scenario. The infix notation we decided to use may also have caused the slow processing of indentation representing arithmetic expressions. Finally, we assumed domain knowledge to be fully available for all domains. Future work will explore whether this is a confounding factor.