Keywords

1 Introduction

Although some research has already been done to augment exhibits in the museum (e.g., [30, 40, 41, 46]), many exhibits are still passive and silent, i.e., the objects themselves do not provide any additional information or recommendations. This is why the artifacts are often augmented through short textual descriptions or explanations from a guide. However, commonly there is more data available than what can be written on a small plate or told in a short time (e.g., knowledge of curators or information in databases and documents). Using mobile devices to convey this extra information has been proven effective in several research projects [10, 43].

Fig. 1.
figure 1

Museum setting showing the visitor using the mobile application with visualization (a), the physical object in the museum (b), and Bluetooth beacons at exhibits telling the app where the user is located (c).

Many museums enable their visitors to download additional information to their smartphones (e.g., Mumok [42], Deutsches Museum [22]). However, this content is often identical to that presented within the exhibition. Besides, while most museum apps do offer maps and lists to assist users in navigation [23], the apps are not location-aware, and visitors still have to select the objects of interest themselves.

Providing a friendly and efficient user interface is crucial when conveying the vast amount of information behind an exhibition, so an app can also reach less experienced visitors and, thus, more users in general [44]. We believe this can be achieved through Information Visualization (InfoVis) methods. While there is much research on human-computer interaction in museums [31], the use of interactive data visualization in museums is still a growing area of research (e.g., [13, 59]). Our research aims to bring the perspective of InfoVis to the visitors’ mobile devices.

We conducted a design study to identify a suitable visualization for a location-aware mobile application that guides visitors through a museum exhibition (see Fig. 1). Our overall goal was to develop a visualization that is integrated into a mobile application for a museum exhibition. Therefore, we first explored three different visualization concepts and evaluated them in a lab test. In the second stage, we built upon the results of the first stage, developed variations of the preferred visualization concept, and evaluated them in both a lab and the actual museum setting. The main contributions of this work are:

  • the design of situated, mobile visualization concepts, their prototypical implementation using web technologies and application in a real museum setting (Sect. 4.1 and 5.1),

  • the results of two consecutive comparative studies (Sect. 4 and 5), as well as

  • a set of lessons learned and implications about visualization & interaction and the insights derived from the study process (Sect. 6).

In the following, we will describe the background of our study and the design process (Sect. 2). We present relevant related work focusing on mobile devices in museums as well as on timeline visualization and InfoVis on mobile devices and in museums. Section 4 and 5 document the two performed iterations to evaluate visualization concepts. In Sect. 6, we reflect on the results and the design process.

2 Background and Design Process

The exhibition “Des Kaisers neuer Heiliger” (The Emperors New Saint) at the Klosterneuburg Monastery, Austria, told the stories of Emperor Maximilian I (Habsburg) and Margrave Leopold III (Babenberg) at a time of media in transition [56]. The exhibition was a mixture of the two extreme museum types (highly interactive science museums vs. do-not-touch art museums and galleries) described in [31]. Our study was performed in the context of this exhibition.

2.1 Method

We followed a design process with two iterations. First, we developed three basic visualization concepts and implemented them as clickable mockups for evaluation (Iteration 1). Based on the chosen timeline concept, we noticed there were multiple options for visualizing the data. Therefore, we conceived three concepts, which were developed as fully functional prototypes and assessed (Iteration 2). Figure 2 illustrates the timeline of the iterative development phases.

Fig. 2.
figure 2

Timeline of the development phases from July 2018 to November 2019.

The concepts were developed by experts in mobile development, HCI, and InfoVis. The same concepts were refined based on feedback provided by three other experts not only in InfoVis and HCI but also on the development of digital artifacts to enhance museum visits. As part of our co-design process, we also performed an initial workshop with the five exhibition curators with a background in history and as cultural mediators as domain experts. In this workshop, we got to know the exhibition concept, discussed existing data, and defined the requirements. Furthermore, we brainstormed about how the visualization could look like. One of the ideas was a timeline that looked like a newspaper, which after the contribution from different experts became the timeline concepts designed for Iteration 1. We got back with the ideas to the curators in one of our monthly workshops with them in the course of preparing the entire interactive exhibition. Thus, we regularly discussed the progress of the visualization design and specified the underlying data.

2.2 Problem Analysis

Based on the results of the initial workshop with the exhibition curators, we defined data and requirements for the visualization.

Data Overview. The data contained a title, time (year(s)), a detailed description in text form, images, and additional interactive components (e.g., AR or a single choice game). The temporal dimension was not visible in the exhibition except on the exhibit labels which showed title, short description, and year. Thus, including time in the visualization interested the curators and us.

In total, 27 exhibits were planned for the exhibition. Because of the storytelling aspects, the curators decided not to integrate all exhibits into the application. This is why the exhibits within the app were reduced to 13. These exhibits were divided into six content-related sections.

Requirements for Visualization. Together with the curators, we defined requirements for our visualization concepts:

  • RQ1. The visualization should function as a guide through the exhibition. Thus, the order of the individual exhibits within the museum has to be shown.

  • RQ2. Depending on the visitor’s location, exhibits should appear dynamically. The focus should be on the closest exhibit.

  • RQ3. To visualize the time dependency within the exhibition, exhibits appear chronologically within each section.

  • RQ4. Complementary information is shown for the selected exhibit, thus providing the visitors with additional information.

3 Related Work

Based on the data and requirements, we focus on time-oriented data visualization for a location-aware mobile application in a museum. Therefore, in this section, we cover related work in four areas. First, we summarize the use of mobile devices in museums. Additionally, we take a closer look at timeline visualization, InfoVis on mobile devices, and InfoVis in museums.

3.1 Mobile Devices in Museums

In the 1950s, Acoustiguide [1] introduced the first audio guides with mobile devices in museums. Around 40 years later, the first museum applications using PDAs and Pocket PCs were documented. E.g., HyperAudio [45] is an early mobile guide based on a PDA with additional infrared sensors developed in the late 1990s. HyperAudio displayed hypermedia pages with an audio channel depending on location in the museum. Several other visitor guides followed (e.g., [18, 43]). Economou and Meintani [23] evaluated museum applications for mobile phones. Most of them function as guided tours and representations of exhibitions, which is why maps or lists are often integrated to allow visitors to navigate according to the spatial layout or the chronological or alphabetical order of exhibits. In addition to functioning as guides, mobile phones are used to augment exhibits or exhibition spaces. Luna et al. [37] analyzed Augmented Reality (AR)-integrating heritage applications in Europe. Most of the 35 studied apps (23) used AR to reconstruct spaces and buildings. Interestingly, fewer apps (10) extended exhibits. Mobile devices have also been integrated into multi-device ecologies (MDEs). Such MDEs are often studied in the context of games [33] and as a combination of guides and games [25]. Ghiani et al. [25] extended large screens by combining a mobile guide with games. An underlying architecture for such MDEs integrating the visitor’s mobile device has been proposed by Blumenstein et al. [8]. The authors distinguish between active (interactive, e.g., which connect to a multi-touch table) and passive (traditional) exhibits.

3.2 InfoVis on Mobile Devices and in Museums

Timeline Visualization. Using time-based approaches [2] to visualize data dates back to the 18th century. Already in 1765, John Priestley used timelines to visualize the lifespans of famous people [49] for his ‘Chart of Biography’. Khulusi et al. [34] created an interactive version of Priestley’s chart with data of musicians. Such timelines were also used to visualize personal histories based on medical records [47] and interactions of movie characters [39]. Newer research explores timelines in combination with storytelling [14].

InfoVis on Mobile Devices. In 2006, Chittaro [19] published an article about visualizing information on mobile devices. The main conclusion was that “visualization applications developed for desktop computers do not scale well to mobile devices”. The arguments mainly followed the lines of the smaller size, lower resolution, different aspect ratio, and less powerful hardware. Over the years, the performance of smartphones has been enhanced considerably. However, a survey article by Isenberg & Isenberg [32] seven years later showed that smartphones had only been used in 6% of the 100 analyzed research projects, although the user base of smartphones had been continuously growing during these years. In recent years, visualization on mobile devices has attracted more and more attention in research [9, 15, 20, 35] and practice [51, 53].

Several research works focused on tablets as target devices. Baur et al. [5] presented TouchWave (touchable stacked graphs). Sadana and Stasko [52] implemented multiple coordinated views for tablets. Later research on tablets explores details-on-demand techniques for interactive visual exploration [57] or proposes consistent interaction across different types of visualization [55]. Compared to research on tablet devices, research on smartphones is underrepresented [9]. Besides, Hoffswell et al. [29] proposed design guidelines for responsive visualization addressing news visualization.

InfoVis in Museums. Previously, visualization research used to heavily focused on expert users (e.g., [48]). When designing InfoVis for museum visitors, however, users of the visualization are not domain experts [7]. Research by Börner et al. [11] revealed a rather low level of data visualization literacy among science museum visitors. A promising fact, however, is that participants showed interest in the presented visualization. In 2012, the utilization of casual InfoVis in museums was described “as a rudiment of utopia in the cultural organization” [36]. However, some applications show that data visualization fits very well into the museum space. Hinrichs et al. [28] demonstrated the potential and challenges of such applications with InfoVis on a large touch display. Other examples of visualization in museums included tools with which visitors were able to explore scientific data [38] or a visualization that showed the area around Hamburg through space and time from the Middle Ages to the present [3]. The target device of those applications was a horizontal touch display (tabletop).

Rogers et al. [50] performed a comparison of in-situ and remote exploration of museum collections with three visualizations (Choropleth map, bar graph, and list of artifacts) on tablets. Results showed that a keyword search was mostly likely to be used rather than visualization filters. One reason for this might be that the app was not implemented as situated visualization, which is why it did not react to the visitors’ location, and visitors had to figure out which exhibit was next to them. Situated visualization is defined as data representation, which depends on the situation of the user or the object closest to the user [64].

Currently, there is hardly any focus on InfoVis on mobile devices (especially smartphones) in museums. Nevertheless, museums have utilized mobile applications as an additional way of extending visits. Therefore, introducing InfoVis to this area seems like a more promising approach for enhancing the design possibilities for navigation through exhibits.

4 Iteration 1: Visualization Concepts

We combined basic InfoVis premises (e.g., overview first and then details on demand [54]) and mobile guidelines (e.g., screen size limitation, vertical scrolling, occlusion and fat finger problem [19, 63]) to propose different visualization options. For presenting time-oriented information, we selected three of five representation aspects as described in [14] (linear, grid, and radial). We designed a conventional linear timeline (Timeline) and a radial approach that is optimized to take advantage of mobile screen size (Timeflower). Since our data are also historical, which is often connected to documents and books, we also created a grid visualization based on a bookshelf as a metaphor (Bookshelf).

4.1 Visualization Concepts

In Timeline, data are represented as a linear vertical timeline (Fig. 3 (a)). A box represents each exhibit containing its title (Fig. 3 (a) 1) and is anchored to a year on the timeline (Fig. 3 (a) 2). If a museum visitor has not yet passed an exhibit, it is displayed as inactive (Fig. 3 (a) 5). Addressing RQ2, an exhibit is activated on the timeline whenever visitors walk by. Besides, it automatically moves to the center of the screen. The year of the closest exhibit is marked with a border (Fig. 3 (a) 6). If the visitor selects an exhibit by tapping on the box, a larger box appears containing additional text (Fig. 3 (a) 4, RQ4). A section-introduction element introduces each section (Fig. 3 (a) 3), which, in contrast to regular exhibits, features a different icon and starts a new timeline instead of displaying a year.

Fig. 3.
figure 3

Three design concepts for visualizing historical data on a mobile device: Timeline (a), Bookshelf (b), and Timeflower (c).

The Bookshelf visualization shows the data in a bookshelf-like grid layout in which each tier represents one section of the exhibit (Fig. 3 (b)). Bookshelves were used as a metaphor in visualization before (e.g., [4, 58]). In contrast, we intended to visualize a real bookshelf. Each book corresponds to one exhibition object and shows a keyword on its back (Fig. 3 (b) 1). Labels on the underlying tiers show the year of the exhibits (Fig. 3 (b) 2). Book holders represent section introductions (Fig. 3 (b) 3). While the tiers (sections) are aligned vertically, the horizontal alignment of the books shows the chronological order of the exhibits (RQ1, RQ3). If the years of multiple exhibits are the same, the books are stacked on top of each other (Fig. 3 (b) 7). Once again, the books are marked as inactive until the visitor has passed the corresponding exhibit. The book which represents the closest exhibit is shown with a border (Fig. 3 (b) 6). Its tier scrolls to the middle of the screen upon activation (RQ2). Tapping on the book reveals detailed information as an overlay (Fig. 3 (b) 4, RQ4).

Timeflower was inspired by People Garden [65] which is a graphical representation of users based on their past interactions. The entire content of the exhibition is represented by a flower-like structure (Fig. 3 (c)) in which each exhibit is a petal (Fig. 3 (c) 1). Stamens mark section introductions (Fig. 3 (c) 3). By swiping across the screen, the flower can be rotated. While the flower takes up the lower half of the screen, an information box is displayed on the upper half (Fig. 3 (c) 4, RQ3). This box contains the title and a teaser text of the exhibit petal that is currently facing upwards. Between the box and the petal, the year of the corresponding exhibit is shown (Fig. 3 (c) 2). When visitors pass an exhibit, the flower automatically rotates so that the corresponding petal is in the middle of the screen, and its information is displayed. An additional border marks the petal that represents the closest exhibit (Fig. 3 (c) 6, RQ2).

Our three concepts address the defined requirements. The order of the exhibits in all three concepts represents the path through the museum (RQ1). This order also represents a chronological order within each section (RQ3). As Timeline and Timeflower have a sequential scale, we do not visualize the period in between the exhibits. Bookshelf, on the other hand, visualizes such data as a chronological scale. To address RQ3, we activate exhibits that are close to the visitors, center, and highlight them in all three concepts. Besides, we show additional information for selected exhibits (RQ4). However, our three designs have different strengths and weaknesses. Timeline has the look and feel of a news app, which is well known. The vertical scrolling is familiar to the user [16]. Yet, it is a classic approach that might offer the least fun experience. Bookshelf provides a good overview and looks well structured. Its weakness is the covering of parts when showing detailed information. The strength of Timeflower is the excellent mixture between overview and detail, but it is less known for presenting data and uses vertical text orientation, which could influence readability.

4.2 Evaluation

As the three designs are quite different in their approaches, we conducted a comparative evaluation of clickable mockups to see which concept was the easiest to understand and use. To focus on the different design concepts, we added neither the location-aware aspect nor coloring in this first evaluation step. The evaluation was counter-balanced with a within-subject design. We selected four tasks to find out 1) how participants interpret our designs, 2) whether it is possible for them to navigate within the prototype, 3) find additional information, and 4) get back to the overview page. During these tasks, participants were asked to think aloud. Afterward, they answered three post-task questions about 1) the comprehensibility of the navigation, 2) the comprehensibility of changing between exhibition objects, and 3) the ease of use of the visualization. Once all three concepts had been tested, participants were requested to fill in a post-study questionnaire to directly compare the concepts. All questions were presented as a seven-point Likert scale varying from negative to positive scores.

Subjects. Twenty-four persons (P) participated in the assessment (13 female and 11 male), with an age range of 19 to 77 years (M = 41.6, SD = 14.9). Seventeen participants used Android as the operating system, while the others used iOS. Only one user reported not having experience using smartphones. As Iteration 1 was conducted in a university, participants recruited were students and administrative staff who had not previously been involved in the project.

Data Analysis. Data were analyzed with R. When distributions were not Gaussian (according to the Shapiro-Wilks test), the effect of the three designs on the participants’ scores was evaluated using the non-parametric Friedman test with posthoc Wilcoxon analyses (paired samples).

Fig. 4.
figure 4

Iteration 1. Scores from post-task and post-study questionnaires across designs. The y-axis maps the Likert scale from 1 (negative) to 7 (positive).

4.3 Results

When performing the assigned tasks, the design did not affect user performance. Participants reached correct results in 79% (SD = 23.2) of cases with the Timeline design, in 78% (SD = 21.2) of cases with Bookshelf, and in 74% (SD = 23.0) of cases with Timeflower. However, results regarding informal feedback are in contrast to these findings. Figure 4 summarize the results for the post-task and post-study questionnaires.

Post-task Questions. The design had an effect on the comprehensibility of the navigation (\(\chi ^2\)(2) = 9.30, p < .01). Posthoc tests showed that Timeline provided an easier navigation than Timeflower (p < .01). In addition, there was an effect of the design on the ease of use of the visualization (\(\chi ^2\)(2) = 6.03, p < .05) where Timeline also supported a better usage than Timeflower (p < .05). However, designs did not differ in the ratings for comprehensibility when changing between exhibition objects.

Post-study Questions. The design also had an effect on the understandability rating of the visualization (\(\chi ^2\)(2) = 9.95, p < .01). The Timeline design was assessed as more understandable than both Bookshelf (p < .05) and Timeflower (p < .01). In addition, the design affected the quality of the overview offered by the visualization (\(\chi ^2\)(2) = 11.05, p < .01). This time, Timeflower was judged to be less suitable for overview compared to both Timeline (p < .01) and Bookshelf (p < .05). Both results seem to be related to the responsiveness of Timeflower. The design also had an effect on how easy it was to navigate through the visualization (\(\chi ^2\)(2) = 11.39, p < .01). Again, Timeline was rated as supporting navigation better than Timeflower (p < .01). In addition, regarding the combination between overview and detailed information, there was also a difference between designs (\(\chi ^2\)(2) = 6.11, p < .05). Timeline offered a better balance between overview and detailed information than Timeflower (p < .05).

Informal Feedback. Qualitative results showed that the idea behind Timeline was received well. Comments like “one can scroll, and there are exhibits” (P2), “I would expect it when I download a museum app” (P7), and “exhibition topic with historical order of objects” (P9) documented this fact. For Bookshelf, we recognized two different opinions within the comments of the participants: “overloaded, do not know where to look” (P10) vs. “well structured and numbered” (P18). Timeflower was hardly recognized as a flower. Participants recognized, e.g., a crown, a sun, or arrows. The most critical issue design-wise referred to the sectioning; in each respective design, 14 (for Timeline and Bookshelf) and 18 (for Timeflower) participants could not imagine what the subdivision meant.

5 Iteration 2: Linear Timelines

Based on the results of the first evaluation, in which the Timeline concept received the better overall rating, we developed the linear concept further. Since we had to visualize time and duration for six different sections, we noticed different options concerning scale and layout [14]. First, we reproduced the same timeline concept, which focused on the exhibits’ order. Therefore, such a visualization displayed the different sections consecutively (Stack-based, Fig. 5 (a)). In general, stacking items is common for mobile screens. However, it means having stacked timelines as well, which corresponds to a faceted layout [14]. To avoid a very long list of six chronological timelines, a sequential scale was chosen, which means that distances between exhibits do not correspond to chronological distances [14]. In our second approach, we used pagination (Section-based, Fig. 5 (b)) to overcome the stacking (faceted layout). Thus, we could use a chronological scale to visualize the exhibits. Alternatively, we also implemented a unified option prioritizing time over exhibits’ order by including all exhibits in one timeline (All-in-one, Fig. 5 (c)) with a chronological scale.

Fig. 5.
figure 5

Timeline visualizations showing different areas of the exhibition: (a) Stack-based, (b) Section-based, and (c) All-in-one visualization.

5.1 Visualization Concepts

In all three visualizations, exhibits are represented as cards with a title and the exhibits time frame either on the card (Fig. 5 (b, c) 1) or on the timeline next to it (Fig. 5 (a) 1). The card representing the exhibit closest to the visitors is highlighted (Fig. 5 (a, b, c) 5, RQ2), and a location button can be used to scroll to the corresponding card (Fig. 5 (a, b, c) 7). Inactive cards represent exhibits visitors have yet to unlock by walking by them and are indicated by higher transparency (Fig. 5 (a, b, c) 4). Once an exhibit is unlocked, the visualization scrolls to the corresponding object card (RQ2). In case this exhibit is in a different section, both the section color and the background image change accordingly. In all three visualization prototypes, clicking on a card opens the exhibit’s detail page showing detailed descriptions in the form of text, images, and/or additional interactive components (RQ4). In both Stack-based and Section-based visualization, the current exhibit’s section is displayed on section introduction cards showing its title and icon (Fig. 5 (a, b) 3). In the All-in-one visualization this information is shown as a footer instead (Fig. 5 (c) 3).

In the Stack-based visualization (Fig. 5 (a)), a timeline is divided into sections. Whenever visitors walk into a new section, the timeline’s color as well as the background color and image change. Within these colored sections, the objects are listed chronologically, but the time axis jumps between years according to the sequence of the displayed objects (Fig. 5 (a) 9).

The Section-based visualization shows each section as a page. For navigation through these pages, we positioned a navigation bar at the bottom (Fig. 5 (b) 8). The time frame of the time axis (Fig. 5 (a) 9) remains the same through all sections. The coloring of this axis is based on the section’s color. Objects within the sections are listed chronologically. Timelines [60] on the left side of the object’s cards indicate the temporal assignment as their height is determined by the start and end years the exhibit is assigned to (Fig. 5 (b) 2).

The All-in-one visualization shows one time axis (Fig. 5 (a) 9), integrating all exhibits. Within each section, exhibits still appear based on the order in the exhibition. To differentiate between the exhibition’s sections, each card is connected to a timeline showing the temporal assignment (Fig. 5 (c) 2). The timelines are colored according to the section they belong to. Whenever a visitor is located in a section, the background color, and the image change accordingly. In addition, all cards which belong to this section are shown in full size (Fig. 5 (c) 1). The other cards are reduced to a small card showing the dedicated section icon (Fig. 5 (c) 10) such as a fisheye [24]. In case exhibits are allocated to the same year or time frame, their cards are shown as aggregated (Fig. 5 (c) 6).

Our three visualizations are well-suited for getting to know the time the exhibits are related to (RQ3). Nevertheless, these visualizations have different strengths and weaknesses. The Stack-based visualization is a simple and straightforward concept, backed by Iteration 1. However, the focus is on the exhibits’ order within the exhibition rather than on time (RQ1). The Section-based visualization tries to solve this by displaying each section on demand. In this way, it is possible to compare the time between objects of different sections, as the time axis stays in the same position. On the other hand, it might present too much white space, as exhibits in each section are spread over different years of the same time axis. The focus of the All-in-one visualization is on the chronological order (RQ3) rather than the exhibit’s order within the exhibition (RQ1). Its strengths are the overview of all exhibition objects and the ease of comparing the time of exhibits. However, such a concept might be overwhelming for first-time users.

5.2 Evaluation

To evaluate which visualization is the easiest to understand and provides the best experience, we once again prepared a counter-balanced evaluation of the visualizations (within-subjects), this time in a lab and a museum setting (between-subjects). We implemented the three visualizations with web technologies (HTML, CSS, Javascript, and D3 [12]). As we used Bluetooth tags to trigger the location of the exhibits, we wrapped the web content in a native application for iOS and Android. All subjects were provided with prepared iPhone devices (iPhone 7 and 8) to ensure that different devices did not influence results. The three tasks were the same for all visualization: 1) Describe the visualization. 2) Which year(s) is/are assigned to the last object you activated? 3) Show us two objects which are linked in the same year(s). After each visualization, participants answered three questions about 1) ease of use, 2) understandability of the temporal assignment, and 3) ease of comparing the time between exhibits. When all three visualizations were tested and rated, users completed an additional questionnaire where the visualizations were directly compared. For both the post-task and the post-study questions, we used a 7-point Likert scale varying from negative to positive scores.

For both the lab and the museum setting, we used the same process. Each visualization was tested in two sections (two exhibits in the first section, one exhibit in the second section). For the lab setting, we prepared a path comparable to the museum setting with five sections. Additionally, the test in the museum setting was carried out at a time when there were hardly any visitors in the museum so that we could reduce distractions.

Subjects. Overall, thirty-six persons participated in the assessment. The evaluation was conducted in two different locations: first, with 24 participants (14 female and 10 male, age range 20 to 56 years (M = 32.8, SD = 11.2)) in a lab setting. The second location was in the field, testing 12 users (6 female and 6 male, 11 and 12 years old (M = 11.8, SD = 0.6)) in the actual exhibition.

As the laboratory setting of Iteration 2 was conducted in a university, participants recruited were students and administrative staff who had not previously been involved in the project. The museum setting was conducted with a school class visiting the museum.

In the pre-questionnaire, 25 (museum: 9) participants reported their daily usage time on their smartphone to be 61 min and more, 6 (museum: 2) participants reported 30 to 45 min daily usage time, 3 participants in the lab setting use it 45 to 60 min, while 15 to 30 min (museum) and 10 to 15 min (lab) have each been reported once. Participants were also asked to rate their data visualization experience on a 7 point Likert scale (low to high), resulting in a mean of 4.0 (SD = 1.4) for participants in the lab setting and 2.3 (SD = 1.7) in the museum setting.

Fig. 6.
figure 6

Iteration 2. Scores from post-task and post-study questionnaires across visualization. The y-axis maps the Likert scale from 1 (negative) to 7 (positive).

Data Analysis. The same methods from Iteration 1 were applied. There were no significant differences between the assessed treatments. Following an exploratory analysis [61], Pearson tests were also performed for testing correlation between the users’ scores and their demographics: gender (male, female), age group (child & adolescent (under 18, n = 12), young adult (18 to 30, n = 13), middle and old aged (above 30, n = 11)), smartphone usage, data visualization experience, setting group (lab vs. museum), and condition type (order of design during the test). After detecting significant correlations, a factorial analysis was adopted with a factorial ANOVA test and Tukey test for posthoc analysis.

5.3 Results

Performing the tasks, participants reached correct results in 68% of cases with Stack-based (SD = 32.0) and Section-based design (SD = 33.1), and in 78% (SD = 33.7) of cases with All-in-one design. These findings reflect the overall results regarding user experience. Figure 6 shows the aggregated results for the post-task and post-study questionnaires.

Post-task Questions. ANOVA test reported a significant effect of setting group (lab vs. museum) on ease of use (F(1,102) = 4.44, p = .04). Ease of use was rated significantly better (p = .04) by the museum group than the lab group. Additionally, there was a significant effect of setting group on comparing the time (F(1,102) = 16.11, p < .001), which was better rated (p < .001) by the museum group than the lab group.

Age group had an effect on comparing the time (F(2,99) = 8.76, p = < .001). Young adults and middle & old aged rated this task significantly lower (p < .01 & p < .001) than children & adolescents.

There was also an effect of data visualization experience on comparing the time (F(6,87) = 2.62, p = .02). Participants with medium experience (M = 4.48, SD = 2.01) rated significantly lower (p = .02) than participants with the lowest experience (M = 6.33, SD = 1.28) in general.

Post-study Questions. Correlation tests revealed positive correlations between setting groups and each of the post-task questions. For familiarity, ANOVA revealed a significant difference between designs (F(2,102) = 3.29, p = .04). Participants had more fun with the Section-based visualization than with the All-in-one visualization (p < .001). There was also a significant difference between setting groups (F(1,102) = 11.73, p < .001). The museum group rated significantly better than the lab group (p < .001) in general. We found the same significant differences for understandabilty (F(1,102) = 9.83, p < .01), fun (F(1,102) = 19.25, p < .001), temporal assignment (F(1,102) = 4.64, p = .03), comparing the time (F(1,102) = 18.75, p < .001), suitability as guide (F(1,102) = 20.97, p < .001), noticing the closest object (F(1,102) = 5.76, p = .02), and ease of use (F(1,102) = 11.22, p = .001). The designs also had an effect on ease of use (F(2,102) = 3.34, p = .04). Again, the Section-based visualization was easier to use than the All-in-one visualization (p = .04).

Both designs (F(2,99) = 3.28, p = .04) and age groups (F(2,99) = 3.28, p = .04) had an effect on familiarity. On the one hand, there was a marginally significant difference (p = .05) showing that the Section-based visualization is more familiar than the All-in-one visualization. On the other hand, young adults and middle & old aged rated familiarity significantly lower (p = .04 & p < .01) than children & adolescents. In terms of understandability, we found a significant difference based on age groups (F(2,99) = 5.28, p < .01). Again, the posthoc test showed that young adults and middle & old aged gave significantly less points (p = .02 & p = .01) than children & adolescents. Additionally, there was an effect on interaction between designs and age groups (F(4,99) = 3.04, p = .02). For middle & old aged the Section-based design is more understandable than the Stack-based design (p = .04). Design exhibit a significant effect (F(2,102) = 3.38, p = .04) on ease of use. The Section-based design was easier to use than the All-in-one design (p = .03). Additionally, ease of use (F(2,99) = 5.99, p < .01) as well as fun (F(2,99) = 9.80, p < .001), comparing the time (F(2,99) = 9.64, p < .001), and suitability as a guide (F(2,99) = 10.38, p < .001) showed effects with age groups. Young adults and middle & old aged rated significantly lower than children & adolescents. For temporal assignment, we also found significant differences between age groups (F(2,99) = 4.05, p = .02). This time, only middle & old aged rated significantly lower (p = .16) than children & adolescents.

ANOVA showed a significant difference between designs (F(2,102) = 3.93, p = .04). Also, in combination with data visualization experience, participants reported that the Section-based visualization is easier to use than the All-in-one visualization (p = .03). Data visualization experience had an effect on comparing times (F(6,87) = 2.26, p < .05) and on suitability as a guide (F(6,87) = 2.64, p = .02). Participants with the lowest experience (M = 6.61, SD = 1.04) gave significantly (p = .03) more points than participants with experience three (M = 5.10, SD = 1.84).

Informal Feedback. Qualitative results show that the general “timeline” concept is received well. However, both qualitative and quantitative results show that user opinions on the different designs diverge considerably. The Stack-based design is properly recognized as “multiple stacked timelines”. Participants describe it as “clear” or “looking ordered”. At the same time, others called it “not clear”, “overwhelming”, or found the time “complete messy”. The Section-based design is described as a “vertical timeline with multiple parallel timelines” which “feels like a conventional app layout”. Comments regarding this visualization were that it is “irritating” or “cool, but hard to understand”. One participant reported that they “need longer to do something, but with the sections, it is straightforward”. Another participant states that “the comparison is not possible”. However, she had given the correct answer for the comparison task. The All-in-one design is described as “multiple timelines side by side with colors and icons”. Judging from the comments, it seems to cause the most diverging opinions. On the one hand, All-in-one is praised for providing “a good overview”, for being “easy to understand”, “well organized”, and for having “a clean layout”. On the other hand, participants’ comments included remarks such as “loose orientation”, “totally overwhelming”, “looks more confusing”, “overloaded”, or “very complicated”. Overall, participants liked the section coloring, which supported guidance within the exhibition.

6 Reflections and Lessons Learned

In this section, we reflect on visualization & interaction and the process of our design study and derive implications. Based on our reflections, we derived the following guidelines for future studies on InfoVis for casual users in museums:

  • Choose familiar and linear visualization techniques.

  • The prototype should consider the interaction needed.

  • Choose participants with different profiles.

  • Combine quantitative scores with qualitative feedback.

  • Perform tests in the museum is always essential, leave highly controlled comparison tests for the lab.

Choose Familiar and Linear Visualization Techniques. Our results show that the general public may lean towards more familiar and linear visualization techniques. Börner et al. [11] already stated the importance of familiarity for visitors with low data literacy. In Iteration 1, participants were able to perform the tasks well with all three designs, although the user experience with the designs was different. Based on the users’ ratings, there was no difference between designs regarding their ability to display content about exhibits (Find Objects). However, participants rated Timeflower as the least preferred approach regarding navigation and ease of use. Timeflower was functional and attended the system requirement of focusing mainly on the current exhibit, which was shown by the scores it got for the overview. Nevertheless, it was also a somewhat unfamiliar technique and metaphor. Whenever users were unable to identify which object Timeflower represented, they were not able to see its affordance, which negatively influenced usability. The Bookshelf and Timeline designs, on the other hand, were more apparent and more familiar to the users, yielding close scores in many aspects such as navigation and overview. However, participants rated Bookshelf as significantly less understandable than Timeline. Even when the meaning of the metaphor of Bookshelf was clear, Timeline shows objects straightforwardly and functionally. Timeline also reflects the same visual metaphor found in news feeds and social media (e.g., card-based layout). This is why we chose Timeline as the basis for Iteration 2. With three versions of a linear timeline in Iteration 2, users were again able to perform the tasks well with all three designs with no significant differences in performance and user experience.

Clickable Mockups Are Not Always the Optimal Choice. In Iteration 1, we produced medium-fidelity clickable mockups as we wanted to make use of the real physical device [27]. Nevertheless, we noticed that there might be an effect of the static mockup setup, particularly on the Timeflower assessment. The design of Timeflower included a wheel interaction not present on the other designs. Missing interaction elements (e.g., a smooth spinning of Timeflower) may influence the perception of the visualization representation, especially when comparing familiar (Timeline or Bookshelf) with non-familiar techniques (Timeflower). Therefore, we conclude that click dummies are not always optimal for comparing different visualization techniques. In future projects, interaction should be the primary key when deciding on prototype fidelity. If the interaction varies between designs, all interactions must be implemented as they should work in the final version (i.e., with higher fidelity). Otherwise, it is possible to use lower-fidelity prototypes and simulate interactions (e.g., with a wizard-of-oz technique [62]).

Age vs. Design Alternatives. Different age groups responded differently to the assessed prototypes. The Stack-based and Section-based visualization yielded close scores, but the Section-based design was shown to be easier to use for older users (middle & old aged). Finally, we consistently observed the prevalence of older visitors despite our museum partner reporting its target group as people from a wide age range. In such a scenario, considering only the differences between age groups, the final design should follow a Section-based implementation. However, both Stack- and Section-based visualizations were shown to be suitable.

Combine Quantitative Scores with Qualitative Feedback. In our studies, we wanted to perform a comparative evaluation but still consider the user’s experience. That is why we selected a combination of a task-based method and structured Likert-scale questions, which could work even in the real museum setting and have been used before (e.g. [33]). Thus, we could adequately observe when the performance of the users differed from their perceived performance. With a mixed-method, we could search for statistical effects between the designs. Also, asking participants about the motivation for their ratings provided us additional qualitative information, which could be combined with the ratings. Quantitative data could also support the interpretation of qualitative data [21].

Choose Participants with Different Profiles. Correlation tests in Iteration 2 revealed participants who reported having more experience with data visualization consistently rated their experience lower than those with less data visualization experience. Therefore, more experienced participants might be more analytical than less experienced users. Regarding the age of our participants, we could not determine a correlation between smartphone usage and age. However, our youngest age group (children & adolescents) consistently rated their experience higher than the older participants. The younger ones are surrounded by technology [17] and use them frequently [6]. Hence, they might have shown a higher motivation to use their smartphone within the museum and a different perspective when rating the designs. Visually impaired people are an underrepresented target group in visualization research [26]. Although accessibility was not in the focus of our study, we have provided each element with the required ARIAFootnote 1 labels enabling users to read all visualization elements via screen reader. We observed a blind person using the Section-based visualization (final version). We observed that accessing an exhibit in the middle of the timeline was a main issue, as users had to go through all previous exhibits in the timeline to reach the desired item every time the screen was reloaded. To solve this problem, we had to add anchor points to jump to exhibits directly. As casual users differ in visualization literacy (cf. [48]), motivations in using technology, and limitations, there is a need for future studies to select participants with different profiles, especially when targeting casual users.

Weighing Up Lab Against Museum Setting. In Iteration 2, we tested our location-aware visualization in two different settings. The Lab provided a controlled setting aiming at high internal validity. Besides, we wanted to address the lack of knowledge about the context of usage [9] through additional testing in the museum. In our study, the two questions focusing on the location-aware aspect (noticing the closest object & suitability as a guide) were rated significantly higher by the museum group, which might indicate a clear benefit in using the real exhibition setting. However, the museum group was the same as the youngest age group. Therefore, we cannot differentiate between the effects of age group and test setting. Performing a comparative study in a running exhibition environment meant much effort. That is why we used a school class. On the other hand, in such classes, the population was far too homogeneous, with participants presenting similar profiles.

7 Conclusion

In the context of an exhibition in an Austrian museum, we contribute a design study on visualizing historical data through a mobile museum application. In total, we performed two comparative studies: 1) with clickable mockups of three basic visualization concepts, and based on these results, 2) with three functional prototypes of timeline visualization concepts both in a lab and museum setting. As a final step, we defined guidelines for the design and development of future studies on visualization for casual users in museums based on the lessons learned. The guidelines should support the design of interactive visualizations for cultural heritage applications, covering aspects from the visualization and interaction implementation, as well as the general design process.